Version: 2.31

JinaRanker

Use this component to rank documents based on their similarity to the query using Jina AI models.


Most common position in a pipeline	In a query pipeline, after a component that returns a list of documents (such as a Retriever )
Mandatory init variables	`api_key`: The Jina API key. Can be set with `JINA_API_KEY` env var.
Mandatory run variables	`query`: A query string `documents`: A list of documents
Output variables	`documents`: A list of documents
API reference	Jina
GitHub link	https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/jina
Package name	`jina-haystack`

Overview

JinaRanker ranks the given documents based on how similar they are to the given query. It uses Jina AI ranking models – check out the full list at Jina AI’s website. The default model for this Ranker is jina-reranker-v1-base-en.

Additionally, you can use the optional top_k and score_threshold parameters with JinaRanker :

The Ranker's top_k is the number of documents it returns (if it's the last component in the pipeline) or forwards to the next component.
If you set the score_threshold for the Ranker, it will only return documents with a similarity score (computed by the Jina AI model) above this threshold.

Installation

To start using this integration with Haystack, install the package with:

shell

pip install jina-haystack

Authorization

The component uses a JINA_API_KEY environment variable by default. Otherwise, you can pass a Jina API key at initialization with api_key like this:

python

ranker = JinaRanker(api_key=Secret.from_token("<your-api-key>"))

To get your API key, head to Jina AI’s website.

Usage

On its own

You can use JinaRanker outside of a pipeline to order documents based on your query.

To run the Ranker, pass a query, provide the documents, and set the number of documents to return in the top_k parameter.

python

from haystack import Document
from haystack_integrations.components.rankers.jina import JinaRanker

docs = [Document(content="Paris"), Document(content="Berlin")]

ranker = JinaRanker()

ranker.run(query="City in France", documents=docs, top_k=1)

In a pipeline

This is an example of a pipeline that retrieves documents from an InMemoryDocumentStore based on keyword search (using InMemoryBM25Retriever). It then uses the JinaRanker to rank the retrieved documents according to their similarity to the query.

python

from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack_integrations.components.rankers.jina import JinaRanker

docs = [
    Document(content="Paris is in France"),
    Document(content="Berlin is in Germany"),
    Document(content="Lyon is in France"),
]
document_store = InMemoryDocumentStore()
document_store.write_documents(docs)

retriever = InMemoryBM25Retriever(document_store=document_store)
ranker = JinaRanker()

ranker_pipeline = Pipeline()
ranker_pipeline.add_component(instance=retriever, name="retriever")
ranker_pipeline.add_component(instance=ranker, name="ranker")

ranker_pipeline.connect("retriever.documents", "ranker.documents")

query = "Cities in France"
ranker_pipeline.run(
    data={
        "retriever": {"query": query, "top_k": 3},
        "ranker": {"query": query, "top_k": 2},
    },
)

Overview​

Installation​

Authorization​

Usage​

On its own​

In a pipeline​

Overview

Installation

Authorization

Usage

On its own

In a pipeline