DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio (Waitlist)
Documentation

JinaRanker

Use this component to rank documents based on their similarity to the query using Jina AI models.

Most common position in a pipelineIn a query pipeline, after a component that returns a list of documents (such as a Retriever )
Mandatory init variables"api_key": The Jina API key. Can be set with JINA_API_KEY env var.
Mandatory run variables“query”: A query string

”documents”: A list of documents
Output variables“documents”: A list of documents
API referenceJina
GitHub linkhttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/jina

Overview

JinaRanker ranks the given documents based on how similar they are to the given query. It uses Jina AI ranking models – check out the full list at Jina AI’s website. The default model for this Ranker is jina-reranker-v1-base-en.

Additionally, you can use the optional top_k and score_threshold parameters with JinaRanker :

  • The Ranker's top_k is the number of documents it returns (if it's the last component in the pipeline) or forwards to the next component.
  • If you set the score_threshold for the Ranker, it will only return documents with a similarity score (computed by the Jina AI model) above this threshold.

Installation

To start using this integration with Haystack, install the package with:

pip install jina-haystack

Authorization

The component uses a JINA_API_KEY environment variable by default. Otherwise, you can pass a Jina API key at initialization with api_key like this:

ranker = JinaRanker(api_key=Secret.from_token("<your-api-key>"))

To get your API key, head to Jina AI’s website.

Usage

On its own

You can use JinaRanker outside of a pipeline to order documents based on your query.

To run the Ranker, pass a query, provide the documents, and set the number of documents to return in the top_k parameter.

from haystack import Document
from haystack_integrations.components.rankers.jina import JinaRanker

docs = [Document(content="Paris"), Document(content="Berlin")]

ranker = JinaRanker()

ranker.run(query="City in France", documents=docs, top_k=1)

In a pipeline

This is an example of a pipeline that retrieves documents from an InMemoryDocumentStore based on keyword search (using InMemoryBM25Retriever). It then uses the JinaRanker to rank the retrieved documents according to their similarity to the query.

from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack_integrations.components.rankers.jina import JinaRanker

docs = [Document(content="Paris is in France"), 
        Document(content="Berlin is in Germany"),
        Document(content="Lyon is in France")]
document_store = InMemoryDocumentStore()
document_store.write_documents(docs)

retriever = InMemoryBM25Retriever(document_store = document_store)
ranker = JinaRanker()

ranker_pipeline = Pipeline()
ranker_pipeline.add_component(instance=retriever, name="retriever")
ranker_pipeline.add_component(instance=ranker, name="ranker")

ranker_pipeline.connect("retriever.documents", "ranker.documents")

query = "Cities in France"
ranker_pipeline.run(data={"retriever": {"query": query, "top_k": 3}, 
                                   "ranker": {"query": query, "top_k": 2}})

Related Links

Check out the API reference in the GitHub repo or in our docs: