DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio
Documentation

Ranker

The improvement that the Ranker brings comes at the cost of some additional computation time. The ranking models supported by Haystack are models powered by transformers, meaning that they are sensitive to word order and syntax.

In Haystack, you can use any Cross-Encoder model that returns a single logit as a similarity score. For examples, see the Sentence Transformers page for some examples.

Position in a PipelineAfter a Retriever
InputDocuments
OutputDocuments
ClassesSentenceTransformersRanker

Usage

To use the Ranker in a pipeline, run:

from haystack.document_stores import ElasticsearchDocumentStore
from haystack.nodes import BM25Retriever, SentenceTransformersRanker
from haystack import Pipeline

document_store = ElasticsearchDocumentStore()
... retriever = BM25Retriever(document_store)
ranker = SentenceTransformersRanker(model_name_or_path="cross-encoder/ms-marco-MiniLM-L-12-v2")
... p = Pipeline()
p.add_node(component=retriever, name="BM25Retriever", inputs=["Query"])
p.add_node(component=ranker, name="Ranker", inputs=["BM25Retriever"])

The SentenceTransformersRanker can also be used in isolation by calling its predict() method after initialization.

Use Case

As an example, a Ranker can pair nicely with a sparse retriever, such as the BM25Retriever. While the BM25Retriever is fast and lightweight, it is not sensitive to word order but rather treats text as a bag of words. By placing a Ranker afterwards, you can offset this weakness and have a better sorted list of relevant documents.

Training

The Ranker needs to be initialised with a model trained on a text pair classification task. The SentenceTransformersRanker has a train() method to allow for this training. Alternatively, this FARM script shows how to train a text pair classification model.


Related Links