MetaFieldRanker
MetaFieldRanker ranks Documents based on the value of their meta field you specify. It's a lightweight Ranker that can improve your pipeline's results without slowing it down.
| Name | MetaFieldRanker | 
| Folder path | /rankers/ | 
| Most common position in a pipeline | In a query pipeline, after a component that returns a list of documents, such as a Retriever | 
| Mandatory input variables | “documents”: A list of documents ”top_k”: The maximum number of documents to return. If not provided, returns all documents it received. | 
| Output variables | “documents”: A list of documents | 
Overview
MetaFieldRanker sorts documents based on the value of a specific meta field in descending or ascending order. This means the returned list of Document objects are arranged in a selected order, with string values sorted alphabetically or in reverse (for example, Tokyo, Paris, Berlin).
MetaFieldRanker comes with the optional parameters  weight and ranking_mode you can use to combine a document’s score assigned by the Retriever and the value of its meta field for the ranking. The weight parameter lets you balance the importance of the Document's content and the meta field in the ranking process. The ranking_mode parameter defines how the scores from the Retriever and the Ranker are combined.
This Ranker is useful in query pipelines, like retrieval-augmented generation (RAG) pipelines or document search pipelines. It ensures the documents are ordered by their meta field value. You can also use it after a Retriever (such as the InMemoryEmbeddingRetriever) to combine the Retriever’s score with a document’s meta value for improved ranking.
By default, MetaFieldRanker sorts documents only based on the meta field. You can adjust this by setting the weight to less than 1 when initializing this component. For more details on different initialization settings, check out the API reference for this component.
Usage
On its own
You can use this Ranker outside of a pipeline to sort documents.
This example uses the MetaFieldRanker to rank two simple documents. When running the Ranker, you pass the  query, provide the documents and set the number of documents to rank using the top_k parameter.
from haystack import Document
from haystack.components.rankers import MetaFieldRanker
docs = [Document(content="Paris", meta={"rating": 1.3}), Document(content="Berlin", meta={"rating": 0.7})]
ranker = MetaFieldRanker(meta_field="rating")
ranker.run(query="City in France", documents=docs, top_k=1)
In a pipeline
Below is an example of a pipeline that retrieves documents from an InMemoryDocumentStore based on keyword search (using InMemoryBM25Retriever). It then uses the MetaFieldRanker to rank the retrieved documents based on the meta field rating, using the Ranker's default settings:
from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.rankers import MetaFieldRanker
docs = [Document(content="Paris", meta={"rating": 1.3}),
        Document(content="Berlin", meta={"rating": 0.7}),
        Document(content="Barcelona", meta={"rating": 2.1})]
document_store = InMemoryDocumentStore()
document_store.write_documents(docs)
retriever = InMemoryBM25Retriever(document_store = document_store)
ranker = MetaFieldRanker(meta_field="rating")
document_ranker_pipeline = Pipeline()
document_ranker_pipeline.add_component(instance=retriever, name="retriever")
document_ranker_pipeline.add_component(instance=ranker, name="ranker")
document_ranker_pipeline.connect("retriever.documents", "ranker.documents")
query = "Cities in France"
document_ranker_pipeline.run(data={"retriever": {"query": query, "top_k": 3}, 
                                   "ranker": {"query": query, "top_k": 2}})
Updated over 1 year ago
