Top-p (nucleus) sampling identifies and retains a subset of documents based on their cumulative probabilities. Rather than selecting a fixed number of documents, top-p sampling focuses on a specified percentage of the highest cumulative probabilities within a list of documents.
The practical goal of the TopPSampler is to return a list of documents that in sum have a score larger than the
top_p value. So, for example, when
top_p is set to a high value, more documents will be returned, which can result in more varied outputs. The value is typically set between 0 and 1.
run() method takes in a query and a set of documents, calculates the similarity scores between the query and the documents, and then filters the documents based on the cumulative probability of these scores. The TopPSampler provides a way to efficiently select the most relevant documents based on their similarity to a given query.
You can additionally check out the Model Parameters documentation.
By default, TopPSampler uses the ms-marco-MiniLM-L-6-v2 model, but you can replace it with any other cross-encoder model. For a full list of models, see Hugging Face.
TopPSampler is used in combination with other nodes, such as
WebRetriever to limit the number of results they return. Here's an example of TopPSampler in a pipeline:
retriever = WebRetriever(api_key="<your_api_key_here>", mode="preprocessed_documents") sampler = TopPSampler(top_p=0.95) p = Pipeline() p.add_node(component=retriever, name="Retriever", inputs=["Query"]) p.add_node(component=sampler, name="Sampler", inputs=["Retriever"]) print(p.run(query="What's the secret of the Universe?"))
Updated 12 days ago