Uses nucleus sampling to filter documents. Useful in combination with WebRetriever to choose documents that are diverse but relevant to the query.
What's nucleus (top_p) sampling?
Nucleus sampling is expressed in the
top_p parameter used in generative question answering. It controls the level of randomness and diversity in the generated text.
top_p is set to a high value, the model is more likely to generate diverse and creative outputs. When set to a low value, the model is more likely to generate predictable and less risky outputs.
Nucleus sampling is often used in combination with other parameters, such as
top_k to achieve the balance between creativity and coherence in the generated text.
See also Model Parameters.
While nucleus, or top p, sampling is usually mentioned in the context of the next token selection in generative NLP models, we can also use it to filter documents based on the cumulative probability of the similarity scores between the query and the documents.
In this context, top p sampling selects a subset of diverse query's most relevant documents while also removing unrelated documents. The technique involves calculating the cumulative probability of the scores of the query's most similar documents, and then selecting the top p percent of the most similar documents with the highest cumulative probability.
By default, TopPSampler uses the ms-marco-MiniLM-L-6-v2 model, but you can replace it with any other cross encoder model. For a full list of models, see Hugging Face.
TopPSampler is used in combination with other nodes, such as
WebRetriever to limit the number of results they return. Here's an example of TopPSampler in a pipeline:
retriever = WebRetriever(api_key="<your_api_key_here>", mode="preprocessed_documents") sampler = TopPSampler(top_p=0.95) p = Pipeline() p.add_node(component=retriever, name="Retriever", inputs=["Query"]) p.add_node(component=sampler, name="Sampler", inputs=["Retriever"]) print(p.run(query="What's the secret of the Universe?"))
Updated 2 months ago