DocumentMerger
Create a single Document from multiple Documents using the DocumentMerger node. This is particularly useful when you want to create summaries based on multiple Documents using the Summarizer.
Usage
To initialize a DocumentMerger, run:
from haystack.nodes import DocumentMerger
document_merger = DocumentMerger(separator=" ")
To run a DocumentMerger on its own, use the merge()
method:
d1 = Document(content="Here's an introductory sentence.")
d2 = Document(content="Let's continue on with some text.")
d3 = Document(content="And finally, we'll end with a conclusion.")
merged_documents = document_merger.merge([d1, d2, d3])
merged_documents[0].content
# "Here's an introductory sentence. Let's continue on with some text. And finally, we'll end with a conclusion."
To use a DocumentMerger within a pipeline, run:
pipeline = Pipeline()
pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
pipeline.add_node(component=document_merger, name="DocumentMerger", inputs=["Retriever"])
pipeline.add_node(component=summarizer, name="Summarizer", inputs=["DocumentMerger"])
summary= pipeline.run(query="Political parties of Australia")
Updated almost 2 years ago