DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

DocumentMerger

Create a single Document from multiple Documents using the DocumentMerger node. This is particularly useful when you want to create summaries based on multiple Documents using the Summarizer.

Position in a PipelineAfter a Retriever and before a Summarizer in a querying pipeline.
InputDocuments
OutputDocuments
ClassesDocumentMerger

Usage

To initialize a DocumentMerger, run:

from haystack.nodes import DocumentMerger

document_merger = DocumentMerger(separator=" ")

To run a DocumentMerger on its own, use the merge() method:

d1 = Document(content="Here's an introductory sentence.")
d2 = Document(content="Let's continue on with some text.")
d3 = Document(content="And finally, we'll end with a conclusion.")

merged_documents = document_merger.merge([d1, d2, d3])

merged_documents[0].content
# "Here's an introductory sentence. Let's continue on with some text. And finally, we'll end with a conclusion."

To use a DocumentMerger within a pipeline, run:

pipeline = Pipeline()
pipeline.add_node(component=retriever, name="Retriever", inputs=["Query"])
pipeline.add_node(component=document_merger, name="DocumentMerger", inputs=["Retriever"])
pipeline.add_node(component=summarizer, name="Summarizer", inputs=["DocumentMerger"])
summary= pipeline.run(query="Political parties of Australia")