Writes Documents to a DocumentStore.
Module document_writer
DocumentWriter
@component
class DocumentWriter()
Writes documents to a DocumentStore.
Usage example:
from haystack import Document
from haystack.components.writers import DocumentWriter
from haystack.document_stores.in_memory import InMemoryDocumentStore
docs = [
Document(content="Python is a popular programming language"),
]
doc_store = InMemoryDocumentStore()
doc_store.write_documents(docs)
DocumentWriter.__init__
def __init__(document_store: DocumentStore,
policy: DuplicatePolicy = DuplicatePolicy.NONE)
Create a DocumentWriter component.
Arguments:
document_store
: The DocumentStore where the documents are to be written.policy
: The policy to apply when a Document with the same id already exists in the DocumentStore.DuplicatePolicy.NONE
: Default policy, behaviour depends on the Document Store.DuplicatePolicy.SKIP
: If a Document with the same id already exists, it is skipped and not written.DuplicatePolicy.OVERWRITE
: If a Document with the same id already exists, it is overwritten.DuplicatePolicy.FAIL
: If a Document with the same id already exists, an error is raised.
DocumentWriter.to_dict
def to_dict() -> Dict[str, Any]
Serializes the component to a dictionary.
Returns:
Dictionary with serialized data.
DocumentWriter.from_dict
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> "DocumentWriter"
Deserializes the component from a dictionary.
Arguments:
data
: The dictionary to deserialize from.
Raises:
DeserializationError
: If the document store is not properly specified in the serialization data or its type cannot be imported.
Returns:
The deserialized component.
DocumentWriter.run
@component.output_types(documents_written=int)
def run(documents: List[Document], policy: Optional[DuplicatePolicy] = None)
Run the DocumentWriter on the given input data.
Arguments:
documents
: A list of documents to write to the store.policy
: The policy to use when encountering duplicate documents.
Raises:
ValueError
: If the specified document store is not found.
Returns:
Number of documents written