DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio
Documentation

STACKITDocumentEmbedder

This component enables document embedding using the STACKIT API.

Most common position in a pipelineBefore a DocumentWriter in an indexing pipeline
Mandatory init variables"model": The model used through the STACKIT API
Mandatory run variables“documents”: A list of documents to be embedded
Output variables“documents”: A list of documents enriched with embeddings
API referenceSTACKIT
GitHub linkhttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/stackit

Overview

STACKITDocumentEmbedder enables document embedding models served by STACKIT through their API.

Parameters

To use the STACKITDocumentEmbedder, ensure you have set a STACKIT_API_KEY as an environment variable. Alternatively, provide the API key as an environment variable with a different name or a token by setting api_key and using Haystack’s secret management.

Set your preferred supported model with the model parameter when initializing the component. See the full list of all supported models on the STACKIT website.

Optionally, you can change the default api_base_url, which is "https://api.openai-compat.model-serving.eu01.onstackit.cloud/v1".

You can pass any text generation parameters valid for the STACKIT Chat Completion API directly to this component with the generation_kwargs parameter in the init or run methods.

Then component needs a list of documents as input to operate.

Usage

Install the stackit-haystack package to use the STACKITDocumentEmbedder and set an environment variable called STACKIT_API_KEY to your API key.

pip install stackit-haystack

On its own

from haystack_integrations.components.embedders.stackit import STACKITDocumentEmbedder

doc = Document(content="I love pizza!")

document_embedder = STACKITDocumentEmbedder(model="intfloat/e5-mistral-7b-instruct")

result = document_embedder.run([doc])
print(result["documents"][0].embedding)

# [0.0215301513671875, 0.01499176025390625, ...]

In a pipeline

You can also use STACKITDocumentEmbedder in your pipeline in a following way.

from haystack import Document
from haystack import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.embedders.stackit import STACKITTextEmbedder, STACKITDocumentEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever

document_store = InMemoryDocumentStore()

documents = [Document(content="My name is Wolfgang and I live in Berlin"),
             Document(content="I saw a black horse running"),
             Document(content="Germany has many big cities")]

document_embedder = STACKITDocumentEmbedder(model="intfloat/e5-mistral-7b-instruct")
documents_with_embeddings = document_embedder.run(documents)['documents']
document_store.write_documents(documents_with_embeddings)

text_embedder = STACKITTextEmbedder(model="intfloat/e5-mistral-7b-instruct")

query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", text_embedder)
query_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

query = "Where does Wolfgang live?"

result = query_pipeline.run({"text_embedder":{"text": query}})

print(result['retriever']['documents'][0])

# Document(id=..., content: 'My name is Wolfgang and I live in Berlin', score: ...)

You can find more usage examples in the STACKIT integration repository and its integration page.