DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

PineconeDocumentStore

Use a Pinecone vector database with Haystack.

Pinecone is a cloud-based vector database. It is fast and easy to use.
Unlike other solutions (such as Qdrant and Weaviate), it can’t run locally on the user's machine but provides a generous free tier.

Installation

You can simply install the Pinecone Haystack integration with:

pip install pinecone-haystack

Initialization

  • To use Pinecone as a Document Store in Haystack, sign up for a free Pinecone account and get your API key.
    The Pinecone API key can be explicitly provided or automatically read from the environment variable PINECONE_API_KEY (recommended).
  • In Haystack, each PineconeDocumentStore operates in a specific namespace of an index. If not provided, both index and namespace are default.
    If the index already exists, the Document Store connects to it. Otherwise, it creates a new index.
  • When creating a new index, you can provide a spec in the form of a dictionary. This allows choosing between serverless and pod deployment options and setting additional parameters. Refer to the Pinecone documentation for more details. If not provided, a default spec with serverless deployment in the us-east-1 region will be used (compatible with the free tier).
  • You can provide dimension and metric, but they are only taken into account if the Pinecone index does not already exist.

Then, you can use the Document Store like this:

from haystack import Document
from haystack_integrations.document_stores.pinecone import PineconeDocumentStore

# Make sure you have the PINECONE_API_KEY environment variable set
document_store = PineconeDocumentStore(
		index="default",
		namespace="default",
		dimension=5,
  	metric="cosine",
  	spec={"serverless": {"region": "us-east-1", "cloud": "aws"}}
)

document_store.write_documents([
    Document(content="This is first", embedding=[0.0]*5), 
    Document(content="This is second", embedding=[0.1, 0.2, 0.3, 0.4, 0.5])
    ])
print(document_store.count_documents())

Supported Retrievers

PineconeEmbeddingRetriever: Retrieves documents from the PineconeDocumentStore based on their dense embeddings (vectors).