PineconeDocumentStore
Use a Pinecone vector database with Haystack.
API reference | Pinecone |
GitHub link | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/pinecone |
Pinecone is a cloud-based vector database. It is fast and easy to use.
Unlike other solutions (such as Qdrant and Weaviate), it can’t run locally on the user's machine but provides a generous free tier.
Installation
You can simply install the Pinecone Haystack integration with:
pip install pinecone-haystack
Initialization
- To use Pinecone as a Document Store in Haystack, sign up for a free Pinecone account and get your API key.
The Pinecone API key can be explicitly provided or automatically read from the environment variablePINECONE_API_KEY
(recommended). - In Haystack, each
PineconeDocumentStore
operates in a specific namespace of an index. If not provided, both index and namespace aredefault
.
If the index already exists, the Document Store connects to it. Otherwise, it creates a new index. - When creating a new index, you can provide a
spec
in the form of a dictionary. This allows choosing between serverless and pod deployment options and setting additional parameters. Refer to the Pinecone documentation for more details. If not provided, a default spec with serverless deployment in theus-east-1
region will be used (compatible with the free tier). - You can provide
dimension
andmetric
, but they are only taken into account if the Pinecone index does not already exist.
Then, you can use the Document Store like this:
from haystack import Document
from haystack_integrations.document_stores.pinecone import PineconeDocumentStore
# Make sure you have the PINECONE_API_KEY environment variable set
document_store = PineconeDocumentStore(
index="default",
namespace="default",
dimension=5,
metric="cosine",
spec={"serverless": {"region": "us-east-1", "cloud": "aws"}}
)
document_store.write_documents([
Document(content="This is first", embedding=[0.0]*5),
Document(content="This is second", embedding=[0.1, 0.2, 0.3, 0.4, 0.5])
])
print(document_store.count_documents())
Supported Retrievers
PineconeEmbeddingRetriever
: Retrieves documents from the PineconeDocumentStore
based on their dense embeddings (vectors).
Updated about 1 month ago