Docker
Learn how to deploy your Haystack pipelines through Docker starting from the basic Docker container to a complex application.
Running Haystack in Docker
The most basic form of Haystack deployment happens through Docker containers. Becoming familiar with running and customizing Haystack Docker images is useful as they form the basis for more advanced deployment.
Haystack releases are officially distributed through the deepset/haystack
Docker image. Haystack images come in different flavors depending on the specific components they ship and the Haystack version.
At the moment, the only flavor available for Haystack 2.0 is
base
, which ships exactly what you would get by installing Haystack locally withpip install haystack-ai
.
You can pull a specific Haystack flavor using Docker tags: for example, to pull the image containing Haystack 2.0.0-beta7
, you can run the command:
docker pull deepset/haystack:base-v2.0.0-beta.7
Although the base
flavor is meant to be customized, it can also be used to quickly run Haystack scripts locally without the need to set up a Python environment and its dependencies. For example, this is how you would print Haystack’s version running a Docker container:
docker run -it --rm deepset/haystack:base-v2.0.0-beta.7 python -c"from haystack.version import __version__; print(__version__)"
Customizing the Haystack Docker Image
Chances are your application will be more complex than a simple script, and you’re going to need to install additional dependencies inside the Docker image along with Haystack.
For example, you might want to run a simple indexing pipeline using Chroma as your Document Store using a Docker container. The base
image only contains a basic install of Haystack, but you need to install the Chroma integration (chroma-haystack
) package additionally. The best approach would be to create a custom Docker image shipping the extra dependency.
Assuming you have a main.py
script in your current folder, the Dockerfile would look like this:
FROM deepset/haystack:base-v2.0.0-beta.7
RUN pip install chroma-haystack
COPY ./main.py /usr/src/myapp/main.py
ENTRYPOINT ["python", "/usr/src/myapp/main.py"]
Then you can create your custom Haystack image with:
docker build . -t my-haystack-image
Complex Application with Docker Compose
One can go pretty far with a Haystack application running in Docker: with an internet connection available, the container can reach external services providing vector databases, inference endpoints, and observability features.
Still, you might want to orchestrate additional services for your Haystack container locally, for example, to reduce costs or increase performance. When your application runtime depends on more than one Docker container, Docker Compose is a great tool to keep everything together.
As an example, let’s say your application wraps two pipelines: one to index documents into a Qdrant instance and the other to query those documents at a later time. This setup would require two Docker containers: one to run the pipelines (for example, using Hayhooks) and a second to run a Qdrant instance.
The Haystack bit of this application would run on a custom Docker image in order to fulfill the dependency on the QdrantDocumentStore
, and the Dockerfile would look like this:
FROM deepset/haystack:base-v2.0.0-beta.7
EXPOSE 1416
RUN pip install qdrant-haystack hayhooks sentence-transformers
CMD ["hayhooks", "run", "--pipelines-dir", "/pipelines", "--host", "0.0.0.0"]
We wouldn’t need to customize Qdrant, so their official Docker image would work perfectly. The docker-compose.yml
file would then look like this:
services:
qdrant:
image: qdrant/qdrant:latest
ports:
- 6333:6333
- 6334:6334
expose:
- 6333
- 6334
configs:
- source: qdrant_config
target: /qdrant/config/production.yaml
volumes:
- ./qdrant_data:/qdrant_data
hayhooks:
image: deepset/hayhooks:main
ports:
- "1416:1416"
volumes:
- ./pipelines:/pipelines
configs:
qdrant_config:
content: |
log_level: INFO
For a functional example of a Docker Compose deployment, check out the “Qdrant Indexing” demo from GitHub.
Updated 4 months ago