DocumentationAPI Reference📓 Tutorials🧑‍🍳 Cookbook🤝 Integrations💜 Discord🎨 Studio
Documentation

Kubernetes

Learn how to deploy your Haystack Pipelines through Kubernetes.

The best way to get Haystack running as a workload in a container orchestrator like Kubernetes is to create a service to expose one or more Hayhooks instances.

Create a Haystack Service

A single Hayhooks instance can serve multiple Pipelines. As a first step, it might be beneficial to start with a simple service running a single pod:

kind: Pod
apiVersion: v1
metadata:
  name: hayhooks
  labels:
    app: haystack
spec:
  containers:
  # We use the unstable release to keep this example fresh and relevant.
  - image: deepset/hayhooks:main
    name: hayhooks
    # Since we're using the moving tag `main`, we have to always pull
    # to be sure we're getting the latest.
    imagePullPolicy: Always

---

kind: Service
apiVersion: v1
metadata:
  name: haystack-service
spec:
  selector:
    app: haystack
  ports:
  # Default port used by the Hayhooks Docker image
  - port: 1416

If you have an ingress properly configured, haystack-service will expose the Hayhooks API. From there, you can manage and run Pipelines. See the Hayhooks docs for more details.

Auto-Run Pipelines at Pod Start

Hayhooks can load Haystack Pipelines at startup, making them readily available when the server starts. We can leverage this mechanism to have our pods immediately serve one or more Pipelines when they start.

Since Pipelines YAML definitions are rather small, we can use a ConfigMap to make them available to our pods. In this example, create the config map directly from files, assuming you have a local folder ./pipelines containing one or more Pipeline definitions:

kubectl create configmap pipelines --from-file=pipelines

You can double-check if the config map was correctly created with:

kubectl describe configmap pipelines

The next step is to mount the config map pipelines as a volume in our pods (see the Kubernetes docs for more details) by changing the Pod resource definition from the previous example like this:

kind: Pod
apiVersion: v1
metadata:
  name: hayhooks
  labels:
    app: haystack
spec:
  containers:
  # We use the unstable release to keep this example fresh and relevant.
  - image: deepset/hayhooks:main
    name: hayhooks
    # Since we're using the moving tag `main`, we have to always pull
    # to be sure we're getting the latest.
    imagePullPolicy: Always
    # Mount the ConfigMap containing the pipelines under
    # /opt/pipelines in the container
    volumeMounts:
      - name: config-volume
        mountPath: /opt/pipelines
    # Instruct Hayhooks that the pipelines we want to run at startup
    # will be found under /opt/pipelines
    env:
    - name: HAYHOOKS_PIPELINES_DIR
      value: /opt/pipelines
  volumes:
    - name: config-volume
      configMap:
        name: pipelines

Roll Out Multiple Pods

Haystack Pipelines are usually stateless – a perfect use case for distributing the requests to multiple pods running the same set of Pipelines. The easiest way to tell Kubernetes to roll out multiple replicas is through a Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: haystack-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: haystack
  template:
    metadata:
      labels:
        app: haystack
    spec:
      containers:
      # We use the unstable release to keep this example fresh and relevant.
      - image: deepset/hayhooks:main
        name: hayhooks
        # Since we're using the moving tag `main`, we have to always pull
        # to be sure we're getting the latest.
        imagePullPolicy: Always
        # Mount the ConfigMap containing the pipelines under
        # /opt/pipelines in the container
        volumeMounts:
          - name: config-volume
            mountPath: /opt/pipelines
        # Instruct Hayhooks that the pipelines we want to run at startup
        # will be found under /opt/pipelines
        env:
        - name: HAYHOOKS_PIPELINES_DIR
          value: /opt/pipelines
      volumes:
        - name: config-volume
          configMap:
            name: pipelines

Implementing the above configuration will create three pods. Each pod will run a different instance of Hayhooks, all serving the same two Pipelines provided by the config map we created in the previous example.

Deploy With Helm

Haystack comes with a Helm chart that can be used to deploy a fully functional application on a Kubernetes cluster. The chart is distributed through the deepset repository, so the first thing to do is add it to Helm:

helm repo add deepset https://deepset-ai.github.io/charts/

To install the chart on a local cluster (such as KIND or Minikube) with the default values, you need to give a name to your application (for example, full-coral) and invoke Helm like this:

helm install full-coral deepset/hayhooks

The chart will create a ConfigMap you can use to store Pipeline definitions that will be automatically served when a Hayhooks pod starts. To update the config map when the application has already started, you can run:

kubectl create configmap full-coral-default-pipelines --from-file=your/pipelines/dir -o yaml --dry-run=client | kubectl replace -f -

Then, restart the application so it will pick up the Pipelines:

kubectl rollout restart deployment full-coral-haystack-deployment

Chart Values

ValueTypeDefault
replicaCountint1
image.repositorystringdeepset/hayhooks
image.pullPolicystringAlways
image.tagstringmain
pipelinesDirMountstring/opt/pipelines
nameOverridestring“”
fullnameOverridestring“”
podAnnotationsobject{}
podLabelsobject{}
podSecurityContextobject{}
securityContextobject{}
service.portint1416
ingress.enabledbooltrue
ingress.classNamestring“”
ingress.annotationsstringnginx.ingress.kubernetes.io/rewrite-target: /$2
ingress.hostslist[ host: localhost paths: - path: /haystack(/|$)(.\*) pathType: ImplementationSpecific ]
ingress.tlslist[]
resourcesobject{}
livenessProbe.httpGet.pathstring/status
livenessProbe.httpGet.portstringhttp
readinessProbe.httpGet.pathstring/status
readinessProbe.httpGet.portstringhttp
autoscaling.enabledboolfalse
autoscaling.minReplicasint1
autoscaling.maxReplicasint100
autoscaling.targetCPUUtilizationPercentageint80
volumeslist[]
volumeMountslist[]
nodeSelectorobject{}
tolerationslist[]
affinityobject{}