HomeDocumentationAPI ReferenceWhat's NewTutorials
Haystack Homepage


You can deploy Haystack as a REST API. Using Haystack through an API is useful if you want to deploy NLP functionality in your web or mobile app. Learn how the API works and how to set it up.

The API uses a web server to receive HTTP requests and pass them on to a running instance of Haystack for processing, before returning Haystack results as an HTTP response.

The diagram below illustrates how the Haystack REST API is structured:

A diagram illustrating the structure of Haystack REST API. The user sends an HTTP request to gunicorn HTTP Server. The server then sends the question to Haystack, which communicates with a datastore, such as elasticsearch, weaviate, and so on. Then Haystack sends back the ansewr to the gunicorn server. The server then sends the HTTP response to the user.

Background: Haystack Pipelines

The Haystack pipeline is at the core of Haystack’s functionality, whether Haystack is used directly through the Python bindings or through a REST API.

A pipeline is a sequence of nodes where each node performs a dedicated function, for example retrieving Documents from a DocumentStore or extracting an answer to a query from a text Document. Each node in a pipeline takes the output of the preceding node as its input.

The Haystack REST API exposes an HTTP interface for interacting with a pipeline. For instance, you can use the REST API to send a query submitted in the body of an HTTP request to the Haystack pipeline. Haystack then processes the request and returns the answer in an HTTP response.

When running Haystack as a REST API, you need to define the pipeline you'll use in the API as a YAML file. Check out the YAML File Definitions for more details.

The example Haystack pipeline below is defined in the rest_api/rest_api/pipeline/pipelines.haystack-pipeline.yml file.

Setting up a REST API with Haystack

A simple Haystack API is already defined in the project’s default docker-compose.yml file. The easiest way to start this Haystack API is to clone the Haystack repository and then run docker-compose:

git clone https://github.com/deepset-ai/haystack.git
cd haystack
docker-compose up

docker-compose starts two Docker containers:

  • haystack-api: this container runs both Haystack and the HTTP API server.
  • elasticsearch: the datastore that backs the Haystack QA system in this example.



We recommend running Docker with at least 8 GB of RAM to make sure all containers run properly.

By default, the Haystack API container starts with a pipeline defined in the rest_api/rest_api/pipeline/pipelines.yaml file. If you want to direct the API container to use a YAML file at a different location, you can set the PIPELINE_YAML_PATH environment variable in docker-compose.yml.

Interacting with the Haystack API through Swagger Documentation

When you start the Haystack containers, the API server starts running by default on The API server includes a Swagger documentation site, so to view all endpoints available through the REST API, navigate to

A screenshot of the Swagger website with Haystack REST API

You can use the Swagger page to view the available endpoints and expected input and output formats for each endpoint. The Swagger site also includes the option to send sample API requests and inspect responses.


Hosted REST API Definitions

See Haystack REST API for definitions of the different endpoints available on the latest main version.

Running HTTP API without Docker

If you don't want to use Docker with your Haystack API, start the REST API server and the supporting Haystack pipeline by running the gunicorn server manually:

gunicorn rest_api.application:app -b -k uvicorn.workers.UvicornWorker -t 300

This is the same command that’s used in the haystack-api container definition. Remember that you need a running Elasticsearch instance to run Haystack API.

Indexing Documents in the Haystack REST API DocumentStore

This example uses an ElasticSearch container pre-loaded with articles about the Game of Thrones series. In a production environment, you’d start with either an existing or empty data store.

There are two options for indexing answers when working with an empty data store. You can either use an indexing script or load files through another pipeline exposed through an API.

Option 1: Using an Indexing Script

You can use a Python script to load documents to your data store. The script runs before the Haystack API is initialized. Build Your First QA System is an example script that preprocesses files and saves them into the DocumentStore.

If you go with an indexing script, this is what you need to do:

  1. Initialize the DocumentStore.
  2. Run the indexing script.
  3. Start the REST API.
  4. Submit queries.

Option 2: Using the REST API

When using the indexing pipeline, here is what you need to do:

  1. Initialize the DocumentStore.
  2. Start the REST API.
  3. Add Documents through the index endpoint.
  4. Submit queries.

The example pipelines.yaml file defines an indexing pipeline. The pipeline processes text and PDF files submitted to the file-upload API endpoint. If you submit an API request to the indexing endpoint, the indexing pipeline processes the submitted content, loads it into Elasticsearch, and makes it available to the query pipeline.

Here’s how you can add a Document through the indexing API by using cURL to make the HTTP request:

curl -X POST -H 'Accept: application/json' -F files='@/Users/alexey/Downloads/Sansa Stark - Wikipedia.pdf'



You can use any other HTTP request tool instead of cURL. We recommend referring to the Swagger documentation page for complete examples of file-upload requests, as putting them together by hand with cURL can prove difficult.

Querying the Haystack REST API

After adding Documents to the DocumentStore, call the API endpoints directly to retrieve answers. When working with the API’s JSON output, use jq to make the output easier to read. Let’s try querying the Haystack API directly without using the UI. Here’s an example of a query with cURL and jq:

curl -X 'POST' \
  '' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "Who is the father of Arya Stark?",
  "params": {}



To learn how to pass parameters into your pipeline, see the Pipeline Arguments.

You get the same information in the response as what you previously saw in the UI:

   "query":"Who is the father of Arya Stark?",
         "context":"\n====Season 1====\nArya accompanies her father Ned and her sister Sansa to King's Landing. Before their departure, Arya's half-brother Jon Snow gifts A",
         "content":"\n===In the Riverlands===\nThe Stark army reaches the Twins, a bridge stronghold controlled by Walder Frey, who agrees to allow the army to cross the river and to commit his troops in return for Robb and Arya Stark marrying two of his children.\nTyrion Lannister suspects his father Tywin, who decides Tyrion and his barbarians will fight in the vanguard, wants him killed...",

Building a Custom API Endpoint

Existing API endpoints are defined using FastAPI route methods, for example in the rest_api/rest_api/controller/search.py file.

You can add custom endpoints to the Haystack API by defining new API endpoints. You do this by using the FastAPI methods. New endpoints can be handy for making multiple pipelines available under different API endpoints, if you need to add custom metrics, or modify how parameters are being passed to the pipelines.

Check out the FastAPI intro tutorial for details on how to use FastAPI methods.

Related Links