Telemetry

Haystack relies on anonymous usage statistics to support continuous software improvements for all its users. To this end, a limited set of event messages are shared automatically, for example, what kind of DocumentStore is used. Learn more about how it works.

What Information Is Shared?

Telemetry in Haystack comprises anonymous usage statistics of base components, such as DocumentStore, Retriever, Reader, or any other pipeline node. An event is sent when one of these components is initialized so that we can learn which of them are most relevant to the community. For the same reason, an event is sent when one of the tutorials is executed. The events contain an anonymous user id, which is a randomly generated uuid. There is no way to infer your identity from the user id or any other content of the event.

To prevent revealing a user's identity, telemetry will never use the following properties:

  • IP addresses
  • Hostnames
  • File paths
  • Queries
  • Document contents

Here is an exemplary event that is sent when tutorial 1 is executed by running Tutorial1_Basic_QA_Pipeline.py:

{
    "event": "tutorial 1 executed",
    "distinct_id": "9baab867-3bc8-438c-9974-a192c9d53cd1",
    "properties": {
        "os_family": "Darwin",
        "os_machine": "arm64",
        "os_version": "21.3.0",
        "haystack_version": "1.0.0",
        "python_version": "3.9.6",
        "torch_version": "1.9.0",
        "transformers_version": "4.13.0"
        "execution_env": "script",
        "n_gpu": 0,
    },
}

How Does Telemetry Help?

Telemetry allows us to understand the needs of the community. "What pipeline nodes are most popular?", "Should we focus on supporting one specific DocumentStore?", "How many people use Haystack on Windows?" are exemplary questions that telemetry helps us to answer and thus steer the development process. Metadata related to the operating system and installed dependency versions help us to quickly identify and address issues caused by specific environment setups. Through sharing this information, you enable us to continuously improve Haystack for all its users.

How Can I Opt Out?

You can disable telemetry by setting the environment variable HAYSTACK_TELEMETRY_ENABLED to "False" directly.

If you are using a bash shell, you can add the following line to the file ~/.bashrc to permanently disable telemetry: export HAYSTACK_TELEMETRY_ENABLED=False. If you are using zsh as your shell, for example, on macOS, you need to add that line to the file ~/.zshrc.

To permanently disable telemetry on Windows, set a user-level environment by running this command in the standard command prompt: setx HAYSTACK_TELEMETRY_ENABLED "False". If you are using Windows PowerShell, the command is: [Environment]::SetEnvironmentVariable("HAYSTACK_TELEMETRY_ENABLED","False","User").

Note that you might need to restart your shell for the command to take effect.