DocumentationAPI ReferenceπŸ““ TutorialsπŸ§‘β€πŸ³ Cookbook🀝 IntegrationsπŸ’œ Discord

Translator

The Translator does what it says on the tin: it translates text from one language into another. Have a look at examples of how to use the Translator in your search app.

Some translation models are language-specific while others are multilingual. See the Hugging Face Model Hub for a list of available models.

Position in a PipelineAfter preprocessing in an indexing pipeline or after the Retriever in a querying pipeline
InputDocuments
OutputDocuments
ClassesTransformersTranslator

Usage

You can use the Translator component directly to translate your query or documents:

from haystack import Document
from haystack.nodes import TransformersTranslator

DOCS = [
        Document(
            content="""Heinz von Foerster was an Austrian American scientist
                  combining physics and philosophy, and widely attributed
                  as the originator of Second-order cybernetics."""
        )
    ]
translator = TransformersTranslator(model_name_or_path="Helsinki-NLP/opus-mt-en-fr")
res = translator.translate(documents=DOCS, query=None)

Using the TranslationWrapperPipeline

Let's imagine you have an English corpus of technical docs, but the mother tongue of many of your users is French. You can use a Translator node in your pipeline to:

  • Translate the incoming query from French to English.
  • Search in your English corpus for the right document or answer.
  • Translate the results back from English to French.
from haystack.pipelines import TranslationWrapperPipeline, DocumentSearchPipeline
from haystack.nodes import TransformersTranslator

pipeline = DocumentSearchPipeline(retriever=my_dpr_retriever)

in_translator = TransformersTranslator(model_name_or_path="Helsinki-NLP/opus-mt-fr-en")
out_translator = TransformersTranslator(model_name_or_path="Helsinki-NLP/opus-mt-en-fr")

pipeline_with_translation = TranslationWrapperPipeline(input_translator=in_translator,
                                                       output_translator=out_translator,
                                                       pipeline=pipeline)