Converters
Use various Converters to extract data from files in different formats and cast it into the unified Document format. There are several converters available for converting PDFs, images, DOCX files, and more.
Converter | Description |
---|---|
AzureOCRDocumentConverter | Converts PDF (both searchable and image-only), JPEG, PNG, BMP, TIFF, DOCX, XLSX, PPTX, and HTML to Documents. |
HTMLToDocument | Converts HTML files to Documents. |
MarkdownToDocument | Converts markdown files to Documents. |
OpenAPIServiceToFunctions | Transforms OpenAPI service specifications into a format compatible with OpenAI's function calling mechanism. |
OutputAdapter | Helps the output of one component fit into the input of another. |
PyPDFToDocument | Converts PDF files to Documents. |
TikaDocumentConverter | Converts various file types to Documents using Apache Tika. |
TextFileToDocument | Converts text files to Documents. |
UnstructuredFileConverter | Converts text files and directories to a Document. |
Updated 9 months ago
Related Links
See the parameters details in our API reference: