githubEdit

Content Uploader

The Content Uploader is a specialized loader for ingesting single files directly into Katara. It creates a document from the uploaded content, allowing you to add specific files to your corpus without setting up a crawler or external integration.

Configuration

File Upload

You can upload a single file at a time. The system supports various text-based formats and processes them as UTF-8 content.

Limits and Constraints

  • File Size: Maximum 10 MB per file.

  • Filename: Maximum 255 characters. Filenames cannot contain / or \ characters.

  • Encoding: Content must be UTF-8 encoded.

Supported Formats

The loader supports Markdown and a variety of other text-like formats.

Markdown Extensions

  • .md, .markdown, .mdown, .mkd, .qmd, .rmd

Other Text Formats

  • .adoc, .asc, .asciidoc, .csv, .htm, .html, .json, .latex, .tex, .text, .tsv, .txt, .vtt, .xhtml, .xml

Metadata

When uploading a file, you can optionally provide additional information to help categorize and manage the content.

  • Source ID: An optional identifier for the content source. If not provided, it defaults to the filename.

  • Language: The primary language used in the document.

  • Published At: The original publication date of the content.

Tags

Tags help organize your documents for better filtering and retrieval.

  • Agent Tags: Default tags configured on the Content Uploader agent apply to every upload.

  • Per-Upload Tags: You can specify additional tags for each specific file. These merge with the default agent tags.

Access and Ownership

  • Ownership: The document owner is set to the user performing the upload. If no user context is available, ownership falls back to the agent's owner.

  • Sharing: Uploaded documents inherit the sharing configuration from the Content Uploader agent's default document shares.

Last updated