Katara
AppBlogDiscordWebsite
  • Getting Started
    • 👋Welcome to Katara
    • 👨‍🔬Use Cases
    • ⚙️How does it work?
      • 🧱Large Language Models
      • ➡️RAGs - Extending AI context
      • 🏛️Diataxis Framework
      • 🔧Agents and Workflows
      • 🧑‍🤝‍🧑People at the helm
    • 📁Data Collection and Usage Policy
  • Organizations
    • ⚡Quickstart
    • ⚒️Manage Your Organization
  • Agents
    • 💻What is an agent?
    • Q&A
    • Website
    • GitHub
    • Discord
    • Telegram
    • Slack
    • Taxonomy
    • Website Widget
  • Workflows
    • What is a Workflow?
    • 📊Community Analytics
    • 📈Content Improvement with Gap Analysis
    • 📄Adopt Diataxis
    • ❓Community Support
  • Corpuses
    • 📑What is a Corpus?
Powered by GitBook
On this page
  • A corpus is a collection of documents used to inform your AI agent, allowing them to produce personalized responses, and actions.
  • The problem
  • Solution: Extending AI context
Edit on GitHub
  1. Corpuses

What is a Corpus?

PreviousCommunity Support

Last updated 28 days ago

A corpus is a collection of documents used to inform your AI agent, allowing them to produce personalized responses, and actions.

The problem

AI models like ChatGPT are trained on vast amounts of data spanning everything from Cat pictures, to Reddit threads about baking cookies. Often this data is not directly helpful to organizational needs like technical writing, or answering questions about a specific documentation.

Additionally foundational models training data is frozen at a certain date, meaning that the model is unaware of any information about a topic from a certain time onwards. This leads to answers often being incomplete or misleading.

Solution: Extending AI context

A , a method for extending AI context to include relevant, targeted, specific information about a certain Topic. Katara allows to utilize data loader agents to pull information from live sources, like Discord, Slack, websites, GitHub, etc. This data will be collected and categorized in your Corpus. Generative agents will use the corpus to inform answers and content. This is critical to producing meaningful, and accurate answers. You can then periodically or automatically refresh the links to pull in the latest data about the topic.

📑
Corpus is a RAG
organizations