📑What is a Corpus?
Last updated
Last updated
AI models like ChatGPT are trained on vast amounts of data spanning everything from Cat pictures, to Reddit threads about baking cookies. Often this data is not directly helpful to organizational needs like technical writing, or answering questions about a specific documentation.
Additionally foundational models training data is frozen at a certain date, meaning that the model is unaware of any information about a topic from a certain time onwards. This leads to answers often being incomplete or misleading.
A Corpus is a RAG, a method for extending AI context to include relevant, targeted, specific information about a certain Topic. Katara allows organizations to utilize Data Loader agents to pull information from live sources, like Discord, Slack, websites, GitHub, etc. This data will be collected and categorized in your Corpus. Generative Agents will use the corpus to inform answers and content. This is critical to producing meaningful, and accurate answers. You can then periodically or automatically refresh the links to pull in the latest data about the topic.