Increase your RAG accuracy by 30% by joining Lettria's Pilot Program. Request your demo here.

Bridging the Gap: Why We Built Perseus to Solve the AI Memory Problem

Artificial Intelligence is evolving rapidly, but it faces a persistent hurdle: reliable memory.

In this article

While Large Language Models (LLMs) excel at processing information in the moment, they lack a dependable long-term memory. This isn't just a problem for Retrieval-Augmented Generation (RAG); it is a critical bottleneck for AI Agents. For an agent to act autonomously—to plan, reason, and execute complex workflows—it needs structured, retrievable knowledge, not just a probability-based guess.

The industry consensus is that Knowledge Graphs (KGs) are the missing link. They provide the structure necessary for AI to "remember" facts accurately. However, adopting KGs has historically been difficult due to high costs, complexity, and inconsistent results.

At Lettria, we wanted to understand exactly why this process was so difficult, so we could build a tool that actually works for the community.

First, We Did the Science

Before writing a single line of code for our new SDK, we went back to the drawing board to investigate the root causes of failure in text-to-graph systems. Our research, detailed in our paper "Text2KGBench-LettrIA: A Refined Benchmark for Text2Graph Systems", uncovered three major structural flaws in how the industry was building and evaluating these systems:

  1. Imprecise Ontologies: We found that existing frameworks relied on "flat," non-hierarchical designs that lacked formal rigor. If the blueprint (the ontology) is ambiguous, the model cannot organize data effectively.

  2. Inconsistent Data: Standard benchmarks suffered from erratic annotations and a lack of standardization, making it impossible to reliably train models or measure their success.

  3. The "Bigger is Better" Myth: Perhaps most importantly, our experiments revealed that massive, general-purpose proprietary models often struggle with strict schema adherence, leading to hallucinations.

To fix this, we developed a new, rigorously corrected benchmark with over 14,000 high-fidelity triples and strict hierarchical ontologies. This research proved a crucial point: smaller, fine-tuned open-source models can significantly outperform massive proprietary models in graph construction tasks, provided they are trained on high-quality data.

Introducing Perseus SDK: Research Applied

We leveraged these scientific insights to build the Perseus SDK. It is designed to make high-quality graph construction accessible, scalable, and accurate for developers and enterprises alike.

Here is how Perseus addresses the challenges we identified:

1. Accuracy Through Specialization

Our research showed that generalist models often fail to follow strict output formats6. Perseus utilizes the fine-tuning methodologies validated in our benchmarks, ensuring that the extracted data strictly adheres to your schema. This drastically reduces hallucinations and ensures that your AI Agents are acting on verified facts, not statistical noise.

2. Solving the "Cold Start" with Automated Ontology

One of the hardest parts of building a graph is defining the "Ontology"—the rules and logic of your data structure (e.g., defining that a Project must have a Manager).

To lower this barrier, we have built an Ontology Toolkit that can automatically generate a tailored ontology directly from your unstructured data. This feature, which will soon be integrated directly into the Perseus SDK, allows you to move from raw documents to a structured logic layer in minutes, giving you full control over how your data is organized without needing a PhD in information science.

3. Efficiency and Scalability

Running millions of documents through a giant model like GPT-4 is prohibitively expensive and slow. By utilizing optimized, fine-tuned models—which we found to be orders of magnitude faster with lower latency 7—Perseus allows you to process massive datasets efficiently. This makes it feasible to build dynamic, living knowledge bases that update in real-time.

Why This Matters

The release of the Perseus SDK represents a shift from "guessing" to "knowing."

  • For Developers: You get a Python-native experience that integrates seamlessly with graph databases like Neo4j, abstracting away the complexity of prompt engineering.
  • For Partners: You can offer solutions where AI Agents are grounded in reality, capable of complex reasoning with a memory you can trust.

We believe that better benchmarks lead to better tools. With Perseus, we are sharing the results of our research to help the entire community build the next generation of reliable AI.

Start Building Today

You can access the Perseus Client on GitHub to get started, or reach out to us to learn more about our upcoming automated ontology features.

Want to see Lettria in action on your documents?

THANKS! Your request has been received!
Oops! An error occurred while submitting the form.

Frequently Asked Questions

Can Perseus integrate with existing enterprise systems?

Yes. Lettria’s platform including Perseus is API-first, so we support over 50 native connectors and workflow automation tools (like Power Automate, web hooks etc,). We provide the speedy embedding of document intelligence into current compliance, audit, and risk management systems without disrupting existing processes or requiring extensive IT overhaul.

How does Perseus accelerate compliance workflows?

It dramatically reduces time spent on manual document parsing and risk identification by automating ontology building and semantic reasoning across large document sets. It can process an entire RFP answer in a few seconds, highlighting all compliant and non-compliant sections against one or multiple regulations, guidelines, or policies. This helps you quickly identify risks and ensure full compliance without manual review delays.

What differentiates Lettria Knowledge Studio from other AI compliance tools?

Lettria focuses on document intelligence for compliance, one of the hardest and most complex untapped challenges in the field. To tackle this, Lettria  uses a unique graph-based text-to-graph generation model that is 30% more accurate and runs 400x faster than popular LLMs for parsing complex, multimodal compliance documents. It preserves document layout features like tables and diagrams as well as semantic relationships, enabling precise extraction and understanding of compliance content.

Callout

Start to accelerate your AI adoption today.

Boost RAG accuracy by 30 percent and watch your documents explain themselves.