Increase your RAG accuracy by 30% by joining Lettria's Pilot Program. Request your demo here.

Faster, Smarter, Sharper, Surer: The GraphRAG by Lettria

Discover the evolution of retrieval with GraphRAG. Move beyond text chunks to a structured knowledge graph featuring low-latency performance, multimodal intelligence, and verified visual grounding.
Increase your rag accuracy by 30% with Lettria
In this article

Key takeaways

⚡ Faster (Low Latency): Heavy pre-computation, such as community summary generation and temporal validity, combined with a high-efficiency vector database ensures that retrieval remains near-instant, even when navigating complex relationships at scale.
🧠 Smarter (Graph Retrieval & Multimodal Intelligence): Our advanced Graph Retrieval engine transforms unstructured text into a rich, structured knowledge graph of entities, relationships, and communities, enabling precise answers to both granular and high-level queries. Combined with native multimodal processing—spanning tables, charts, and drawings—it delivers a powerful foundation for deeper reasoning and more context-aware insights.
🎯 Sharper (Semantic & Lexical Vector Search Precision): We dynamically combine dense retrieval—powered by semantic understanding—with lexical search for exact keyword matching, enabling high-precision results even for complex queries involving multiple sub-questions, entity comparisons, and nuanced constraints across any domain.
✅ Surer (Verified Grounding): Our “visual truth” UI allows users to audit every answer through side-by-side document views and visual bounding boxes on the original source files.

The evolution of retrieval: from chunks to structure

The hidden cost of verbose AI

Traditional RAG relies on chunks—blocks of raw text stored in a vector database. In enterprise contexts, chunks are often verbose and include irrelevant filler, quickly consuming the model’s limited context window. As noise increases, reasoning quality drops—driving higher costs and weaker answers.

GraphRAG shifts from raw text blocks to structured graph elements. By extracting entities and relationships, we provide a dense, high-precision map of the data. This is strengthened by the use of ontologies, a formal schema that guides extraction and ensures entities and relationships remain consistent with the domain.

Furthermore, our GraphRAG uses Communities to answer those high-level "Global" questions that usually break standard RAG. While a Local question might ask for a specific data point (e.g., "What is the Tier 1 capital ratio?"), a Global question requires a synthesis of the entire document (e.g., "What are the primary risk themes?"). By clustering interconnected entities within a document into thematic groups, the system can synthesize these broad insights—a capability that is particularly critical for navigating long, complex reports.

Unlocking hidden knowledge in PDFs

Enterprise knowledge isn’t only plain text—critical insights sit in tables, diagrams, and images, especially inside PDFs. Many systems skip these layers, creating blind spots in risk assessment and contract analysis.

Our ingestion pipeline uses multimodal models to extract content from every document layer. Before indexing, content goes through a tuned splitting process:

  • Chunks too large → more noise, worse context efficiency
  • Chunks too small or abruptly truncated → broken coreference, degraded retrieval

By pairing precise chunking with table/visual interpretation, the system builds context that traditional text-only search can’t match—so answers can be found in nested tables, images, or across multiple paragraphs.

The four pillars of trustworthy GraphRAG

To support regulated sectors like finance, legal, and healthcare, the GraphRAG repository is built on four technical pillars.

1) Low-latency performance

Speed is a non-negotiable requirement for enterprise adoption. We push the "heavy lifting" of relationship mapping and structural analysis into the ingestion phase (pre-computation). This ensures that at query time, the Graph Retrieval engine can rely on a high-efficiency vector database to fetch the most relevant nodes instantly.Key architectural drivers for our low-latency performance include:

Pre-computed Communities: We use the Leiden algorithm during ingestion to pre-summarize related nodes into thematic clusters.

Temporal Validity: We implement temporal validity in the metadata which allows the retriever to select the information corresponding to the relevant date.

Metadata-Filtered Retrieval: Thanks to document traceability within our system, we can apply filters based on document names. By narrowing the search scope before vector retrieval begins, we reduce noise, improve latency, and focus computation on truly relevant content.

By minimizing runtime computation through these pre-structured layers, we keep latency low even when the knowledge base scales to millions of nodes and thousands of concurrent documents.

2) Structural Connectivity via Graph Retrieval

Relevance depends on understanding how data points connect. Our Graph Retrieval engine performs per-document synthesis to identify thematic communities within each file. Instead of treating text chunks as isolated islands, the system navigates the graph to link related concepts. This allows the AI to "connect the dots" across a document, providing a level of synthesis that is far more clever and accurate than independent and traditional vector search.

This is orchestrated by a smart Query Scanner  process that detects user intent which allows to route the query between a local or global search. This ensures the system only retrieves and uses communities when comprehensive context is required, maintaining both precision and speed.

Improve Your RAG Performance with Graph-Based AI.

3) Semantic VS Literal vector search

We combine two distinct retrieval worlds:

  • Dense Search: Captures the semantic "intent" and meaning behind a question.
  • Lexical (Sparse) Search: Captures exact keywords, technical identifiers, and unique codes.

This dual-layer approach is essential for “needle-in-a-haystack” queries, such as identifying specific clause IDs, policy numbers, or specialized medical codes.

4) Verified grounding

Trust is built through evidence. In our SaaS interface, the AI’s answer appears on the left while the source document is displayed on the right. Every claim made by the AI is tied to visual bounding boxes on the original PDF.

Users can also filter their search to a specific subset of documents, ensuring the context remains "trusted," less noisy, and highly accurate.

A production-ready architecture

Synchronized knowledge bases

A key engineering challenge is keeping multiple data stores in sync. We ensure consistent, atomic writes across both our vector index and knowledge graph during ingestion—even in the event of process failures—preventing divergence between systems. This guarantees that every piece of indexed content is fully aligned across representations, improving retrieval reliability, traceability, and overall system integrity.

Precision through document filtering

In a SaaS environment, users often need to narrow their focus to a specific project, department, or timeframe. Our architecture supports document-level filtering, which drastically improves context quality. By restricting retrieval to a specific set of files, the system produces cleaner, more authoritative answers that are strictly grounded in the user's chosen sources.

FAQ

  • How does graph structure help the AI reason better than chunks?

Structured graph elements are built through a cross-sectional analysis of each document, producing concise and interconnected representations of the content. In addition, this graph structure normalizes the data, making it simpler to exploit for an AI. As a result, more useful information fits into the model’s context, improving accuracy on complex questions.

  • Can the system really read tables inside my PDFs?

Yes. Multimodal parsing identifies table/image structure and converts it into structured knowledge, so the AI can retrieve and explain it like any other content.

  • What is the benefit of the lexical part of search?

Semantic search identifies results based on meaning, while lexical search matches exact terms (such as policy numbers, medical terms, or clause IDs). Used together, they balance recall (finding relevant results) and precision (finding exact matches).

  • What is the benefit of using an ontology during ingestion?

An ontology acts as a rigorous schema for Text-to-Graph, specifying which entities/relations matter so the resulting graph stays accurate and domain-aligned.

  • How do I know the AI isn’t making things up?

Verified grounding provides direct visual evidence: the answer is tied to exact locations in the source document, highlighted with bounding boxes.

Conclusion: trust as an infrastructure choice

Enterprise AI must move from “plausible” to deterministic answers. Our GraphRAG solution treats trust as core infrastructure: structural connectivity, hybrid retrieval, low latency, and visual auditability.

By prioritizing structured Graph Retrieval over verbose chunks and providing a transparent audit trail, enterprises can finally deploy AI with total confidence—transforming complex document repositories into a verifiable competitive advantage.

Ready to revolutionize your RAG?

Start to accelerate your AI adoption today.

Boost RAG accuracy by 30 percent and watch your documents explain themselves.