How to Build a Private ChatGPT Using Open-Source Technology? Download our free white paper.

Structuring Data Into Graphs With Text To Graph

Leverage Lettria's Text2Graph to convert unstructured text into structured, searchable data using NoSQL graph databases.

Introduction

For storing and processing data - databases are the king.  Traditional databases structure data into structured tables that allow for fast and easy queries, and allow organizations to find structure and patterns in the data.

Databases are amazing tools for our data that is structured…but what about the text files in your organization?  What about files that cannot be easily converted into structured and ordered tables? Are they a round peg that cannot fit into the square hole of a database, and we are destined to not process and organize this data?

This is no longer the case. NoSQL databases (Not Only SQL) allow for the storage of unstructured data, and using a special type NOSQL databases has allowed Lettria to create Text2graph - a project that finds structure in your text files and presents the data in a graph format.  

Want to learn how to build a private ChatGPT using open-source technology?

So how does it work?

Graph databases are a type of NoSQL databases that can be used to navigate relationships in a file. By specifying common relationships during the inputting of the file, Lettria’s Text2Graph builds structure around unstructured text files, giving your organization the opportunity to parse and study these files in ways that haven't been done in the past.

Graph databases

Graph databases are a piece of this magic.  A graph database organizes text into a schema of nodes and edges. 

  • Nodes are entities in the database. They are like a record in a traditional database, but there is no limit to the number of attributes that a node may have.
  • Edges are the relationships/links that connect the Nodes in the database.  Edges can be directed, or they can have no direction.

For example, in a social media application, Albert and Betty would be nodes, and their friendship would be an edge.  They are both friends, so the edge has no specific direction.

In a family tree where Charlie is the parent of David, the edge will be named ‘parent’, and it will have a specific direction - from Charlie to David.

Finding Structure in a text document

Lettria’s Text2Graph finds the structure in your text documents, defining the nodes and edges as the data is inserted into a graph database.  Lettria’s ontology function is the magic that finds the structure and provides that data to the database.  

What is an ontology?  

An Ontology is a formal representation of concepts and relationships for a specific domain.  By creating ontologies for your data, Lettria creates the formula to structure your text files.  For example, in the Lettria “business event Detection,” the ontology is built around Merger and acquisition events and relationships. The screenshot below shows business events around acquisitions, bidding wars, merger and demerger events. Further, the personas are described - the acquirer and acquiree in a merger, or the participants in a merger or demerger:

When a business news article is run through this ontology, Text2Graph structures the data around these ontologies and builds a graph

A simple example of Lettria’s Text2Graph ontology in action

We’ll create a fake M&A article:

SolarFlare Innovations has agreed to buy QuantumWise Technologies for an estimated $129M.  

Lettria creates the following graph:

The sentence has been charted into 4 nodes - Solar Flare, Quantum Wise, Acquisition Event and Amount.  Clicking on the center node reveals the edges: where we discover that Solar Flare is the ‘acquirer’, Quantum Wise is the ‘acquiree’, and the Amount is, well, the ‘amount’ of the deal. Each node is also described in the right navigation:

Clearly, the Text2Graph ontology has structured the data to make it easier to parse and search. By converting unstructured data into a searchable index - we now have the power to better understand files that were previously unsearchable!

This example was created using Lettria’s Text2Graph, a tool that quickly and accurately extracts information from unstructured data like text files, and creates a structure around the content allowing for database queries around the data. The data can be entered into any graph database, and visualized as needed.

Interested in trying this for your company? Contact us for a demo!

Callout

Build your NLP pipeline for free
Get started ->