The Power of Storing Corporate Data in a Graph Database

By Rebecca Rabb

May 12, 2021

Blog

Reading Time: 5 minutes

As a company grows, it needs to ensure it is surrounding itself with the right people, partnering with reputable organizations, and remaining flexible in the face of market volatility and changing societal conditions. Prior to graph databases such as Neo4j, how would one gather information regarding its network constituents? And not just the obvious first-degree connections, but rather the multiple degrees further out where possible risk is less detectable? For example, do my partners have relationships that could somehow jeopardize our corporate integrity? Or, how do we analyze the impact of realties of conditions out of my control such as natural disasters, a scathing op-ed, or a global pandemic? Whether you’re an established name in international commerce or a budding start-up, these are all elements that can and should to be taken into account, but that have been difficult if not impossible to fully grasp prior to the advent of graph databases.

Graph databases help address these types of issues in a manner that is efficient, transparent, accessible to every area of a business. For an individual or organization with various connections, the complexity of background checks and related resource searches grow exponentially the further out the exploration. For example, Company A could have 100 subsidiaries and those subsidiaries could have another 100 themselves, and so on. Using traditional technologies and approaches, it can become incredibly resource intensive to continue the exploration, both in time and computation, in order to thoroughly vet the extent of the web-like network, and oftentimes critical nodes can be missed using those heritage and often manual-intensive approaches. Below I’ll review two use cases that can significantly improve outcomes, utilizing graph databases.

Use Case 1: Conflict of Interest

In business, in politics, and in science one looks to scope potential conflict of interest, avoid that conflict as possible, or at the very least disclose and document it. For example, in business, is one of the board members voting with their holdings in mind? In the political sphere, what about a lawmaker championing a bill that’s in their best interest rather than the constituents? In science, is a researcher publishing results that advocate or conflict with other projects or grants?Below is an example from the political space. We begin with a politician sponsoring a given bill that would affect legislation in two industries:

We then look to establish a circular relationship to determine how close or far removed that individual is from the interest of that bill. Upon first look from this politician’s disclosed holdings, there’s no association. This politician owns stake in a few businesses including company 1, then company 1 does business with companies 2, 3, and 4.

By adding weights to these relationships, we’re able to determine just how much influence the individual has on the downstream entities. By way of multiplication, if the politician has a 40% stake in company 1, then company 1 fully owns company 4, and company 4 owns 60% of downstream company 5. At the end of the day that politician holds 24% stake in a company they may or may not have known about or disclosed.

Then finally what could bring the web full circle is that this company 5 of which we’ve established has a considerable tie to this politician is in the given industry or sector of which the initial bill benefits. What we’ve established here is a means to look that much deeper into the intersection of lawmaking and the business it may benefit. Setting aside whether the intent is malicious or not, oftentimes individuals at the federal and even state levels have numerous spheres of influence that extend far and wide. In keeping the legislative process fair and transparent, graph databases can be used to identify those relationships and determine whether authority has a vested interest in the outcome of a bill becoming law. That formerly obscured connection is shown below:

Use Case 2: The ripple effect

For this scenario, let’s imagine a wave mad cow disease sweeping through Midwest farms causing cattle to become sick across the region. One can conclude that this severely affects agriculture, certain commodity prices, restaurants, and more, but to what extent? Would certain sectors or companies be impacted more severely than others? How would one go about assessing the magnitude of the impact or risk?

We can begin by establishing a couple basic relationships where the disease has spread to two prominent beef farms in the Midwest. Details such as that farm’s geographic location, number of infected animals, or relative score of the impact can all be added as properties to the nodes and relationship.

Then we can add a number of other dimensions to this graph including who buys direct from these farms, what companies are intermediaries between that farm and a larger exchange, and second and third degree connections that could be impacted by this event.

The power of the graph lies in the efficiency and transparency of which to surface these relationships. Stakeholders at all levels can visualize the paths between this disease affecting a given ETF, monitoring prices on exchange, or the number of “hops” between their organization and one being impacted more severely.

In this way an organization can begin establishing risk profiles, because what is considered unacceptable risk may differ from company to company. In the graph analysis below, some examples of criteria that can be evaluated differently by different organizations can include:

  • Entities must be a certain distance from one another

  • Entities cannot be in a given list of “risky addresses”

  • Companies cannot source more than a given threshold of their supply from one entity (diversity of supply chain)

  • There cannot be any company/subsidiary relationship or competing interests between similar companies (as described in use case 1)

You can customize the criteria and reach of your risk algorithm in the graph analysis, and it also allows for flexibility as your requirements change. As new legislation or compliance requirements emerge, organizations can adjust by adding new nodes, adding or modifying relationships and properties such that the graph model can adjust to the evolving context.

With even these simple examples, it becomes evident that graphs can be an accessible yet powerful tool to model multi-faceted issues that businesses face everyday. Whether it be evaluating the extent of a conflict of interest, impact of a catastrophic event such as disease or natural disaster, or building related risk profiles to establish action thresholds and alerts, graph databases can move organizations from manually-intensive, hunch-based risk management to automated, fact-based risk mitigation.


Graphable delivers insightful graph database (e.g. Neo4j) / machine learning (ml) / natural language processing (nlp) projects as well as graph and Domo analytics with measurable impact. We are known for operating ethically, communicating well, and delivering on-time. With hundreds of successful projects across most industries, we thrive in the most challenging data integration and data science contexts, driving analytics success.

Want to find out more about the Hume knowledge graph / insights platform? As the Americas exclusive reseller, we are happy to connect and tell you more. Book a demo by contacting us here.


We are known for operating ethically, communicating well, and delivering on-time. With hundreds of successful projects across most industries, we thrive in the most challenging data integration and data science contexts, driving analytics success.
Contact us for more information: