Powering Businesses with Knowledge Graphs
At its core, a graph is a mathematical representation of relationships between pairs of objects or entities. These entities are usually called nodes or vertices, while the relationships between them are known as edges. Most will point to the problem of the Seven Bridges of Königsberg as the first application of graphs to solve a navigation problem around a city. The goal of the solution was to create a walking route through the city that would cross all the bridges while crossing each bridge only once. Credited to Leonhard Euler, this first graph solution allowed him to reduce the routing problem by representing the land masses as nodes with each bridge as edges that connect two nodes. This allowed the determination of the sequence of bridge crossings to get from one landmass to the next to solve the problem.
Graphs are a remarkable solution that serve functions well beyond the simplification of information for way-finding and routing. Edges between nodes can be assigned distances or given direction. Nodes can be treated like an object, sized and given mass. By treating these nodes and edges as objects, gravity can be added to the system to adjust the layout of the graph like planets. What makes graphs interesting is that they can be used as a mathematical representation of space and at the same time represent the mathematical constructs from space. The underlying structure of graphs allow solutions to be built using linear algebra (finding null spaces) and geometry (angles). Neural network models used for machine learning are at their core, graphs after all.
What is a Knowledge Graph?
There is no broad consensus on what a knowledge graph actually is. In the most general sense a knowledge graph is a method of storing data as related entities. At its core, knowledge graphs are a machine representation of human knowledge. We all remember and convey knowledge by piecing together "data" from disparate sources of information and building relationships of different types between them. Perhaps the most widely recognized implementation is Google's Knowledge Graph that uses connected data to populate infoboxes or the knowledge panel to the right of their search results. Getting these to show up in a search query requires a little bit of time spent on schema.org to leverage the standardized naming conventions to connect data in JSON-LD format. To get a feel for these, you can test out your own Rich Results on Google. Here is a sample JSON-LD that will open up a Google Rich Result Test page that anyone can play around with to generate the output below:
For a more tangible example, we can look at a local business and how we could represent a local business as a knowledge graph from the perspective of schema.org.
Knowledge Graphs Mirror Human Cognition
A point of caution: Knowledge graphs are not designed to be transactional storage systems, but rather higher order systems distilling information from our transactions. No human would remember all the conditions surrounding them every time they perform addition. Things such as time of day, the weather, if the addition was done with pen and paper, or on a calculator. The only knowledge the brain needs to retain is that 2 plus 2 equals 4. Any remaining extraneous information is irrelevant and does not have to be stored for the person to know the result of adding 2 and 2 together. Effectively, the process of acquiring knowledge is one where repeated practice distills key pieces of information that get connected and retained with irrelevant data erased from the memory.
The same holds true for businesses seeking to leverage knowledge graphs. As an example, we can take a restaurant that is trying to build a knowledge graph to connect different wines with menu items that can be personalized for different customers. One way of solving this problem would be to link ingredients and flavor palettes with wine bouquets manually using human knowledge. While time-consuming, expert sommeliers and chefs working together could use their knowledge to create relationships in a spreadsheet that can be built into the knowledge graph. A small subsection of the graph will look like the image below. For example, the chefs and sommeliers work on tagging red wines that have an earthy flavor to be paired with steak and linking a dry white wine with the seabass could be done one wine and one entree at a time. These subject matter experts can even rate the strength of the pairings and these can be stored on the relationships (indicated by the thickness of the lines).
On the surface, wine-to-entrees are relationships that reflect human knowledge, and we've now moved this expert knowledge from a single mind into our knowledge graph. Moving out from our simple wines to entries relationship, we can now make a second order connection between entree ingredients and the individuals notes of the wine bouquet. With this second order connection, we can add new wines to the menu nightly without having to manually review each one, using their tasting notes and dish ingredients to infer the best pairings between them. The expert sommelier is unlikely to be able to memorize every relationship between entree and wine, but has a fundamental understanding of the collection of ingredients that pair with the collective notes of any particular bouquet. To replicate this, new entrees or wines can be integrated into the knowledge graph with different similarity metrics can be used to find the best pairings for them.
Another way of building a knowledge graph is to use customer history to find the best relationships between wine and entrees. The knowledge graph itself would not need to store all the transactions from all customers. Instead, the historical purchase patterns of customers and their ratings. By finding the highest rated customer experiences, the probability that an entree should pair with a given wine can be determined, without the need for explicit knowledge of relationships between entree ingredient and wine bouquet. This process can be done using analytics and will require minimal amounts of human intervention, letting patterns within the data guide the way. But, the caveat here is whether the restaurant patron is the best source of knowledge in this space, and that the aggregate behavior of the consumer will reveal the best pairings overall.
Most importantly, the flexibility of the knowledge graph allows us to leverage all three of these options in whatever way is best for the business. We can customize the importance of the expert knowledge, the inferred matches, and transaction history to achieve the perfect recommendations from our highly connected data, all updating in real time, returning instantaneously results based on millions of wines and food pairings in the above example.
Unstructured data and Knowledge Graphs
While beyond the scope of this article, Knowledge Graphs can also leverage unstructured data alongside structured data to achieve their full potential. If companies combine free-form text such as reviews, articles, 10-Ks, legislation, technical documents etc. and connect the meaning and value of that data with their structured data, a Knowledge Graph has the potential to have a much wider and more significant impact on the business. By using methods from Natural Language Processing (NLP) to detect key topics, sentiments, and extract additional context, hidden relationships can be discovered to drive new insights for the organization.
Where can Knowledge Graphs be Deployed?
There are so many places that knowledge graphs can be deployed effectively. From powering chatbots to content management, the potential for knowledge graphs to drive businesses is nearly limitless. Here is a shortlist of the potential use cases for knowledge graphs in business:
Research and Development Tracking and Whitespacing
Augmented Intelligence for Personalization
Automated Content Tagging
Simply, the ability to be able to connect information together holds tremendous power in the digital space. In upcoming blogs we will take a deeper dives into some of these use cases.