Operationalizing the PageRank Algorithm: Protein-Protein Interaction Graph Analysis [Video]

By Sean Robinson, MS / Lead Data Scientist

July 1, 2022

Resource

Reading Time: 3 minutes

Watch our video below for a quick PageRank algorithm demonstration using a protein-protein interaction graph. Graphable Lead Data Scientist Sean Robinson explores how to operationalize centrality measures such as PageRank from the Neo4j graph data science (GDS). The GraphAware Hume knowledge graph platform delivers the value of graph-based analytics into the hands of everyday analysts and decision-makers in an easy-to-interpret visualization.

Learn more:

Video Transcript: PageRank Algorithm (Protein-Protein Interaction Graph Example)

Today we’re going to operationalize the PageRank Algorithm with a protein-protein interaction graph example. We’ll be using the Clinical Knowledge Graph (CKG) to surface which proteins are most important within a given protein complex. Then we’ll see which biological processes are impacted by those most influential proteins. 

Using PageRank algorithm on protein-protein interaction graph
Using PageRank algorithm on protein-protein interaction graph
Adding a Style With Graph Data Science Algorithms 

I’m going to start by looking up my complex in question. Let’s expand the proteins for this complex to see what our interaction network looks like. Here we can see a number of proteins, but there’s quite a bit of chaos to our visualization. Let’s put some order to that chaos using some graph data science algorithms.

I’m going to add a style to my proteins. Let’s base it on the size of the protein to help those most influential proteins stand out. I’m going to select my PageRank attribute, which I’ve previously calculated and written in my database so Hume will be able to access to it. 

I’m going to have it relative to my visualization (rather than globally across my entire database) to keep things in context in terms of what I’m seeing on the screen. I’m even going to increase my size multiplier a bit so that those most influential proteins really stand out. 

Now let’s add our rule: Our most influential proteins now stand out like a sore thumb, which is incredibly useful when we have a large, chaotic visualization.

Protein-protein interaction graph with increased size multiplier
Protein-protein interaction graph with increased size multiplier
Using Processes Action

Let’s select those proteins and take a look at what processes they might be impacting. I’ll clear out the rest of my visualization. Now we have a much simpler picture of our most influential proteins. I’ll select all of those proteins and use my Processes action to see which biological processes they’re affecting.

Using Bio Processes action to analyze protein-protein interaction graph
Using Bio Processes action to analyze most influential proteins

So here we can see they’re impacting rRNA processing, translation, and a number of other biological processes. This can be incredibly useful when we’re trying to do something like targeted drug discovery for one of these processes.

So I hope this has provided useful insight into how to use your graph data science algorithms to filter and clean up your protein-protein interaction graph visualizations. I encourage you to think about what other visualizations and algorithms could be enhanced with the combination of the two. I look forward to seeing what you do!


Graphable delivers insightful graph database (e.g. Neo4j consulting) / machine learning (ml) / natural language processing (nlp) projects as well as graph and Domo consulting for BI/analytics, with measurable impact. We are known for operating ethically, communicating well, and delivering on-time. With hundreds of successful projects across most industries, we thrive in the most challenging data integration and data science contexts, driving analytics success.

Want to find out more about our Hume consulting on the Hume knowledge graph / insights platform? As the Americas principal reseller, we are happy to connect and tell you more. Book a demo by contacting us here.

Check out our article, What is a Graph Database? for more info.


We are known for operating ethically, communicating well, and delivering on-time. With hundreds of successful projects across most industries, we thrive in the most challenging data integration and data science contexts, driving analytics success.
Contact us for more information: