Overview: What Is Neo4j (Graph Database)?
Neo4j is the leading graph database. It's offered commercially, fully supported and open sourced. Created in 2007, Neo4j is a No-SQL database that's Java based, schema optional and massively scalable. So what is Neo4j? It's commonly viewed as the best enterprise-ready graph database available today. In this article, we discuss graph databases and examine the features and advantages of Neo4j.
What Is a Graph Database?
Fundamentally, graph databases store data in the form of graphs. A graphs is a mathematical concept that classifies elements in terms of vertices (nodes) and edges (relationships) to understand connections and patterns within the information being studied. When using a graph database like Neo4j, these graphs are often represented visually.
Graph databases are a relatively new class of database leveraged for use cases that are particularly focused on connectedness within data. In other words, while graph databases store data like nodes and edges, they focus more heavily on the relationships that are often hidden among the many elements within masses of data. In a graph database, relationships are first-class citizens along with data objects.
While Neo4j is often referred to as schemaless, a better term is "schema optional" since a schema can be used (though not required). (See What is GraphAware Hume? for more information on creating Neo4j schemas.)
Why Are Graph Databases Valuable?
We live in a world that's becoming more and more connected. As a result, our data is becoming more connected as well. Given the volume of data that's produced globally each day, the value of the relationships inside the data is fast becoming more valuable than the data itself. The unique value of graph databases comes from its ability to surface new interconnected knowledge, natively and at scale, as analytical insights that have material impacts on businesses and other organizations.
Can’t RDBMS Do This?
With enough development time and compute power, RDBMS can do many things for which it's not ideally suited. Unlike graph databases, traditional relational databases do not natively store relationships among data sets. Rather, RDBMS only store the data itself. It can then only calculate relationships at run time. This is time consuming and compute expensive when the same information can be returned in milliseconds from a simple graph query. Practically speaking, RDBMS is ill suited for many use cases, whereas Neo4j and other graph databases excel.
Leading Graph Databases
While Neo4j is the most mature and well-adopted graph database in the world by a significant margin, there are dozens of others available, which are generally divided into native (e.g. Neo4j) and multimodel (e.g. CosmosDB). For more detail on the differences between native and multimodal graph databases, check out our article where we discuss Neo4j architecture. At present, the top ten graph databases across both native and multimodel include:
- Microsoft CosmosDB (certainly over-ranked due to ComosDB multi-model SQL Server popularity)
- Amazon Neptune
Principal Neo4j Features
- Cypher. A familiar SQL-like query language
- LPG. Labelled property graph model
- Native graph. Includes native graph storage and native graph processing engine (GPE)
- Index-optional. Index-free adjacency drives performance; Neo4j can also leverage indices using Apache Lucence
- Schema optional. Neo4j offers a schema-optional interaction made possible by the nature of the data and the storage itself; there are use cases for deploying a schema, but certainly not in all cases
- UNIQUE. Supports UNIQUE constraints
- Strong UI. Neo4j Data Browser is an easy way to execute Cypher commands
- ACID compliant. Neo4j's database integrity is founded on atomicity, consistency, isolation and durability
- Exports. Supports exporting to JSON and Excel
- API access.
- REST API is accessible via any means using REST protocol (e.g., Java, Spring, etc.)
- Java with two APIs: native Java API and Cypher API
- Sharding. Neo4j sharding is their advanced, distributed computing offering as of version 4.0
Top Advantages of Neo4j (Graph Database)
- First mover. Neo4j is more widely adopted in the market than any other solution. In fact, Neo4j Founder Emil Efreim actually coined the term "graph database".
- Community. Neo4j has a thriving user community, active forums, deep documentation and resources for any graph databased-related questions.
- Performance. Neo4j is one of the very few true native graph databases, which enables index-free adjacency for massive performance gains.
- Availability. From massive real-time applications to analytically focused graph and graph data machine learning apps, Neo4j sets the bar for meeting HA requirements.
- ACID Compliant. Neo4j's performance across both read and write has been demonstrably scaled to the enterprise. It has also maintained integrity with true ACID compliance, which is still lacking in most other offerings.
- Easy access. It's easy to interact with Neo4j, whether through the Neo4j Browser UI with Cypher query language (which is much easier than alternatives like Gremlin) or through the Java API.
- Unstructured / semi-stuctured / textual data. Deriving value from the absolutely massive and growing amount of unstructured textual data available today has never been easy. Given the unique capacity for graphs to embed meaning and connect concepts, Neo4j is the ideal solution for driving insight. Additionally, Neo4j is a leader in natural language processing (NLP) with graphs.
- Graph data science. Neo4j is the commercial leader in data science with graphs, including NLP use cases. Given that Google has asserted the future of data science is being built around network theory / graph databases, this is a significant advantage for Neo4j.
When to Use Neo4j
What is Neo4j used for? While Neo4j can be used for most use cases, its unique value becomes evident for uses cases that are focused on connectedness within the data. Some of the most common uses cases include:
Close to 1,000 commercial customers and nearly 5,000 startups use Neo4j. Some of the well-known commercial brands that use Neo4j include:
- The National Geographic Society
- US Army
- Thomson Reuters
- Volvo Cars
- Top Ten Reasons To Consider Neo4j
- Beginner Overview of Neo4j
- Gartner Predicts Exponential Growth of Graph Technology
- High Tech Security Firm Limbik Uses Neo4j to Create Information Defense System
- What is Cypher? A Quick Neo4j Cypher Introduction (With Examples)
- Link Prediction With Python: Example Using Protein-Protein Interactions [Video]
- Neo4j MDM Graph Database – Master Data Management [Overview Video]
- What Is a Knowledge Graph? Powering Business With Graphs
- Natural Language is Structured Data
- Graph Data Lineage for Financial Services: Avoiding Disaster