Integrated enterprises are the organizations capable of implementing change and surviving unexpected crises such as COVID-19. Across the board, the resounding answer seems to be that Artificial Intelligence or AI is the critical capability that is an edge over the competition. Whether deployed to deliver a better customer experience, optimizing the supply chain, or making better demand forecasts, AI clearly has the potential to deliver value to any organization if deployed in production.
“Through 2020, 80% of AI projects will remain alchemy, run by wizards whose talents will not scale in the organization.” — Gartner
Despite the promise of AI, very few data science initiatives ever make it into production. Apart from Gartner, VentureBeat reports that 87% of data science projects will fail to make to production, thus providing zero return on investment. Gartner also lists three barriers to AI adoption: 1) skills gap; 2) fear of the unknown; and 3) data scope and quality. From our experience at Graphable, these 3 barriers are true from a 30,000ft perspective. But, there are more fundamental barriers that exist on the ground that have to be overcome. We are going to unpack three critical silo walls that affect the day-to-day implementation of enterprise AI.
1. Not Understanding that Information Matters
Very few organizations value the information that they have as a true enterprise asset. Where inventory of physical products is given a dollar value, data sitting in databases are generally viewed as cost centers with license fees and server or cloud costs. Instead of receiving the same value and care as physical products, we see data as a liability rather than an asset. As an example, poorly made products sitting in the warehouse are marked down and dumped, but data with awful quality and management can be left to sit untouched and unchecked.
A critical piece of the puzzle is understanding that business transactions of all kinds involve the transfer of information. From the beginnings of information theory, Claude Shannon understood that not only does information matter, information IS matter. Since the days of World War II, Shannon had understood that the same scientific constructs, rules, and laws that govern physical systems apply to information. Think about the earliest business transactions, barter trades during the times of the earliest civilizations. Every time furs and shells exchanged hands, there is an exchange of information in the form of design, language, and raw materials.
In today’s digital business environment, the ability to track every decision and transaction in the organization is critical. Yet, we see so many businesses continue to view data movement and management as a burden, which in turn delays the process of developing AI solutions. Without centralized data sources that are managed and governed appropriately with a focus on ensuring that the data support data science efforts by yielding the necessary features needed for models.
2. Defining Data Science By “The How” Over “The What”
Any internet search on data science seems to lead to articles and blog posts on what “real” data science is. Understandably, there is a need in the data science community to distinguish itself from analysts. Some capture this as a path of going from descriptive statistics (what has happened) to inferential statistics (what changed) up to predictive analytics (what should happen). Others might even define what a the “true” data scientist is by whether they use R or Python. Normally the tendency is to consider machine learning over statistical regression methods as a key differentiator between data science and AI vs. analytics.
The unfortunate outcome of what can only be described as “tech tribalism” is that data science tends to become a solution in search of a problem. Instead of starting with simple solutions that are augmented over time, data science teams start off on large-scale machine learning projects. After months infrastructure building, data cleansing, feature engineering, model training and testing, these projects end up getting shut down because the mounting development costs can no longer be justified. Data science teams should be focused on directly solving business problems efficiently from the outset, then building in additional capabilities and enhancements over time.
3. Not Viewing Data Science as a Key Business Function
Compounding the “how” vs. “what” problem, there is a tendency in so many businesses that treat any function involving data as living exclusively in the realm of information technology. Unfortunately, this causes data science to be treated as a form of tech support and not a core business function. In order to “stay in their lane”, data science teams are incentivized to focus on platform and algorithm building instead of engaging with business teams and having a say in how the business could be driven forward. This causes the “wizards” to have minimal interest in engaging with the very business functions that data science should be supporting.
This separation of data science teams from business execution creates a different kind of skills gap than the one that is widely reported. Usually, business struggle to find tech talent to develop the machine learning processes and algorithms. At the same time, there is also a shortage of data science experts who have sufficient depth of business understanding and an ability. Another missing piece of the skills puzzle is the ability to articulate the mathematical underpinnings of their approaches to business sponsors and partners. It is a general misunderstanding that business partners are math-averse. Simply put, most people do not need to hear about things like “hyperparameter tuning for gradient descents in a boosted tree” to understand what a mathematical model is doing and why it works. Data science teams of the future have no choice but to become part of the business functions they support, and communicate with the business in understandable terms in the context of the business.
Solution: Understanding that Businesses are Networks
At its core, every business is a network. Information flows from marketing through ads to the consumer. When a purchase happens, there is a flow of information in the form of a product to the customer, who in turn shares feedback in the form of ratings and reviews back to the business. The manufacturer transmits information in the form of inventory through the logistics network. Simply then, what you have is a graph of interconnected business-related entities, forming a graph of nodes and relationships. Below is what a network for a fast food or quick service restaurant chain might look like. The network shows the many interactions between internal and external dimensions that impact the business as a whole.
Using a graph as the data backbone of the analytics team is a platform for the development of explainable and transparent insights that be understood and made actionable by any component of the organization. By storing the data in the form a network that mimics the way that information moves across the enterprise, understanding the impacts of different dimensions of the organization simultaneously is possible. The ability to analyze information along any path in the network in any direction allows the data science team to be able to rapidly yield insight by leveraging the underlying structure of the network. Graph analytics allows the team to quickly assess how multiple dimensions of the organization might be impacting another. For example, understanding which marketing channel is driving the most sales for a particular menu item or if a particular menu item should be discounted with a price-change in the commodities market.
Deeper dives into the latent structure of the graph can reveal even more about the business. Using graph analytics to reveal critical hubs in the network to find areas where the organization might be over- or under-represented. Analyses of the structure of the enterprise network can also be used to test the resiliency of the supply chain by tracking the number of paths through which products can move from the manufacturer to the distribution center and then eventually onto the customer. Community identification methods can be deployed to find clusters of similar customers that can then be used for viral marketing by linking ads to purchases. There are nearly endless possibilities for new, direct, and transparent methods of extracting insight from the enterprise information network.
Graphable delivers insightful graph database (e.g. Neo4j) / machine learning (ml) / natural language processing (nlp) projects as well as graph and Domo analytics with measurable impact. We are known for operating ethically, communicating well, and delivering on-time. With hundreds of successful projects across most industries, we thrive in the most challenging data integration and data science contexts, driving analytics success.