As the world becomes increasingly interconnected and systems increasingly complex, it is imperative to use technologies designed to take advantage of relationships and their dynamic characteristics. Businesses today face extremely complex challenges and opportunities that require more flexible and intelligent approaches.
The business graph framework for data scientists aims to improve predictions that lead to better decisions and innovation. Neo4j for Graph Data Science embeds the predictive power of relationships and network structures into existing data to answer previously intractable questions and increase the accuracy of predictions.
OpenGov Asia had the opportunity to speak with Dr. Alicia ExecutiveSenior Director of Product Management at Neo4jto better understand graphical data science.
Alicia is Neo4j’s lead on all things graphical data science – working closely with engineering to create a world-class platform for connected data science, collaborating with customers and practitioners to understand how graphs can be put into practice and educating the data science community about the power of connections.
Neo4j is a graph company that involves determining connections within data to derive insights from it. Without finding connections, the data itself may not have actionable meaning. Organizations need connections to make sense of otherwise isolated data points.
Alicia differentiates a database platform from a data science platform. In a database, organizations can store their data and they can query and search for important things. Data science is about taking advantage of the connections between billions, if not billions, of data points. The graph for data science leverages these connections to determine what is important and meaningful.
Industry requirements for data mining
There are three main requirements. First, as organizations have more data, the speed of the ability to access, retrieve, and interpret the data becomes important; whether it is the speed of the query or the speed at which the algorithm operates.
The second thing is expressiveness. The more data there is, the more important it is that the data represents something meaningful. In a graph context, organizations need to structure data the same way it is represented in real life.
The final point is that the more data organizations have, the harder it is to know exactly what to look for in a data set. Having the tools to search for important patterns becomes crucial. So end users can focus their value on what’s important instead of spending years sifting through useless information.
In OpenGov Asia’s conversation with Nik Vora, Vice President, Asia-Pacific, he explains that graphics technology is important because it can extract inherent value from the data itself. The purpose of the technology is to store information without restricting it to a predefined pattern.
Alicia agrees with this. A chart data platform not only represents individual data points, but all the connections between them. Traditional data storage could lose that critical piece of information, such as the relationship between two people or items. Graph Data Platform faithfully represents data; relationships and connections are preserved. When organizations access data through a query or a machine learning model, they always capture the core meaning without throwing important information into it.
Graphical Data Science
Graph Data Science is all about letting connected data speak for itself. It could run an unsupervised graph algorithm method to find the signal in the noise. Depending on how the data is connected, these nodes and concepts are the most important.
It could also be based on the customer graph to show how the customer community interacts and the information is useful for segmentation.
Organizations could go further by doing supervised machine learning on the graph. This way they can predict how the graph will change in the future. Graph Data Science allows organizations to learn from the structure of the graph – not just from the people they are connected to, but from the entire graph. It predicts what relationship will form next. It’s about moving from knowing what to look for to highlighting what’s important and unusual to then predicting the future and what will change.
Graphic data science knowledge graph
Dr. Maya Natarajan, Senior Manager, Knowledge Graphs, Neo4j believes that knowledge graphs are extremely useful for organizations to solve their business challenges. She says that a knowledge graph is unique due to semantics. Semantics is one of the key components and benefits of knowledge graphs.
The semantics are encoded with the data in the graph itself. This is how knowledge graphs embed intelligence into data and dramatically improve its value. Essentially, knowledge graphs increase the value of data through semantics by adding more context.
Knowledge graphs are often implemented as the first phase of graphical data science. Alicia considers a knowledge graph as a heterogeneous graph or a graph composed of different types of nodes, such as people, places, and objects.
The first step in graph data science is to have a graph. The vast majority of Neo4j customers start with a Knowledge Graph to find out what information they have, how it relates to other concepts, and how it relates to their business issues.
Once they’ve built a Knowledge Graph, doing Graph Data Science is all about figuring out what problems they’re trying to solve, what questions they want to ask, and how they turn everything they know into accurate predictions.
Switch from reactive models to predictive models
Companies often start in their reactive phase. For example, organizations only look for fraud when it has already happened and find out who is committing the fraud. Alicia thinks this approach is useful but limited because ultimately the goal is to prevent fraud instead of catching fraudsters.
In terms of predictive value, this means learning the types of models that predict a certain outcome. In the future, organizations will be able to know the patterns of certain characteristics to derive accurate predictions.
Alicia gives the example of predictive modeling by citing a large global pharmaceutical company. The company has an electronic medical record. They were able to say that for every patient they had data on, it was the sequence of events they observed in their care pathway. They had all the data connected in a graph.
What interests them is to take this data and learn from this information: who looks like someone who will benefit from certain interventions? Who benefits from this medicine? And who would benefit from this medicine in the future? Then they know what the graphic pattern looks like for someone who will benefit from the drug. They can also find people with similar characteristics and perform early interventions to improve patient outcomes.
In closing, Alicia says she has been using Neo4j for over 10 years. Neo4j is the first graphical data platform that existed. and without a doubt, Neo4j was the first graphical data science platform. In addition to the solid foundations of a database, there is a super powerful and scalable data science platform for the enterprise.
Neo4j has tested products on tens of billions of nodes to ensure that their algorithm will terminate, give the correct answer, and be easy to use. When organizations combine a mature, long-standing database product with innovative data science, they will get all the predictive capabilities combined with the ability to process them. Neo4j hits the bar of maturity, scalability, speed and future completeness.
For more information, visit https://neo4j.com/product/graph-data-science/