• Neela Sakhardande

Graph Technology & Graph Databases

To get ahead of the curve, companies are investing heavily in AI for enhanced personalisation and a better customer experience. AI does have enormous potential, but at the same time, it's important to understand that it's still very much in its initial stages and has a way to go before it can, for example, grasp social and contextual awareness. AI systems are consequently only as good as the data we feed them to train intelligent algorithms.


Connections are Important!


Today's software doesn’t possess super-intelligence and it has no understanding of context. Thus no matter how much bigger or faster are the computers you run AI on, it is almost impossible to get the right data results when one crucial element is missing - relationships. To turn any data into actionable insights, you need to understand its relationships i.e. it needs to be connected. This is where graph technology comes into the picture.


Graph Technology

Have you seen those movies where the camera zooms away from a detective board, showing all the pictures, notes and news articles connected by thumbtacks and yarn? You can immediately see the power of connecting the dots. Fancy taking this detective board and applying a mathematical engine that can query its data relationships. This is a graph database!


Let's compare it with a well-known territory of relational databases. One of the main attributes of a relational database is the constraining nature of its relationships which makes it ideal for processing transactions and building deterministic analytics. However, these strict constraints oftentimes make it too obscure to explain distant relationships.


For example, assume you have all the data of university students and professors ever gathered. Then assume you want to know the relationship between a group of 10 students originating from 10 completely different universities. At first glance, you would think that since the students didn't go to the same university, there cannot be any connections. But now, if we look at their professors who also attended the universities themselves, we can realise that they all shared a common professor when they (professors) were students.


Students and Professors Graph Technology

Now, let us define some of the characteristics of a graph database. First, you have nodes, which are essentially records. Connected to these nodes is a type of relationship which can have a direction and a property associated with it. So in our case, the direction points to the original professor, the relationship type is STUDENT_OF, and the property is the year and the semester when they were taught.


Students and Professor Graph Technology

Now querying this database isn't like your typical SQL query. The graph database vendors usually have their own query language - which is something that the industry is still working out. This leads us to the drawbacks of graph databases.


Drawbacks:

  • One apparent drawback is the tendency to get a little conspiratorial, just as at times, the detective board on the movie screen can make somebody look crazy.

  • Graph databases can infer connections that don't mean anything. For example, imagine the assumption you could make if all the students in the previous example ended up dropping out of school. Does that mean that the original professor had some kind of influence on that unfortunate outcome? Well, anything is possible, but we must be a little more cynical of such conspiratorial patterns. So graph databases are normally a tool for starting questions but not necessarily answering them.

  • Another problem is that it's not unusual to send a graph database query into oblivion, usually because the query plan can be very challenging to conceptualize. This can lead to inaccurate interpretations and slow queries.

The data science communities use the graph databases to test inferences. The discovery of these relationships and their relevance to the organization often is what gets bubbled up into the data warehouse. The flexibility of a graph data model allows you to add new nodes and relationships as required. This avoids expensive data migration and all your original data remains intact.