Ex-Googler startup DGraph Labs raises US$1.1 million in seed funding round to build industry’s first open source, native and distributed graph database

Houston, Texas, UNITED STATES

SAN FRANCISCO, May 17, 2016 (GLOBE NEWSWIRE) -- DGraph Labs, a graph database company, started by ex-Googler Manish Jain, today announced a US$1.1 million seed funding round co-led by Bain Capital Ventures and Blackbird Ventures, with participation from Atlassian co-founder Mike Cannon-Brookes, and ex-Googler Mark Cummins. DGraph will use the funds to speed development of the first open source, native and distributed graph database, which promises to dramatically improve performance, scalability and efficiency of cloud and big data applications.

"Graph data structures store objects and the relationships between them. In these data structures, the relationship is as important as the object. Graph databases are, therefore, designed to store the relationships as first class citizens,” said Manish Jain, Founder and CEO of DGraph. “Accessing those connections is an efficient, constant-time operation that allows you to traverse millions of objects quickly. Many companies including Google, Facebook, Twitter, eBay, LinkedIn and Dropbox use graph databases to power their smart search engines and newsfeeds."

Use of graph databases has grown significantly in a world where “big data” is now a common term.  Apart from social networks, other applications of graph databases include user behavior analysis, e-commerce recommendations, fraud detection, internet of things, medical and DNA research, unstructured text mining, machine learning and artificial intelligence.

While it’s possible to store graph data sets in old-school table-based databases (relational databases), querying graph data in relational databases is computationally intensive as this requires a large number of table joins to identify the relationships. This means that the compute time required increases as the number of results increase, making traditional databases particularly inappropriate for large data sets.

Traditional SQL databases also suffer from flat inflexible table schemas. Graphs can have complex and flexible schemas, and thus, allow multiple types of entities to co-exist, and change properties without having to rewrite data.

The main factor that has prevented graph databases from becoming more mainstream is that until now they have been either non-distributed or non-native.

“In 2016, it’s hard to believe that there are still no commercial graph databases that are truly distributed,” said Salil Deshpande, Managing Director at Bain Capital Ventures.  “Graph databases that exist today are not truly distributed: they run fine on one node but rely on a variety of architectural hacks to run on multiple nodes, and are thus not scalable. Whereas the ones that are distributed are not really graph databases: they are simply overlays of graph functionality on top of non-graph databases, which results in poor query performance especially when joins are involved, and query performance being correlated with the size and nature of the result set rather than the complexity of the query.”

“To date, serious graph data implementations inside the likes of Google, Facebook, Amazon, etc., have required custom databases,” said Rick Baker, Co-Founder of Blackbird Ventures.  “DGraph will provide this power to the rest of the world at a time that large connected data sets are becoming increasingly important.”

DGraph is an open source, native and distributed graph database, developed for real-time, low-latency and high throughput query flow. Its data distribution is designed to minimize the number of network calls, keeping them directly proportional to the complexity of a query, not the number of results. This ensures a 95 percentile query latency even as you add more servers to the cluster, allowing the cluster to scale nicely.  It is already showing significant performance advantages over the rest of the market.

DGraph allows scaling a database from a single laptop to serving terabytes of structured data via commodity hardware. It's built to survive machine failures and partial data center collapses.

It also naturally provides joins, which distributed SQL databases typically don't. In fact, joins are the most common operation for a graph database. Doing joins in a distributed environment is a hard problem, but DGraph has the right team to solve it.

Before starting DGraph, Manish worked at Google in the Web Search and Knowledge Graph Infrastructure group for over six years. The idea for DGraph came out of his experience at Google, where he tackled similar issues serving the Knowledge Graph, and various real-time updating data feeds to power Knowledge Cards and One-Boxes.

“This is a strong team continuing their work on a problem they’ve had considerable success with inside Google.  It’s great to have the opportunity to back them to bring their expertise to the world,” said Rick Baker, co-founder of Blackbird Ventures.

DGraph Labs will use the funds to recruit engineers and build core technology. "We’re a distributed team. We want to hire smart and experienced backend engineers from the US, Canada, Australia, and anywhere else we find talent,” said Jain. “Startups working on hard-core backend infrastructure problems are rare. Typically such problems get taken on by large companies such as Google, Facebook or Twitter. But their systems would never run on a laptop or a single server. We are solving a more complex scalability problem in a small and fun environment.”

DGraph Labs, Inc.
Website: http://dgraph.io
Github: https://github.com/dgraph-io/dgraph

Manish Jain, Founder