SAN FRANCISCO, CA--(Marketwired - Jul 5, 2016) - Databricks, the company founded by the team that created Apache® Spark™, today announced that, the leading car information and shopping network, has implemented Databricks to improve the overall customer experience of their website. Edmunds serves nearly 20 million visitors each month and makes it easy for shoppers to browse not only dealer inventory, but also vehicle reviews, shopping tips, photos, videos, and feature stories. 

Accurate vehicle data is of the utmost importance for Edmunds' website visitors. The data team at Edmunds integrates a wide spectrum of data, ranging from their proprietary data sets to paid data sources, to automatically populated details of each vehicle from its VIN code. The rapid growth in the volume and complexity of vehicle data created enormous challenges in maintaining data integrity. For example, determining the percentage of Subarus with missing option details or Hondas with incorrect exterior color were problems that the Edmunds engineering team spent hours trying to fix.

While Edmunds evaluated Apache Spark as a solution to its data challenges, the company also determined that its analysts and data professionals needed a comprehensive data platform that provided managed services to simplify its Spark deployment and increase productivity.

"We chose Databricks when we knew we wanted to move to Apache Spark because we needed advanced functions beyond the open source software to solve our analytics challenges. Databricks simplifies our data access and ingest, helps with jobs and cluster management, and enables data exploration and reporting," said Greg Rokita, executive director of technology at

With the implementation of Databricks, Edmunds was able to democratize data access across its organization, allowing its data engineering, data science, and business analyst teams to work collaboratively on their data at scale. Edmunds also achieved the following quantitative results:

  • Accelerated ad hoc data exploration and analysis by six-fold allowing the company to answer data integrity questions faster;
  • Improved reporting speed by reducing processing time by 60 percent, or an average of 3-5 hours per week for the engineering team;
  • Improved vehicle data quality metrics across its website by 35 percent.

"Apache Spark enabled Edmunds to understand the quality, quantity, and cost of data sources at scale. The power of Spark was fully realized with the implementation of the Databricks platform, enabling Edmunds data engineers and business analysts to collaborate through one tool. Now Edmunds drives better inventory and offers a more personalized experience for customers who use the platform to guide their car buying decisions," said Kavitha Mariappan, vice president of marketing at Databricks.

For more information, download the case study:
Visit Databricks at

About Databricks:
Databricks' vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache® Spark™, a powerful open source data processing engine built for sophisticated analytics, ease of use, and speed. Databricks is the largest contributor to the open source Apache Spark project providing 10x more code than any other company. The company has also trained over 20,000 users on Apache Spark, and has the largest number of customers deploying Spark to date. Databricks provides a just-in-time data platform, to simplify data integration, real-time experimentation, and robust deployment of production applications. Databricks is venture-backed by Andreessen Horowitz and NEA. For more information, contact

© Databricks 2016. All rights reserved. Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation.

Contact Information:

Media Contact:
Suzanne Block
Merritt Group for Databricks
P: 617-824-0981