Informatica and Hortonworks Partner to Deliver Parallel Parsing to the Apache Hadoop Community

Informatica HParser Community Edition to be Available for Free to Hortonworks Data Platform Users

Redwood City, California, UNITED STATES

REDWOOD CITY and SUNNYVALE, Calif., Nov. 2, 2011 (GLOBE NEWSWIRE) -- Informatica Corporation (NASDAQ: INFA), the world’s number one independent provider of data integration software, and Hortonworks, a leading contributor to Apache Hadoop projects, today announced that Informatica HParser Community Edition will be made available for free to Hortonworks Data Platform users and available for download from the Hortonworks website. The two companies collaborated on the distribution of the Community Edition of Informatica HParser, which is based on Informatica B2B Data Transformation’s differentiated approach to parsing complex files and messages with unmatched efficiency and scale.

Informatica HParser Community Edition provides users with a powerful, engine-based interactive tool to simplify and speed the data parsing and analytics process for some of the most popular data types used in Apache Hadoop including logs, Omniture, XML and JSON. Part of the recently announced set of Informatica HParser offerings, the community edition exploits the parallelism of the MapReduce framework and helps developers in small and large enterprises more easily take advantage of Apache Hadoop.  

Unique Advantages of Informatica HParser Community Edition

With Informatica HParser Community Edition, users of the Hortonworks Data Platform are able to:

  • Accelerate deployment - Informatica HParser Community Edition provides out-of-the-box support for data parsing of logs, Omniture, XML and JSON files and messages.
  • Boost developer productivity - Users leverage a visual, drag-and-drop utility for quickly and easily creating parsing definitions.
  • Speed the parsing of complex data - HParser natively exploits the parallelism inside the MapReduce framework.

Supporting Quotes

  • "Apache Hadoop has clearly reached a key inflection point in terms of broad market adoption within the enterprise," said Eric Baldeschwieler, chief executive officer, Hortonworks. "As organizations seek to extract greater value from complex and unstructured data, there is a growing need to parse data including XML, JSON, PDF, WORD, binary data, logs, call detail records (CDRs), etc. We are excited to see Informatica bringing its years of experience in transformation and parsing technology to the Apache Hadoop community. We are pleased to be able to offer the advanced transformation capabilities of the community edition of HParser to all users of the Hortonworks Data Platform at no charge."
  • "As we advance the Apache Hadoop framework for storing, processing and analyzing big data, Informatica HParser’s approach to enable parallel parsing across the Hadoop clusters available inside MapReduce is a unique addition to the Apache Hadoop community," said Arun C. Murthy, co-founder and lead of the next generation MapReduce project in Apache Hadoop. "Informatica HParser on the Hortonworks Data Platform uniquely takes advantage of the Apache Hadoop programming framework and is linearly scalable. We look forward to further collaboration between Informatica and Hortonworks to accelerate the development of parsing tasks and empower organizations to exploit the unstructured and complex data sources that have been traditionally untapped."
  • "Informatica is taking a critical step toward making Hadoop easier to use together with Hortonworks, a team that has contributed more than 80 percent of all the code in Apache Hadoop," said Juan Carlos Soto, senior vice president and general manager, B2B Data Exchange and Cloud Data Integration, Informatica. "With Informatica HParser Community Edition, we are delivering free of charge, world-class data parsing capabilities for the most popular data types in Hadoop and empowering enterprises to turn big data into competitive advantage and ultimately maximize their Return on Data."

Tweet this: Hortonworks Powers Apache #Hadoop Data Platform with @InformaticaCorp HParser Community Edition #bigdata

Additional Links

About Informatica

Informatica Corporation (NASDAQ: INFA) is the world’s number one independent provider of data integration software. Organizations around the world rely on Informatica to gain a competitive advantage with timely, relevant and trustworthy data for their top business imperatives. Worldwide, over 4,500 enterprises depend on Informatica for data integration, data quality and big data solutions to access, integrate and trust their information assets residing on-premise and in the Cloud. For more information, call +1 650-385-5000 (1-800-653-3871 in the U.S.), or visit Connect with Informatica at and


Note: Informatica, Informatica HParser Informatica B2B Data Transformation and Informatica B2B Data Exchange are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.