ST. PAUL, MN--(Marketwired - Nov 2, 2015) - EDRM, the leading standards organization for the e-discovery market, announced today the release of a new 5 GB Micro Dataset. The new EDRM dataset was assembled to meet the e-discovery data testing and process validation needs of software and tool providers, litigation support organizations, law firms and educational organizations.

The new dataset follows the August release of an initial dataset offering, a 136.9 MB zip file containing the latest versions of everything from Microsoft Office and Adobe Acrobat files to image files. That file is available to the public on the EDRM website, and has been downloaded over 300 times.

The new dataset is similar but much larger, at approximately 5.5 GB. The full dataset is sourced from publicly available data and free from copyright restrictions. It was assembled by the Digital Forensics Research Laboratories at the Auckland University of Technology, in collaboration with the EDRM Dataset team.

The EDRM Micro Dataset is valued for its large variety of file types and other challenges characteristic of ESI collected in discovery cases. The files have various levels of corruption, and the dataset contains a duplicate set of files that are encrypted, to support exception handling exercises and advanced testing.

The EDRM Micro Dataset mix of file types includes:

  • A variety of.csv files 
  • Websites and web pages
  • Adobe Acrobat files
  • Graphic files and photographs
  • Public census data
  • Microsoft Office files
  • Audio files
  • 4 email boxes with shared correspondence, threads and attachments
  • Multiple Encase .e01 files containing data from a phone and another data source

"The EDRM Dataset team has done outstanding work in advancing the industry with the development of advanced datasets that better reflect the types of data anomalies and challenges faced by e-discovery professionals today," said George Socha, co-founder of EDRM. "EDRM members will benefit greatly from their work, in addition to the education, guidelines and latest in industry best practices provided to members."

The Dataset team includes:

  • Eric Robi, president, Elluma Discovery
  • Michael Lappin, director, Technology and Sales Engineering, Nuix
  • Chad Main, founder, Percipient
  • Henry Moreno, eDiscovery manager, Dell Inc.
  • Brian Cusack, director, AUT Digital Forensic Research Laboratories, and professor, ECU Security Research Center, Auckland University of Technology

The EDRM Micro Dataset is available exclusively to EDRM members. Current EDRM members have been notified by email with instructions for file downloading. Organizations interested in EDRM membership will find information at

About EDRM

EDRM creates practical resources to improve e-discovery and information governance. Launched in May 2005, EDRM was established to create standards and guidelines in the e-discovery market. In January 2006, EDRM published the Electronic Discovery Reference Model, followed by additional resources such as IGRM, CARRM and the Talent Task Matrix. Since its launch, EDRM has comprised 360 organizations, including 190 service and software providers, 75 law firms, 72 corporations, 13 governmental entities, three industry groups and seven educational institutions involved with e-discovery and information governance.

Contact Information:

Tom Gelbmann
Phone: 651-483-0022