Latest Releases of Open Source Tools from Iterative.ai Extend Traditional Software Tools for Machine Learning Engineers

Open Source Projects Data Version Control (DVC) and Continuous Machine Learning (CML) extend tools like Git and CI/CD for MLOps


SAN FRANCISCO, March 03, 2021 (GLOBE NEWSWIRE) -- Iterative.ai, the MLOps company dedicated to streamlining the workflow of data scientists, today announced the latest releases of Data Version Control (DVC) and Continuous Machine Learning (CML) open source projects. DVC and CML remove the need for proprietary AI Platforms (such as AWS SageMaker and Microsoft Azure ML Engineer) by extending traditional software tools like Git and CI/CD to meet the needs of ML Engineers.

ML engineers, who work with unstructured data, need GitHub for collaboration and CI/CD systems to resolve issues between each other, between the team and production system. With a lack of adequate tools for versioning data and models to meet the needs of the ML Engineers, Iterative.ai has built open source tools, DVC and CML, on top GitHub, GitLab and BitBucket to fill this gap.

“AI Platforms are siloed and require everything to go into their own systems creating vendor lock-in,” said Dmitry Petrov, CEO and founder of Iterative.ai. “Iterative.ai allows users to stay within their application development space and effectively extend the familiar dev environments with tools to support Machine Learning Engineers and Data Scientists.”

DVC brings agility, reproducibility, and collaboration into the existing data science workflow. DVC provides users with a Git-like interface for versioning data and models, bringing version control to machine learning and solving the challenges of reproducibility. DVC is built on top of git, allowing users to create lightweight metafiles and enabling the system to handle large files, rather than storing them in Git. It works with remote storage for large files in the cloud or on-premise network storage.

CML is an open-source library for implementing continuous integration and delivery (CI/CD) in machine learning projects. Users can automate parts of their development workflow, including model training and evaluation, comparing ML experiments across their project history, and monitoring changing datasets. CML will also auto-generate reports with metrics and plots in each Git pull request.

Together, CML and DVC provide ML Engineers a number of features and benefits that support data provenance, machine learning model management and automation including:

  • GitFlow for data science - Use GitLab or GitHub to manage ML experiments, ML models and modified data tracking.
  • Repository & knowledge library - Maintain a code repository with data files, ML model files, and model metrics. Keep track of ML experiments to share knowledge about successful ideas as well as failures. No Git repo required.
  • Collaboration - Collaborate on ML experiments with pipeline and workflow visualization. Data scientists, ML engineers, DevOps teams work concurrently instead of waiting for handoffs.
  • Reporting - Auto-generate ML experiment reports with metrics and plots in each Git Pull Request.

DVC and CML are available today via GitHub and GitLab. To schedule a demo, visit www.iterative.ai.

About Iterative
Based in San Francisco, Iterative.ai is the company behind the development of DVC and CML, open-source tools to streamline the workflow of data scientists. Iterative.ai integrates ML workflows into current practices for software development instead of creating separate AI platform. For more information visit www.iterative.ai.

Media Contact:
Joe Eckert, jeckert@eckertcomms.com
Ray George, ray@eckertcomms.com