As Kubernetes Nears 2 Million Lines of Code, Commit Velocity and API Point to Project Maturity, Innovation Moves to Increasing Satellite Projects

Analysis by source{d} provides the Kubernetes community with unique metrics and insights that are otherwise buried within the project codebase

SEATTLE, Dec. 11, 2018 (GLOBE NEWSWIRE) -- KubeCon -- As the Kubernetes project nears 2 million lines of code (including all languages and generated files), the 4-year-old open source project is showing many signs of maturity, according to an analysis by source{d}, the company enabling Machine Learning for large-scale code analysis.

The velocity of commits for the core Kubernetes project seems to be slowing down as the community focus moves to infrastructure testing, cluster federation, Machine Learning and HPC (High Performance Computing) workloads management. With just under 16,000 methods, the Kubernetes API also seems to be stabilizing despite its high level of complexity.

This analysis leverages source{d} Engine to retrieve and analyze all the Kubernetes git repositories through SQL queries to get insights into the project codebase history, as well as emerging trends.  

“Kubernetes has clearly evolved from one of the most active open source projects of all time to a production-ready platform for the enterprise,” said Francesc Campoy, vice president of product and developer relations at source{d}. “These source{d} Engine queries have revealed useful insights into the relatively young Kubernetes project; imagine the possibilities for large companies which have very heterogeneous and old codebases.”

The source{d} analysis includes:

  • Release schedule over the past three years
  • Number of files/lines of code over time
  • Number of public APIs over time
  • Number of commits per month
  • Top repositories and number of commits
  • Contributors by number of commits
  • Organizations by number of commits
  • Number of languages over time
  • Number of lines of code for programming languages
  • Popularity of languages over time

Here are some details on findings of the analysis:

  • While the number of lines of code continues to grow towards the 2 million mark, the commit velocity has been decreasing since March 2018 which implies that the project has reached a high level of maturity and stability.
  • As the project matures, most of the contributions are now directed to upgrades and tools for Kubernetes testing (infra-test) as well as the cluster federation, Machine Learning / HPC workloads and the AWS ALB Ingress controller Special Interest Groups.
  • Even though Google is the main contributor to Kubernetes by number of commits, individuals (those with and emails) achieve a similar number. The exact number of organizations contributing is harder to measure, but the analysis shows that people from more than 600 different email domains have contributed, including major cloud providers such as Red Hat, Huawei and Microsoft.
  • At its outset in 2014, the Kubernetes project had 15 programming languages, a number that quickly increased to 35 by the beginning of 2017. Given that Kubernetes came from Google, it’s not surprising to see that Go is by far the dominant language followed by Python, YAML and Markdown. The analysis shows that other languages such as Gradle and Lua have been dropped while some others like Assembly, SQL and Java made a comeback.
  • The number of API endpoints exported in the Kubernetes codebase is stabilizing at 16,000 which confirms a level of both maturity and complexity. The decrease between some releases (during 2017) might reveal a lack of backward compatibility.

source{d} Engine turns code into an analyzable and productive asset across large codebases, facilitating the digital transformation of large, traditional, companies through software modernization and the adoption of Inner Source practices.

Full copies of the report can be downloaded here. Companies interested in getting their own code base analyzed can request an analysis here.

About source{d}
source{d}, the only open core company to turn code into actionable data and business intelligence, is building the tech stack that enables large-scale code analysis and machine learning on code. Used by top engineers at the world’s leading companies, source{d} develops projects transparently, collaborating with the broader community of Machine Learning on Code researchers. Headquartered in Madrid, with a U.S. office in San Francisco, source{d} has raised $10 million from Otium, Sunstone Capital and others. To learn more, visit

Editorial Contact:
Joseph Eckert for source{d}

Photos accompanying this announcement are available at:


commits-k8s Contrib-K8S