Complete Genomics Publishes Paper Describing Its Informatics Approach for High-Accuracy Whole Human Genome Sequencing


MOUNTAIN VIEW, Calif., Jan. 19, 2012 (GLOBE NEWSWIRE) -- Complete Genomics Inc. (Nasdaq:GNOM) announced today that the Journal of Computational Biology has published online the company's paper describing some of the computational methods that enable it to produce highly-accurate whole human genome sequencing data. The paper is available at http://www.liebertonline.com/doi/full/10.1089/cmb.2011.0201.

Complete Genomics performs whole human genome sequencing using proprietary biochemistry based on DNA nanoball arrays and combinatorial probe-anchor ligation sequencing. As these methods1 produce reads with unique characteristics, Complete Genomics has developed new methods that call single nucleotide polymorphisms (SNPs), short substitutions and insertions/deletions.

"The methods described in this paper produce very accurate variant calls," said Dr. Clifford Reid, chairman, president and CEO of Complete Genomics. "The algorithms described in this paper have been used for all of our 69 genome public data repository and the more than 3,800 complete, deeply sequenced human genomes we have delivered to customers to date." Access to Complete Genomics' genome data repository is provided free of charge at http://www.completegenomics.com/sequence-data/download-data/.

The effectiveness of the company's sequencing and bioinformatics approach is borne out in customer research papers where its data has been used to investigate lung cancer2, Miller syndrome3, craniosynostosis4 and hypercholesterolemia5 and published in Science, Nature, The American Journal of Human Genetics and Human Molecular Genetics, respectively. It is also compared positively with another sequencing technology in the December issue of Nature Biotechnology6.

Complete Genomics' approach employs a local de novo assembly process, which uses a combination of Bayesian analysis and graph-based techniques, for each variation. This de novo assembly approach, which was pioneered by Complete Genomics, has since been adopted by other organizations.

The company's assembly approach allows it to call both alleles at a position independently. This enables Complete Genomics to make complex calls in cases where both alleles differ from the reference. Furthermore, its algorithms are particularly adept at detecting variants that are located close to each other. Complete Genomics' technology is also capable of detecting previously unknown indels, whereas some other approaches can only check whether a known indel is present. This additional insight is included in the rich variant reports that Complete Genomics delivers to its customers. These reports also include copy number variations (CNVs), structural variations (SVs), transposable element insertions, and a comparison of tumor and normal samples if applicable. The comprehensiveness of the standard data reports provided reduces researchers' data analysis burden when working with Complete Genomics data.

Complete Genomics continues to refine its methods, making improvements in the quality and cost of data it produces to enable large-scale disease and cancer studies in the translational research market. "I'm always looking for ways to optimize our algorithms so that they run faster and produce more accurate output," said Bruce Martin, senior vice president of product development. "As a result, Complete Genomics can now map and assemble a genome in less than a day with very high sensitivity and specificity."

References

1. Drmanac R, et al.: Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays. Science, 5 November 2009 (10.1126/science.1181498).

2. Lee W, Jiang Z, Liu J et al.: The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465, 473–477 (2010).

3. Roach JC, Glusman G, Smit AFA et al.: Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328(5978), 636–639 (2010).

4. Nieminen P, Morgan NV, et al.: Inactivation of IL11 signaling causes craniosynostosis, delayed tooth eruption, and supernumerary teeth. The American Journal of Human Genetics - 15 July 2011 (Vol. 89, Issue 1, pp. 67–81)

5. Rios J, Stein E, Shendure J, Hobb HH,Cohen JJ et al.: Identification by whole-genome resequencing of gene defect responsible for severe hypercholesterolemina. Hum. Mol. Genet. 19(22), 4313–4318 (2010).

6. Lam HYK, Clark MJ, Chen R et al.: Performance comparison of whole-genome sequencing platforms. Nature Biotechnology. Advance online publication 18 December 2011.

About Complete Genomics

Complete Genomics is the complete human genome sequencing company that has developed and commercialized an innovative DNA sequencing platform. The Complete Genomics Analysis Platform (CGA™ Platform) combines Complete Genomics' proprietary human genome sequencing technology with our advanced informatics and data management software. The innovative, end-to-end, outsourced CGA™ Service provides customers data ready for genome-based research. Additional information can be found at http://www.completegenomics.com.

The Complete Genomics logo is available at http://www.globenewswire.com/newsroom/prs/?pkgid=8216

Forward-looking Statements

Certain statements in this press release, including statements relating to the ability of Complete Genomics' standard data reports to reduce researchers' data analysis burden and its continuing refinement of its sequencing methods to enable large-scale disease and cancer studies, are forward-looking statements that are subject to risks and uncertainties. Readers are cautioned that these forward-looking statements are based on management's current expectations, and actual results may differ materially from those projected. The following factors, without limitation, could cause actual results to differ materially from those in the forward-looking statements: the company's limited operating history, delays in production due to technical issues, delays in capacity expansion, its ability to reduce the average cost of its sequencing service, the timing and extent of reductions in the price of its genomic sequencing service, growth in the market for complete human genomes and any potential inability to increase yield. More information on potential factors that could affect the Company's financial results can be found in its Annual Report on Form 10-K filed on March 30, 2011 and its Quarterly Reports on Form 10-Q, including those listed under the caption "Risk Factors." The Company disclaims any obligation to update information contained in these forward-looking statements, whether as a result of new information, future events or otherwise.


            

Contact Data