ASHG 2023 conference: Exclusive insights from Complete Genomics

At the American Society for Human Genetics in Washington D.C., geneticists worldwide were briefed on Complete Genomics’ DNBSEQ revolutionizing full genome sequencing, the use of DNBSEQ-T7 in clinical labs for structural variant detection, and a University of Massachusetts researcher’s use of STOmics’s Stereo-Seq and DNBSEQ-T7 data.

DNBSEQ technology and new advances in whole genome sequencing

Dr. Radoje Drmanac, Phd

Chief Scientific Officer, Complete Genomics

Key takeaways

End-to-End genomic solutions: Complete Genomics provides comprehensive genomic research solutions, from sample processing to bioinformatics.
DNBSEQ technology: DNBSEQ line, including DNBSEQ-G99 and DNBSEQ-T7, offers high-throughput, accurate genome sequencing.
Cost-effective genome sequencing: Notable reductions in sequencing costs, exemplified by the $150 genome milestone achieved with DNBSEQ-T7.

Recording

Talk summary

Dr. Drmanac introduced Complete Genomics, founded in 2005 in Silicon Valley and acquired in 2013. The company, with over $1B in investments and technology development, now boasts over 200 employees, nearly 600 patents, around 3,000 global customers, and more than 6,000 research publications.

Regarding these publications, the vast majority are RNA-Seq findings, and a long tail of additional techniques (WGS, targeted sequencing, metagenomics…) typical for next-generation sequencing. Complete Genomics has paid attention to the entire laboratory workflow, offering liquid handling automation from sample disruption and handling, to nucleic acid purification, automated NGS library preparation, sequencing, all the way through to bioinformatics solutions.

The DNBSEQ method (DNA NanoBall sequencing) forms the common technology throughout the sequencer lineup. Their “G-Series” DNBSEQ-G99 produces 8 to 48 GB of sequence per run; and can be set up to finish a run “in a few hours if needed” for time-sensitive applications. The “T-Series” DNBSEQ-T7 produces 1-7 Tb of sequence per run, with a four-flowcell system, which is unique and offers great flexibility in the life sciences industry.

Dr. Drmanac then presented the DNBSEQ technology, illustrated in Figure 1. Rolling circle replication uses the same single template repeatedly, which is an important nuance of the technology: there are no clonal errors that arise, because only the original template is read. About two hundred copies of the template comprise the DNA NanoBalls (DNBs). A high-density, patterned flowcell contains a single DNB at >95% occupancy.

Another important benefit of not having clonal amplification errors is the elimination of index hopping, a known phenomenon with amplification-based template preparation.

He then presented a timeline of whole genome sequencing costs. Starting with a $50,000 genome in 2009 and the $5,000 genome in 2010, the timeline continued with a series of NGS instruments launched including the DNBSEQ-G400 in 2018, and the $150 genome with the DNBSEQ-T7 in 2019.

To illustrate what increases in density (and throughput) are possible, he showed the current T20 flowcell format: DNBs about 200nm in size, and 500nm apart (called “pitch size”). A future Txx model flowcell could have the DNBs in the same size, at only 250nm pitch size (raising the density by 400%). With only 125nm pitch size and DNBs shrunk to only 100nm in size, another 400% (four-fold) increase could be achieved.

Shifting to the applications of whole-genome sequencing, he mentioned the 2014 Nature publication¹ of the diagnosis of severe intellectual disability of 50 individuals.

Dr. Drmanac continued with a brief description of indel and SNV accuracy of the whole-genome sequence of NA12878, with PCR-free libraries independently prepared in duplicate and run on the DNBSEQ-G400. Discordant data between replicates were between 2.05% – 2.65%. Of interest was a side-by-side comparison chart of Genome in a Bottle (GIAB) sample HG002 between the DNBSEQ-G400 and the Illumina NovaSeq 6000Ô. For single nucleotide variants (SNVs), the false negative (FN), false positive (FP), recall and precision were all comparable. In contrast, for indels the FN and FP were improved by two-fold, depending on the analysis pipeline used (whether DNAScope or GATK).

In a different analyses with the same sample and datasets (specifically HG002 and 30x coverage, both using PCR-free libraries) the NIST “Challenging Medically Relevant Gene” benchmark curates 273 genes, and the Complete Genomics measures across the board were essentially equivalent. Of the 273 genes, Illumina could detect 127 of these genes perfectly, while Complete Genomics could detect 130.

Dr. Drmanac concluded by presenting single-tube Long Fragment Read technology (stLFR) for haplotype phasing, and a new technology called Spatio-Temporal Enhanced REsolution Omic Sequencing (Stereo-Seq) for spatial transcriptomics. For long fragment reads, a single-tube protocol enables 70Mb contig lengths (N50), with the ability to detect compound heterozygote SNVs, offering higher coverage of difficult regions such as tandem repeats or the aforementioned challenging medically related genes and areas of poor coverage termed “blind zones” in genes such as SMN1, NBPF4 and CHRNA7.