Latest News

Solving the Challenges of Comprehensive Carrier Screening by Long-Read Sequencing

Contributed Commentary by Ninad Pendse, Asuragen

August 23, 2024 | For clinical laboratories, keeping up with the changing landscape of carrier screening has been quite challenging lately. A maze of ancestry-based screening guidelines has been replaced with new guidelines that call for screening an order of magnitude of more genes. From assay development to validating several different gene tests, the result has been a new burden on clinical labs all while the demand for carrier screening is increasing. 

Three years ago, the American College of Medical Genetics and Genomics (ACMG) updated its carrier screening recommendations, moving away from a patchwork approach that suggested certain gene sets for patients of certain ancestries. Genetic screening for just two conditions: spinal muscular atrophy and cystic fibrosis, had been more broadly recommended. The new ACMG guidelines responded to two key factors: the need to increase equity in healthcare and DNA-backed evidence indicating that many patients lack a comprehensive understanding of their own ancestry. Carrier screening is now recommended for all women using a panel of 97 autosomal recessive genes plus 16 X-linked genes. Other professional societies, such as the American College of Obstetricians and Gynecologists, are expected to follow suit. 

For clinical labs, the updated guidance means getting on a path to be able to detect, validate, and screen 113 genes to achieve ACMG’s vision for patients. It’s a tremendous leap in testing complexity and capacity, and it will take many labs considerable time to ramp up. But as lab teams begin to move in this direction, they will all encounter a common challenge: a number of genes included in the guidelines are extremely difficult to analyze. 

The problem comes down to the technology most commonly used for this kind of work: short-read DNA sequencing. These ubiquitous sequencers excel at detecting certain types of genetic changes, including single nucleotide variants, many copy number variants, and small insertions or deletions. The vast majority of the 113 genes recommended for carrier screening can be analyzed quite effectively with short-read sequencing technologies. 

But other types of genetic variation are confounding for these sequencers. Spotting large insertions and deletions, differentiating a gene from its pseudogene, getting accurate counts of a repeat expansion—as the variation becomes longer or more complex, it becomes difficult or even impossible to resolve with short-read data. 

The challenge, of course, lies in read length. Consider a trinucleotide repeat expansion that stretches to 1,000 bases. With reads that might only be 200 bp or 300 bp long, even the most sophisticated algorithm cannot properly align and assemble those reads to get an accurate count of repeats. The only way to truly represent this type of sequence is to capture it in its entirety without the need for assembly. That’s true for many genomic elements that stretch so far beyond the bounds of a single short read. 

This has major implications in the realm of carrier screening. Of the top 10 neonatal genetic conditions, six are caused by genes that cannot be characterized accurately with short-read sequencing. Several others are included in the 113 genes recommended for screening by ACMG. 

Consider a few examples. Gaucher disease is associated with copy number variation, structural rearrangements, and SNV/indels in the GBA gene, but associating variants correctly requires distinguishing GBA from its highly homologous pseudogene GBAP1. Variants must be phased to GBA specifically to be relevant, which is difficult when the variant is far away from differentiating features of GBA and GBAP1; structural rearrangements complicate long-range PCR.  

Fragile X syndrome is caused by a repeat expansion in the 5’ UTR of FMR1 gene. Not only is it critical to get an accurate count of CGG repeats—potentially ranging into the hundreds—but it’s also important to spot any AGG interruptions in the sequence because they have clinical relevance. In one final example, large inversions in an intron of F8 are responsible for nearly half of all cases of severe hemophilia A. 

Most labs deal with these challenging genes by resorting to different techniques for each gene, such as qPCR for SMN1, PCR/CE for FMR1, and long-range PCR followed by Sanger sequencing for GBA. Multiplex ligation-dependent probe amplification and other approaches are used for other troublesome genes. For robust and reliable carrier screening, though, it would be simpler for labs to have a single platform that can resolve all difficult genes, leaving short-read sequencing for the easy ones. 

Long-read sequencing can fill this gap. Depending on the platform, long-read sequencers can generate reads that are routinely thousands of bases or even hundreds of thousands of bases long, capturing the largest genomic elements completely. Long-read data also makes it easier to phase DNA, so maternal and paternal alleles can be represented when needed. If capital expenditures are a concern for clinical labs, at least one type of long-read sequencing—nanopore-based sequencing—is economical enough to be implemented without a big-ticket purchase. 

The significant increase in genes that should be covered with carrier screening does not have to mean added complexity for clinical labs. Handling the straightforward genes with readily available short-read sequencing and complementing that with a single long-read workflow to cover the dozen or so challenging genes would allow labs to step away from overly complicated testing strategies that currently require several different testing platforms. 

 

Ninad Pendse serves as senior product manager for molecular diagnostic products at Asuragen, a Bio-Techne brand. He can be reached at Ninad.Pendse@bio-techne.com

Load more comments
comment-avatar