Breakthroughs in gene therapy are only possible with an exact understanding of the genetic underpinnings of disease. To develop safe and effective gene therapies, researchers need confidence that genomic data is both complete and accurate. Here, DDW Editor Reece Armstrong speaks to Neil Ward, General Manager and VP at PacBio about the benefits of long-read sequencing and how genomic data can be made more complete through it, potentially leading to a better understanding of disease.
Could you explain the differences between short-read and long-read sequencing?
While typical short-read technologies can sequence around 300 base pairs, long-read sequences, including PacBio HiFi reads, can generate sequences of around 15,000-20,000 pairs in length. This considerable magnitude of difference in read length affords researchers a more complete and accurate view of genomic variation. The key difference between the two technologies is that long-read sequences are more effective at unambiguously determining how all the pieces of a sequenced genome fit together. Visualising the whole genome is especially useful when sequencing new organisms, or stretches of DNA with many repeats or translocations, which are difficult to see from short sections of DNA code. Long-reads are often applied in complex areas, such as rare diseases and neurodegenerative diseases, which require deeper insights into more regions of the human genome that cannot be achieved with short read alone.
What is the significance of long-read sequencing and its ability to delve into the final 8% of the genome?
Until very recently, the human genome most frequently utilised as a reference genome was only 92% complete. However, just this year the sequencing of the complete human genome was finally completed by the Telomere-to-Telomere (T2T) Consortium, with the help of PacBio’s sequencing technology. It was only possible to accurately sequence and assemble this remaining 8% of the genome due to highly accurate long-read sequencing technologies. We now have the tools to explore these challenging parts of the genome and figure out the roles that they play in human disease.
How advanced is current genomic sequencing technology and does it offer enough accuracy to help researchers understand disease and develop effective gene therapies?
Recent years have seen a great deal of progress in both short- and long-read sequencing. The ability to see the full genome, including difficult to sequence regions, is essential for understanding how genetic variations drive susceptibility to disease, response to therapies, and many other phenotypes. The improved understanding that highly accurate long-read sequencing brings will help to advance gene therapies, including Adeno-Associated-Virus (AAV) vectors. Long-read data allows researchers to comprehensively discover, design and confirm AAV gene therapy approaches. Moreover, the ability to see full length is important for discovery and for identifying truncated products while high accuracy is important for identifying undesired mutations.
For years now the UK has been somewhat of a leader in genomic research. Do you agree with this sentiment and did it play a part in PacBio’s EMEA expansion into London?
I would certainly agree that the UK has an impressive heritage in genomics, and without a doubt is a leader in the field. This was a key factor in our decision to set up our EMEA HQ in London, where we are in proximity to scientific hubs in Cambridge and Oxford. We will also benefit from the diverse and skilled workforce found in and around the city. As well as the scientific know-how embedded in the UK, there is a growing tech scene around King’s Cross where our lab is located with Google just next door, and we’re always looking for tech-savvy people who can apply their skills to science.
How has PacBio’s work with Genomics England helped inform the use-case of long-read sequencing as a tool to help diagnose rare disease patients?
We’re delighted to be working with Genomics England to demonstrate how PacBio HiFi sequencing can help identify the genetic causes of rare disease that remain undiagnosed after short read sequencing. The study will re-sequence a selection of biobanked samples collected during Genomics England’s 100,000 Genomes Project that have been previously analysed with short read sequencing. This study is also part of our broader efforts to demonstrate the use case of long read sequencing in identifying rare diseases. Some of our other rare disease-focused research collaborations include Care4Rare Canada Consortium, ARUP Laboratories, UCLA Health, and Children’s Mercy Kansas City.
What are the opportunities to use genomic sequencing to guard against future pandemics? Is there any application in the antimicrobial resistance space?
Genomic sequencing plays an essential role in the discovery of novel pathogens and detecting their changing biology, helping to improve disease surveillance. In antimicrobial resistance research, scientists need continual genomic-level analysis to understand resistance and the evolution of microbes. However, in these areas, research efforts should not just be focused on sequencing known human pathogens, but also on sequencing all micro-organisms to better understand how they are evolving. We are keen to see greater efforts to globally characterise all life and PacBio supports this aim through our involvement in studies including the Human Pangenome Project and the Darwin Tree of Life Project.
Now that scientists have a better understanding of the whole human genome, is there anything you’re particularly excited about in terms of the research to follow?
One area I am personally excited about is the new breakthroughs that could follow in cancer diagnosis and treatment. I’m especially interested in the potential application of sequencing technologies in precision oncology, and how the combination of short- and long-read technologies may advance this field. Cancer is a disease of the genome, and the more we can understand cancer at a molecular level, the quicker we can make diagnoses and better treat people.