Niven R. Narain, PhD, President and CEO of BPGbio asks how a better calibre of biobanks in life sciences can lead to more drugs making it to market.
Artificial Intelligence (AI) has become a popular buzzword in the drug discovery and development lexicon but not all approaches to AI-driven drug discovery produce meaningful insights that translate into real compounds which ultimately demonstrate clinical efficacy in human trials. As an early adopter of AI tools, I’ve witnessed how the quality of inputs into an AI model are key determinants of the quality of outputs. In other words, garbage in, garbage out. The caliber of ‘biobanked’ samples, along with other clinical and demographic annotation, wet lab experiments, and real-world data, are all critical to ensuring AI-derived insights translate into clinical successes.
Use of AI tools does not equal guaranteed drug success
AI has garnered praise for its potential to enable faster iteration and testing of drug compounds across the broader chemical space through in silico experiments. Research that once took months to complete – requiring synthesizing and testing compounds – can now be done in days and even hours using AI. But despite significant investment into companies built on the foundation of AI tools, most have yet to see the products of their in silico experimentation prove successful in human clinical trials and the most advanced human trials of AI-developed drug candidates are still in phase II stage. We eagerly await future data read-outs from trials because after all, it is the drug that patients are waiting for, not our fancy algorithms.
The recent and unfortunate failure of BenevolentAI’s medicine in development for eczema is a cautionary tale for an industry desperate for more rapid solutions to its pipeline woes, the high costs of developing new compounds, and the deafening cacophony of calls for additional regulation to decrease drug prices.
As an early adopter of AI technology in drug discovery, the reality that I’ve known for a long time has recently come into focus for all to see: AI-designed compounds, just like compounds designed by other methods, fail because of poor translation from model to human. Thus, improving the success of AI-driven drug discovery requires us to improve the AI’s output through better data inputs followed by rigorous pre-clinical and other pharmacodynamic and mechanism studies.
Not all biobanks are created equal
Biobanks provide a treasure trove of clinical data for exploration with AI tools. In anticipation of the long-awaited personalised medicine revolution, many different types of biobanks have emerged to catalogue samples. But the utility of biobanks varies. Private biobanks from consumer genomics companies like 23andMe may have millions of samples, but the absence of deep clinical annotation and phenotypic data may potentially miss the important nuances that come from having both genotypic and phenotypic information, preferably from longitudinal samples. The combination of pre-diseased, diseased, post-diseased samples from the same patients, their clinical data, alongside rich annotation information — demographic data, phenotypic data, and the results of wet lab experiments — is a key differentiator for companies like BPGbio which use multiomics data from private biobanks as AI inputs.
The quality and diversity of the samples themselves is another issue which can impede the effectiveness of an AI model based on human samples. While our company’s clinically annotated 100,000+ patient sample biobank was catalyzed from our relationship with the U.S. Department of Defense (DoD) and over 50 strategic relationships with medical schools, universities, and hospitals worldwide, it is noteworthy that a valiant attempt must always be at the forefront of embracing diversity, not only from an ethnic perspective, but also geographically. As the field of AI-derived drug discovery grows, it’s critical that biobanks are populated using broader efforts to obtain clinically annotated samples from diverse populations. Preserving tissues and other sample types, alongside robust clinical annotation, will only seek to improve AI-models like ours for future drug discovery efforts in drugs and diagnostics.
Better inputs, better outputs
Leveraging biobanked samples from the US DoD and institutions from around the world, and a proprietary AI-model running on the Frontier supercomputer at Oak Ridge National Laboratory, BPGbio has been able to leverage its AI-drug discovery model to identify promising diagnostic markers and multiple drug compounds currently in Phase II studies for several aggressive diseases: glioblastoma multiforme (GBM) and pancreatic cancer. We also have development programs studying compounds for epidermolysis bullosa (EB), squamous cell carcinoma (SCC), chemo-induced alopecia (CIA), and sarcopenia and a deep pipeline of early-state drug targets and compounds being assessed for various solid and liquid tumours and neurodegenerative diseases.
The methods employed in building our AI models have also fuelled innovation in diagnostics, most notably in cancers of the prostate, pancreas, and breast, as well as Parkinson’s disease.
Truly solving the translational challenges that have plagued this industry and resulted in high failure rates for compounds in human trials will require a dramatic scale up of the approach we have pioneered and advanced. Both public and private biobanking efforts need to dramatically scale their activities to attract and populate more samples, across more disease indications, from more diverse patients, across more disease areas, with richer clinical annotation, over various time points.
Towards personalised medicine for populations
From my vantage point, we are still in the early days of understanding the value of biobanked samples and AI-models in designing new drugs. In the future, by leveraging expanded biobanks and robust clinical annotation data, I hope that the industry will be able to create increasingly personalised and optimised medicines for population subsets. For the vision to become a reality, our industry needs to dramatically expand the collection, annotation, and use of biobanked samples alongside other lab-derived data, to optimise AI models from which we can design medicines that best represent the populations that require such medicines and deepen cross industry collaboration.