From data to cure: The journey of AI in cancer trials


Deepika Khedekar, Associate Centralized Clinical Trial Lead at IQVIA Inc, looks at the challenges of utilising artificial intelligence (AI) in oncology clinical research.

Ten million1 people around the world die from cancer each year. Yet, while the number of cancer clinical trials has soared from 19,211 in 2013 to a whopping 26,396 in 20223, and they keep surging in 2024, the success stories are few and far between. MIT research4 tells us a hard truth: about 95% of these trials fall short, leaving behind a trail of broken hopes for cancer patients each year.

Challenges in oncology research

Time is one of the scarce resources in cancer research yet cancer clinical trials last for about 14-18 months longer than non-oncology clinical trials and the total research cycle for oncology drugs hovers around about 12 years5. If you combine that with the average overall five-year survival rate in pancreatic cancer which is around mere 12%, our cancer trials are way too slow to say the least6. Data is another challenge. Oncology trials produce about 3.1 million data points per protocol as opposed to 1.9 million in non-oncology trials5. That’s a lot of data generated that needs to be analysed efficiently and reliably with high-speed.

The next challenge is related to the enrollment rates. Cancer trials are notorious because they are dependent on the genetic profile of patients and hence highly-selective. That brings down the enrollment rate to about 14% in non-small-lung-cancer trials compared to 54% in non-oncology trials5. Cancer progresses rapidly and so it leads to significantly high drop-out rates, which means even if the trial was to enroll enough number of patients, there is no guarantee that it will be able to proceed as planned since we may not have the required pool of patients to statistically measure drug efficacy and toxicity. This and many such challenges negatively impact the success rate of oncology trials that is currently stuck at mere 5%4. This is where artificial intelligence and machine learning might come handy.

Foundations of artificial intelligence: From machine learning to neural networks

Artificial intelligence (AI), machine learning (ML), and neural networks (NNs) are all interconnected yet distinct. AI encompasses a broad domain in computer science focused on creating systems that perform tasks typically requiring human intelligence. ML, a subset of AI, involves teaching AI to learn and improve from experience, similar to humans learning from our mistakes. Instead of programming specific rules for every situation, ML allows the AI to learn these rules by examining examples.

Neural network is a type of ML mechanism, designed to imitate human brain functionality with interconnected nodes processing information. Each of these nodes is a mathematical function. These artificial networks excel at identifying patterns in complex data.  AI, as the overarching concept, employs ML to learn from data, and NNs are a specific method in ML, particularly effective in pattern recognition within complex data. That brings us to our next question – so what problems can AI solve in oncology research?

Predicting treatment outcomes with AI & tumour genetics in oncology trials

In a study8 conducted by the University of California San Diego School of Medicine, researchers used artificial intelligence to overcome a significant hurdle in oncology: predicting chemotherapy resistance.

The researchers delved into the genetic makeup of tumours, recognising that genetic mutations heavily influence how tumours respond to drugs. They concentrated on 718 genes often used in cancer genetic testing and incorporated mutations into their machine learning model. Impressively, their AI model, trained on publicly available drug response data, identified 41 key molecular assemblies – think of them as teams of proteins– that significantly influence drug efficacy. This approach was particularly impactful in cervical cancer, where about 35% of tumours typically resist treatment. The AI model could distinguish which tumours were likely to respond and which weren’t while also shedding light on the molecular mechanisms behind these outcomes enabling transparency in these models.

So, what does this mean for clinical trials in oncology? First off, it dramatically improves our ability to predict treatment outcomes in clinical trials. This means we can be more selective about which treatments we test in oncology trials, focusing on those likely to be effective. Consequently, this strategy substantially reduces the failure rate of cancer clinical trials. Also, understanding the molecular whys and hows of treatment resistance opens up new pathways for developing oncology drugs. The role of AI in advancing cancer clinical research goes well beyond predicting treatment outcomes.

Accelerating drug discovery through AI

With around 20,000 different types of proteins in the human body and over 10,000+ chemical compounds to explore for drug discovery, the traditional methods of identifying effective protein-drug pairs to target cancer cells are slow and computationally intensive. Why? Because in addition to evaluating a whooping 20,000*10,000 combinations of drug-protein pairs, the conventional process uses 3D structures of proteins in these computations. So, what this really means is that we’re going to need a whole lot of computing power to crunch the numbers and, at the same time, handle the graphics processing for each pair before we can discover a new drug for cancer.

ConPLex, an AI model developed by MIT and Tufts University researchers, addresses this challenge9. It encodes protein information into numerical representations of amino-acid sequences. Trained on known protein-drug interactions, it learned to associate specific protein features with drug-binding ability without the need to process the 3D data associated with each protein. The model was tested on 4,700 candidate drug molecules for their ability to bind to a set of 51 proteins. It identified 19 drug-protein pairs with strong drug-protein affinity scores, 12 of which showed the strongest affinity. ConPLex’s ability to screen over 100 million compounds in a day drastically reduces the time and cost of drug discovery in oncology, significantly improving the efficiency and accuracy of this process that typically takes about four to five years in the absence of AI.

AI’s role in oncology research is pivotal, offering advancements in predicting treatment outcomes and accelerating drug discovery. However, these innovations also come with challenges and risks that must be carefully managed.

Challenges on the path to AI integration in cancer research

Data privacy and bias in datasets

One of the most significant challenges facing AI in cancer trials revolves around data: obtaining large and diverse datasets that are both comprehensive and respect patient privacy is a complex task. Cancer datasets are particularly sensitive since they contain confidential health information, necessitating stringent ethical and privacy measures. The need to protect this data isn’t just about compliance with laws like HIPAA or GDPR; it’s about maintaining trust. Without robust privacy measures, we risk not only ethical breaches but also the very foundation of patient trust that clinical trials depend on. But here’s the kicker: these same privacy measures can make it tough to get the diverse, rich datasets AI needs to learn effectively. It’s a bit of a catch-22, and finding the balance is more art than science.

Furthermore, these datasets often suffer from biases, failing to represent the full diversity of the population adequately. This lack of diversity in data can skew AI algorithms, potentially resulting in less effective or even ineffective treatments for underrepresented groups. Bridging this gap requires meticulous attention to data collection and processing, ensuring AI models are trained on datasets that are not only vast but also inclusive and representative. That brings us to our next challenge – the fragmentation of data and the inherent opacity of many AI models, which further exacerbates the landscape of oncology research.

Data fragmentation & model opacity

Data used to train AI models comes in from a range of sources – different hospitals, labs, research centres – and it’s rarely in a consistent format. This lack of data standardisation is a roadblock to creating AI models that work across the board. AI models require data that is consistent and comparable across different sources. Normalising these data to a common scale without losing significant information is challenging but an essential step for developing accurate and reliable AI models10. Effective data normalisation ensures that the AI model’s predictions are based on true biological or clinical signals rather than artifacts of data processing.

Compounding this, the complexity of AI, especially deep learning models, introduces a “black box” problem where the rationale behind AI-driven decisions remains obscure. This opacity can erode confidence in AI-assisted clinical decisions, as stakeholders may find it challenging to trust outcomes when the underlying logic is inaccessible. Addressing these issues is essential for harnessing the full potential of AI in clinical settings, ensuring both the reliability of trials and the establishment of trust in AI-driven healthcare solutions. Data is not the only challenge we have in this journey. There are more – regulations, validations and integrations.

Regulatory compliance, AI model validation, and clinical workflow integration

Incorporating AI models into cancer clinical trials also encompasses the dual challenge of navigating the stringent regulatory approval process and ensuring its validation for safety and efficacy, alongside seamlessly integrating these tools into existing trial and clinical workflows. This isn’t just about ticking boxes; it’s about ensuring these AI models genuinely work across the board – for every kind of patient you might encounter in a trial, without leaving anyone at a disadvantage.

Moreover, the integration of AI into the day-to-day operations of cancer trials calls for intuitive design and user-centric interfaces that align with the fast-paced environment of oncology clinics, ensuring that AI tools augment rather than complicate clinical processes. This endeavor requires a concerted effort from developers, clinicians, and regulatory bodies to forge AI solutions that not only achieve validation and regulatory milestones but also enhance clinical workflow efficiency.

The future of cancer research

The integration of AI and ML into cancer clinical trials represents a pivotal shift toward addressing the intricate challenges that have long impeded progress in oncology research. From the protracted durations of clinical trials to the immense complexity of data analysis required, the traditional approaches to cancer research are being outpaced by the rapid advancement of the disease itself. However, the advent of AI and ML offers a transformative potential to surmount these hurdles, through enhanced prediction of treatment outcomes and the acceleration of drug discovery processes. This has a profound impact on both the efficiency and precision of oncology trials.

Yet, the journey towards fully harnessing AI in cancer research is fraught with challenges – data privacy concerns, dataset biases, data fragmentation, model opacity, regulatory hurdles, and the need for seamless clinical integration. Addressing these issues demands a delicate balance between innovation and ethical responsibility, ensuring that AI-driven solutions are both effective and equitable. The question is profound yet simple – how do we ensure that the AI revolution in cancer research not only advances science but also ensures that every patient benefits from these innovations?

Our efforts to refine AI models should go beyond mere technical achievements; they should reflect a commitment to inclusivity and ethicality, making certain that our advancements in AI not only propel scientific progress but also do so with profound humanity.


  1. World Health Organization. Cancer Key Facts. Accessed January 6, 2024.
  2. World Health Organization. Childhood Cancer Key Facts. Accessed January 6, 2024.
  3. World Health Organization. International Clinical Trials Registry Platform (ICTRP). Accessed January 15, 2024.
  4. Wong CH, Siah KW, Lo AW. Estimation of clinical trial success rates and related parameters. Biostatistics. 2019;20(2):366.
  5. WCG. Emerging Challenges in Oncology Trials: Enrollment, Protocol Deviations, and Growing Data. Accessed January 15, 2024.
  6. ASCO. Pancreatic Cancer:Statistics. Accessed January 15, 2024.
  7. National Human Genome Research Institute. GENE. Accessed January 24, 2024.
  8. UC San Diego Today. AI Harnesses Tumor Genetics to Predict Treatment Response. Accessed January 18, 2024.
  9. Anne Trafton. MIT News. New model offers a way to speed up drug discovery. Accessed January 25, 2024.
  10. Zhang B, Shi H, Wang H. Machine Learning and AI in Cancer Prognosis, Prediction, and Treatment Selection: A Critical Approach. J Multidiscip Healthc. 2023;16:1779-1791. Published 2023 Jun 26.

About the author

Deepika KhedekarDeepika Khedekar is an Associate Centralized Clinical Trial Lead at IQVIA Inc, where she spearheads clinical trial monitoring programmes for major pharmaceutical companies. In her 12+ years in the pharmaceutical industry, she has led Phase I, II and III clinical trial programmes in diverse therapeutic areas for leading US and Australia-based pharmaceutical organisations, such as Gilead Sciences, Macleods Pharma and Arrowhead Pharmaceuticals.

Related Articles

Join FREE today and become a member
of Drug Discovery World

Membership includes:

  • Full access to the website including free and gated premium content in news, articles, business, regulatory, cancer research, intelligence and more.
  • Unlimited App access: current and archived digital issues of DDW magazine with search functionality, special in App only content and links to the latest industry news and information.
  • Weekly e-newsletter, a round-up of the most interesting and pertinent industry news and developments.
  • Whitepapers, eBooks and information from trusted third parties.
Join For Free