AI’s Groundbreaking Role in New Drug Target Discovery

Drug discovery and target identification is a notoriously expensive and slow process: Preclinical drug development (the research stage that precedes clinical trials) can sometimes take more than five years and cost billions of dollars. 

Even the experimental drugs that make it to clinical trials tend to flame out quickly. Around 90 percent of clinical drug development ultimately fails, costing companies between US$30 million and more than $300 million per clinical trial.

Failure during the preclinical or clinical trial phases is costly for drug companies and potentially disastrous for patients waiting for treatment.

But AI has begun to revolutionize the drug discovery and target identification process, helping pharma companies improve time to market while creating safer, more effective products for less money.

What is Target Identification in Drug Discovery?

Drug discovery is the process of discovering new medications and involves several steps, including molecular simulation, the prediction of drug properties, de novo drug design, and drug target identification.

The latter step, target identification, helps researchers understand how drugs will interact with the body, determine appropriate dosing levels, and whether the drug could trigger an adverse reaction by the patients. It is the process of identifying biological molecules and cellular pathways that drugs can affect to realize therapeutic benefits.

How Does AI Help the Target Identification Process?

Despite the well-oiled clinical trials processes in place among most established pharmaceutical firms, even the most productive among them can only manage between three and a dozen clinical trials per year. 

Increased Velocity

Mock et al. have reported the injection of AI tools into drug development, including drug target identification, can help speed up this process, according to Amgen’s Marissa in Nature.

Thanks to such AI techniques (along with other innovations, such as robotic workstations), the authors from Amgen say their company spends 60 percent less time on drug development up to the clinical trial stage than it did five years ago.

Developing Protein Therapeutics: A Comparison

Manual (traditional) approach AI approach
Screening to identify proteins that will bind to a desired target at the appropriate strength 6 months 3 months
Modifying proteins to have the right properties 18 months 6 months
Rate of successful production of a clinical trials drug candidate ~50% >90%


Greater Efficiency

AI has already proven effective at improving efficiency across the entire drug discovery lifecycle, including target identification. The target identification process helps researchers determine appropriate dosing levels, understand how drugs will interact with the body, and ascertain whether the drug could trigger an adverse reaction.

To improve drug targeting, AI models are trained on large datasets (including omics, phenotypic, expression, disease association, patient, and clinical trial data), which helps them understand how diseases work while identifying new proteins with specific therapeutic benefits.

More Accurate Predictions

When properly trained, these AI models can quickly recognize patterns in the amino acid sequence of a protein and other data to predict a drug candidate’s efficacy, safety, and ease of manufacture. The models can then predict the medicinal properties of a specific protein or design an augmented protein with more desirable properties, such as a longer best-before date.

AI models can also run deep analyses on which other processes within the body could be affected by a particular compound when attempting to target a specific pathway. They can even predict the molecular structure of targets in 3D, helping to accelerate drug design through more effective drug binding.  

Before AI, much of this painstaking work was performed manually and took much longer. 

Milestones in AI-Enabled Drug Discovery

Researchers have made much progress over the past few years in applying AI models to drug discovery and target identification. Here are some of the most important recent milestones in the quest for AI-enabled drug discovery:

  • Early 2020: Exscientia announces the first AI-designed drug molecule in human clinical trials
  • July 2021: Google DeepMind’s AlphaFold predicts the protein structures of 330,000 proteins (the AlphaFold Protein Structure Database now includes more than 200 million proteins) 
  • February 2022: Insilico Medicine begins Phase I clinical trials for the first AI-discovered molecule based on a novel drug target discovered by the company’s Pharma.AI platform
  • January 2023: AbSci creates and validates de novo antibodies in silico using generative AI, the first company ever to do so 
  • February 2023: The FDA grants an Orphan Drug Designation to Insilico Medicine’s AI-developed drug for experimental idiopathic pulmonary fibrosis therapy
  • September 2023: Deep Genomics launches BigRNA, an AI foundation model with “the potential to reshape the landscape of RNA therapeutic discovery”
  • September 2023: Researchers from the University of Cambridge and Insilico Medicine develop a new way to find novel drug targets for diseases caused by dysregulation of the protein phase separation process
  • September 2023: DeepMind releases its new AI tool, AlphaMissense, which researchers say can analyze missense variants to predict genetic diseases at 90 percent accuracy

The Future of AI-Enabled Drug Targeting

The industry’s enthusiasm for AI-enabled drug discovery and drug targeting is evident in the numbers: Morgan Stanley says “even modest improvements” in early-stage drug development could equate to 50 new therapies (representing approximately $50 billion) over the next decade. 

Indeed, third-party investment in AI drug discovery research hit $5.2 billion at the end of 2021 after doubling yearly for the previous five.

However, one significant blocker to progress in this area is that individual drug companies sometimes can’t generate enough internal data to perform AI-enabled drug targeting on their own. That’s why some in the industry, including the authors from Amgen cited earlier, have proposed data collaboration among companies without compromising competitive information.

Suggested approaches in this regard include federated learning, a type of decentralized machine learning that uses raw data on edge devices to train AI models (improving data privacy among participants) rather than using global servers. 

Implementing decentralized federated learning in the drug development pipeline can “improve developers’ predictive abilities, benefiting both the firms and the patients,” according to the authors of the Nature study quoted earlier.

This approach could yield even more accurate predictions when combined with active learning, a type of semi-supervised learning model that can determine what kind of training data it needs to improve. 

Either way, given the rapid advances of the past few years, it’s not difficult to envision a near future featuring fully automated drug discovery powered by AI.

CapeStart’s machine learning engineers and data scientists work with pharmaceutical and medical companies every day to get products to market faster, enhance internal efficiency, and improve health outcomes. Contact us today to learn more about how we can help accelerate your medical research. 

Contact Us.