Digital: AI & ML and Robotics

The future of generative AI and automation in drug discovery

How is generative AI being utilised in drug discovery to improve processes and secure higher-quality drugs?

Yann Gaston-Mathé at Iktos

In a year where both the Chemistry and Physics Nobel Prizes have been awarded for foundational contributions to artificial intelligence (AI), it is the perfect time to explore how AI and robotics are reshaping drug discovery. This field stands out as a key area for AI application, free from many of the ethical or regulatory challenges faced in other domains. As new technologies emerge, the exploration of chemical space for novel therapies is accelerating. By streamlining molecular design, automating synthesis and enhancing the overall quality and speed of discovery, AI is opening the door to the next frontier in drug discovery.

These advances are transforming the design, make, test, analyse (DMTA) cycle. AI and robotics enable scientists to better explore chemical space, design novel compounds through multiparametric approaches, automate molecule synthesis, and integrate assay results and predictions into subsequent design iterations. The impact of AI and robotics on drug discovery is clear – faster, more efficient processes and higher-quality drug candidates.

Beyond traditional computational chemistry

Drug hunting teams strive to promote preclinical candidate molecules with the desired balance of potency, selectivity, safety, solubility, bioavailability and novelty – essential properties that determine a drug’s success in clinical phases and treating patients. Over the past 30 years, computational chemistry has dramatically expanded the capabilities of drug discovery, allowing teams to screen vast libraries of small molecules in the hopes of identifying a promising starting point. Despite this, traditional computational methods such as library enumeration or the use of prebuilt virtual libraries often explore only limited regions of chemical space, or employ ‘blind’ virtual screenings without sufficient biological or structural guidance. These approaches are typically followed by sequential optimisation cycles aided by multiple computational chemistry techniques, addressing individual molecular properties one at a time – potency first, then selectivity – followed by other pharmacokinetic attributes.

While these methods have contributed to notable advancements, it frequently results in a suboptimal process that has not achieved significant acceleration of the discovery timelines, having had little impact on the current DMTA cycle paradigm of sequential property optimisation. Worse still is the issue that optimising one molecular attribute often compromises others, necessitating further rounds of refinement.

In addition, traditional computational chemistry approaches face significant limitations, particularly when aiming to deliver truly innovative therapies. In the case of novel targets, there is often insufficient information available at the project’s inception, meaning that the right solution cannot be found in existing data sets or screening libraries. Even for well-established targets, new strategies must be devised to differentiate from prior work. As boundaries in chemical space are pushed, the limits of current predictive models are likely to be encountered, with their applicability diminishing in unexplored regions. This highlights the need for more dynamic and exploratory approaches that can adapt as new information becomes available throughout the drug discovery process.

The advent of chemical space exploration with generative AI

Generative AI (genAI) introduced a new era in small molecule drug discovery. Its application in chemistry research began with the first models trained on drug-like compounds in 2016. Since then, the focus has shifted towards using genAI to refine the early-stage design process, enabling medicinal chemists to devise more innovative strategies for discovering new therapeutics. Deep learning (DL) architectures have transformed numerous aspects of drug discovery, from protein structure prediction and reaction mechanism elucidation, to acyclic diene metathesis (ADMET) property prediction, molecular docking, and the design and optimisation of small molecules.

Rather than relying on the traditional approach of screening vast libraries in the hopes of finding a viable candidate, AI-driven drug discovery now enables the design of molecules from the outset to meet specific therapeutic needs. Advances in genAI allow for a multiparametric approach, optimising several characteristics simultaneously, such as potency, selectivity and pharmacokinetics. This is possible as genAI can explore previously uncharted regions of chemical space, creating novel solutions beyond existing compounds and data sets. Importantly, project objectives are considered early in the design cycle and optimised in parallel. This streamlines the process and reduces the need for extensive compound libraries by focusing on a smaller, more targeted set of molecules with the potential to meet the required criteria for drug development. In turn, this aims to decrease discovery timelines by optimising the design step of the DMTA cycle and reducing the total number of cycles needed. However, the process is not without challenges. While genAI can explore an estimated chemical space of 1060 potential compounds, these must still be tested in the laboratory, and so the compounds designed by genAI must meet the limitations of chemical synthesis. To address this, many genAI models incorporate synthetic accessibility filters in a postprocessing step, scoring compounds based on their likelihood of being synthetically achievable. This refinement narrows the scope of possibilities, promoting those compounds with a higher likelihood of being experimentally viable from within a subset of molecules that were not optimised for synthetic viability. As explained by Rafael Gomez-Bombarelli, professor at Massachusetts Institute of Technology (MIT), US: “It is critical to connect the creativity of genAI models to the constraints of real chemistry, that needs to be made and tested quickly.”

GenAI meets the reality of traditional DMTA cycles

The manual and segmented nature of the DMTA cycle remains a significant barrier to accelerating the drug discovery process. Pairing genAI design tools with traditional synthesis and characterisation steps limits the overall productivity gains in the discovery pipeline.

For the design of synthetically feasible molecules that meet specific project requirements, a comprehensive system is needed – one that accounts for synthetic constraints alongside bioactivity and physicochemical properties to achieve an optimal outcome. Although there have been notable advancements in the ‘make’ stage of the DMTA cycle through automation, AI and machine learning (ML) have yet to make a substantial impact on synthesis itself, which remains largely unchanged from the methods used for decades. A typical workflow still follows these steps:

Design: Chemists generate lists of molecules for synthesis

Retrosynthetic planning: Analyse literature, devise synthetic routes and determine available building blocks

Materials sourcing: Order the necessary reagents and materials

Experimental setup: Prepare reaction vessels and conditions and run the reactions

Reaction monitoring: Oversee reactions, carry out work-up processes (quenching, extraction, evaporation) and prepare for purification

Purification: Select and optimise purification methods, run and monitor the purification, and concentrate the product

Characterisation: Confirm the structure and purity of the synthesised compound using appropriate analytical techniques.

This process is highly manual, labour-intensive and time-consuming, with low throughput – often only two to four reactions per day per chemist – since many steps cannot be parallelised efficiently. The unpredictable nature of synthesis timelines introduces bottlenecks that hinder the overall speed of the DMTA cycle, delaying progress in drug discovery. To further complicate the process of molecule synthesis, this work is often outsourced to clinical research organisations (CROs), either due to a lack of in-house expertise or to reduce costs, but resulting in additional logistical challenges and managerial overhead.

Is automation enough? Not really…

The promise of robotics in drug discovery lies in the ability to significantly improve efficiency. Robots can operate continuously and parallelise many of the reaction, work-up and purification steps, making the synthesis of chemical matter far more efficient. While automated synthesis holds great potential, its implementation often presents challenges. Retrofitting automated synthesis platforms onto existing synthetic processes is not always effective. To overcome these challenges, it is necessary to re-evaluate all the steps in both the ‘design’ and ‘make’ process, and refactor them with robotics in mind. By integrating synthesis planning with design, both tailored to an automated synthesis platform, the entire DMTA cycle can be optimised. Currently, automated synthesis platforms are applied in three primary ways:

Library design

This approach focuses on synthesising large numbers of molecules using the same reaction framework but with slight variations in the building blocks. Typically employed in a combinatorial fashion, the repetitive nature of the setup makes it well-suited for automation. Although AI may not always be required, this method excels at producing simple molecules through straightforward reaction steps, often applied to build screening libraries or test one series at a time.

Reaction optimisation

In this method, the same building blocks are used repeatedly, while the reaction conditions are adjusted to optimise outcomes. Although yield is often the primary goal, other factors such as reducing impurities, improving purification or enhancing safety may also be considered. Automation and robotics can streamline this process, especially when scaling up the reaction for larger batches.

Reaction screening

This can be approached with or without AI. Scientists can manually design the screening profile or use ML models to predict reaction outcomes, which helps to efficiently identify optimal conditions. Robotics, when applied, accelerates the process, especially when handling more complex molecules or when focusing on one reaction step at a time for specific projects.

Advances in genAI allow for a multiparametric approach, optimising several characteristics simultaneously, such as potency, selectivity and pharmacokinetics

Automation for general drug design is more complex as it must accommodate a variety of reactions, exit vectors and building blocks. Platforms capable of handling this complexity need to leverage the full power of AI to manage multiple projects and synthesise complex molecules through multi-step processes as efficiently as possible. Successful implementation of this approach relies on the integration of molecule generation, retrosynthesis planning and reaction condition optimisation into a cohesive, autonomous drug design platform that empowers human experts to drive drug discovery programmes without the need to micromanage each aspect of the process.

The future of genAI and automation: towards an autonomous platform for general drug design

Ideally, design, synthesis and robotic workflow planning should occur in parallel, or iteratively, to provide continuous feedback throughout an autonomous process. By integrating these elements, the efficiency of the ‘make’ step can be determined by the principles employed during the ‘design’ step – ensuring high-quality compounds, while greatly improving the overall effectiveness of the DMTA process. Looking ahead, the development of a fully autonomous platform for general drug design will rely on several key components:

Integration of design and synthesis

When designing molecules, it is essential to consider synthetic constraints from the outset. Even small modifications to a target molecule could make it significantly easier to synthesise – enabling faster testing of hypotheses and accelerating the discovery process – without having to compromise on the quality of a chemical series to meet project requirements.

Integration with compound ordering and inventory management

A seamless connection between the platform, ordering systems and laboratory inventory management systems (LIMS) is crucial. By integrating these systems with design and synthesis planning tools, the platform can prioritise molecules that utilise in-house or easily sourced building blocks, while managing the scheduling and ordering of materials that may take longer to obtain.

Generalised automated synthesis capabilities

Once the design and synthetic pathway are planned, the laboratory procedures should be readily transferable to a robotic platform. This includes transforming procedures from patents or literature into machine-readable instructions.

Integration of purification and biological testing

Purification, analysis and biological testing must be key components of automated platforms. This integration ensures that critical data points are fed back into the system in short times, allowing for continuous refinement of the design cycles and fostering a truly autonomous drug design ecosystem.

Ultimately, such a platform will give scientists – medicinal and computational chemists and biologists – oversight and high-level control of the drug discovery process. Combined with their expertise, these new technologies will enable them to run more programs in parallel, generating higher volumes of relevant data fast, resulting in high quality compounds designed and made in shorter time frames. With the ability to probe more fundamental questions, there is tremendous potential to achieve unprecedented breakthroughs in addressing critical human health needs with novel, AI-first medicines.


Image

Yann Gaston-Mathé, co-founder and chief executive officer of Iktos, is a seasoned R&D professional, strategy consultant and biotech entrepreneur with over 20 years of experience in pharma (Servier, Ipsen), molecular diagnostics (IntegraGen) and strategy consulting (Capgemini Consulting, BearingPoint, Cepton). He is also a skilled data scientist with several patents and publications in the field of biostatistics, and biomarkers discovery and validation.