Computer-aided design: progress towards more rational drug discovery
Drug discovery is a science, a craft and an art, and often it requires a fair amount of serendipity or plain luck to succeed. Success is rare and costly, as evidenced by numerous reports and reviews (1,2). In its quest for success, industry nowadays relies more than ever on modern computational tools in all stages of the process.
Traditionally, Computer-Aided Drug Design (CADD) was applied almost exclusively during compound progression, guiding chemistry efforts with structure-based and pharmacophore techniques and quantitative structure activity relationships (QSARs). Over recent years, however, it has considerably extended its range of applications both up- and down-stream the pipeline, and it started to live up to its name.
In its early years, the focus was to improve potency, to design better inhibitors, agonists or antagonists, paying little, if any, attention to other important drug properties like bioavailability and lack of toxicity; its task was binder design rather than drug design. Throughput was low, as computers were much slower than today and many tasks still had to be performed manually. In spite of the significant investment required for a specialised group including hardware, software and staff, more and more companies employed CADD, but the impact of this support on the projects was variable and did often not live up to the very high expectations.
Its capability to rationalise and explain experimental findings was uncontested, but correct a priori predictions remained exceptions rather than the rule, not least because the confidence of medicinal chemists in these methods was not always sufficient to set out and actually prepare synthetically challenging compounds. The rationally designed neuroaminidase inhibitor zanamivir (Relenza) (3) remained the community’s rather lonely flagship example for quite some time.
Moving upstream: virtual screening
The advent of high-throughput screening (HTS) and combinatorial chemistry brought a paradigm change to pharma: it was now possible to synthesise and test far more compounds than ever before. Equally unprecedented were the amounts of data that needed to be stored, managed and analysed (‘mined’) and chem(o)informatics emerged and thrived at the meeting grounds of CADD, database design, statistics and computer science. It was the time of ‘the bigger, the better’ in screening campaigns, but it soon became apparent that large numbers are not a guarantee for success.
Many reasons contributed to disappointing results, where either no (novel) positives could be identified, or only compounds that failed in validation, were chemically unattractive or subsequently proved to be not optimisable. Problems with solubility, stability and detection resulted in large numbers of false positives (and probably negatives!). The screening of huge combinatorial libraries was even more disillusioning: sometimes almost entire libraries had the desired effect on the target, indicating a waste of resources in synthesis and screening, but too often no hits could be identified at all.
In addition, early libraries, already confined by ‘easy’ chemistry, suffered from a lack of purity and definition as well as from compounds that were too hydrophobic and too rich in features (4), in other words: too large and too complex. In an attempt to improve hit rates, computational pre-screening steps were introduced to select compounds for HTS, ideally even before their synthesis or purchase. Lipinski’s ‘rule of five’ (5), originally derived with a completely different goal in mind, along with new descriptors like PSA, the polar surface area, became very popular with combinatorial and medicinal chemists, because they are relatively simple and Teasy to interpret.
The analysis of databases of drugs and drug candidates like CMC (6), MDDR (7), WDI (8) or others yielded ranges for many more molecular descriptors previously employed in QSAR studies, defining a ‘drug-like property’ (9) profile for interesting compounds. Taking into account that molecular weight and lipophilicity tend to increase during optimisation programmes, more restricted ranges for desirable ‘lead like’ compounds were proposed as an extension to this scheme (10).
Other groups (11) argued that fitting into an appropriate property profile was not enough; they went a step further and developed complex models that are able to differentiate between ‘drug like’ compounds, ie compounds similar to the ones found in the respective databases, and ‘non-drugs’, typically taken from cleaned vendor catalogues. More recently it has been suggested that different target classes can display different requirements and that these ‘rules’ should be redefined, based on the target under consideration (12).
Other classes of compounds that were considered ‘nuisance’ or ‘swill’ (13) and often excluded from screening by computational filters were those which carried potentially reactive moieties (14) (like strong Michael acceptors) or perceived toxic liabilities (eg aromatic nitro groups) and ‘promiscuous’ compounds (15) (aka ‘frequent hitters’) that showed unspecific activity against a large number of unrelated targets.
A different approach to reduce the number of compounds to purchase, to synthesise and ultimately to screen has its roots in the basic dogma of medicinal chemistry, namely that similar compounds have similar effects. Given the chance that close analogues are either all active (albeit to a varying degree) or inactive, suggests that screening too similar compounds results in redundancy. In order to avoid this, schemes based on dissimilarity (‘diversity’) were proposed in order to select only a low number (one?) of representatives from larger subsets of similar compounds (16).
This methodology neglects, however, that the similarity dogma only holds in general, but that even minor modifications in critical, series-dependent positions can result in a complete loss of activity (17). Selecting only one or few of these apparently similar compounds by chance is therefore risky, even more so if one considers the incidence of false negatives. Nilakantan and Nunn have estimated that it takes of the order of 100 analogues around the same scaffold to be reasonably certain to recognise this series, if it indeed confers activity (18).
But not only QSAR methods were applied; docking and pharmacophore techniques, as well, have changed considerably over the years in order to face the new challenges. The ever increasing speed of computers made it possible to automate large parts of the respective workflows and to achieve the throughput necessary to process millions of compounds and even larger virtual libraries.
The two methodologies are complementary in their requirements: while in order to apply pharmacophores, one requires known actives, eg natural substrates or published compounds, docking needs the three-dimensional structure of the target. But the number of targets for which both can be used is rapidly expanding, due to the explosive increase in structures from both x-ray and NMR, the improved accuracy of homology modelling and the possibility to project pharmacophore points from the known structure of a target for which no actives are known.
To exploit the respective strengths of both approaches, hybrid methods are being developed where pharmacophores guide the docking towards relevant poses and ensure that critical interactions are respected. Probably the biggest remaining challenge in the field is to move away from static models and to consider target flexibility in a suitable manner.
Moving downstream: predictive ADMETox
In spite of new technologies, streamlined processes and increased R&D expenditures, however, output as measured in new drugs entering the market remained stagnant (2). A number of detailed analyses of the reasons why compounds fail in later phases of drug discovery, when already large sums of money have been invested in them, highlighted the importance of successful candidates having an appropriate pharmacokinetic profile and no toxic liabilities (19).
This issue was addressed experimentally by applying relevant tests (eg permeation of Caco2 cell layers) earlier than before in the progression pathway, by adapting them to high-throughput conditions and by developing surrogate screens (eg PAMPA). On the computational side, work on robust global models for properties other than logP and maybe solubility was in the past hampered by the lack of reliable data sets of sufficient size and diversity. But the increase in screening (and publishing) finally provided a basis for the development of predictive models for many key ADME parameters, in spite of complications like different protocols and large intra-laboratory variation for some endpoints.
The potential savings associated with the early identification of possible deficiencies are considerable and it does not come as a surprise that the area has not only attracted academic interest, but also the classical providers of modelling and QSAR software and companies from neighbouring fields.
The requirements of HTS and CombiChem, the expressed preference for oral dosage forms and, of course, the success of Lipinski’s ‘rule of five’ (5) explain the dominance of models for solubility and absorption, but numerous commercial packages are now available, catering for the full range of properties of interest (20), up to sophisticated systems for mechanism based simulations of pharmacokinetics and -dynamics in humans. But in spite of all the efforts and the significant progress made in recent years, none of these methods can currently claim to replace the critical experiments on real compounds. This is especially true for the prediction of toxicity (21).
Back to the roots
The discovery of small-molecule drugs still begins with the identification of suitable hits in screening and developments in this domain will unfailingly involve CADD eventually. In their attempts to find smaller compounds as better starting points for their work, chemists have readily taken on the concept of ligand efficiency in analysing HTS data (22), but the actual screening of ever-smaller entities was hindered by the fact that the activities of such fragments are often below the detection or operational limits of classical biochemical assays.
Other technologies, like x-ray structure determination, NMR or plasmon resonance detection do not suffer from this disadvantage to the same degree and are increasingly employed (23). If these approaches prove successful, they will undoubtedly result in new tasks for the computational chemists: the chemical space around smaller compounds is larger and to optimise them efficiently, structure-based and de novo design strategies will need to be updated, adapted and improved.
New developments in other fields, too, will certainly impact the way we will design drugs in the future, but how is more difficult to envision. Chemical genetics/genomics, for example, can potentially provide active hits for which the molecular target is not known; optimising them while simultaneously trying to identify their target( s) may well become one of the new big challenges for CADD. But this is not a new situation: so far, CADD has always been able to react and follow the changes in the drug discovery environment and, in doing so, it grew into an integrated and essential player, a constant companion. In fact, many tools and methods that were previously only used by CADD experts have become standard applications, are now routinely used by combinatorial and medicinal chemists and have changed their daily work.
Drug discovery still is a science, a craft and an art, but Computer-Aided Drug Design will continue to play a major role in the attempts to make it more rational and successful in the future.
This article originally featured in the DDW Summer 2005 Issue
Dr Wolfgang Sauer studied chemistry at the Universities of Erlangen and Strathclyde (Glasgow); he received his Dr.rer.nat. from the Computer Chemistry Center in Erlangen in 1994. After five years with Oxford Molecular (meanwhile Accelrys) he joined the Serono Pharmaceutical Research Institute close to Geneva, where he is responsible for computational chemistry and the CADD support of the medicinal and combinatorial chemists.
1 eg Mullin, R. Drug Development Costs About $1.7 Billion. Chem. & Eng. News 81(50) 8 (2003); Bains, W. Failure Rates in Drug Discovery. Drug Disc.World 5(4) 9-18 (2004).
2 Schmid, EF, Smith, DA. Is pharmaceutical R&D just a game of chance or can strategy make a difference? Drug Disc.Today 9(1) 18-26 (2004).
3 von Itzstein, M et al. Rational design of potent sialidasebased inhibitors of influenza virus replication. Nature 363(6428) 418-423 (1993).
4 eg Carr, R, Hann, M.The right road to drug discovery? Fragment-based screening casts doubt on the Lipinski route. Modern Drug Discov., 5(4) 45-48 (2002).
5 Lipinski, CA et al. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Adv. Drug Del. Rev. 23(1) 3-25 (1997) and 46(1) 3-26 (2001).
6 Derived from the Drug Compendium in Pergamon’s Comprehensive Medicinal Chemistry, available from ElsevierMDL, http://www.mdli.com.
7 MDL Drug Data Report, available from ElsevierMDL, http://www.mdli.com, developed in cooperation with Prous Science Publishers, http://www.prous.com.
8 World Drug Index, available from Thomson Derwent, http://thomsonderwent.com.
9 eg Viswanadhan,VN et al. Knowledge-based approaches in the design and selection of compound libraries for drug discovery. Curr. Op. Drug Disc. & Dev. 5(3) 400-406 (2002).
10 eg Hann, MM, Oprea,TI. Pursuing the leadlikeness concept in pharmaceutical research. Current Opinion in Chemical Biology 8(3) 255–263 (2004)
11 Ajay et al. Can We Learn To Distinguish between ‘Drug-like’ and ‘Nondrug-like’ Molecules? J.Med.Chem. 41(18) 3314-3324 (1998); Sadowski, J, Kubinyi, H. A Scoring Scheme for Discriminating between Drugs and Nondrugs. J.Med.Chem. 41(18) 3325-3329 (1998).
12 eg Mason, JS.What are the developments in the drug industry? Presentation at the SMI conference ‘Drug Design’, 23. Feb. 2004, London.
13 Walters,WP, Stahl, MT, Murcko, MA.Virtual Screenings – an overview. Drug Disc.Today 3(4) 160-178 (1998); Charifson, PS,Walters,WP. Filtering databases and chemical libraries. J. Comput.Aided Mol. Des. 16() 311–323 (2002).
14 Rishton, GM. Nonleadlikeness and leadlikeness in biochemical screening. Drug Disc.Today 8(2) 86-96 (2003).
15 Roche,O et al.Development of a Virtual Screening Method for Identification of ‘Frequent Hitters’ in Compound Libraries. J.Med.Chem. 45(1) 137-142 (2002); McGovern, SL et al.A Common Mechanism Underlying Promiscuous Inhibitors from Virtual and High- Throughput Screening. J. Med. Chem. 2002; 45(8) 1712-1722 (2002); McGovern, SL et al.A Specific Mechanism of Nonspecific Inhibition. J. Med. Chem. 46(20) 4265-4272 (2003); Seidler, J et al. Identification and Prediction of Promiscuous Aggregating Inhibitors among Known Drugs. J. Med. Chem. 46(21) 4477-4486 (2003).
16 Matter,H. Selecting Optimally Diverse Compounds from Structure Databases a Validation Study of Two-Dimensional and Three-Dimensional Molecular Descriptors. J. Med. Chem. 40(8) 1219-1229 (1997); Brown,RD, Martin,YC.An Evaluation of Structural Descriptors and Clustering Methods for Use in Diversity Selection. SAR QSAR Environ. Res. 23-39 (1998); Martin,YC,Kofron, JL,Traphagen, LM. J. Med. Chem. 45(19) 4350- 4358 (2002).
17 Kubinyi, H. Similarity and Dissimilarity – a Medicinal Chemists View. Perspect. Drug Disc. Des. 11 225-252 (1998); Martin,YC. Diverse Viewpoints on Computational Aspects of Molecular Diversity. J. Combin. Chem. 3(3) 231-250 (2001).
18 Nilakantan, R, Nunn, DS.A fresh look at pharmaceutical screening library design. Drug Disc.Today 8(15) 668-672 (2003).
19 Kennedy,T. Managing the Drug Discovery/Development Interface. Drug Disc.Today 2(10) 436-444 (1997).
20 Clark, DE. Computational Prediction of ADMET Properties: Recent Developments and Future Challenges. Ann.Rep.Compu.Chem. 1 133- 151 (2005).
21 Greene, N. Computer systems for the prediction of toxicity: an update.Adv. Drug Deliv. Rev. 54(3) 417–431 (2002); Egan,WJ, Zlokarnik, G, Grootenhuis, PDJ. In silico prediction of drug safety: despite progress there is abundant room for improvement. Drug Disc. Today:Technologies 1(4) 381- 387 (2004).
22 Hopkins,AL, Groom, CR, Alex,A. Ligand efficiency: a useful metric for lead selection. Drug Disc.Today 9(10) 430-431 (2004); Rees, DC et al. Fragment-Based Lead Discovery. Nature Rev. Drug Disc. 3(8) 660-672 (2004); Abad-Zapatero, C, Metz, JT. Ligand efficiency indices as guideposts for drug discovery. Drug Disc.Today 10(7) 464- 469 (2005).
23 Lundqvist,T.The devil is still in the details – driving early drug discovery forward with biophysical experimental methods. Cur.Op.Drug Disc.Dev. 8(4) 513-519 (2005).