Exploring the horizons of small molecule drug discovery the evolution and application of the ideal fragment library
With the pharmaceutical industry facing unprecedented challenges in small molecule drug discovery, this paper argues that, with the correct design of the fragment library, Fragment-Based Drug Discovery has emerged as a complementary strategy to High Throughput Screening.
Over the last decade, fragment screening has emerged as a complementary strategy to high throughput screening (HTS) and Fragment Based Drug Discovery (FBDD) has gained wide acceptance within the pharmaceutical and biotechnology sectors, with a number of fragments progressed into lead series and on to clinical candidates (1-4).
Initial hits are identified by screening small libraries typically of 500-2,000 fragments of low molecular weight (MW 100- 300Da) against a target. Fragments bind with low affinity (KD values in the high micromolar to millimolar range) and are, therefore, screened at high concentration using biophysical techniques such as NMR, x-ray crystallography and surface plasmon resonance (SPR).
Historically, HTS libraries have been populated by ‘drug-like’ molecules, mostly chosen to comply with Lipinski’s rule of five (5,6). However, these molecules tend to be large and lipophilic and thus are difficult to develop into potent compounds without reducing their ‘drug-likeness’.
Poor absorption, distribution and metabolism characteristics caused by high molecular weight and lipophilicity are major reasons for attrition of lead candidates. Hit rates from HTS screens are invariably low, as complex screening compounds form mismatches with receptors due to suboptimal interactions or steric clashes. Fragments are simpler and smaller so are more likely to fit into the binding site without these unfavourable interactions (7). Fragments typically bind with lower affinity to the target sites than larger drug-like molecules that can form many more interactions, but the binding efficiency per atom is as high or higher.
Fragment hits can then be readily optimised into potent leads by synthesising larger compounds that pick up additional target-ligand interactions resulting in improved affinity for the target, while still maintaining ‘drug-likeness’. Astex Pharmaceuticals reports that it is nearly always possible to obtain nM lead compounds through the synthesis of 20-100 analogues starting from the fragment hit for a wide range of target classes (8).
What makes a library ideal?
The design of a fragment library is critical to ensure that high quality hits are obtained and there have been a number of discussions in recent literature over what constitutes the ideal fragment library (2,4,9-13). In this paper, we will discuss the main considerations, including the physico-chemical properties of fragments, removal of unwanted chemical functionalities, medicinal chemistry tractability, overall diversity and the size of the library, with particular focus on the importance of aqueous solubility to the ultimate success of the fragment library screen.
While the exact methodology used to design proprietary fragment libraries varies, the steps followed are similar. We will illustrate this with the design of the Maybridge Ro3 Library, summarised as a flowchart in Figure 1.
The starting point for fragment library design is a pool of available compounds, obtained from inhouse collections and commercial sources. In order to computationally evaluate these compounds, files containing 2D connectivity information are generated (eg SD files or SMILES strings) and the set of molecules is then ‘filtered’ against a number of physico-chemical properties such as MW, logP, predicted solubility and flexibility.
Scientists at Astex analysed a diverse set of fragment hits that were identified against a variety of targets and concluded that the hits obeyed, on average, the ‘Rule of 3’ where MW was <300Da, cLogP was ≤3, and the numbers of hydrogen bond donors, hydrogen bond acceptors and rotatable bonds were all ≤3. In addition, a polar surface area of ≤60 Å2 (14) was considered important for good cell permeability. These parameters have been widely accepted as providing a starting point for identifying ‘ideal’ fragments and tailored variants of them are used as filters for most fragment libraries.
A nearly linear relationship between molecular weight and binding efficiency was observed by Hajduk who retrospectively deconstructed 18 highly optimised inhibitors until the minimal binding elements could be identified (15). He elegantly showed that to obtain nM potency with a lead candidate which rigorously obeys the rule of five, the initial fragment must have a MW of <250Da, unless its potency is greater than about 30μM.
This fact, plus the higher probability of finding hits with smaller fragments, means that in many fragment libraries the upper MW limit is kept below 250Da. A lower MW limit of about 100Da is often applied; crystal soaking experiments having shown that compounds with a MW <100, at high concentrations, will bind to most active sites (16). Other libraries apply a lower limit of 150Da, as smaller fragments have a greater tendency to bind in multiple orientations (17).
An alternative approach is reported by GSK, which has developed a fragment set for ‘reduced complexity screening’ with fragments incorporating <22 heavy atoms (18). This method ensures compounds containing heavy atoms such as bromine, which are useful for further synthetic manipulation, are not excluded.
Solubility prediction models may be used to remove compounds with predicted poor aqueous solubility (19-21). Vernalis reports that 88% of fragments in its first SeeDs library were correctly predicted to have a solubility ≥2mM using a linear model validated for small drug like compounds (22). Molecules containing undesirable functionality from a medicinal chemistry point of view, such as reactive groups and known toxic motifs, are filtered out while fragments containing chemical functionality which allows rapid chemical evolution and optimisation of the fragment hits are positively selected.
At this point in the library design, specific groups of compounds may be included, such as those incorporating privileged structures of known drug compounds (23), scaffolds found in natural products (24) or target focused fragments, derived from pharmacophore screens (25).
Size and diversity
The next step in the design process is to reduce the number of members to a workable library size while maintaining maximum diversity. There is some debate about the optimum number of fragments that should be included in a library. A central concept behind fragment screening is that a small number of fragments can probe a much greater proportion of available ‘chemical space’ than an HTS screen of larger molecules (26).
A recent analysis indicates that screening a 1,000-member fragment library which averages 14 heavy atoms (MW 190) is equivalent to screening a library of more than 1,018 molecules that averages 32 heavy atoms (MW 450) (27). This is exemplified by Novartis which observes that hit rates from its NMR-based fragment screens are 10-1,000 times higher than for HTS assays (28), and by Vernalis that reports it has achieved hit rates of around 0.5% for challenging protein-protein interaction targets through to 7% for kinases from a fragment library of around 1200 fragments (27,29).
To some extent, the choice of library size depends on the assay; x-ray crystallography is a relatively low throughput technique and libraries of <1,000 are typically employed; 30 NMR screens can accommodate 1,000-3,000 member libraries; higher throughput is obtained with SPR, where libraries of several thousand may be screened.
Chemical diversity is generally accessed by some type of clustering algorithm, with 2D fingerprints as molecular descriptors to compare similarity (31,32). For example, clustering for the Maybridge Ro3 library was accomplished with the dbclus algorithm (33) which uses standard Daylight Fingerprints to identify dense clusters, where similarity within each cluster reflects the Tanimoto value used and the cluster centroid is similar to every other molecule within the cluster in a consistent and automated manner.
AstraZeneca reports development of a similar in-house clustering method (13), while Vernalis uses 2D-pharmacophore graph triangle fingerprints in the second and subsequent iterations of their SeeDs library (29). To achieve the required number of diverse molecules for a particular library, the Tanimoto co-efficient may be tailored to give a suitable number of clusters and singletons. For the Maybridge Ro3 library of 1,500 compounds, a Tanimoto level of 0.66 was chosen, as this resulted in 819 clusters and 690 singletons from a starting pool of 8,000 pre-filtered structures.
Quality control and solubility
Finally, quality control and solubility of the fragment library are important, both initially and on storage. Fragments are tested to confirm identity, purity (typically >95%) and solubility. If the compounds are stored in DMSO, regular examination, both visually checking for precipitation (13) and purity analysis, is essential. Regardless of the biophysical technique employed for screening, the fragments must be soluble in aqueous media at high enough concentrations for weak binding interactions to be measured.
Poor fragment solubility can also compromise the robustness of the screening data through aggregation and promiscuous inhibition (34). Although solubility is related to lipophilicity, other factors affecting solubility are difficult to model, therefore experimental determination of fragment solubility is critical.
Experimental solubility measurement
Practical methods for the determination of aqueous solubility of fragment molecules are not well described in the literature. A new, high-throughput solubility measurement protocol was developed for the Maybridge Ro3 libraries, using a Stem Clarity Solubility Station with IR transmission measurement giving a ‘soluble’ or ‘insoluble’ result for each fragment at 200mM DMSO, 5mM aqueous pH 7.5 phosphate buffer (containing 2.5% DMSO) and 1mM aqueous buffer (containing 0.5% DMSO).
The cut-off transmission value below which a compound was deemed to be insoluble, was validated by visually examining the sample tubes in a 1,000 subset35. A correlation of 96.9% was achieved between the transmission result and the visual result. The full set of 4,000 Ro3 compliant fragments, pre-filtered using the parameters detailed in Figure 1, were subjected to the solubility measurement protocol. The percentage of insoluble compounds found in each MW range at concentrations of 5mM and 1mM in aqueous buffer is shown in Figure 2.
There is a clear correlation between increasing MW and poor solubility and this data lends more credence to the view that ‘smaller is better’, when selecting fragments for inclusion in a library.
Assessing fragment binding using biosensor technology
Once developed, the quality of a fragment library can be put to the test by screening it against targets. Structural methods, such as NMR and crystallography, are commonly used to identify the relative positions of, and specific contacts between, a fragment and its target. While these methods can provide high-resolution detail about the binding interface, they require relatively large amounts of reagents, have limited throughput, and often do not provide insight into the strength of a binding interaction.
In contrast, label-free interaction analyses, such as surface plasmon resonance (SPR) biosensor technology, have rapidly evolved as the methods of choice for screening fragment libraries upstream of structural analysis (36,37).
Biosensor technology’s advantages include its low sample consumption (typically 1-5μg of target to create a reaction surface and <5μL of each fragment (prepared at 10mM)) and relatively high throughput. A significant benefit of biosensor technology is the ability to determine both affinity (KD, equilibrium dissociation constant) and specificity. Establishing that a fragment binds in a stoichiometric manner to the target is an unappreciated benefit of the methodology.
Historically, biosensor technology was used to characterise protein/protein and antibody/antigen interactions because it was thought that the technology lacked the sensitivity to be useful in small molecule analyses. Fortunately, over the past 10 to 15 years improvements in experimental design and data processing have been adopted throughout the user community to a point where small molecule analysis has become fairly routine. Additionally, advances in instrument hardware have improved both sensitivity and sampling throughput (38).
GE Healthcare’s Biacore 4000 platform can test four samples over four target surfaces at one time. Bio-Rad’s ProteOnXPR36 can measure six samples over six targets. ForteBio’s Octet384 can be configured with up to 16 sensor tips for higher parallel processing.
Even plate based label-free systems such as SRU Biosystems’ BIND technology are being effectively utilised to triage fragments prior to screening on a target to identify poorly behaved (eg ‘sticky’ or insoluble) fragments. And, towards the next step in increased throughput, ICx Nomadics’ SensiQ can automatically dilute analytes to test a gradient of concentrations for each fragment within a 96- or 384-well plate, thereby combining preliminary screening and follow-up affinity testing into one assay (39).
Today’s sensor technology is readily capable of screening libraries of several thousand compounds. In addition, most biosensor platforms have the capability to screen each fragment in parallel against multiple targets. This means one can select for fragments that bind only to the target of interest. Identifying these selective hits is essential for a successful fragment screen. Novice users are surprised to see how often small molecules bind indiscriminately to proteins when the compounds are assayed at high concentrations. Fortunately, biosensor technology can be used to identify the selective compounds.
Basic steps in biosensor-based fragment screening
Start with an active target
Since most biosensor technologies are essentially mass based, the binding of low-molecular-mass analytes inherently produces small changes in the biosensor binding response. The quality of the results is directly proportional to the quality of the starting material. Unlike enzymatic assays that can often be conducted on material with low specific activity, biosensor analysis requires high activity to begin with. Immobilise the target and control protein on the sensor surface. Biosensor analyses require that the targets be tethered to the sensing surface.
For fragment screening, this is actually advantageous because the same sensor surface can be used to sample many fragments. Proteins can be immobilised using a variety of chemistries ranging from amine-, carboxyl-, and thiol-coupling to capturing methods including biotinylation, poly-His-fusions and GSTfusions.
The best practice is to try different coupling methods and select the one that retains the highest functional activity of the target and then mimic those conditions for the control protein. The control can be unrelated to the target or something more specific to the target class, depending on the goals of the fragment screen. However, a successful fragment screen requires equal attention be paid to the control protein as to the target.
As illustrated by the cartoon in Figure 3A, the control protein can even be an unrelated second target, which means the fragment library can simultaneously be screened against two targets.
Establish target activity and stability
Prior to running a full fragment screen, a positive control compound is often used to confirm the immobilised target is active. It is possible to run a fragment screen without a positive control, but in those cases it is imperative to have good control protein surfaces to help with the hit selection process.
For enzymatic systems, substrates can be a good starting point for a control. The responses produced from concentration series of the control compounds confirm the targets are active enough to detect small molecule binding and the controls bind selectively to one target or the other, as well as reveal the range of signals one can expect to see for fragments (Figure 3B).
To establish the stability of the target surfaces, the binding of control compounds are tested repeatedly over time (Figure 3C). In cases where the target rapidly loses activity, many biosensor systems can support analysis at lower temperatures (for example, at 4°C) and/or addition of protein stabilisers such as glycerol to improve target stability.
Screen the library
To increase throughput, fragment libraries are typically screened at one concentration to first identify potential binders. With biosensors, screening is normally done between concentrations of 50 to 500μM. These concentrations are lower than those run in structural-based fragment screening studies, where a high percentage of site occupancy is needed. Biosensors are capable of detecting binding at concentrations that are well below the KD. The use of lower concentrations helps reduce the number of false positives and favours the selection of the higher-affinity fragments.
Identify the poorly behaved compounds
When reviewing the sensorgram response data, the first step in hit selection is to look at the fragments’ responses from a reference surface that has no protein immobilised (Figure 4A, left panel). Plotting raw response points taken just after each compound test on the reference surface is used to identify compounds that stick non-specifically or aggregate on the sensor surface itself (Figure 4A, right panel).
Any compounds that show significant binding to the reference surface should be omitted from further consideration. In the full 1,500-compound Ro3 library from Maybridge, only a few fragments bound to the reference surface or showed unusual injection response indicating aggregation. This low occurrence of poorly-behaved compounds is a result of the careful selection process the compounds have gone through in establishing this fragment library.
Check the control
The left panels of Figure 4B show the responses obtained for a fragment screen against two targets. Report-point trend plots (right panels of Figure 4B) of the processed response data from the surfaces are useful for visualising the behaviour of the assay over time. In this example, the binding of one control slowly decreased over the time required for this screen (Figure 4B, lower right panel). Across the panel of fragments, a majority show little or no binding to either target surface (which we would expect for a non-focused library), while a few fragments appeared to be promising hits.
Identify the selective binders using vs plots
To identify selective hits efficiently, the responses at the end of the binding phase from one target surface are plotted versus the responses from the other target surface (Figure 4C). The data for many of the apparent binders actually lie along a diagonal indicating they bind similarly to both proteins. Compounds in this region are not worth following up on, as they do not show specificity. The most interesting compounds will lie off the diagonal. The closer to the axis they are, the more specific they will be (note the positions of the highly selective positive-control compounds).
The number of binders to choose for follow-up analyses will of course vary depending on the library and the target, but it is the quality not the quantity of potential hits that really matter at this stage. Often project teams find that the hits identified in SPR screens do not show up as reliable binders in structural analyses. This is a result of poor hit selection in the screen.
Too often investigators flag all compounds that show target binding as hits, but only later find that most of their hits are non-specific binders. By including an off-target or secondary target in the analysis, and plotting the responses in a versus plot, it is easy to identify fragments that bind specifically to one target and to disregard those fragments that are not selective.
Follow-up concentration-dependent studies
Followup assays of the selected hits are typically run in full concentration series to demonstrate the binding is stoichiometric and establish affinity of the fragment. While ranking hits by their relative affinities may be useful at this point in the discovery process, fragments that bind stoichiometrically provide the highest likelihood for success in structural analysis.
Follow-up competition studies
It is possible to carry out competition studies using the biosensor to identify potential binding sites for hits from a screen. In these blocking experiments, the hits are tested for binding in the presence of a saturating concentration of a known binder. This added information about whether a fragment is competitive or non-competitive for known site binders can further help identify which compounds to pursue in structural analyses and hit optimisation programmes.
Often structural analogues of potential hits may be present in the library or available as part of a larger compound collection. Given the speed of the biosensor methodology, typically an analysis of existing analogues can provide additional insight into the structure activity relationship for a particular framework to aid in the selection process. It is also not uncommon to find more potent fragments even within a small collection of analogues.
After choosing a well-behaved library and establishing the activity and stability of a target and control protein, SPR-based fragment screening and follow-up studies are relatively straightforward. The most significant challenge is hit selection. Wisely choosing which fragments are indeed worth pursuing in downstream analyses requires careful evaluation of the responses obtained for the entire library from the target, control protein and reference surfaces.
Conclusions - Exploring the horizons of small molecule drug discovery
The pharmaceutical industry faces unprecedented challenges in small molecule drug discovery. Low hit rates from HTS screens and high attrition of subsequent lead candidates have lead Medicinal Chemists to seek other ways of identifying progressable hit compounds. Fragment screening permits a much larger chemical space to be probed by screening a relatively low number of diverse fragments and yields hits that can be readily optimised into potent leads, while still maintaining ‘drug-likeness’.
The success of the screen ultimately depends on the design of the fragment library. Aqueous solubility is a key consideration, as compounds are screened at very high concentrations in order to detect weak binding. Fragment libraries are being actively enhanced to support recent advancements in biosensor screening technologies that have now increased the feasibility of higher sensitivity screening of larger numbers of compounds against multiple targets. DDW
This article originally featured in the DDW Winter 2011/12 Issue
Dr David Myszka is the Director of Biosensor Tools LLC, a contract services and consulting firm with a focus on biosensor-based drug discovery. Over the past 18 years he has published more than 150 research articles and reviews on biosensor technology. Further information can be obtained at www.biosensortools.com.
Dr Jane Paul is the Maybridge Value Stream Manager and overseas the development of the Maybridge Ro3 Library. Previously she was a Chemistry Team Leader responsible for generating New Building Blocks and screening compounds for the Maybridge portfolio. Prior to joining Thermo Fisher Scientific in 1999, Jane spent four years at Chiroscience Ltd and carried out postdoctoral research at The Novartis Institute of Medical Sciences.
1 Hajduk, PJ, Greer, J. Nat. Rev. Drug Discov., 2007, 6, 211.
2 Hubbard, RE, Murray, JB. Methods in Enzymology, 2011, 493, 509-531.
3 Chessari, G, Woodhead, AJ. Drug Discov. Today, 2009, 14, 668-675.
4 Congreve, M, Chessari, G, Tisi, D, Woodhead, AJ. J. Med. Chem., 2008, 51, 3661-3680.
5 Lipinski, CA, Lombardo, F, Dominy, DW, Feeney, PJ. Adv. Drug Deliv. Rev., 1997, 23 3-25.
6 Lipinski, CA, Lombardo, F, Dominy, DW, Feeney, PJ. Adv. Drug Deliv. Rev., 2001, 46 3-26.
7 Hann, MM, Leach, AR, Harper, GJ. Chem. Inf. Comput. Sci.; 2001, 41, 856-864.
8 Murray, CW, Rees, DC. Nature Chemistry 2009, 1, 187-192.
9 DegJarlais, RE. Methods in Enzymology, 2011, 493, 137-155.
10 Chen, IJ, Hubbard, RE. J. Comput. Aid. Mol. Des., 2009, 23, 603-620.
11 Whittaker, M, Hesterkamp, T, Barker, J. Europ. BioPharm. Rev., Autumn 2006. A Cohesive view of Fragments.
12 Hubbard, RE, Davis, B, Chen, I, Drysdale, M. J. Curr. Top. Med. Chem., 2007, 7 1568.
13 Blomberg, N, Cosgrove, DA, Kenny, PW, Kolmodin, K. J. Comput. Aided Des., DOI 10.1007/s10822-009-964-5.
14 Congreve, M, Carr, R, Murray, C, Jhoti, H. Drug Dis. Today, 2003, 8, 876-877.
15 Hajduk, PJ. J. Med. Chem,., 2006, 49, 6972-6972.
16 English, AC, Groom, CR, Hubbard, RE. Protein Eng., 2001, 14, 47-59.
17 Lepre, CA. Meth. In Enzym., 2011, 493, 219-237.
18 Leach, AR, Hann, MM, Burrows, JN, Griffen, EJ. Mol. BioSyst., 2006, 2, 429-446.
19 Butina, D, Gola, JMR. Chem. Inf. Comput. Sci., 2003, 43, 837-841.
20 Baurin, N, Baker, R, Richardson, C, Chen, I, Foloppe, N, Potter, A, Jorden, A, Roughley, S, Parratt, M, Greaney, P, Morley, D, Hubbard, R. J. Chem. Inf. Comput. Sci., 2004, 44, 643-651.
21 Tetko, IV, Tanchuk, VY, Kasheva, TN, Viaal, AEP. J. Chem. Inf. Comput. Sci., 2001, 41, 1488-1493.
22 Baurin, N, Aboul-Ela, F, Barril, X, Davis, B, Drysdale, M, Dymock, B, Finch, H, Fromont, C, Richardson, C, Simmonite, H, Hubbard, R. J. Chem. Inf. Comput. Sci., 2004, 44, 2157-2166.
23 Gianti, E, Sartori, L. J. Chem. Inform. Mod., 2008, 48, 2129-2139.
24 Siegal, G, AB, E, Schultz, J. Drug Dis. Today, 2007, 12, 1032-1039.
25 Akritopoulou-Zanze, I, Hajduk, PJ. Drug Discov. Today, 2009, 14, 291-297.
26 Fink, T, Reymond, JL. J. Chem. Inf. Model., 2007, 47, 342-353.
27 Roughley, SD, Hubbard, RH. J. Med Chem. 2011, 54, 3989-4005.
28 Schuffenhauer, A. Curr. Top. Med. Chem. 2005, 5, 751-762.
29 Hubbard, RE, Davis, B, Chen, I, Drysdale, MJ. Curr. Top. Med. Chem., 2007, 7, 1568.
30 Tounge, BA, Parker, MH. Meth. in Enzym., 2011, 493, 3-20.
31 Shemetulskis, NE, Weininger, D, Blankley, CJ, Yang, JJ, Humblet, C. J. Chem. Inf. Comput. Sci., 1996, 36, 826-871.
32 Rogers, D, Hahn, M. J. Chem Inf. Model, 2010, 50, 742-754.
33 Darko, B. J. Chem. Inf. Comput. Sci. 1999, 39, 747-750.
34 Stoichet, BK et al. J. Med. Chem., 2002, 45, 8, 1712-1722.
35 Roeschlaub, CA, Redwood, CJ, Cross, DJ, Bridge, E, Green, S, Overfield, D, Evans, D. Highthroughput technique for rapid measurement of fragment solubility, poster available at www.maybridge.com.
36 Congreve, M, Rich, RL, Myszka, DG, Siegal, G, Marshall, F. Methods Enzymol., 2011, 493, 115-136.
37 Rich, RL, Myszka, DG. Anal. Biochem., 2010, 402, 170-178.
38 Rich, RL, Myszka, DG. Anal. Biochem., 2007, 361, 1-6.
39 Rich, RL, Quinn, JG, Morton, T, Stepp, JD, Myszka, DG. Anal. Biochem. 2010, 407, 270-277.