Precise genome editing: the key to a CRISPR drug discovery pipeline?
The recent emergence of precise genome editing methodologies and the CRISPR/Cas9 technology, in particular, may prove the key to unlocking this huge potential.
A key focus of drug discovery teams in recent years has been the generation of more physiologically relevant models, which could provide better correlation to the clinical setting and thereby reduce candidate drug attrition.
The ability to efficiently and precisely edit a cell’s DNA to generate both in vitro and in vivo models, using CRISPR/Cas9, represents a significant opportunity in this area, as discussed here.
In spite of the progress made since the advent of genetic engineering methodologies in the 1970s, the biggest challenge to the widespread modulation of genes has been the lack of an efficient and widely applicable method for precisely introducing genetic changes. Historical approaches, such as the use of self-splicing introns, while demonstrating the potential of genetic engineering, proved laborious and time-consuming, with sub-optimal efficiency, precision and scalability. This led to the transient or stable over-expression of exogenous DNA becoming the method of choice, despite modification of a gene in the endogenous context being the desired goal.
Progress towards endogenous manipulation was made through the development of approaches using targetable nucleases, such as the zinc-finger nucleases (ZFN) (2003) and transcription activator-like effector (TALE) nucleases (TALENs) (2010)1. Both of these techniques use the nuclease domain of the Fok1 restriction enzyme, coupled to a targeting motif to direct DNA cleavage activity to the desired site. In the case of ZFN, this targeting role is performed by engineered zinc-finger domains, derived from natural transcription factors, each of which is directed towards a three-base pair sequence. Similarly, TALENs are directed by a series of TALE repeat elements, although in this scenario, each module targets a single base pair. With both ZFN and TALENs, targeting to the desired sequence enables Fok1 dimerisation and subsequent DNA cleavage to form a double-strand break (DSB). The cell’s own DNA repair mechanisms possess the ability to then introduce the desired modification.
While both systems form effective ‘DNA scissors’ for performing precise modifications, they also have their limitations. The need to re-engineer the full protein module of the zinc-finger or TALE domain increases the complexity of tool design and synthesis and does not lend itself to easy multiplexing. The development of the CRISPR/Cas9 approach overcomes many of these issues which is why it is proving transformative.
CRISPR/Cas systems occur naturally as a bacterial mechanism enabling acquired immunity to phage infection. In this scenario the bacterium is able to integrate short DNA sequences from invading pathogens into their genome, resulting in repeating arrays, called Clustered Regularly Interspaced Short Palindromic Repeats, or CRISPRs. If a related virus infects the cell, these sequences provide bacterial immunity by cleaving the incoming genome. With the elucidation of this system, the potential for its use in genome engineering became apparent and has led to development of the CRISPR/Cas9 gene editing technology2.
Many of the advantages of the CRISPR/Cas9 approach arise from the simplicity of the system itself, which only requires the introduction of two key components. The first is the Cas9 nuclease which comprises the HNH and RuvC-like nuclease domains and acts to cleave the two DNA strands, thereby forming a DSB. The second component is responsible for directing the Cas9 to the appropriate cleavage site and generally takes the form of a single guide RNA (sgRNA) molecule (commonly termed CRISPR). This sgRNA is designed to be complementary to the DNA sequence around the modification site, and has been engineered from the bacterial system, in which both a trans-activating crRNA (tracrRNA) and a CRISPR RNA (crRNA) were required.
Successful targeting of the Cas9 nuclease and sgRNA is dependent on the target DNA itself, more specifically a short sequence, widely found throughout the genome, termed the protospacer adjacent motif (PAM) flanking the 3’ end of the targeted site. This separation of the targeting and nuclease elements is a key feature of the CRISPR/Cas9 system, as it offers many advantages. Rather than needing to re-engineer the nuclease for every targeting event, the Cas9 remains consistent, with only the need to include different sgRNA molecules to target different sites within the genome. The relative ease of generating such RNA molecules, rather than re-engineering the protein components, offers time and cost benefits. Furthermore, this feature enables the multiplexing of more than one modification in a single experiment through the inclusion of multiple sgRNA alongside the Cas9.
Once the sgRNA has directed Cas9 to the desired location and the nuclease has generated a DSB, the process then relies on the cell’s own DNA repair mechanisms to complete gene editing. Repair can occur through the non-homologous end joining (NHEJ) pathway, which is often errorprone, giving rise to insertion and deletion events (indels) that may result in genetic knockouts. Alternatively, repair by the homologous recombination (HR) pathway, with the provision of appropriately designed DNA oligonucleotides or vectors, enables the introduction of specific changes within the genetic sequence.
With appropriate bioinformatic analysis, oligonucleotide design and CRISPR methodology, growing evidence suggests that off-target effects are not a major concern. Nevertheless several approaches have been designed which decrease the chance of off-target cleavage. The most common of these is a modified Cas9, termed nickase3. In this molecule, one of the two cleavage domains of the Cas9 enzyme has been mutated such that the nuclease only cuts a single DNA strand, resulting in a single strand break (SSB) or ‘nick’. Through the use of two offset sgRNAs directed to sites in close proximity, it is possible to induce a pair of nicks on opposite strands, which function like a DSB, inducing DNA repair via the aforementioned repair mechanisms. The need for dual recognition reduces off-target effects. Additionally, this approach has been taken further with the development of an editing molecule where catalytically inactive, or ‘dead’, Cas9 (dCas9) has been fused to the Fok1 enzyme, introducing the requirement for two dCas9-Fok1 monomers to increase cleavage specificity4.
In addition to the nuclease and the nickase entities, the rapid advances being made in the CRISPR arena have supplemented the ‘CRISPR toolbox’ with a range of variations that further demonstrate the potential impact of the technique (Figure 2). In addition to conventional gene knockouts and modifications, inducible systems have been developed, a technique known as iCRISPR5. In this approach Cas9 and sgRNA are engineered into a safe harbour locus within the genome (eg the ROSA26 or the AAVS locus), and Cas9 activity is only induced upon introduction of a pharmacological agent, thereby allowing real time monitoring of knockout events.
The use of dCas9 coupled to an effector domain makes it possible to selectively inhibit or activate specific gene loci through the techniques of CRISPRi6 and CRISPRa7. CRISPRi employs the ability of dCas9 to bind to, but not cut a specific DNA sequence. In doing so, the presence of the nuclease itself prevents access of RNA polymerase and thereby represses transcription in a reversible manner. This offers a level of sophistication in that it is possible to modulate the level of inhibition, rather than just inducing complete loss of the protein. An extension of this technique involves the use of a fusion of dCas9 to a transcriptional silencing domain (eg dCas9- KRAB). In a similar manner, the generation of dCas9 molecules fused to transcriptional activating moieties such as VP64, enables interrogation of the effects of activating a gene, a technique termed CRISPRa. This allows ‘gain of function’ studies to be performed, which yields considerable value in studying the normal function of a target.
The ‘CRISPR toolbox’ also comprises an array of techniques and enabling technologies. For example, delivery of the Cas9 can be achieved not only through the use of various transfection methodologies for DNA entry into the cell, but also through the introduction of Cas9 mRNA or protein. Fusion of Cas9 to a fluorescent tag, such as GFP, enables the enrichment of targeted cells through the use of FACS, increasing cell line generation efficiency. Similarly, the use of droplet digital PCR (ddPCR) with appropriate primers and probes, enables quantitative analysis of the sequence being targeted, thereby giving an indication of both the allele number of the targeted gene and also the efficiency of modification.
Together, the CRISPR/Cas9 methodology allows a huge range of changes to be made, including gene knockouts, knock-ins, the introduction of specific mutations or insertions and the ability to induce or repress transcription, to name but a few. The most significant consideration with all these changes is that they are modifications to the endogenous gene, which allows study of the protein at its native levels. This is an advantage compared to overexpression systems in which the protein exists at raised levels when compared to the normal levels of both the protein of choice, and its interacting partners. Thus, the range of changes that can be made, coupled with the greater physiological relevance deliver a technique that will likely have a huge impact across the entire breadth of the drug discovery process (Figure 3).
Identifying CRISPR targets
An increasingly important step in the drug discovery process is the identification of novel, validated targets, whose pharmacological modulation may yield the desired therapeutic modality. To date, only a fraction of the proposed ‘druggable’ genome has been explored and techniques such as the screening of siRNA libraries against a specific phenotypic endpoint have been deployed to identify novel targets for a particular condition. Although siRNA libraries have proved to be an extremely valuable tool in target identification, this technique has several limitations. Gene inactivation through siRNA is often incomplete and thereby not representative of a true loss of function. In addition, confounding off-target effects of siRNA molecules are widely reported and represent a significant challenge.
The ability to specifically modulate the endogenous gene with CRISPR/Cas9 and introduce a complete genetic knockout, while minimising offtarget effects, offers an improved approach to target identification. Moreover, the ability to scale this approach through the generation of genomewide CRISPR libraries, coupled with the use of lentiviral delivery methods enables high-throughput loss-of-function screens to be performed rapidly and identify genes whose activity is important for the specific endpoint being measured. For example, such approaches have been used by scientists at the Sanger Institute of the Wellcome Trust in Cambridge/UK, where a library of ~88,000 sgRNAs targeting approximately 19,000 genes was used to identify previously unknown genes that modulate specific endpoints8. In addition, the use of CRISPRa is of value in the target identification phase, as it offers the potential to perform large scale gain-of-function screens, which could identify a different range of targets from inhibitory screens9. Following the identification of novel targets, rigorous target validation (TV) is required and this is another area where CRISPR can have a major impact by enabling the efficient introduction of engineered alterations into the genome.
Through the generation of knockout cell lines, it is possible to evaluate the effect of loss of the target, giving an indication of the potential effects of pharmacological inhibition. While this will often be achievable through a simple genetic knockout (resulting from indels introduced during NHEJ repair), iCRISPR strategies have also been developed that enable the study of knockouts which are thought to be lethal to the cell. The use of CRISPRa can also yield information about the activation of the target, which may represent a pharmacological approach. In conjunction with the use of CRISPRi, modulation of the target through both activation and inhibition can yield increased confidence in normal functioning of the gene product.
Many targets are identified through mutational analysis and CRISPR enables the rapid generation of cell lines harbouring such mutations, in the desired cellular background, to recapitulate this scenario. The recent development of sequencing technologies and next-generation sequencing (NGS) in particular, help provide a much greater understanding of the genetics underlying many conditions. By coupling this knowledge with CRISPR/Cas9 tools, more physiologically relevant systems can be generated on which to base pharmacological studies. For example, isogenically- paired cell lines harbouring specific changes on an identical genetic background enable interrogation of both the genetic changes and the action of lead molecules.
A further benefit of the CRISPR/Cas9 approach is the speed and relative ease with which cellular models can be generated. Through standard processes such as transfection methodologies, FACS sorting and sequence analysis, cellular modifications can be rapidly engineered (Figure 4). Thus in conjunction, these approaches can provide a wealth of data on the normal functioning of a target within the cellular environment. This, in turn, should yield better validated targets for progression into full drug discovery campaigns, with appropriate hit-finding strategies.
Designing CRISPR assay cascades
A crucial step in attempting to identify better lead molecules and reducing candidate drug attrition is the development of more physiologically relevant assays. Historically, cell assays have often involved the transient or stable overexpression of targets or the use of fixed-cell endpoints, however the more representative nature of CRISPR-modified cells offers huge advantages. Assays can be developed that monitor the behaviour of both the native or modified target at endogenous levels, which should prove advantageous to the use of overexpressing cell lines, or reporter gene lines. CRISPR technology allows tags to be added to the native protein in a strategy called ‘endogenous tagging’10. The incorporation of a fluorescent tag such as GFP, for example, allows the direct visualisation of the target, both at native levels and also in real time. This lends itself to application in high content imaging assays, where the location, movement and behaviour of the protein within the cell can be monitored and removes the need for either the overexpression of tagged variants, or the use of fixed endpoints with antibody-based visualisation, both of which have their challenges. Thus for proteins where suitable antibodies are not available, or it is desirable to monitor the target in real time, this may well become the method of choice. Again the pace of development of the CRISPR field is evident with the advances being made, and the recent development of the SunTag approach11, whereby CRISPR/Cas9 can be used to tag a single protein molecule with up to 24 copies of a fluorescent tag, enables both improved imaging of proteins and an approach for multimerising proteins on a target scaffold. Even at the DNA level, the use of appropriate sgRNAs with dCas9 fused to a fluorescent tag enables the direct visualisation of the chromosomal DNA at a specific locus in live cells.
Developing CRISPR disease models
Another approach is the introduction of mutations and knockout cell lines, to generate isogenically paired cell lines for assay use. These can be screened in a differential manner, to identify compounds that work in one cellular context, but not another. This could prove particularly advantageous in phenotypic approaches where hit molecules that only show a phenotypic effect in the wild type, but not mutant/knockout setting of a gene, can be identified and linked to the edited target. The isogenic approach will be increasingly valuable with the growing drive to align lead molecules to a specific clinical patient segment as early as possible. Taking this a step further, it is also possible to generate more complex cellular models through the introduction of a range of changes and this may be a case where the multiplexing capability of CRISPR/Cas9 is valuable. Should the desired clinical context not be available in an appropriate cell type, CRISPR could be utilised to engineer the scenario of choice.
This is not only true of basic immortalised cell types, as CRISPR has also been used to introduce modifications into more representative cell types such as primary cells and induced pluripotent stem cells (iPSC), further increasing the physiological relevance of the desired engineered model. Although CRISPR may have huge impact for the generation of complex cellular models, it is not limited to in vitro assays. To date a range of species have been modified using the CRISPR technology, including mice, rodents, zebrafish and primates. The ability to use CRISPR in an in vivo setting has leveraged significant cost and time savings in the generation of animal models. Direct injection of Cas9 and transcribed sgRNA into fertilised zygotes allows the traditional ES cell targeting phase to be bypassed, thereby reducing the generation time of murine models from over a year to just a few months, while improving precision and reducing animal use12. Moreover, CRISPR allows the insertion or the induction of target-specific genetic mutations directly into the adult animal circumventing the issues associated with the study of embryonic lethal genes.
In addition, many of the advantages that have already been discussed for cellular model generation also hold true in the in vivo setting. For example, the use of multiplexed sgRNAs has been achieved in vivo, which again facilitates the ability to generate accurate models of complex human diseases in an efficient manner. This functionality should prove particularly valuable in the lead generation to clinical candidate selection stages of the drug discovery pipeline. The generation of complex physiologically-relevant in vivo models is critical for numerous capabilities, including efficacy testing, drug metabolism and pharmacokinetic studies and safety profiling. The use of CRISPR extends beyond the preclinical environment because it offers drug discovery teams the potential to respond rapidly to breaking clinical data. For example, a key challenge within oncology is the capability of a tumour to acquire resistance to pharmacological agents. This often results from the selection of mutations with resistance- conferring properties. Upon the clinical observation and characterisation of such genetic modifications, the efficient nature of CRISPR targeting allows the rapid generation of both cellular and in vivo models that recapitulate the clinically observed scenario. This, in turn, enables drug discovery teams to rapidly respond to breaking clinical data in order to design the next generation of therapeutic models.
In addition to this widespread impact across the breadth of the classical small molecule drug discovery process, CRISPR potentially represents a therapeutic opportunity in its own right, namely as a gene therapy. So far CRISPR/Cas9 has been used to modulate genes in in vitro and in vivo models in a wide range of non-human species. The recent correction of the hereditary tyrosinemia disease phenotype through the direct hydrodynamic delivery of CRISPR/Cas9 components and an oligonucleotide repair template into mouse liver highlights the promise of CRISPR for gene therapy13. This is further supported by clinical trials on the ex vivo use of ZFN to knockout the CCR5 receptor as a treatment for HIV infection, which yielded reductions in HIV DNA and RNA levels.
This raises the possibility of the use of Cas9 as a direct therapeutic for treating genetic disorders in humans. Rather than traditional gene therapy methods, such as gene augmentation, which deliver additional functional copies of a gene, Cas9- induced repair of the endogenous mutated gene offers significant advantages in terms of direct repair, although numerous challenges remain. Probably the greatest of these is the need for efficient and efficacious targeted gene delivery systems. These would need to be tailored for the individual applications, to enable targeting to the specific disease tissues or cell types. Another consideration that would need to be addressed is the safety of Cas9 introduction, as well as the related potential for off-target effects. Numerous hurdles remain, but the potential is growing for the ultimate delivery of a successful therapeutic CRISPR/Cas9 modality.
A CRISPR future?
As outlined in this brief review, the potential impact of the CRISPR/Cas9 technology in pharmaceutical research is enormous. The breadth of impact spans the entire drug discovery process from target identification through to the generation of models based on clinical observations and both the use and potential opportunities are rapidly increasing. The dramatic pace of development of the CRISPR genome editing field is highlighted by the growing number of recent publications. Searching PubMed for the term ‘CRISPR’, both independently and in conjunction with ‘genome editing’ shows the exponential growth of the field over recent years. Whereas a growing number of articles on the bacterial CRISPR system were published annually prior to 2012, since the reports of CRISPR/Cas9 as a gene editing tool, the number of publications has grown enormously.
In parallel, significant investments and collaborative agreements are being put in place to both drive forward progress and leverage the potential opportunities. Recent examples include the range of agreements put in place by AstraZeneca and others to collaborate with leading academic, biotechnology and reagent partners around the world. AstraZeneca has announced four research collaborations aimed at harnessing the power of CRISPR for use across its drug discovery platform to identify new drug targets and enable target validation in systems that more closely resemble human disease. The collaborations complement the company’s in-house CRISPR programme and, in line with its open innovation approach, findings will be published in peer-reviewed journals to advance the application of CRISPR technology
We currently reside at an exciting point in the development and application of genome editing technologies. Not only is the potential of CRISPR currently being realised in the development of in vitro and in vivo models for drug discovery, but the wider potential opportunities ahead are being brought closer by the dramatic progress being made in laboratories throughout the world. This is why there is a belief that CRISPR/Cas9 technology could be a lynchpin in demonstrating what science can do to transform the drug discovery process
The authors wish to thank Emanuela Cuomo, David Fisher and Martin Main for input into the manuscript; Max Gallucci for illustration support, along with Linda MacCallum, John Wiseman and all members of the AstraZeneca Genome Editing teams in the UK and Sweden for their support with precise genome editing.
Dr Jonathan Wrigley is Associate Director of the Cell Reagents & Assay Development group within Discovery Sciences at AstraZeneca, Cambridge, UK. The group is responsible for the provision of cells, cellular reagents and development of cellular assays for a wide range of therapeutic areas, and includes the generation of cell lines through precise genome editing. Previously, Jonathan obtained a BSc in Biochemistry & Molecular Biology from the University of Leeds and subsequently continued his studies there, obtaining a PhD on the biochemical characterisation of retinal degeneration. In 2000, Jonathan joined Merck, Sharp and Dohme, working at the Neuroscience Research centre in Harlow, during which time he published work on the characterisation and targeting of gamma-secretase for Alzheimer’s disease. In 2006 Jonathan joined AstraZeneca, initially leading a team in Oncology Lead Generation and later in the Assay Sciences group, before moving into his current role in 2011.
Dr Marcello Maresca is an Associate Principal Scientist in the Transgenic function in the Reagents & Assay Development department. He has helped to implement the genome editing platform in AstraZeneca and he is currently working on the strategy and generation of transgenic animals and reagents for genome edited cell line generation activities at AstraZeneca. He obtained his PhD in molecular biology from the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden, where he worked on the development of the most used genome editing tool in bacteria, recombineering. He did his postdoctoral work at the Novartis Institute of BioMedical Research where he developed new methods of genome editing in mammals using ZFNs, TALENs and the CRISPR system.
Dr Karen Birmingham is the Global Science Media Relations Director at AstraZeneca. She works alongside R&D teams to understand their work and share this with the media to explain what science can do to deliver innovative medicines to patients. Karen joined AstraZeneca from GSK where she was most recently the communications and engagement lead for their bioelectronic medicines programme. Previously, Karen worked in science journalism and was Senior Editor at Nature Medicine, the Editor of three healthcare journals for the Royal College of Nursing and a reporter on Scrip. She obtained a BSc in Pharmacology from the University of Leeds and a PhD in neuroscience from the University of Bath.
Dr Mohammad Bohlooly-Y is an Associate Director working as head of the Transgenic function in the Reagents & Assay Development department. His team has global responsibility for generation of transgenic animals, transgenic in vivo mouse physiology validation and reagents for PGE based cell line generation activities at AstraZeneca. His team supports support all Innovative Medicine areas in AstraZeneca and is based in Sweden. He obtained his PhD in physiology from Gothenburg University and joined AstraZeneca in 2001. He has published more than 50 papers in peer-reviewed journals and served as a reviewer on several scientific journals.
Dr Lorenz Mayr joined AstraZeneca in September 2012 as Vice-President, Reagents & Assay Development with global responsibility for generation of biological reagents and assay development. This encompasses the generation of proteins and cell lines for hit finding, hit-to-lead and lead optimisation activities including structure & biophysics activities across all therapeutic areas, the generation of tool antibodies, transgenic animals, stem cells and primary cells as tools for target validation studies and lead optimisation programmes. His department in the UK and Sweden is responsible for assay development activities for biochemical, cell-based and phenotypic assays for all therapeutic areas at AstraZeneca. Before that, he worked as Executive Director at Novartis Pharma in Basel, Switzerland, at Bayer Pharma Research in Wuppertal, Germany, at Bayer Central Research in Leverkusen, Germany and at the MIT/Whitehead Institute in Cambridge, Massachusetts (USA). He has published more than 50 papers in peerreviewed journals and serves on several editorial and scientific advisory boards, including two terms at the Board of Directors for the Society of Biomolecular Sciences (2004-11) and working as the Conference Chair of the MipTec Drug Discovery Conference, Europe’s largest drug discovery event, held in Basel, Switzerland.
1 Carroll, D. Genome Engineering with Targetable Nucleases. Annu Rev Biochem. 83, 409-439 (2014).
2 Doudna, JA and Charpentier, E. The new frontier of genome engineering with CRISPRCas9. Science 346, (6213):1258096. doi: 10.1126/science.1258096 (2014).
3 Ran et al. Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154, 1380-9. doi: 10.1016/j.cell.2013.08.021 (2013).
4 Tsai, SQ et al. Dimeric CRISPR RNA-guided Fok1 nucleases for highly specific genome editing. Nature Biotechnology 32, 569-576. (2014).
5 Gonzalez, F et al. An iCRISPR platform for rapid, multiplexable and inducible genome editing inhuman pluripotent stem cells. Cell Stem Cell 15, 215–226 (2014).
6 Qi, LS et al. Repurposing CRISPR as an RNA-guided platform for sequence specific control of gene expression. Cell 152, 1173-1183 (2013).
7 Gilbert, et al. CRISPRmediated modular RNAguided regulation of transcription in eukaryotes. Cell 154, 442-451 (2013).
8 Koike-Yusa et al. Genomewide recessive screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nature Biotechnology 32, 267-277 (2014).
9 Gilbert, LA et al. Genomescale CRISPR-mediated control of gene repression and activation. Cell 159, 1-15 (2014).
10 Fetter, J et al. Endogenous gene tagging with fluorescent proteins. Chromosomal Mutagenesis, Methods in Molecular Biology vol. 1239, Chapter 12. (2015).
11 Tanenbaum, ME et al. A protein tagging system for signal amplification in gene expression and fluorescence imaging. Cell 159, 635-646 (2014).
12 Platt, RJ et al. CRISPR-Cas9 Knockin Mice for genome editing and cancer modelling. Cell 159, 440-455 (2014).
13 Yin, H et al. Genome editing with Cas9 om adul;t mice corrects a disease mutation and phenotype. Nature Biotechnology 32, 551- 553 (2014).