Informatics
Drug Discovery World
Blockchain technology in drug discovery: use-cases in R&D
By Dr Richard Shute
Fall 2017

There is no doubt that the interest and hype around blockchain or distributed ledger technology (DLT) builds daily. The explosion of articles and books recently is testament to this. But what is a blockchain and how might DLT impact healthcare, Pharma and R&D?

Running through the profusion of publications on blockchain technology in the traditional press (1), online (2) and via social media (3), there is a thread that speculates on how DLT will disrupt most, if not all, industries (4) – building out from FinTech and bitcoin (5), which is where blockchain first started, to industries as diverse as diamonds, music and real estate (6). But what is a blockchain? For the sake of this article, which is focused on what you can do with the technology, ie the use-cases, rather than the ‘nuts and bolts’ of what the technology is, a good, relatively digestible definition or description is as follows (7):

“A blockchain is a digital, distributed transaction ledger, with identical copies maintained on multiple computer systems controlled by different entities. Anyone participating in a blockchain can review the entries in it; users can update the blockchain only by consensus of a majority of participants. Once entered into a blockchain, information can never be erased; ideally, a blockchain contains an accurate and verifiable record of every transaction ever made.”

Throughout these articles, one industry routinely appears listed as ripe for disruption by DLT – healthcare (8). Indeed, a recent article has identified 10 new blockchain-based healthcare startups (9); there are many more! But while there are 10, 20, 100 or more would-be blockchain startups rushing to get their Initial Coin Offerings (ICOs) (10) out there so that they can put their killer idea into practise, it would seem sensible to at least try to identify the sweet-spots within any industry, and particularly healthcare, where blockchain technology might actually make a difference. To that end there have been published a few more sober and hype-reduced articles on the use-cases of DLT that might make a real difference to industry (11).

To help provide some more insight to those usecases that might be more beneficial and compelling within the healthcare industry and more specifically in drug discovery, in May 2017 Curlew Research facilitated a workshop at BioIT’s World Expo in Boston on Blockchain Technology (12). One session of the workshop (13) used the knowledge of the nearly 30 attendees, many of whom had Pharma and Biotech expertise, and some of whom already had blockchain experience, to try to identify some of the more compelling business processes or capabilities in the medicines and healthcare value chain from target to patient where DLT might be beneficial. In this article I expand on the suggestions and insights which came out of that workshop, focusing on potential use-cases in drug discovery R&D, from target identification up to regulatory submission, review and approval.

In a subsequent piece I will look at the postR&D space from regulatory submission, via manufacturing, to the patient themselves.

Blockchain use-cases in drug discovery R&D

In the drug discovery R&D domain (defined for the purposes of this article as from Target ID to the end of Phase III clinical), there were a number of important use-cases identified by the attendees.

These clustered into seven higher level areas:

1. Patents and Intellectual Property (IP)

2. Genomic data management

3. Raw and refined research data

4. Collaboration

5. Clinical trials

6. Licensing

7. Electronic signatures

-

1. Patents and Intellectual Property (IP)

A critical part of R&D is the generation of intellectual property (IP), usually (but not exclusively) through the production of patents such as ‘composition of matter’ linked to utility. One of the earliest clear use-cases for DLT was its ability to timestamp unalterably a file or document, through storage of the hash of the file (14,15) on the blockchain. Such a file timestamping service is already available through sites such as Proofofexistence.com (16).

Currently IP is handled pre-clinically by most organisations through the use of electronic lab notebooks (ELNs). These electronic record (ER) systems are now well-accepted legally for the capturing of IP (17), as they are able to provide proof of who created the ER (ie Identity), when it was created (ie Timestamping) and what it contained on the date of creation (ie proof of Content). One additional, ideal facet would be the ability to prove that the ER has not been altered since it was created (ie Immutability). These four, critical properties: Identity, Timestamping, proof of Content (via hashing) and Immutability – what I term ITCI – are four of the fundamental use-cases provided by blockchain technology. In the area of patents, IP and ELNs, it therefore seems likely that highly secure, blockchain-enabled ELNs will be with us in the not too distant future. DLT could also enable a more fine-grained approach to patenting which could advantageously support the increasing trend towards pre-clinical drug discovery patents being filed later and with fewer compounds (18) – sometimes with only the one compound of real interest. If the ITCI data on that compound and its relevant bioactivity have been captured on a blockchain, then there is a clear, unarguable ‘stake in the ground’ for when that material has been first generated. We will have to wait and see how, or whether, this might affect ‘first to file’ (19), the patenting approach now adopted pretty much everywhere in research/discovery based industries.

2. Genomic data management

Genomic data plays an important role in the early stages of drug discovery R&D as part of the feedback from the clinic to target identification and val idation. Since the advent of the ‘$1,000 Genome’ (20) there is an ever-increasing amount of genomic data being generated: on patients by healthcare organisations, physicians and researchers; and on ‘healthy’ individuals by choice, often driven by a desire to know more about their ethnic origins and their genetic predisposition to disease. While this explosion of genomic ‘big data’ is invaluable to the research community, including those scientists in Pharma R&D, there is a risk that individuals’ genomic data – the most fundamental data about ‘you’ – might be hacked, stolen or otherwise abused if good security and privacy controls are not put in place. This use-case of security and privacy of genomic data is one where blockchain technology can play a major beneficial role; indeed the first presentation specifically on blockchain at BioIT World Expo in 2016 was on this very topic (21). More recently, an article in Forbes reinforced this use-caseb(22) and introduced another new blockchain start-up (23) whose aim is to offer services to manage and market genomic data securely (24). Once again, four of the fundamental properties of blockchains: Identity, Timestamping, Content and Immutability (ITCI), seem to offer great potential to make genomic data more widely and more securely available to individuals, healthcare workers and researchers.

3. Raw and refined research data

The 21st century drug discovery research process now universally involves the use of instruments and techniques such as NMR, Mass Spec, HPLC, etc, to support the laboratory-based work and to help prove or disprove the experimental products, findings and conclusions. The instruments produce what is known as raw data (the output files), which are then processed into more human-readable and interpretable files known as refined data. Raw and refined data files comprise much of the evidence that an experiment has been performed (successfully or not), and they often make up a significant part of an ELN entry, whether they are stored within the ELN system or not (eg in a scientific data management system or SDMS (25)).

One use-case that arose during the workshop concerned these raw and refined data files and relates to the growing issue of file tampering (26). If data files could be ‘stored’ on a public blockchain (eg via hashing on the bitcoin blockchain through a service such as Proofofexistence.com or Tierion (27)) then the ITCI aspects described above could be exploited and an audit trail back from publication to the original file could be established. To exemplify this further, consider an image file generated from a cellular staining experiment, or a photograph of a 2D gel; both of these are humanreadable, refined data files. If, at the time of generation, the file were hashed14 using a public hashing algorithm (eg SHA256 or MD5), and the hash stored both in an SDMS system and on a public blockchain (28), then when the research is published and the image or photograph file is reproduced in the paper along with its hash, and if the associated electronic data file is also made available for downloading (for example via Figshare or Zenodo (29)), anyone can take that file, perform the same hashing routine and compare this hash to that published in the paper and to the same hash stored on the public blockchain. If the hashes are identical, this proves the files are identical (proof of Content) and unchanged (Immutability). It will also give the time when the original file was placed on to the blockchain (Timestamping). Blockchain enabling of ELNs, SDMSs and LIMS, combined with greater, more open availability of key, supportive research data files so that more checking for integrity via hashing and timestamping is made possible, could thus help to heal this growing sore within the scientific publication domain.

4. Collaboration

One of the primary foundations of blockchain technology, as described by Satoshi Nakamoto in his/her/their original whitepaper (30) is the concept of disintermediation or the enablement of “multiple parties who do not fully trust each other to safely and directly share a single database without requiring a trusted intermediary”11a. If, in that quotation you replace the words “a single database” by any of the following: data, results, information or knowhow, then you have the basis for a system that can support more effective collaboration between parties who would otherwise be very nervous or suspicious about working together. Current drug discovery R&D is now increasingly being done by a network of organisations, often in a classic, centralised customer-supplier relationship, but increasingly in a more collaborative, peer-to-peer, decentralised ecosystem where information, data and results are shared more or less openly (31).

Blockchain technology, probably with off-chain storage, which is needed because current block size in most blockchains is not sufficiently big to allow full datasets to be stored on-chain, could enable this more collaborative, distributed mode of product (not just drug) discovery to become the standard research model, so superseding the historical centralised mode. DLT-enabled or merely blockchain-connected ELNs, LIMS and SDMSs as predicted above could facilitate and de-risk this research revolution without compromising critical company IP. True, global collaborative drug discovery could finally be fully enabled by the blockchain.

5. Clinical Trials

Use-cases around blockchain support for the area of clinical trials and the supply chain comprising Pharma-Physician-Patient-Data have been the subject of several recent articles (32). The exact mechanisms for how DLT could positively impact clinical trials’ support are still being hotly debated. Indeed, a recent, heavily-publicised paper describing how “blockchain-timestamped clinical trials protocols could improve the trustworthiness of medical science” has recently been retracted (33). Nevertheless, blockchain technology and its four primary facets of ITCI could most definitely smooth and facilitate the undeniably complex transactional space that is the clinical trial. DLT could have a significant impact, from patient recruitment, supply of trial medicine (or placebo), to protocol and results securing; and from billing, to more easily allowing individual physicians to run different trials for multiple Pharma. DLT may even lead to the situation where physicians are disintermediated from the clinical trials process altogether!

6. Licensing

In use-case #4 above, I discuss how DLT could help secure the newer, decentralised, more collaborative form of drug discovery. An additional arm of effective medicines’ discovery that has become more and more popular over the last 20 or more years has been the cross-licensing and joint development of clinical candidates. Two good examples from my own time in Big Pharma are lisinopril and rosuvastatin (34). Licensing in the R&D environment does not have to just involve compounds. Technology and IP – from the use of software and content to scientific techniques, and to the use of high-tech equipment – can all be subject to licensing under a variety of different contractual models, eg fixed price, measured usage, named users, royalties, ‘all-you-can-eat’, etc. Licensing is governed by the contract that sits behind the deal. Blockchain technology has huge potential in the governing and management of contracts through the use of so-called ‘smart contracts’ (35), which grew from ideas initially put forward by Nick Szabo in the 1990s (36). The DLT smart contract (SC) approach has been pioneered by the Ethereum Foundation (37). In essence, DLT SCs combine the concepts of classical contract law with the ‘stored procedure’ ideas of traditional database technology (38) but using blockchain technology. The blockchain checks that certain key contractual conditions have been met, and then enforces the subsequent contractual commitments automatically. To describe this in a little more detail (39):

“A contract is an agreement between two or more people involving conditional commitments, ie: “If you do X for me, I will do Y for you.” A legal contract makes those conditional commitments legally enforceable. If you fail to do X for me, I can take you to court and have you ordered to do X, or ordered to pay me compensation for failing to do X. A smart contract is effectively the same, only you use some technological infrastructure to ensure that conditions have been met and/or to automatically enforce commitments. This can be done using blockchain technology because the distributed ledger system can be used to confirm whether contractual conditions have been met.”

So if, for example, a pre-clinical drug candidate is cross-licensed from Institution ‘A’ to Pharma ‘B’ with royalty payments from ‘B’ to ‘A’ being dependent on say, compound submission to the FDA, this can be encoded as a smart contract. Then, as soon as the regulatory package has been received by the FDA, a payment is made. In the clinical trials domain, the use of smart contracts to manage billing and reduce fraudulent claims has also been proposed as a beneficial use-case for DLT in healthcare32c. Patient recruitment to a trial could also be subject to a DLT SC such that the moment the target for genuine trial recruits has been reached by a physician or healthcare organisation, some sort of remuneration to that doctor or hospital could automatically be enabled. Another possible example of the use of DLT SCs (not just in healthcare and not just in R&D) could be pay-as-you-go usage of software or of a high-tech scientific instrument (40).

I have given just a few examples of the potential for blockchain-enabled smart contracts in the healthcare domain, but the huge growth in the use of the Ethereum blockchain over the last six to nine months (as measured by the increase in value of ‘ether’, the coin or token that enables one to use the Ethereum blockchain (41)) is evidence of the massive interest in DLT SCs in many industries, not just in healthcare. Time will tell where the more specific value-added uses of blockchain-based smart contracts in healthcare will make their mark.

7. Electronic signatures

The digital signing of documents has become ubiquitous in modern-day business, not just across healthcare but across multiple industries. Wherever it is critical for an individual or an organisation or even an electronic device to affix ‘something’ which confirms their identity to a document, a file, or to an experiment, etc – in this modern digital age, that ‘something’ now takes the form of a digital signature. A natural consequence of this is the concept of digital identity (42). The use of blockchain technology to support a higher level of security and immutability to next generation digital identity, and to allow individuals to take back a higher degree of control over their own identity, has become one of the biggest areas of DLT exploration over the last few years (43). Numerous start-up companies (44) and several initiatives have been established to look at DLT and digital identity, eg the Decentralised Identity Foundation (DIF) (45). Many of the more significant global technology players have also become involved in such ventures (46).

In drug discovery R&D digital signatures are used, inter alia:

- In ELNs, to confirm that an experiment has been performed by an individual and that it is understandable to a second individual who is “knowledgeable in the art’.

- In documentation that will form part of any later regulatory submission.

- In contracts.

- In Good Laboratory, Manufacturing or Clinical Practise (GLP/GMP/GCP) documents.

Digital signatures are in fact used right across the R&D domain. Any technology that can make digital signing and identity confirmation more secure and certain, and so make fraud and identity misappropriation less likely, will make the discovery and marketing of the drugs that are prescribed to patients more trustworthy and more trusted. While preliminary descriptions of the use of DLT in digital signatures have been described (47), it will probably be digital identity initiatives such as DIF that will more likely produce the new industrystandard signatures that future blockchain-enabled ELNs and document management systems will use across the R&D space, again, not just in healthcare but in many other industries.

Conclusion

In this article on blockchain technology in drug discovery, I have focused on potential use-cases in R&D. They all build upon four fundamental facets of DLT: Identity, Timestamping, Content and Immutability (ITCI). Many are supported by observations and suggestions put forward by others, both online and in the literature. Taken altogether, it is clear that DLT is going to be a major technological player in healthcare and drug discovery in the future. Furthermore, that future is not too far away, according to a survey of 120 senior pharmaceutical and life science leaders conducted in June 2017 by The Pistoia Alliance (48). This survey concluded that “interest in blockchain is high with a significant 83% expecting blockchain to be adopted in under five years.”(49) When polled on the single most significant hurdle or perceived limitation that blockchain must overcome before it is adopted widely in healthcare, “the leaders identified the biggest hurdle as regulatory issues (45%), followed by concerns over data privacy (26%).” A more focused question on the specific areas where blockchain tech will have the greatest impact suggested that manufacturing and the medicines’ supply chain (68%) followed by the health records area, including genomic data (60%) were the areas likely to be most affected by DLT.

These observations suggest that blockchain technology may have its biggest impact post-R&D. I contend, however, that DLT will also disrupt the pre-submission, R&D space. Some suggestions about how and where that disruption might occur along the medicines discovery value chain are described above. Time will tell whether any of these lead to successful blockchain-based products, services or businesses, or whether some other currently unknown use-case will appear that becomes the killer app for blockchain just as email was for the internet (50). Rest assured, blockchain technology is here to stay and medicines discovery will be changed by it.

Acknowledgement

I should like to thank Dr Nick Lynch of Curlew Research for his help, advice and support in the writing of this article.

---

Dr Richard Shute is an experienced medicinal chemist and informatics IS/IT manager. His PhD was on β-lactam antibiotics from Nottingham University. Richard worked for more than 25 years in Big Pharma at ICI, Zeneca and AstraZeneca; half that time in chemistry, the other half in informatics. Richard has worked for Curlew Research as a consultant since 2015.

---

References

1 (a) “Blockchain Revolution”, Don Tapscott & Alex Tapscott, Portfolio/Penguin 2016, ISBN: 978-0-141-23785-4; (b) http://www.telegraph.co.uk/tec hnology/news/10881213/Thecoming-digital-anarchy.html; (c) https://www.theguardian.com/ world/2016/jul/07/blockchainanswer-life-universeeverything-bitcoin-technology; (d) https://www.economist. com/news/world-if/21724906trust-business-little-noticedhuge-startups-deployingblockchain-technologythreaten.

2 (a) https://hbr.org/2017/ 01/the-truth-about-blockchain; (b) https://www.forbes.com/ sites/forbestechcouncil/2017/0 2/16/how-blockchain-willevolve-in-2017/# 306322876343; (c) https:// www.wired.com/2017/03/forge t-bitcoin-blockchain-revealwhats-true-today-tomorrow/; (d) http://www.coindesk.com/ information/; (e) http://www. coindesk.com/research/stateblockchain-q1-2017/.

3 https://www.linkedin.com/ pulse/8-lessons-blockchainyou-wont-read-anywhere-elsejiri-kram.

4 (a) “The Business Blockchain”, William Mougayar, Wiley & Sons 2016, ISBN: 9781-119-30031-1; (b) http://in. pcmag.com/amazon-webservices/112363/feature/blockc hain-the-invisible-technologythats-changing-the-world.

5 https://www2.deloitte.com/ us/en/pages/aboutdeloitte/articles/pressreleases/deloitte-surveyblockchain-reaches-beyondfinancial-services-with-someindustries-moving-faster.html.

6 http://www.nasdaq.com/ article/how-the-blockchain-isbeing-used-beyond-bitcoinand-finance-cm571563.

7 http://deloitte.wsj.com/cfo/ 2016/02/26/beyond-bitcoinblockchain-is-coming-todisrupt-your-industryweekend-reading/.

8 (a) https://public.dhe.ibm. com/common/ssi/ecm/gb/en/gb e03790usen/GBE03790USEN.P DF; (b) https://www.healthit. gov/sites/default/files/8-31blockchain-ibm_ideationchallenge_aug8.pdf; (c) https://hashedhealth.com/block chain-101-healthcare/; (d) https://hashedhealth.com/hashe d-health-blockchain-proofconcept/; (e) http://www. healthcareitnews.com/news/blo ckchain-faces-toughroadblocks-healthcare.

9 https://www.intelligenthq. com/innovationmanagement/blockchainhealthcare-startups/.

10 (a) https://www.forbes. com/sites/laurashin/2017/05/16 /icos-why-people-are-investingin-this-380-millionphenomenon; (b) http://www. coindesk.com/an-amoraldefense-of-blockchain-tokensas-an-ok-thing/ ; (c) https:// www2.deloitte.com/us/en/page s/consulting/articles/initialcoin-offering-a-newparadigm.html.

11 (a) http://www.multichain. com/blog/2016/05/fourgenuine-blockchain-use-cases/; (b) https://bravenewcoin.com/ news/moodys-new-reportidentifies-25-top-blockchainuse-cases-from-a-list-of-120/; (c) https://everisnext.com/ 2016/05/31/blockchaindisruptive-use-cases/; (d) https://www.coindesk.com/four -quadrants-dividingconquering-crypto-universe.

12 http://curlewresearch.com/ curlew-blockchain-workshopbioit-2017/.

13 https://www.slideshare.net/ RichardShute1/an-introductionto-blockchain-in-healthcare/.

14 http://unixwiz.net/techtips/ iguide-crypto-hashes.html.

15 There are many hashing algorithms available and many online sites that can take files and generate the relevant hash. See for example http://hash.online-convert.com/ sha256-generator which calculates the SHA256 hash for any given inputted file.

16 https://proofofexistence. com/about.

17 http://accelrys.com/micro/ notebook/documents/eln-legalissues-sandercock.pdf.

18 I use the term ‘compound’ in this article to cover small molecules, the traditional drugs of Pharma, as well as the more modern large molecule therapeutics including biomolecules, antibodies, RNAderived materials (eg siRNA), proteins, etc.

19 https://en.wikipedia.org/ wiki/First_to_file_and_first_to_ invent.

20 (a) http://www.nature.com/ news/technology-the-1-000genome-1.14901; (b) http://www.bio-itworld.com/ 2016/3/28/how-veritasgenetics-plans-make-999whole-genome-stick.html.

21 https://www.slideshare.net/ RichardShute1/securingpersonal-genomic-data-res.

22 https://www.forbes.com/ sites/patricklin/2017/05/08/bloc kchain-the-missing-linkbetween-genomics-and-privacy.

23 http://www.encrypgen.com.

24 Disclaimer: Curlew Research staff act as professional advisors to Encrypgen LLC.

25 https://www.limswiki.org/ index.php/Scientific_data_mana gement_system.

26 http://go.nature.com/2toIbSl.

27 https://tierion.com/.

28 Depositing the hash of a file on to a public blockchain does not constitute publication of the data or the results contained with the file.

29 (a) https://figshare.com/; (b) http://about.zenodo.org/.

30 https://bitcoin.org/en/ bitcoin-paper.

31 (a) http://onlinelibrary.wiley. com/doi/10.1207/s15516709co g2102_1/epdf; (b) http:// csmres.co.uk/cs.public.upd/arti cle-downloads/Khanna.pdf; (c) http://eu.wiley.com/WileyCDA/ WileyTitle/productCd0470917377.html; (d) https://www.yahoo.com/news/s even-pharma-companiespartner-gates-153531737.html.

32 (a) https://www.linkedin. com/pulse/blockchain-use-caseshealthcare-anca-petre – usecase #2; (b) https://f1000 research.com/articles/6-66/v3; (c) https://www.forbes.com/ sites/reenitadas/2017/05/08/doe s-blockchain-have-a-place-inhealthcare – use-cases #1 & #4.

33 https://f1000research.com/ articles/5-222/v3.

34 Zestril® and Crestor®, both licensed by Zeneca Pharmaceuticals (now AstraZeneca) from Merck and Sumitomo respectively – both went on to be ‘blockbuster’ drugs.

35 https://medium.com/ blockchain-dreams/smartcontracts-101-really-smart59f76cdd06ef.

36 http://www.alamut.com/ subj/economics/nick_szabo/sm artContracts.html.

37 https://ethereum.org/ foundation.

38 https://en.wikipedia.org/ wiki/Stored_procedure.

39 http://hplusmagazine.com/ 2015/11/18/blockchaintechnology-smart-contractsand-smart-property/.

40 If one considers a car to be a ‘high-tech instrument’ then the use of blockchain technology to manage and keep track of the usage of selfdriving cars is another (albeit non-healthcare) use-case that has been seriously proposed for the future: http://www. coindesk.com/blockchainmove-self-driving-cars-fastlane/.

41 http://coinmarketcap.com/ currencies/ethereum/.

42 (a) https://www.techopedia. com/definition/23915/digitalidentity; b) https://www. adelaide.edu.au/press/titles/digi tal-identity/Digital_Identity_ Ebook.pdf.

43 (a) https://mydigitalblock chain.com/2016/04/06/howthe-blockchain-could-becomethe-next-e-signature/; (b) http://www.coindesk.com/docu sign-founder-sees-blockchaintech-potential-identitymanagement/.

44 https://letstalkpayments. com/22-companies-leveragingblockchain-for-identitymanagement-andauthentication/.

45 https://decentralisedidentity.github.io/.

46 (a) https://www.ibm.com/ blockchain/identity/; (b) http://www.coindesk.com/bloc kchain-consortium-drawsenterprise-giants-torevolutionize-digital-identity/.

47 https://blog.signatura.co/ using-the-blockchain-as-adigital-signature-schemef584278ae826.

48 http://www.pistoiaalliance. org.

49 http://www.drugdiscovery today.com/view/46077/83-oflife-science-leaders-believeblockchain-will-be-adoptedwithin-five-years-finds-surveyfrom-the-pistoia-alliance/.

50 http://radar.oreilly.com/ 2013/12/email-the-internetsfirst-and-last-killer-app.html.