Received: July 23, 2018; Published: July 31, 2018
*Corresponding author: Jae-Eun Lee, Division of Biobank for Health Sciences, Korea National Institute of Health, Korea
Medicine has been translating into personalized and precision medicine based on individual genetic, environmental and clinical characteristics, and lifestyle. The biobank is an essential infrastructure for the successful implementation of personalized and precision medicine. Recently, researches for development of artificial intelligence (AI) technology for personalized and precision medicine are actively being conducted, however research to utilize AI in biobanking is not noticeable. This article presents current issues in the biobank and the future possibilities of AI in biobanking.
Keywords:Artificial intelligence, Biobank, Biobanking, Precision biobanking, Precision medicine
Abbreviations:AI: Artificial Intelligence, ML: Machine Learning, NLP: Natural Language Processing, SOPs: Standard Operating Procedures, STR: Short Tandem Repeat, SNP: Single Nucleotide Polymorphism
It is important to establish a collection plan of biosamples and related data in view of research trends, preanalytical and analytical variables, disease trends, and health-related information. Especially, it is very important that the biobank controls the entire lifecycle of biosamples, because biosamples such as serum, plasma, urine, and tissue may be affected by preanalytical (e.g., biosample collection, processing, movement, and storage conditions) or analytical variables (e.g., the type of analyte and the method of analysis) [1-4]. The biobank may also have a need to develop and manage dynamic consent for future biomedical research. Dynamic consent makes it possible to economize the recruitment and management of biobank participants , and to continuously secure biosamples and related information (e.g., electronic clinical records and life log data) for the follow up of participants. It also allows research to be carried out flexibly, reflecting new analytical techniques. Current major issues in biobanking for personalized and precision medicine are summarized following as:
a) Developing and managing dynamic consent
Collecting, processing, transporting, storing, and distributing biosamples in consideration of preanalytical and analytical variables
c) Securing information (such as timestamp) on the entire lifecycle of biosamples
d) Selecting and classifying collected biosamples suitable to specific intended uses
e) Establishing a biosample collection plan in consideration of the research and disease trends and inventory status of the biosample.
Although researches for application of AI to various fields including medicine are actively being conducted, research to utilize AI in biobanking is not noticeable. However, I believe that in the foreseeable future, a new generation of biobanking using AI will be launched. AI systems are able to process large amounts of data simultaneously and rapidly, and to learn from each incremental case to continually improve accuracy . AI devices include machine learning (ML) methods and natural language processing (NLP) techniques. ML and NLP techniques extract information from structured data (such as images, genetic data, and electrophysiological recording) and unstructured data (such as clinical notes and the literature), respectively . ML techniques can reveal complex relationships . NLP techniques translate text-based data into structured data that can be analyzed using ML techniques . In the medical field, AI applications could be utilized for various tasks including diagnosis and outcome prediction of diseases and medical image analysis. In deed Watson, a question-answering AI computer system that can answer questions raised in natural language, has been developed in IBM’s Deep QA project. IBM Watson has several types including Watson for Genomics to interpret genetic data and Watson for Oncology to recommend cancer patient’s treatments. AI has also the potential to play a variety of roles in supporting persons working in biobanks.
Dynamic consent in biobank research will be acquired and managed through web-based communication between AI and biobank participants; for example, AI-based systems could read and explain the contents of the consent form to the participants and answer the questions. When a participant withdraws consent, AI system could discard biosample-related data and ask the biobank’s administrator to destroy participant’s biosamples and could announce the research progress to participants in real time. Biosamples are useful if they are obtained in a standardized way. AI will develop standard operating procedures (SOPs) or standardized criteria for the acquisition of biosamples suitable to specific intended uses, through analysis of the literature on preanalytical and analytical variables by the type of biosample; for example, Marzi et al.  proposed that to detect microRNAs for lung cancer diagnosis, blood must be clotted for 2-3hr at room temperature and the serum is immediately separated after centrifugation . AI could establish serum sampling conditions for microRNAsbased clinical test for lung cancer diagnosis by analyzing similar accumulated research results. AI systems will collect and manage information on the history of collection, processing, movement, and storage of biosamples. This information can be used to select biosamples suitable to research purposes. AI has the potential to interpret various types of medical image data (e.g., magnetic resonance imaging, radiograph, and ultrasound imaging). AI systems will extract significant information from electronic medical records of biobank participants and will collect information about health status by analyzing medical image data of participants.
AI will define and measure the quality of biosamples; for example, AI systems could assess DNA integrity with DNA gel electrophoresis images and could determine percentage of tumor and necrosis with digital histopathology images of tissue samples. The results of the short tandem repeat (STR) analysis and single nucleotide polymorphism (SNP) genotyping that conducted for quality control of biosamples could be used to judge whether or not they match the gender information of participants or DNA sequencing data. In addition, AI will establish a biosample collection plan for the future biomedical research by analyzing the biobank’s distribution and inventory status, and research trends (such as publication and patent trends in biomedical research). As biosamples are used for researches, empty space is caused irregularly in biosample storage equipment. If AI system is linked with an automated sample storage system, it will change the location of biosamples for efficient use of storage space.
AI applications will analyze the contents of the research proposal and then will recommend biosamples suitable to specific uses. For this, AI could extract important information (e.g., the type of biosample, the type of analyte, the method of analysis, the target disease, and research purposes) from the research proposal and could analyze the references to preanalytical and analytical variations related to these elements. Next, AI could select biosamples suitable for the study in consideration of biosample collection, processing, and storage details, results of quality control, and participants’ clinical information. Biobanks are able to assess the value of their biosamples through bibliographic analysis on publications and patents specifying the use of their biosamples . AI systems will extract publications and patents specifying the use of the biobank’s biosamples from a bibliographic database (e.g., the Scopus and Embase database) and will analyze research purposes from them.
This article describes how AI technology can promote standardization and innovation in biobanking. AI systems will continue to evolve with the development of big data analysis technology . In the foreseeable future, AI technology will enable the precision biobanking  by supporting the work of people working in the biobank. The development of AI systems that can be used in biobanking should be performed in a generally same or similar direction to the challenges presented in this article.