Global text mining and development of pharmacogenomic knowledge resource for precision medicine

Debleena Guin, Jyoti Rani, Priyanka Singh, Sandeep Grover, Shivangi Bora, Puneet Talwar, Muthusamy Karthikeyan, K. Satyamoorthy, C. Adithan, S. Ramachandran, Luciano Saso, Yasha Hasija, Ritushree Kukreti

Research output: Contribution to journalArticle

Abstract

Understanding patients' genomic variations and their effect in protecting or predisposing them to drug response phenotypes is important for providing personalized healthcare. Several studies have manually curated such genotype-phenotype relationships into organized databases from clinical trial data or published literature. However, there are no text mining tools available to extract high-accuracy information from such existing knowledge. In this work, we used a semiautomated text mining approach to retrieve a complete pharmacogenomic (PGx) resource integrating disease-drug-gene-polymorphism relationships to derive a global perspective for ease in therapeutic approaches. We used an R package, pubmed.mineR, to automatically retrieve PGx-related literature. We identified 1,753 disease types, and 666 drugs, associated with 4,132 genes and 33,942 polymorphisms collated from 180,088 publications. With further manual curation, we obtained a total of 2,304 PGx relationships. We evaluated our approach by performance (precision = 0.806) with benchmark datasets like Pharmacogenomic Knowledgebase (PharmGKB) (0.904), Online Mendelian Inheritance in Man (OMIM) (0.600), and The Comparative Toxicogenomics Database (CTD) (0.729). We validated our study by comparing our results with 362 commercially used the US- Food and drug administration (FDA)-approved drug labeling biomarkers. Of the 2,304 PGx relationships identified, 127 belonged to the FDA list of 362 approved pharmacogenomic markers, indicating that our semiautomated text mining approach may reveal significant PGx information with markers for drug response prediction. In addition, it is a scalable and state-of-art approach in curation for PGx clinical utility.

Original languageEnglish
Article number839
JournalFrontiers in Pharmacology
Volume10
Issue numberJULY
DOIs
Publication statusPublished - 01-01-2019

Fingerprint

Precision Medicine
Data Mining
Pharmacogenetics
United States Food and Drug Administration
Pharmaceutical Preparations
Drug Labeling
Toxicogenetics
Databases
Genetic Databases
Phenotype
Benchmarking
Knowledge Bases
PubMed
Genes
Publications
Biomarkers
Genotype
Clinical Trials
Delivery of Health Care
Therapeutics

All Science Journal Classification (ASJC) codes

  • Pharmacology
  • Pharmacology (medical)

Cite this

Guin, D., Rani, J., Singh, P., Grover, S., Bora, S., Talwar, P., ... Kukreti, R. (2019). Global text mining and development of pharmacogenomic knowledge resource for precision medicine. Frontiers in Pharmacology, 10(JULY), [839]. https://doi.org/10.3389/fphar.2019.00839
Guin, Debleena ; Rani, Jyoti ; Singh, Priyanka ; Grover, Sandeep ; Bora, Shivangi ; Talwar, Puneet ; Karthikeyan, Muthusamy ; Satyamoorthy, K. ; Adithan, C. ; Ramachandran, S. ; Saso, Luciano ; Hasija, Yasha ; Kukreti, Ritushree. / Global text mining and development of pharmacogenomic knowledge resource for precision medicine. In: Frontiers in Pharmacology. 2019 ; Vol. 10, No. JULY.
@article{8245c0b038d1458fad48259b62e26d70,
title = "Global text mining and development of pharmacogenomic knowledge resource for precision medicine",
abstract = "Understanding patients' genomic variations and their effect in protecting or predisposing them to drug response phenotypes is important for providing personalized healthcare. Several studies have manually curated such genotype-phenotype relationships into organized databases from clinical trial data or published literature. However, there are no text mining tools available to extract high-accuracy information from such existing knowledge. In this work, we used a semiautomated text mining approach to retrieve a complete pharmacogenomic (PGx) resource integrating disease-drug-gene-polymorphism relationships to derive a global perspective for ease in therapeutic approaches. We used an R package, pubmed.mineR, to automatically retrieve PGx-related literature. We identified 1,753 disease types, and 666 drugs, associated with 4,132 genes and 33,942 polymorphisms collated from 180,088 publications. With further manual curation, we obtained a total of 2,304 PGx relationships. We evaluated our approach by performance (precision = 0.806) with benchmark datasets like Pharmacogenomic Knowledgebase (PharmGKB) (0.904), Online Mendelian Inheritance in Man (OMIM) (0.600), and The Comparative Toxicogenomics Database (CTD) (0.729). We validated our study by comparing our results with 362 commercially used the US- Food and drug administration (FDA)-approved drug labeling biomarkers. Of the 2,304 PGx relationships identified, 127 belonged to the FDA list of 362 approved pharmacogenomic markers, indicating that our semiautomated text mining approach may reveal significant PGx information with markers for drug response prediction. In addition, it is a scalable and state-of-art approach in curation for PGx clinical utility.",
author = "Debleena Guin and Jyoti Rani and Priyanka Singh and Sandeep Grover and Shivangi Bora and Puneet Talwar and Muthusamy Karthikeyan and K. Satyamoorthy and C. Adithan and S. Ramachandran and Luciano Saso and Yasha Hasija and Ritushree Kukreti",
year = "2019",
month = "1",
day = "1",
doi = "10.3389/fphar.2019.00839",
language = "English",
volume = "10",
journal = "Frontiers in Pharmacology",
issn = "1663-9812",
publisher = "Frontiers Media S. A.",
number = "JULY",

}

Guin, D, Rani, J, Singh, P, Grover, S, Bora, S, Talwar, P, Karthikeyan, M, Satyamoorthy, K, Adithan, C, Ramachandran, S, Saso, L, Hasija, Y & Kukreti, R 2019, 'Global text mining and development of pharmacogenomic knowledge resource for precision medicine', Frontiers in Pharmacology, vol. 10, no. JULY, 839. https://doi.org/10.3389/fphar.2019.00839

Global text mining and development of pharmacogenomic knowledge resource for precision medicine. / Guin, Debleena; Rani, Jyoti; Singh, Priyanka; Grover, Sandeep; Bora, Shivangi; Talwar, Puneet; Karthikeyan, Muthusamy; Satyamoorthy, K.; Adithan, C.; Ramachandran, S.; Saso, Luciano; Hasija, Yasha; Kukreti, Ritushree.

In: Frontiers in Pharmacology, Vol. 10, No. JULY, 839, 01.01.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Global text mining and development of pharmacogenomic knowledge resource for precision medicine

AU - Guin, Debleena

AU - Rani, Jyoti

AU - Singh, Priyanka

AU - Grover, Sandeep

AU - Bora, Shivangi

AU - Talwar, Puneet

AU - Karthikeyan, Muthusamy

AU - Satyamoorthy, K.

AU - Adithan, C.

AU - Ramachandran, S.

AU - Saso, Luciano

AU - Hasija, Yasha

AU - Kukreti, Ritushree

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Understanding patients' genomic variations and their effect in protecting or predisposing them to drug response phenotypes is important for providing personalized healthcare. Several studies have manually curated such genotype-phenotype relationships into organized databases from clinical trial data or published literature. However, there are no text mining tools available to extract high-accuracy information from such existing knowledge. In this work, we used a semiautomated text mining approach to retrieve a complete pharmacogenomic (PGx) resource integrating disease-drug-gene-polymorphism relationships to derive a global perspective for ease in therapeutic approaches. We used an R package, pubmed.mineR, to automatically retrieve PGx-related literature. We identified 1,753 disease types, and 666 drugs, associated with 4,132 genes and 33,942 polymorphisms collated from 180,088 publications. With further manual curation, we obtained a total of 2,304 PGx relationships. We evaluated our approach by performance (precision = 0.806) with benchmark datasets like Pharmacogenomic Knowledgebase (PharmGKB) (0.904), Online Mendelian Inheritance in Man (OMIM) (0.600), and The Comparative Toxicogenomics Database (CTD) (0.729). We validated our study by comparing our results with 362 commercially used the US- Food and drug administration (FDA)-approved drug labeling biomarkers. Of the 2,304 PGx relationships identified, 127 belonged to the FDA list of 362 approved pharmacogenomic markers, indicating that our semiautomated text mining approach may reveal significant PGx information with markers for drug response prediction. In addition, it is a scalable and state-of-art approach in curation for PGx clinical utility.

AB - Understanding patients' genomic variations and their effect in protecting or predisposing them to drug response phenotypes is important for providing personalized healthcare. Several studies have manually curated such genotype-phenotype relationships into organized databases from clinical trial data or published literature. However, there are no text mining tools available to extract high-accuracy information from such existing knowledge. In this work, we used a semiautomated text mining approach to retrieve a complete pharmacogenomic (PGx) resource integrating disease-drug-gene-polymorphism relationships to derive a global perspective for ease in therapeutic approaches. We used an R package, pubmed.mineR, to automatically retrieve PGx-related literature. We identified 1,753 disease types, and 666 drugs, associated with 4,132 genes and 33,942 polymorphisms collated from 180,088 publications. With further manual curation, we obtained a total of 2,304 PGx relationships. We evaluated our approach by performance (precision = 0.806) with benchmark datasets like Pharmacogenomic Knowledgebase (PharmGKB) (0.904), Online Mendelian Inheritance in Man (OMIM) (0.600), and The Comparative Toxicogenomics Database (CTD) (0.729). We validated our study by comparing our results with 362 commercially used the US- Food and drug administration (FDA)-approved drug labeling biomarkers. Of the 2,304 PGx relationships identified, 127 belonged to the FDA list of 362 approved pharmacogenomic markers, indicating that our semiautomated text mining approach may reveal significant PGx information with markers for drug response prediction. In addition, it is a scalable and state-of-art approach in curation for PGx clinical utility.

UR - http://www.scopus.com/inward/record.url?scp=85071341662&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071341662&partnerID=8YFLogxK

U2 - 10.3389/fphar.2019.00839

DO - 10.3389/fphar.2019.00839

M3 - Article

AN - SCOPUS:85071341662

VL - 10

JO - Frontiers in Pharmacology

JF - Frontiers in Pharmacology

SN - 1663-9812

IS - JULY

M1 - 839

ER -