Cargando…
Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine
The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6348314/ https://www.ncbi.nlm.nih.gov/pubmed/30689846 http://dx.doi.org/10.1093/database/bay147 |
_version_ | 1783390078206214144 |
---|---|
author | Islamaj Doğan, Rezarta Kim, Sun Chatr-aryamontri, Andrew Wei, Chih-Hsuan Comeau, Donald C Antunes, Rui Matos, Sérgio Chen, Qingyu Elangovan, Aparna Panyam, Nagesh C Verspoor, Karin Liu, Hongfang Wang, Yanshan Liu, Zhuang Altınel, Berna Hüsünbeyi, Zehra Melce Özgür, Arzucan Fergadis, Aris Wang, Chen-Kai Dai, Hong-Jie Tran, Tung Kavuluru, Ramakanth Luo, Ling Steppi, Albert Zhang, Jinfeng Qu, Jinchan Lu, Zhiyong |
author_facet | Islamaj Doğan, Rezarta Kim, Sun Chatr-aryamontri, Andrew Wei, Chih-Hsuan Comeau, Donald C Antunes, Rui Matos, Sérgio Chen, Qingyu Elangovan, Aparna Panyam, Nagesh C Verspoor, Karin Liu, Hongfang Wang, Yanshan Liu, Zhuang Altınel, Berna Hüsünbeyi, Zehra Melce Özgür, Arzucan Fergadis, Aris Wang, Chen-Kai Dai, Hong-Jie Tran, Tung Kavuluru, Ramakanth Luo, Ling Steppi, Albert Zhang, Jinfeng Qu, Jinchan Lu, Zhiyong |
author_sort | Islamaj Doğan, Rezarta |
collection | PubMed |
description | The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease association studies, electronic health records and other data types. The biomedical literature provides a rich foundation for populating these KBs, reporting genetic and molecular interactions that provide the scaffold for the cellular regulatory systems and detailing the influence of genetic variants in these interactions. The goal of BioCreative VI Precision Medicine Track was to extract this particular type of information and was organized in two tasks: (i) document triage task, focused on identifying scientific literature containing experimentally verified protein–protein interactions (PPIs) affected by genetic mutations and (ii) relation extraction task, focused on extracting the affected interactions (protein pairs). To assist system developers and task participants, a large-scale corpus of PubMed documents was manually annotated for this task. Ten teams worldwide contributed 22 distinct text-mining models for the document triage task, and six teams worldwide contributed 14 different text-mining systems for the relation extraction task. When comparing the text-mining system predictions with human annotations, for the triage task, the best F-score was 69.06%, the best precision was 62.89%, the best recall was 98.0% and the best average precision was 72.5%. For the relation extraction task, when taking homologous genes into account, the best F-score was 37.73%, the best precision was 46.5% and the best recall was 54.1%. Submitted systems explored a wide range of methods, from traditional rule-based, statistical and machine learning systems to state-of-the-art deep learning methods. Given the level of participation and the individual team results we find the precision medicine track to be successful in engaging the text-mining research community. In the meantime, the track produced a manually annotated corpus of 5509 PubMed documents developed by BioGRID curators and relevant for precision medicine. The data set is freely available to the community, and the specific interactions have been integrated into the BioGRID data set. In addition, this challenge provided the first results of automatically identifying PubMed articles that describe PPI affected by mutations, as well as extracting the affected relations from those articles. Still, much progress is needed for computer-assisted precision medicine text mining to become mainstream. Future work should focus on addressing the remaining technical challenges and incorporating the practical benefits of text-mining tools into real-world precision medicine information-related curation. |
format | Online Article Text |
id | pubmed-6348314 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-63483142019-01-31 Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine Islamaj Doğan, Rezarta Kim, Sun Chatr-aryamontri, Andrew Wei, Chih-Hsuan Comeau, Donald C Antunes, Rui Matos, Sérgio Chen, Qingyu Elangovan, Aparna Panyam, Nagesh C Verspoor, Karin Liu, Hongfang Wang, Yanshan Liu, Zhuang Altınel, Berna Hüsünbeyi, Zehra Melce Özgür, Arzucan Fergadis, Aris Wang, Chen-Kai Dai, Hong-Jie Tran, Tung Kavuluru, Ramakanth Luo, Ling Steppi, Albert Zhang, Jinfeng Qu, Jinchan Lu, Zhiyong Database (Oxford) Original Article The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease association studies, electronic health records and other data types. The biomedical literature provides a rich foundation for populating these KBs, reporting genetic and molecular interactions that provide the scaffold for the cellular regulatory systems and detailing the influence of genetic variants in these interactions. The goal of BioCreative VI Precision Medicine Track was to extract this particular type of information and was organized in two tasks: (i) document triage task, focused on identifying scientific literature containing experimentally verified protein–protein interactions (PPIs) affected by genetic mutations and (ii) relation extraction task, focused on extracting the affected interactions (protein pairs). To assist system developers and task participants, a large-scale corpus of PubMed documents was manually annotated for this task. Ten teams worldwide contributed 22 distinct text-mining models for the document triage task, and six teams worldwide contributed 14 different text-mining systems for the relation extraction task. When comparing the text-mining system predictions with human annotations, for the triage task, the best F-score was 69.06%, the best precision was 62.89%, the best recall was 98.0% and the best average precision was 72.5%. For the relation extraction task, when taking homologous genes into account, the best F-score was 37.73%, the best precision was 46.5% and the best recall was 54.1%. Submitted systems explored a wide range of methods, from traditional rule-based, statistical and machine learning systems to state-of-the-art deep learning methods. Given the level of participation and the individual team results we find the precision medicine track to be successful in engaging the text-mining research community. In the meantime, the track produced a manually annotated corpus of 5509 PubMed documents developed by BioGRID curators and relevant for precision medicine. The data set is freely available to the community, and the specific interactions have been integrated into the BioGRID data set. In addition, this challenge provided the first results of automatically identifying PubMed articles that describe PPI affected by mutations, as well as extracting the affected relations from those articles. Still, much progress is needed for computer-assisted precision medicine text mining to become mainstream. Future work should focus on addressing the remaining technical challenges and incorporating the practical benefits of text-mining tools into real-world precision medicine information-related curation. Oxford University Press 2019-01-28 /pmc/articles/PMC6348314/ /pubmed/30689846 http://dx.doi.org/10.1093/database/bay147 Text en Published by Oxford University Press 2019. This work is written by US Government employees and is in the public domain in the US. |
spellingShingle | Original Article Islamaj Doğan, Rezarta Kim, Sun Chatr-aryamontri, Andrew Wei, Chih-Hsuan Comeau, Donald C Antunes, Rui Matos, Sérgio Chen, Qingyu Elangovan, Aparna Panyam, Nagesh C Verspoor, Karin Liu, Hongfang Wang, Yanshan Liu, Zhuang Altınel, Berna Hüsünbeyi, Zehra Melce Özgür, Arzucan Fergadis, Aris Wang, Chen-Kai Dai, Hong-Jie Tran, Tung Kavuluru, Ramakanth Luo, Ling Steppi, Albert Zhang, Jinfeng Qu, Jinchan Lu, Zhiyong Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine |
title | Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine |
title_full | Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine |
title_fullStr | Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine |
title_full_unstemmed | Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine |
title_short | Overview of the BioCreative VI Precision Medicine Track: mining protein interactions and mutations for precision medicine |
title_sort | overview of the biocreative vi precision medicine track: mining protein interactions and mutations for precision medicine |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6348314/ https://www.ncbi.nlm.nih.gov/pubmed/30689846 http://dx.doi.org/10.1093/database/bay147 |
work_keys_str_mv | AT islamajdoganrezarta overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT kimsun overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT chatraryamontriandrew overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT weichihhsuan overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT comeaudonaldc overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT antunesrui overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT matossergio overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT chenqingyu overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT elangovanaparna overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT panyamnageshc overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT verspoorkarin overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT liuhongfang overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT wangyanshan overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT liuzhuang overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT altınelberna overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT husunbeyizehramelce overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT ozgurarzucan overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT fergadisaris overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT wangchenkai overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT daihongjie overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT trantung overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT kavulururamakanth overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT luoling overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT steppialbert overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT zhangjinfeng overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT qujinchan overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine AT luzhiyong overviewofthebiocreativeviprecisionmedicinetrackminingproteininteractionsandmutationsforprecisionmedicine |