Cargando…

Molecular profiling of thyroid cancer subtypes using large-scale text mining

BACKGROUND: Thyroid cancer is the most common endocrine tumor with a steady increase in incidence. It is classified into multiple histopathological subtypes with potentially distinct molecular mechanisms. Identifying the most relevant genes and biological pathways reported in the thyroid cancer lite...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Chengkun, Schwartz, Jean-Marc, Brabant, Georg, Nenadic, Goran
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290788/
https://www.ncbi.nlm.nih.gov/pubmed/25521965
http://dx.doi.org/10.1186/1755-8794-7-S3-S3
_version_ 1782352303124119552
author Wu, Chengkun
Schwartz, Jean-Marc
Brabant, Georg
Nenadic, Goran
author_facet Wu, Chengkun
Schwartz, Jean-Marc
Brabant, Georg
Nenadic, Goran
author_sort Wu, Chengkun
collection PubMed
description BACKGROUND: Thyroid cancer is the most common endocrine tumor with a steady increase in incidence. It is classified into multiple histopathological subtypes with potentially distinct molecular mechanisms. Identifying the most relevant genes and biological pathways reported in the thyroid cancer literature is vital for understanding of the disease and developing targeted therapeutics. RESULTS: We developed a large-scale text mining system to generate a molecular profiling of thyroid cancer subtypes. The system first uses a subtype classification method for the thyroid cancer literature, which employs a scoring scheme to assign different subtypes to articles. We evaluated the classification method on a gold standard derived from the PubMed Supplementary Concept annotations, achieving a micro-average F1-score of 85.9% for primary subtypes. We then used the subtype classification results to extract genes and pathways associated with different thyroid cancer subtypes and successfully unveiled important genes and pathways, including some instances that are missing from current manually annotated databases or most recent review articles. CONCLUSIONS: Identification of key genes and pathways plays a central role in understanding the molecular biology of thyroid cancer. An integration of subtype context can allow prioritized screening for diagnostic biomarkers and novel molecular targeted therapeutics. Source code used for this study is made freely available online at https://github.com/chengkun-wu/GenesThyCan.
format Online
Article
Text
id pubmed-4290788
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42907882015-01-15 Molecular profiling of thyroid cancer subtypes using large-scale text mining Wu, Chengkun Schwartz, Jean-Marc Brabant, Georg Nenadic, Goran BMC Med Genomics Research BACKGROUND: Thyroid cancer is the most common endocrine tumor with a steady increase in incidence. It is classified into multiple histopathological subtypes with potentially distinct molecular mechanisms. Identifying the most relevant genes and biological pathways reported in the thyroid cancer literature is vital for understanding of the disease and developing targeted therapeutics. RESULTS: We developed a large-scale text mining system to generate a molecular profiling of thyroid cancer subtypes. The system first uses a subtype classification method for the thyroid cancer literature, which employs a scoring scheme to assign different subtypes to articles. We evaluated the classification method on a gold standard derived from the PubMed Supplementary Concept annotations, achieving a micro-average F1-score of 85.9% for primary subtypes. We then used the subtype classification results to extract genes and pathways associated with different thyroid cancer subtypes and successfully unveiled important genes and pathways, including some instances that are missing from current manually annotated databases or most recent review articles. CONCLUSIONS: Identification of key genes and pathways plays a central role in understanding the molecular biology of thyroid cancer. An integration of subtype context can allow prioritized screening for diagnostic biomarkers and novel molecular targeted therapeutics. Source code used for this study is made freely available online at https://github.com/chengkun-wu/GenesThyCan. BioMed Central 2014-12-08 /pmc/articles/PMC4290788/ /pubmed/25521965 http://dx.doi.org/10.1186/1755-8794-7-S3-S3 Text en Copyright © 2014 Wu et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Wu, Chengkun
Schwartz, Jean-Marc
Brabant, Georg
Nenadic, Goran
Molecular profiling of thyroid cancer subtypes using large-scale text mining
title Molecular profiling of thyroid cancer subtypes using large-scale text mining
title_full Molecular profiling of thyroid cancer subtypes using large-scale text mining
title_fullStr Molecular profiling of thyroid cancer subtypes using large-scale text mining
title_full_unstemmed Molecular profiling of thyroid cancer subtypes using large-scale text mining
title_short Molecular profiling of thyroid cancer subtypes using large-scale text mining
title_sort molecular profiling of thyroid cancer subtypes using large-scale text mining
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290788/
https://www.ncbi.nlm.nih.gov/pubmed/25521965
http://dx.doi.org/10.1186/1755-8794-7-S3-S3
work_keys_str_mv AT wuchengkun molecularprofilingofthyroidcancersubtypesusinglargescaletextmining
AT schwartzjeanmarc molecularprofilingofthyroidcancersubtypesusinglargescaletextmining
AT brabantgeorg molecularprofilingofthyroidcancersubtypesusinglargescaletextmining
AT nenadicgoran molecularprofilingofthyroidcancersubtypesusinglargescaletextmining