Cargando…

Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers

Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Labaj, Wojciech, Papiez, Anna, Polanski, Andrzej, Polanska, Joanna
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer Berlin Heidelberg 2017
Materias:	Original Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5366179/ https://www.ncbi.nlm.nih.gov/pubmed/28303531 http://dx.doi.org/10.1007/s12539-017-0216-9

_version_	1782517545464496128
author	Labaj, Wojciech Papiez, Anna Polanski, Andrzej Polanska, Joanna
author_facet	Labaj, Wojciech Papiez, Anna Polanski, Andrzej Polanska, Joanna
author_sort	Labaj, Wojciech
collection	PubMed
description	Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s12539-017-0216-9) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5366179
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Springer Berlin Heidelberg
record_format	MEDLINE/PubMed
spelling	pubmed-53661792017-04-10 Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers Labaj, Wojciech Papiez, Anna Polanski, Andrzej Polanska, Joanna Interdiscip Sci Original Research Article Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s12539-017-0216-9) contains supplementary material, which is available to authorized users. Springer Berlin Heidelberg 2017-03-16 2017 /pmc/articles/PMC5366179/ /pubmed/28303531 http://dx.doi.org/10.1007/s12539-017-0216-9 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Original Research Article Labaj, Wojciech Papiez, Anna Polanski, Andrzej Polanska, Joanna Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers
title	Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers
title_full	Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers
title_fullStr	Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers
title_full_unstemmed	Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers
title_short	Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers
title_sort	comprehensive analysis of mile gene expression data set advances discovery of leukaemia type and subtype biomarkers
topic	Original Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5366179/ https://www.ncbi.nlm.nih.gov/pubmed/28303531 http://dx.doi.org/10.1007/s12539-017-0216-9
work_keys_str_mv	AT labajwojciech comprehensiveanalysisofmilegeneexpressiondatasetadvancesdiscoveryofleukaemiatypeandsubtypebiomarkers AT papiezanna comprehensiveanalysisofmilegeneexpressiondatasetadvancesdiscoveryofleukaemiatypeandsubtypebiomarkers AT polanskiandrzej comprehensiveanalysisofmilegeneexpressiondatasetadvancesdiscoveryofleukaemiatypeandsubtypebiomarkers AT polanskajoanna comprehensiveanalysisofmilegeneexpressiondatasetadvancesdiscoveryofleukaemiatypeandsubtypebiomarkers

Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers

Ejemplares similares