Cargando…

Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia

BACKGROUND: Somatic Hypermutation (SHM) refers to the introduction of mutations within rearranged V(D)J genes, a process that increases the diversity of Immunoglobulins (IGs). The analysis of SHM has offered critical insight into the physiology and pathology of B cells, leading to strong prognostica...

Descripción completa

Detalles Bibliográficos
Autores principales: Kavakiotis, Ioannis, Xochelli, Aliki, Agathangelidis, Andreas, Tsoumakas, Grigorios, Maglaveras, Nicos, Stamatopoulos, Kostas, Hadzidimitriou, Anastasia, Vlahavas, Ioannis, Chouvarda, Ioanna
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4905615/
https://www.ncbi.nlm.nih.gov/pubmed/27295298
http://dx.doi.org/10.1186/s12859-016-1044-3
_version_ 1782437280933216256
author Kavakiotis, Ioannis
Xochelli, Aliki
Agathangelidis, Andreas
Tsoumakas, Grigorios
Maglaveras, Nicos
Stamatopoulos, Kostas
Hadzidimitriou, Anastasia
Vlahavas, Ioannis
Chouvarda, Ioanna
author_facet Kavakiotis, Ioannis
Xochelli, Aliki
Agathangelidis, Andreas
Tsoumakas, Grigorios
Maglaveras, Nicos
Stamatopoulos, Kostas
Hadzidimitriou, Anastasia
Vlahavas, Ioannis
Chouvarda, Ioanna
author_sort Kavakiotis, Ioannis
collection PubMed
description BACKGROUND: Somatic Hypermutation (SHM) refers to the introduction of mutations within rearranged V(D)J genes, a process that increases the diversity of Immunoglobulins (IGs). The analysis of SHM has offered critical insight into the physiology and pathology of B cells, leading to strong prognostication markers for clinical outcome in chronic lymphocytic leukaemia (CLL), the most frequent adult B-cell malignancy. In this paper we present a methodology for integrating multiple immunogenetic and clinocobiological data sources in order to extract features and create high quality datasets for SHM analysis in IG receptors of CLL patients. This dataset is used as the basis for a higher level integration procedure, inspired form social choice theory. This is applied in the Towards Analysis, our attempt to investigate the potential ontogenetic transformation of genes belonging to specific stereotyped CLL subsets towards other genes or gene families, through SHM. RESULTS: The data integration process, followed by feature extraction, resulted in the generation of a dataset containing information about mutations occurring through SHM. The Towards analysis performed on the integrated dataset applying voting techniques, revealed the distinct behaviour of subset #201 compared to other subsets, as regards SHM related movements among gene clans, both in allele-conserved and non-conserved gene areas. With respect to movement between genes, a high percentage movement towards pseudo genes was found in all CLL subsets. CONCLUSIONS: This data integration and feature extraction process can set the basis for exploratory analysis or a fully automated computational data mining approach on many as yet unanswered, clinically relevant biological questions.
format Online
Article
Text
id pubmed-4905615
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49056152016-06-14 Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia Kavakiotis, Ioannis Xochelli, Aliki Agathangelidis, Andreas Tsoumakas, Grigorios Maglaveras, Nicos Stamatopoulos, Kostas Hadzidimitriou, Anastasia Vlahavas, Ioannis Chouvarda, Ioanna BMC Bioinformatics Research BACKGROUND: Somatic Hypermutation (SHM) refers to the introduction of mutations within rearranged V(D)J genes, a process that increases the diversity of Immunoglobulins (IGs). The analysis of SHM has offered critical insight into the physiology and pathology of B cells, leading to strong prognostication markers for clinical outcome in chronic lymphocytic leukaemia (CLL), the most frequent adult B-cell malignancy. In this paper we present a methodology for integrating multiple immunogenetic and clinocobiological data sources in order to extract features and create high quality datasets for SHM analysis in IG receptors of CLL patients. This dataset is used as the basis for a higher level integration procedure, inspired form social choice theory. This is applied in the Towards Analysis, our attempt to investigate the potential ontogenetic transformation of genes belonging to specific stereotyped CLL subsets towards other genes or gene families, through SHM. RESULTS: The data integration process, followed by feature extraction, resulted in the generation of a dataset containing information about mutations occurring through SHM. The Towards analysis performed on the integrated dataset applying voting techniques, revealed the distinct behaviour of subset #201 compared to other subsets, as regards SHM related movements among gene clans, both in allele-conserved and non-conserved gene areas. With respect to movement between genes, a high percentage movement towards pseudo genes was found in all CLL subsets. CONCLUSIONS: This data integration and feature extraction process can set the basis for exploratory analysis or a fully automated computational data mining approach on many as yet unanswered, clinically relevant biological questions. BioMed Central 2016-06-06 /pmc/articles/PMC4905615/ /pubmed/27295298 http://dx.doi.org/10.1186/s12859-016-1044-3 Text en © Kavakiotis et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Kavakiotis, Ioannis
Xochelli, Aliki
Agathangelidis, Andreas
Tsoumakas, Grigorios
Maglaveras, Nicos
Stamatopoulos, Kostas
Hadzidimitriou, Anastasia
Vlahavas, Ioannis
Chouvarda, Ioanna
Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia
title Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia
title_full Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia
title_fullStr Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia
title_full_unstemmed Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia
title_short Integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia
title_sort integrating multiple immunogenetic data sources for feature extraction and mining somatic hypermutation patterns: the case of “towards analysis” in chronic lymphocytic leukaemia
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4905615/
https://www.ncbi.nlm.nih.gov/pubmed/27295298
http://dx.doi.org/10.1186/s12859-016-1044-3
work_keys_str_mv AT kavakiotisioannis integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia
AT xochellialiki integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia
AT agathangelidisandreas integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia
AT tsoumakasgrigorios integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia
AT maglaverasnicos integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia
AT stamatopouloskostas integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia
AT hadzidimitriouanastasia integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia
AT vlahavasioannis integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia
AT chouvardaioanna integratingmultipleimmunogeneticdatasourcesforfeatureextractionandminingsomatichypermutationpatternsthecaseoftowardsanalysisinchroniclymphocyticleukaemia