Cargando…
Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases
BACKGROUND: Targeted Next Generation Sequencing is a common and powerful approach used in both clinical and research settings. However, at present, a large fraction of the acquired genetic information is not used since pathogenicity cannot be assessed for most variants. Further complicating this sce...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8830183/ https://www.ncbi.nlm.nih.gov/pubmed/35144616 http://dx.doi.org/10.1186/s12920-022-01173-4 |
_version_ | 1784648224553828352 |
---|---|
author | Tarozzi, M. Bartoletti-Stella, A. Dall’Olio, D. Matteuzzi, T. Baiardi, S. Parchi, P. Castellani, G. Capellari, S. |
author_facet | Tarozzi, M. Bartoletti-Stella, A. Dall’Olio, D. Matteuzzi, T. Baiardi, S. Parchi, P. Castellani, G. Capellari, S. |
author_sort | Tarozzi, M. |
collection | PubMed |
description | BACKGROUND: Targeted Next Generation Sequencing is a common and powerful approach used in both clinical and research settings. However, at present, a large fraction of the acquired genetic information is not used since pathogenicity cannot be assessed for most variants. Further complicating this scenario is the increasingly frequent description of a poli/oligogenic pattern of inheritance showing the contribution of multiple variants in increasing disease risk. We present an approach in which the entire genetic information provided by target sequencing is transformed into binary data on which we performed statistical, machine learning, and network analyses to extract all valuable information from the entire genetic profile. To test this approach and unbiasedly explore the presence of recurrent genetic patterns, we studied a cohort of 112 patients affected either by genetic Creutzfeldt–Jakob (CJD) disease caused by two mutations in the PRNP gene (p.E200K and p.V210I) with different penetrance or by sporadic Alzheimer disease (sAD). RESULTS: Unsupervised methods can identify functionally relevant sources of variation in the data, like haplogroups and polymorphisms that do not follow Hardy–Weinberg equilibrium, such as the NOTCH3 rs11670823 (c.3837 + 21 T > A). Supervised classifiers can recognize clinical phenotypes with high accuracy based on the mutational profile of patients. In addition, we found a similar alteration of allele frequencies compared the European population in sporadic patients and in V210I-CJD, a poorly penetrant PRNP mutation, and sAD, suggesting shared oligogenic patterns in different types of dementia. Pathway enrichment and protein–protein interaction network revealed different altered pathways between the two PRNP mutations. CONCLUSIONS: We propose this workflow as a possible approach to gain deeper insights into the genetic information derived from target sequencing, to identify recurrent genetic patterns and improve the understanding of complex diseases. This work could also represent a possible starting point of a predictive tool for personalized medicine and advanced diagnostic applications. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12920-022-01173-4. |
format | Online Article Text |
id | pubmed-8830183 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-88301832022-02-11 Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases Tarozzi, M. Bartoletti-Stella, A. Dall’Olio, D. Matteuzzi, T. Baiardi, S. Parchi, P. Castellani, G. Capellari, S. BMC Med Genomics Research BACKGROUND: Targeted Next Generation Sequencing is a common and powerful approach used in both clinical and research settings. However, at present, a large fraction of the acquired genetic information is not used since pathogenicity cannot be assessed for most variants. Further complicating this scenario is the increasingly frequent description of a poli/oligogenic pattern of inheritance showing the contribution of multiple variants in increasing disease risk. We present an approach in which the entire genetic information provided by target sequencing is transformed into binary data on which we performed statistical, machine learning, and network analyses to extract all valuable information from the entire genetic profile. To test this approach and unbiasedly explore the presence of recurrent genetic patterns, we studied a cohort of 112 patients affected either by genetic Creutzfeldt–Jakob (CJD) disease caused by two mutations in the PRNP gene (p.E200K and p.V210I) with different penetrance or by sporadic Alzheimer disease (sAD). RESULTS: Unsupervised methods can identify functionally relevant sources of variation in the data, like haplogroups and polymorphisms that do not follow Hardy–Weinberg equilibrium, such as the NOTCH3 rs11670823 (c.3837 + 21 T > A). Supervised classifiers can recognize clinical phenotypes with high accuracy based on the mutational profile of patients. In addition, we found a similar alteration of allele frequencies compared the European population in sporadic patients and in V210I-CJD, a poorly penetrant PRNP mutation, and sAD, suggesting shared oligogenic patterns in different types of dementia. Pathway enrichment and protein–protein interaction network revealed different altered pathways between the two PRNP mutations. CONCLUSIONS: We propose this workflow as a possible approach to gain deeper insights into the genetic information derived from target sequencing, to identify recurrent genetic patterns and improve the understanding of complex diseases. This work could also represent a possible starting point of a predictive tool for personalized medicine and advanced diagnostic applications. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12920-022-01173-4. BioMed Central 2022-02-10 /pmc/articles/PMC8830183/ /pubmed/35144616 http://dx.doi.org/10.1186/s12920-022-01173-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Tarozzi, M. Bartoletti-Stella, A. Dall’Olio, D. Matteuzzi, T. Baiardi, S. Parchi, P. Castellani, G. Capellari, S. Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases |
title | Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases |
title_full | Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases |
title_fullStr | Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases |
title_full_unstemmed | Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases |
title_short | Identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases |
title_sort | identification of recurrent genetic patterns from targeted sequencing panels with advanced data science: a case-study on sporadic and genetic neurodegenerative diseases |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8830183/ https://www.ncbi.nlm.nih.gov/pubmed/35144616 http://dx.doi.org/10.1186/s12920-022-01173-4 |
work_keys_str_mv | AT tarozzim identificationofrecurrentgeneticpatternsfromtargetedsequencingpanelswithadvanceddatascienceacasestudyonsporadicandgeneticneurodegenerativediseases AT bartolettistellaa identificationofrecurrentgeneticpatternsfromtargetedsequencingpanelswithadvanceddatascienceacasestudyonsporadicandgeneticneurodegenerativediseases AT dalloliod identificationofrecurrentgeneticpatternsfromtargetedsequencingpanelswithadvanceddatascienceacasestudyonsporadicandgeneticneurodegenerativediseases AT matteuzzit identificationofrecurrentgeneticpatternsfromtargetedsequencingpanelswithadvanceddatascienceacasestudyonsporadicandgeneticneurodegenerativediseases AT baiardis identificationofrecurrentgeneticpatternsfromtargetedsequencingpanelswithadvanceddatascienceacasestudyonsporadicandgeneticneurodegenerativediseases AT parchip identificationofrecurrentgeneticpatternsfromtargetedsequencingpanelswithadvanceddatascienceacasestudyonsporadicandgeneticneurodegenerativediseases AT castellanig identificationofrecurrentgeneticpatternsfromtargetedsequencingpanelswithadvanceddatascienceacasestudyonsporadicandgeneticneurodegenerativediseases AT capellaris identificationofrecurrentgeneticpatternsfromtargetedsequencingpanelswithadvanceddatascienceacasestudyonsporadicandgeneticneurodegenerativediseases |