Cargando…
SuperDCA for genome-wide epistasis analysis
The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single pr...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Microbiology Society
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6096938/ https://www.ncbi.nlm.nih.gov/pubmed/29813016 http://dx.doi.org/10.1099/mgen.0.000184 |
_version_ | 1783348201402662912 |
---|---|
author | Puranen, Santeri Pesonen, Maiju Pensar, Johan Xu, Ying Ying Lees, John A. Bentley, Stephen D. Croucher, Nicholas J. Corander, Jukka |
author_facet | Puranen, Santeri Pesonen, Maiju Pensar, Johan Xu, Ying Ying Lees, John A. Bentley, Stephen D. Croucher, Nicholas J. Corander, Jukka |
author_sort | Puranen, Santeri |
collection | PubMed |
description | The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 10(4)–10(5) polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here, we introduce a novel inference method (SuperDCA) that employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 10(5) polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA, thus, holds considerable potential in building understanding about numerous organisms at a systems biological level. |
format | Online Article Text |
id | pubmed-6096938 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Microbiology Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-60969382018-08-20 SuperDCA for genome-wide epistasis analysis Puranen, Santeri Pesonen, Maiju Pensar, Johan Xu, Ying Ying Lees, John A. Bentley, Stephen D. Croucher, Nicholas J. Corander, Jukka Microb Genom Research Article The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 10(4)–10(5) polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here, we introduce a novel inference method (SuperDCA) that employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 10(5) polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA, thus, holds considerable potential in building understanding about numerous organisms at a systems biological level. Microbiology Society 2018-05-29 /pmc/articles/PMC6096938/ /pubmed/29813016 http://dx.doi.org/10.1099/mgen.0.000184 Text en © 2018 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Puranen, Santeri Pesonen, Maiju Pensar, Johan Xu, Ying Ying Lees, John A. Bentley, Stephen D. Croucher, Nicholas J. Corander, Jukka SuperDCA for genome-wide epistasis analysis |
title | SuperDCA for genome-wide epistasis analysis |
title_full | SuperDCA for genome-wide epistasis analysis |
title_fullStr | SuperDCA for genome-wide epistasis analysis |
title_full_unstemmed | SuperDCA for genome-wide epistasis analysis |
title_short | SuperDCA for genome-wide epistasis analysis |
title_sort | superdca for genome-wide epistasis analysis |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6096938/ https://www.ncbi.nlm.nih.gov/pubmed/29813016 http://dx.doi.org/10.1099/mgen.0.000184 |
work_keys_str_mv | AT puranensanteri superdcaforgenomewideepistasisanalysis AT pesonenmaiju superdcaforgenomewideepistasisanalysis AT pensarjohan superdcaforgenomewideepistasisanalysis AT xuyingying superdcaforgenomewideepistasisanalysis AT leesjohna superdcaforgenomewideepistasisanalysis AT bentleystephend superdcaforgenomewideepistasisanalysis AT crouchernicholasj superdcaforgenomewideepistasisanalysis AT coranderjukka superdcaforgenomewideepistasisanalysis |