Cargando…
Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3257270/ https://www.ncbi.nlm.nih.gov/pubmed/22253580 http://dx.doi.org/10.1371/journal.pcbi.1002286 |
_version_ | 1782221134207385600 |
---|---|
author | Seifert, Michael Gohr, André Strickert, Marc Grosse, Ivo |
author_facet | Seifert, Michael Gohr, André Strickert, Marc Grosse, Ivo |
author_sort | Seifert, Michael |
collection | PubMed |
description | Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only based on first-order HMMs having constrained abilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. Here, we develop parsimonious higher-order HMMs enabling the interpolation between a mixture model ignoring spatial dependencies and a higher-order HMM exhaustively modeling spatial dependencies. We apply parsimonious higher-order HMMs to the analysis of Array-CGH data of the accessions C24 and Col-0 of the model plant Arabidopsis thaliana. We compare these models against first-order HMMs and other existing methods using a reference of known deletions and sequence deviations. We find that parsimonious higher-order HMMs clearly improve the identification of these polymorphisms. Moreover, we perform a functional analysis of identified polymorphisms revealing novel details of genomic differences between C24 and Col-0. Additional model evaluations are done on widely considered Array-CGH data of human cell lines indicating that parsimonious HMMs are also well-suited for the analysis of non-plant specific data. All these results indicate that parsimonious higher-order HMMs are useful for Array-CGH analyses. An implementation of parsimonious higher-order HMMs is available as part of the open source Java library Jstacs (www.jstacs.de/index.php/PHHMM). |
format | Online Article Text |
id | pubmed-3257270 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-32572702012-01-17 Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana Seifert, Michael Gohr, André Strickert, Marc Grosse, Ivo PLoS Comput Biol Research Article Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only based on first-order HMMs having constrained abilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. Here, we develop parsimonious higher-order HMMs enabling the interpolation between a mixture model ignoring spatial dependencies and a higher-order HMM exhaustively modeling spatial dependencies. We apply parsimonious higher-order HMMs to the analysis of Array-CGH data of the accessions C24 and Col-0 of the model plant Arabidopsis thaliana. We compare these models against first-order HMMs and other existing methods using a reference of known deletions and sequence deviations. We find that parsimonious higher-order HMMs clearly improve the identification of these polymorphisms. Moreover, we perform a functional analysis of identified polymorphisms revealing novel details of genomic differences between C24 and Col-0. Additional model evaluations are done on widely considered Array-CGH data of human cell lines indicating that parsimonious HMMs are also well-suited for the analysis of non-plant specific data. All these results indicate that parsimonious higher-order HMMs are useful for Array-CGH analyses. An implementation of parsimonious higher-order HMMs is available as part of the open source Java library Jstacs (www.jstacs.de/index.php/PHHMM). Public Library of Science 2012-01-12 /pmc/articles/PMC3257270/ /pubmed/22253580 http://dx.doi.org/10.1371/journal.pcbi.1002286 Text en Seifert et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Seifert, Michael Gohr, André Strickert, Marc Grosse, Ivo Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana |
title | Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
|
title_full | Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
|
title_fullStr | Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
|
title_full_unstemmed | Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
|
title_short | Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
|
title_sort | parsimonious higher-order hidden markov models for improved array-cgh analysis with applications to arabidopsis thaliana |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3257270/ https://www.ncbi.nlm.nih.gov/pubmed/22253580 http://dx.doi.org/10.1371/journal.pcbi.1002286 |
work_keys_str_mv | AT seifertmichael parsimonioushigherorderhiddenmarkovmodelsforimprovedarraycghanalysiswithapplicationstoarabidopsisthaliana AT gohrandre parsimonioushigherorderhiddenmarkovmodelsforimprovedarraycghanalysiswithapplicationstoarabidopsisthaliana AT strickertmarc parsimonioushigherorderhiddenmarkovmodelsforimprovedarraycghanalysiswithapplicationstoarabidopsisthaliana AT grosseivo parsimonioushigherorderhiddenmarkovmodelsforimprovedarraycghanalysiswithapplicationstoarabidopsisthaliana |