Cargando…

Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana

Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only...

Descripción completa

Detalles Bibliográficos
Autores principales: Seifert, Michael, Gohr, André, Strickert, Marc, Grosse, Ivo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3257270/
https://www.ncbi.nlm.nih.gov/pubmed/22253580
http://dx.doi.org/10.1371/journal.pcbi.1002286
_version_ 1782221134207385600
author Seifert, Michael
Gohr, André
Strickert, Marc
Grosse, Ivo
author_facet Seifert, Michael
Gohr, André
Strickert, Marc
Grosse, Ivo
author_sort Seifert, Michael
collection PubMed
description Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only based on first-order HMMs having constrained abilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. Here, we develop parsimonious higher-order HMMs enabling the interpolation between a mixture model ignoring spatial dependencies and a higher-order HMM exhaustively modeling spatial dependencies. We apply parsimonious higher-order HMMs to the analysis of Array-CGH data of the accessions C24 and Col-0 of the model plant Arabidopsis thaliana. We compare these models against first-order HMMs and other existing methods using a reference of known deletions and sequence deviations. We find that parsimonious higher-order HMMs clearly improve the identification of these polymorphisms. Moreover, we perform a functional analysis of identified polymorphisms revealing novel details of genomic differences between C24 and Col-0. Additional model evaluations are done on widely considered Array-CGH data of human cell lines indicating that parsimonious HMMs are also well-suited for the analysis of non-plant specific data. All these results indicate that parsimonious higher-order HMMs are useful for Array-CGH analyses. An implementation of parsimonious higher-order HMMs is available as part of the open source Java library Jstacs (www.jstacs.de/index.php/PHHMM).
format Online
Article
Text
id pubmed-3257270
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-32572702012-01-17 Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana Seifert, Michael Gohr, André Strickert, Marc Grosse, Ivo PLoS Comput Biol Research Article Array-based comparative genomic hybridization (Array-CGH) is an important technology in molecular biology for the detection of DNA copy number polymorphisms between closely related genomes. Hidden Markov Models (HMMs) are popular tools for the analysis of Array-CGH data, but current methods are only based on first-order HMMs having constrained abilities to model spatial dependencies between measurements of closely adjacent chromosomal regions. Here, we develop parsimonious higher-order HMMs enabling the interpolation between a mixture model ignoring spatial dependencies and a higher-order HMM exhaustively modeling spatial dependencies. We apply parsimonious higher-order HMMs to the analysis of Array-CGH data of the accessions C24 and Col-0 of the model plant Arabidopsis thaliana. We compare these models against first-order HMMs and other existing methods using a reference of known deletions and sequence deviations. We find that parsimonious higher-order HMMs clearly improve the identification of these polymorphisms. Moreover, we perform a functional analysis of identified polymorphisms revealing novel details of genomic differences between C24 and Col-0. Additional model evaluations are done on widely considered Array-CGH data of human cell lines indicating that parsimonious HMMs are also well-suited for the analysis of non-plant specific data. All these results indicate that parsimonious higher-order HMMs are useful for Array-CGH analyses. An implementation of parsimonious higher-order HMMs is available as part of the open source Java library Jstacs (www.jstacs.de/index.php/PHHMM). Public Library of Science 2012-01-12 /pmc/articles/PMC3257270/ /pubmed/22253580 http://dx.doi.org/10.1371/journal.pcbi.1002286 Text en Seifert et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Seifert, Michael
Gohr, André
Strickert, Marc
Grosse, Ivo
Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
title Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
title_full Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
title_fullStr Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
title_full_unstemmed Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
title_short Parsimonious Higher-Order Hidden Markov Models for Improved Array-CGH Analysis with Applications to Arabidopsis thaliana
title_sort parsimonious higher-order hidden markov models for improved array-cgh analysis with applications to arabidopsis thaliana
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3257270/
https://www.ncbi.nlm.nih.gov/pubmed/22253580
http://dx.doi.org/10.1371/journal.pcbi.1002286
work_keys_str_mv AT seifertmichael parsimonioushigherorderhiddenmarkovmodelsforimprovedarraycghanalysiswithapplicationstoarabidopsisthaliana
AT gohrandre parsimonioushigherorderhiddenmarkovmodelsforimprovedarraycghanalysiswithapplicationstoarabidopsisthaliana
AT strickertmarc parsimonioushigherorderhiddenmarkovmodelsforimprovedarraycghanalysiswithapplicationstoarabidopsisthaliana
AT grosseivo parsimonioushigherorderhiddenmarkovmodelsforimprovedarraycghanalysiswithapplicationstoarabidopsisthaliana