Cargando…

Parametric Alignment of Drosophila Genomes

The classic algorithms of Needleman–Wunsch and Smith–Waterman find a maximum a posteriori probability alignment for a pair hidden Markov model (PHMM). To process large genomes that have undergone complex genome rearrangements, almost all existing whole genome alignment methods apply fast heuristics...

Descripción completa

Detalles Bibliográficos
Autores principales: Dewey, Colin N, Huggins, Peter M, Woods, Kevin, Sturmfels, Bernd, Pachter, Lior
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480539/
https://www.ncbi.nlm.nih.gov/pubmed/16789815
http://dx.doi.org/10.1371/journal.pcbi.0020073
_version_ 1782128247306190848
author Dewey, Colin N
Huggins, Peter M
Woods, Kevin
Sturmfels, Bernd
Pachter, Lior
author_facet Dewey, Colin N
Huggins, Peter M
Woods, Kevin
Sturmfels, Bernd
Pachter, Lior
author_sort Dewey, Colin N
collection PubMed
description The classic algorithms of Needleman–Wunsch and Smith–Waterman find a maximum a posteriori probability alignment for a pair hidden Markov model (PHMM). To process large genomes that have undergone complex genome rearrangements, almost all existing whole genome alignment methods apply fast heuristics to divide genomes into small pieces that are suitable for Needleman–Wunsch alignment. In these alignment methods, it is standard practice to fix the parameters and to produce a single alignment for subsequent analysis by biologists. As the number of alignment programs applied on a whole genome scale continues to increase, so does the disagreement in their results. The alignments produced by different programs vary greatly, especially in non-coding regions of eukaryotic genomes where the biologically correct alignment is hard to find. Parametric alignment is one possible remedy. This methodology resolves the issue of robustness to changes in parameters by finding all optimal alignments for all possible parameters in a PHMM. Our main result is the construction of a whole genome parametric alignment of Drosophila melanogaster and Drosophila pseudoobscura. This alignment draws on existing heuristics for dividing whole genomes into small pieces for alignment, and it relies on advances we have made in computing convex polytopes that allow us to parametrically align non-coding regions using biologically realistic models. We demonstrate the utility of our parametric alignment for biological inference by showing that cis-regulatory elements are more conserved between Drosophila melanogaster and Drosophila pseudoobscura than previously thought. We also show how whole genome parametric alignment can be used to quantitatively assess the dependence of branch length estimates on alignment parameters.
format Text
id pubmed-1480539
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-14805392006-06-23 Parametric Alignment of Drosophila Genomes Dewey, Colin N Huggins, Peter M Woods, Kevin Sturmfels, Bernd Pachter, Lior PLoS Comput Biol Research Article The classic algorithms of Needleman–Wunsch and Smith–Waterman find a maximum a posteriori probability alignment for a pair hidden Markov model (PHMM). To process large genomes that have undergone complex genome rearrangements, almost all existing whole genome alignment methods apply fast heuristics to divide genomes into small pieces that are suitable for Needleman–Wunsch alignment. In these alignment methods, it is standard practice to fix the parameters and to produce a single alignment for subsequent analysis by biologists. As the number of alignment programs applied on a whole genome scale continues to increase, so does the disagreement in their results. The alignments produced by different programs vary greatly, especially in non-coding regions of eukaryotic genomes where the biologically correct alignment is hard to find. Parametric alignment is one possible remedy. This methodology resolves the issue of robustness to changes in parameters by finding all optimal alignments for all possible parameters in a PHMM. Our main result is the construction of a whole genome parametric alignment of Drosophila melanogaster and Drosophila pseudoobscura. This alignment draws on existing heuristics for dividing whole genomes into small pieces for alignment, and it relies on advances we have made in computing convex polytopes that allow us to parametrically align non-coding regions using biologically realistic models. We demonstrate the utility of our parametric alignment for biological inference by showing that cis-regulatory elements are more conserved between Drosophila melanogaster and Drosophila pseudoobscura than previously thought. We also show how whole genome parametric alignment can be used to quantitatively assess the dependence of branch length estimates on alignment parameters. Public Library of Science 2006-06 2006-06-23 /pmc/articles/PMC1480539/ /pubmed/16789815 http://dx.doi.org/10.1371/journal.pcbi.0020073 Text en © 2006 Dewey et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Dewey, Colin N
Huggins, Peter M
Woods, Kevin
Sturmfels, Bernd
Pachter, Lior
Parametric Alignment of Drosophila Genomes
title Parametric Alignment of Drosophila Genomes
title_full Parametric Alignment of Drosophila Genomes
title_fullStr Parametric Alignment of Drosophila Genomes
title_full_unstemmed Parametric Alignment of Drosophila Genomes
title_short Parametric Alignment of Drosophila Genomes
title_sort parametric alignment of drosophila genomes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1480539/
https://www.ncbi.nlm.nih.gov/pubmed/16789815
http://dx.doi.org/10.1371/journal.pcbi.0020073
work_keys_str_mv AT deweycolinn parametricalignmentofdrosophilagenomes
AT hugginspeterm parametricalignmentofdrosophilagenomes
AT woodskevin parametricalignmentofdrosophilagenomes
AT sturmfelsbernd parametricalignmentofdrosophilagenomes
AT pachterlior parametricalignmentofdrosophilagenomes