Cargando…

MafFilter: a highly flexible and extensible multiple genome alignment files processor

BACKGROUND: Sequence alignments are the starting point for most evolutionary and comparative analyses. Full genome sequences can be compared to study patterns of within and between species variation. Genome sequence alignments are complex structures containing information such as coordinates, qualit...

Descripción completa

Detalles Bibliográficos
Autores principales: Dutheil, Julien Y, Gaillard, Sylvain, Stukenbrock, Eva H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3904536/
https://www.ncbi.nlm.nih.gov/pubmed/24447531
http://dx.doi.org/10.1186/1471-2164-15-53
_version_ 1782301220933730304
author Dutheil, Julien Y
Gaillard, Sylvain
Stukenbrock, Eva H
author_facet Dutheil, Julien Y
Gaillard, Sylvain
Stukenbrock, Eva H
author_sort Dutheil, Julien Y
collection PubMed
description BACKGROUND: Sequence alignments are the starting point for most evolutionary and comparative analyses. Full genome sequences can be compared to study patterns of within and between species variation. Genome sequence alignments are complex structures containing information such as coordinates, quality scores and synteny structure, which are stored in Multiple Alignment Format (MAF) files. Processing these alignments therefore involves parsing and manipulating typically large MAF files in an efficient way. RESULTS: MafFilter is a command-line driven program written in C++ that enables the processing of genome alignments stored in the Multiple Alignment Format in an efficient and extensible manner. It provides an extensive set of tools which can be parametrized and combined by the user via option files. We demonstrate the software’s functionality and performance on several biological examples covering Primate genomics and fungal population genomics. Example analyses involve window-based alignment filtering, feature extractions and various statistics, phylogenetics and population genomics calculations. CONCLUSIONS: MafFilter is a highly efficient and flexible tool to analyse multiple genome alignments. By allowing the user to combine a large set of available methods, as well as designing his/her own, it enables the design of custom data filtering and analysis pipelines for genomic studies. MafFilter is an open source software available at http://bioweb.me/maffilter.
format Online
Article
Text
id pubmed-3904536
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39045362014-01-29 MafFilter: a highly flexible and extensible multiple genome alignment files processor Dutheil, Julien Y Gaillard, Sylvain Stukenbrock, Eva H BMC Genomics Software BACKGROUND: Sequence alignments are the starting point for most evolutionary and comparative analyses. Full genome sequences can be compared to study patterns of within and between species variation. Genome sequence alignments are complex structures containing information such as coordinates, quality scores and synteny structure, which are stored in Multiple Alignment Format (MAF) files. Processing these alignments therefore involves parsing and manipulating typically large MAF files in an efficient way. RESULTS: MafFilter is a command-line driven program written in C++ that enables the processing of genome alignments stored in the Multiple Alignment Format in an efficient and extensible manner. It provides an extensive set of tools which can be parametrized and combined by the user via option files. We demonstrate the software’s functionality and performance on several biological examples covering Primate genomics and fungal population genomics. Example analyses involve window-based alignment filtering, feature extractions and various statistics, phylogenetics and population genomics calculations. CONCLUSIONS: MafFilter is a highly efficient and flexible tool to analyse multiple genome alignments. By allowing the user to combine a large set of available methods, as well as designing his/her own, it enables the design of custom data filtering and analysis pipelines for genomic studies. MafFilter is an open source software available at http://bioweb.me/maffilter. BioMed Central 2014-01-22 /pmc/articles/PMC3904536/ /pubmed/24447531 http://dx.doi.org/10.1186/1471-2164-15-53 Text en Copyright © 2014 Dutheil et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Dutheil, Julien Y
Gaillard, Sylvain
Stukenbrock, Eva H
MafFilter: a highly flexible and extensible multiple genome alignment files processor
title MafFilter: a highly flexible and extensible multiple genome alignment files processor
title_full MafFilter: a highly flexible and extensible multiple genome alignment files processor
title_fullStr MafFilter: a highly flexible and extensible multiple genome alignment files processor
title_full_unstemmed MafFilter: a highly flexible and extensible multiple genome alignment files processor
title_short MafFilter: a highly flexible and extensible multiple genome alignment files processor
title_sort maffilter: a highly flexible and extensible multiple genome alignment files processor
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3904536/
https://www.ncbi.nlm.nih.gov/pubmed/24447531
http://dx.doi.org/10.1186/1471-2164-15-53
work_keys_str_mv AT dutheiljulieny maffilterahighlyflexibleandextensiblemultiplegenomealignmentfilesprocessor
AT gaillardsylvain maffilterahighlyflexibleandextensiblemultiplegenomealignmentfilesprocessor
AT stukenbrockevah maffilterahighlyflexibleandextensiblemultiplegenomealignmentfilesprocessor