Cargando…

Mega2: validated data-reformatting for linkage and association analyses

BACKGROUND: In a typical study of the genetics of a complex human disease, many different analysis programs are used, to test for linkage and association. This requires extensive and careful data reformatting, as many of these analysis programs use differing input formats. Writing scripts to facilit...

Descripción completa

Detalles Bibliográficos
Autores principales: Baron, Robert V, Kollar, Charles, Mukhopadhyay, Nandita, Weeks, Daniel E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4269913/
https://www.ncbi.nlm.nih.gov/pubmed/25687422
http://dx.doi.org/10.1186/s13029-014-0026-y
_version_ 1782349412287119360
author Baron, Robert V
Kollar, Charles
Mukhopadhyay, Nandita
Weeks, Daniel E
author_facet Baron, Robert V
Kollar, Charles
Mukhopadhyay, Nandita
Weeks, Daniel E
author_sort Baron, Robert V
collection PubMed
description BACKGROUND: In a typical study of the genetics of a complex human disease, many different analysis programs are used, to test for linkage and association. This requires extensive and careful data reformatting, as many of these analysis programs use differing input formats. Writing scripts to facilitate this can be tedious, time-consuming, and error-prone. To address these issues, the open source Mega2 data reformatting program provides validated and tested data conversions from several commonly-used input formats to many output formats. RESULTS: Mega2, the Manipulation Environment for Genetic Analysis, facilitates the creation of analysis-ready datasets from data gathered as part of a genetic study. It transparently allows users to process genetic data for family-based or case/control studies accurately and efficiently. In addition to data validation checks, Mega2 provides analysis setup capabilities for a broad choice of commonly-used genetic analysis programs. First released in 2000, Mega2 has recently been significantly improved in a number of ways. We have rewritten it in C++ and have reduced its memory requirements. Mega2 now can read input files in LINKAGE, PLINK, and VCF/BCF formats, as well as its own specialized annotated format. It supports conversion to many commonly-used formats including SOLAR, PLINK, Merlin, Mendel, SimWalk2, Cranefoot, IQLS, FBAT, MORGAN, BEAGLE, Eigenstrat, Structure, and PLINK/SEQ. When controlled by a batch file, Mega2 can be used non-interactively in data reformatting pipelines. Support for genetic data from several other species besides humans has been added. CONCLUSIONS: By providing tested and validated data reformatting, Mega2 facilitates more accurate and extensive analyses of genetic data, avoiding the need to write, debug, and maintain one’s own custom data reformatting scripts. Mega2 is freely available at https://watson.hgen.pitt.edu/register/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13029-014-0026-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4269913
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42699132014-12-18 Mega2: validated data-reformatting for linkage and association analyses Baron, Robert V Kollar, Charles Mukhopadhyay, Nandita Weeks, Daniel E Source Code Biol Med Software Review BACKGROUND: In a typical study of the genetics of a complex human disease, many different analysis programs are used, to test for linkage and association. This requires extensive and careful data reformatting, as many of these analysis programs use differing input formats. Writing scripts to facilitate this can be tedious, time-consuming, and error-prone. To address these issues, the open source Mega2 data reformatting program provides validated and tested data conversions from several commonly-used input formats to many output formats. RESULTS: Mega2, the Manipulation Environment for Genetic Analysis, facilitates the creation of analysis-ready datasets from data gathered as part of a genetic study. It transparently allows users to process genetic data for family-based or case/control studies accurately and efficiently. In addition to data validation checks, Mega2 provides analysis setup capabilities for a broad choice of commonly-used genetic analysis programs. First released in 2000, Mega2 has recently been significantly improved in a number of ways. We have rewritten it in C++ and have reduced its memory requirements. Mega2 now can read input files in LINKAGE, PLINK, and VCF/BCF formats, as well as its own specialized annotated format. It supports conversion to many commonly-used formats including SOLAR, PLINK, Merlin, Mendel, SimWalk2, Cranefoot, IQLS, FBAT, MORGAN, BEAGLE, Eigenstrat, Structure, and PLINK/SEQ. When controlled by a batch file, Mega2 can be used non-interactively in data reformatting pipelines. Support for genetic data from several other species besides humans has been added. CONCLUSIONS: By providing tested and validated data reformatting, Mega2 facilitates more accurate and extensive analyses of genetic data, avoiding the need to write, debug, and maintain one’s own custom data reformatting scripts. Mega2 is freely available at https://watson.hgen.pitt.edu/register/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13029-014-0026-y) contains supplementary material, which is available to authorized users. BioMed Central 2014-12-05 /pmc/articles/PMC4269913/ /pubmed/25687422 http://dx.doi.org/10.1186/s13029-014-0026-y Text en © Baron et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software Review
Baron, Robert V
Kollar, Charles
Mukhopadhyay, Nandita
Weeks, Daniel E
Mega2: validated data-reformatting for linkage and association analyses
title Mega2: validated data-reformatting for linkage and association analyses
title_full Mega2: validated data-reformatting for linkage and association analyses
title_fullStr Mega2: validated data-reformatting for linkage and association analyses
title_full_unstemmed Mega2: validated data-reformatting for linkage and association analyses
title_short Mega2: validated data-reformatting for linkage and association analyses
title_sort mega2: validated data-reformatting for linkage and association analyses
topic Software Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4269913/
https://www.ncbi.nlm.nih.gov/pubmed/25687422
http://dx.doi.org/10.1186/s13029-014-0026-y
work_keys_str_mv AT baronrobertv mega2validateddatareformattingforlinkageandassociationanalyses
AT kollarcharles mega2validateddatareformattingforlinkageandassociationanalyses
AT mukhopadhyaynandita mega2validateddatareformattingforlinkageandassociationanalyses
AT weeksdaniele mega2validateddatareformattingforlinkageandassociationanalyses