Cargando…
Mega2: validated data-reformatting for linkage and association analyses
BACKGROUND: In a typical study of the genetics of a complex human disease, many different analysis programs are used, to test for linkage and association. This requires extensive and careful data reformatting, as many of these analysis programs use differing input formats. Writing scripts to facilit...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4269913/ https://www.ncbi.nlm.nih.gov/pubmed/25687422 http://dx.doi.org/10.1186/s13029-014-0026-y |
_version_ | 1782349412287119360 |
---|---|
author | Baron, Robert V Kollar, Charles Mukhopadhyay, Nandita Weeks, Daniel E |
author_facet | Baron, Robert V Kollar, Charles Mukhopadhyay, Nandita Weeks, Daniel E |
author_sort | Baron, Robert V |
collection | PubMed |
description | BACKGROUND: In a typical study of the genetics of a complex human disease, many different analysis programs are used, to test for linkage and association. This requires extensive and careful data reformatting, as many of these analysis programs use differing input formats. Writing scripts to facilitate this can be tedious, time-consuming, and error-prone. To address these issues, the open source Mega2 data reformatting program provides validated and tested data conversions from several commonly-used input formats to many output formats. RESULTS: Mega2, the Manipulation Environment for Genetic Analysis, facilitates the creation of analysis-ready datasets from data gathered as part of a genetic study. It transparently allows users to process genetic data for family-based or case/control studies accurately and efficiently. In addition to data validation checks, Mega2 provides analysis setup capabilities for a broad choice of commonly-used genetic analysis programs. First released in 2000, Mega2 has recently been significantly improved in a number of ways. We have rewritten it in C++ and have reduced its memory requirements. Mega2 now can read input files in LINKAGE, PLINK, and VCF/BCF formats, as well as its own specialized annotated format. It supports conversion to many commonly-used formats including SOLAR, PLINK, Merlin, Mendel, SimWalk2, Cranefoot, IQLS, FBAT, MORGAN, BEAGLE, Eigenstrat, Structure, and PLINK/SEQ. When controlled by a batch file, Mega2 can be used non-interactively in data reformatting pipelines. Support for genetic data from several other species besides humans has been added. CONCLUSIONS: By providing tested and validated data reformatting, Mega2 facilitates more accurate and extensive analyses of genetic data, avoiding the need to write, debug, and maintain one’s own custom data reformatting scripts. Mega2 is freely available at https://watson.hgen.pitt.edu/register/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13029-014-0026-y) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4269913 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42699132014-12-18 Mega2: validated data-reformatting for linkage and association analyses Baron, Robert V Kollar, Charles Mukhopadhyay, Nandita Weeks, Daniel E Source Code Biol Med Software Review BACKGROUND: In a typical study of the genetics of a complex human disease, many different analysis programs are used, to test for linkage and association. This requires extensive and careful data reformatting, as many of these analysis programs use differing input formats. Writing scripts to facilitate this can be tedious, time-consuming, and error-prone. To address these issues, the open source Mega2 data reformatting program provides validated and tested data conversions from several commonly-used input formats to many output formats. RESULTS: Mega2, the Manipulation Environment for Genetic Analysis, facilitates the creation of analysis-ready datasets from data gathered as part of a genetic study. It transparently allows users to process genetic data for family-based or case/control studies accurately and efficiently. In addition to data validation checks, Mega2 provides analysis setup capabilities for a broad choice of commonly-used genetic analysis programs. First released in 2000, Mega2 has recently been significantly improved in a number of ways. We have rewritten it in C++ and have reduced its memory requirements. Mega2 now can read input files in LINKAGE, PLINK, and VCF/BCF formats, as well as its own specialized annotated format. It supports conversion to many commonly-used formats including SOLAR, PLINK, Merlin, Mendel, SimWalk2, Cranefoot, IQLS, FBAT, MORGAN, BEAGLE, Eigenstrat, Structure, and PLINK/SEQ. When controlled by a batch file, Mega2 can be used non-interactively in data reformatting pipelines. Support for genetic data from several other species besides humans has been added. CONCLUSIONS: By providing tested and validated data reformatting, Mega2 facilitates more accurate and extensive analyses of genetic data, avoiding the need to write, debug, and maintain one’s own custom data reformatting scripts. Mega2 is freely available at https://watson.hgen.pitt.edu/register/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13029-014-0026-y) contains supplementary material, which is available to authorized users. BioMed Central 2014-12-05 /pmc/articles/PMC4269913/ /pubmed/25687422 http://dx.doi.org/10.1186/s13029-014-0026-y Text en © Baron et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Review Baron, Robert V Kollar, Charles Mukhopadhyay, Nandita Weeks, Daniel E Mega2: validated data-reformatting for linkage and association analyses |
title | Mega2: validated data-reformatting for linkage and association analyses |
title_full | Mega2: validated data-reformatting for linkage and association analyses |
title_fullStr | Mega2: validated data-reformatting for linkage and association analyses |
title_full_unstemmed | Mega2: validated data-reformatting for linkage and association analyses |
title_short | Mega2: validated data-reformatting for linkage and association analyses |
title_sort | mega2: validated data-reformatting for linkage and association analyses |
topic | Software Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4269913/ https://www.ncbi.nlm.nih.gov/pubmed/25687422 http://dx.doi.org/10.1186/s13029-014-0026-y |
work_keys_str_mv | AT baronrobertv mega2validateddatareformattingforlinkageandassociationanalyses AT kollarcharles mega2validateddatareformattingforlinkageandassociationanalyses AT mukhopadhyaynandita mega2validateddatareformattingforlinkageandassociationanalyses AT weeksdaniele mega2validateddatareformattingforlinkageandassociationanalyses |