Cargando…
Canary: an automated tool for the conversion of MaCH imputed dosage files to PLINK files
BACKGROUND: Previous studies have demonstrated the value of re-analysing publicly available genetics data with recent analytical approaches. Publicly available datasets, such as the Women’s Health Initiative (WHI) offered by the database of genotypes and phenotypes (dbGaP), provide a wealthy resourc...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9327220/ https://www.ncbi.nlm.nih.gov/pubmed/35896971 http://dx.doi.org/10.1186/s12859-022-04822-8 |
_version_ | 1784757460900249600 |
---|---|
author | Bennett, Adam N. Rainford, Jethro Huang, Xiaotai He, Qian Chan, Kei Hang Katie |
author_facet | Bennett, Adam N. Rainford, Jethro Huang, Xiaotai He, Qian Chan, Kei Hang Katie |
author_sort | Bennett, Adam N. |
collection | PubMed |
description | BACKGROUND: Previous studies have demonstrated the value of re-analysing publicly available genetics data with recent analytical approaches. Publicly available datasets, such as the Women’s Health Initiative (WHI) offered by the database of genotypes and phenotypes (dbGaP), provide a wealthy resource for researchers to perform multiple analyses, including Genome-Wide Association Studies. Often, the genetic information of individuals in these datasets are stored in imputed dosage files output by MaCH; mldose and mlinfo files. In order for researchers to perform GWAS studies with this data, they must first be converted to a file format compatible with their tool of choice e.g., PLINK. Currently, there is no published tool which easily converts the datasets provided in MACH dosage files into PLINK-ready files. RESULTS: Herein, we present Canary a singularity-based tool which converts MaCH dosage files into PLINK-compatible files with a single line of user input at the command line. Further, we provide a detailed tutorial on preparation of phenotype files. Moreover, Canary comes with preinstalled software often used during GWAS studies, to further increase the ease-of-use of HPC systems for researchers. CONCLUSIONS: Until now, conversion of imputed data in the form of MaCH mldose and mlinfo files needed to be completed manually. Canary uses singularity container technology to allow users to automatically convert these MaCH files into PLINK compatible files. Additionally, Canary provides researchers with a platform to conduct GWAS analysis more easily as it contains essential software needed for conducting GWAS studies, such as PLINK and Bioconductor. We hope that this tool will greatly increase the ease at which researchers can perform GWAS with imputed data, particularly on HPC environments. |
format | Online Article Text |
id | pubmed-9327220 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-93272202022-07-28 Canary: an automated tool for the conversion of MaCH imputed dosage files to PLINK files Bennett, Adam N. Rainford, Jethro Huang, Xiaotai He, Qian Chan, Kei Hang Katie BMC Bioinformatics Software BACKGROUND: Previous studies have demonstrated the value of re-analysing publicly available genetics data with recent analytical approaches. Publicly available datasets, such as the Women’s Health Initiative (WHI) offered by the database of genotypes and phenotypes (dbGaP), provide a wealthy resource for researchers to perform multiple analyses, including Genome-Wide Association Studies. Often, the genetic information of individuals in these datasets are stored in imputed dosage files output by MaCH; mldose and mlinfo files. In order for researchers to perform GWAS studies with this data, they must first be converted to a file format compatible with their tool of choice e.g., PLINK. Currently, there is no published tool which easily converts the datasets provided in MACH dosage files into PLINK-ready files. RESULTS: Herein, we present Canary a singularity-based tool which converts MaCH dosage files into PLINK-compatible files with a single line of user input at the command line. Further, we provide a detailed tutorial on preparation of phenotype files. Moreover, Canary comes with preinstalled software often used during GWAS studies, to further increase the ease-of-use of HPC systems for researchers. CONCLUSIONS: Until now, conversion of imputed data in the form of MaCH mldose and mlinfo files needed to be completed manually. Canary uses singularity container technology to allow users to automatically convert these MaCH files into PLINK compatible files. Additionally, Canary provides researchers with a platform to conduct GWAS analysis more easily as it contains essential software needed for conducting GWAS studies, such as PLINK and Bioconductor. We hope that this tool will greatly increase the ease at which researchers can perform GWAS with imputed data, particularly on HPC environments. BioMed Central 2022-07-27 /pmc/articles/PMC9327220/ /pubmed/35896971 http://dx.doi.org/10.1186/s12859-022-04822-8 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Software Bennett, Adam N. Rainford, Jethro Huang, Xiaotai He, Qian Chan, Kei Hang Katie Canary: an automated tool for the conversion of MaCH imputed dosage files to PLINK files |
title | Canary: an automated tool for the conversion of MaCH imputed dosage files to PLINK files |
title_full | Canary: an automated tool for the conversion of MaCH imputed dosage files to PLINK files |
title_fullStr | Canary: an automated tool for the conversion of MaCH imputed dosage files to PLINK files |
title_full_unstemmed | Canary: an automated tool for the conversion of MaCH imputed dosage files to PLINK files |
title_short | Canary: an automated tool for the conversion of MaCH imputed dosage files to PLINK files |
title_sort | canary: an automated tool for the conversion of mach imputed dosage files to plink files |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9327220/ https://www.ncbi.nlm.nih.gov/pubmed/35896971 http://dx.doi.org/10.1186/s12859-022-04822-8 |
work_keys_str_mv | AT bennettadamn canaryanautomatedtoolfortheconversionofmachimputeddosagefilestoplinkfiles AT rainfordjethro canaryanautomatedtoolfortheconversionofmachimputeddosagefilestoplinkfiles AT huangxiaotai canaryanautomatedtoolfortheconversionofmachimputeddosagefilestoplinkfiles AT heqian canaryanautomatedtoolfortheconversionofmachimputeddosagefilestoplinkfiles AT chankeihangkatie canaryanautomatedtoolfortheconversionofmachimputeddosagefilestoplinkfiles |