Cargando…
BGData - A Suite of R Packages for Genomic Analysis with Big Data
We created a suite of packages to enable analysis of extremely large genomic data sets (potentially millions of individuals and millions of molecular markers) within the R environment. The package offers: a matrix-like interface for .bed files (PLINK’s binary format for genotype data), a novel class...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6505159/ https://www.ncbi.nlm.nih.gov/pubmed/30894453 http://dx.doi.org/10.1534/g3.119.400018 |
_version_ | 1783416703057657856 |
---|---|
author | Grueneberg, Alexander de los Campos, Gustavo |
author_facet | Grueneberg, Alexander de los Campos, Gustavo |
author_sort | Grueneberg, Alexander |
collection | PubMed |
description | We created a suite of packages to enable analysis of extremely large genomic data sets (potentially millions of individuals and millions of molecular markers) within the R environment. The package offers: a matrix-like interface for .bed files (PLINK’s binary format for genotype data), a novel class of linked arrays that allows linking data stored in multiple files to form a single array accessible from the R computing environment, methods for parallel computing capabilities that can carry out computations on very large data sets without loading the entire data into memory and a basic set of methods for statistical genetic analyses. The package is accessible through CRAN and GitHub. In this note, we describe the classes and methods implemented in each of the packages that make the suite and illustrate the use of the packages using data from the UK Biobank. |
format | Online Article Text |
id | pubmed-6505159 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-65051592019-05-21 BGData - A Suite of R Packages for Genomic Analysis with Big Data Grueneberg, Alexander de los Campos, Gustavo G3 (Bethesda) Software and Data Resources We created a suite of packages to enable analysis of extremely large genomic data sets (potentially millions of individuals and millions of molecular markers) within the R environment. The package offers: a matrix-like interface for .bed files (PLINK’s binary format for genotype data), a novel class of linked arrays that allows linking data stored in multiple files to form a single array accessible from the R computing environment, methods for parallel computing capabilities that can carry out computations on very large data sets without loading the entire data into memory and a basic set of methods for statistical genetic analyses. The package is accessible through CRAN and GitHub. In this note, we describe the classes and methods implemented in each of the packages that make the suite and illustrate the use of the packages using data from the UK Biobank. Genetics Society of America 2019-03-20 /pmc/articles/PMC6505159/ /pubmed/30894453 http://dx.doi.org/10.1534/g3.119.400018 Text en Copyright © 2019 Grueneberg, de los Campos http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Software and Data Resources Grueneberg, Alexander de los Campos, Gustavo BGData - A Suite of R Packages for Genomic Analysis with Big Data |
title | BGData - A Suite of R Packages for Genomic Analysis with Big Data |
title_full | BGData - A Suite of R Packages for Genomic Analysis with Big Data |
title_fullStr | BGData - A Suite of R Packages for Genomic Analysis with Big Data |
title_full_unstemmed | BGData - A Suite of R Packages for Genomic Analysis with Big Data |
title_short | BGData - A Suite of R Packages for Genomic Analysis with Big Data |
title_sort | bgdata - a suite of r packages for genomic analysis with big data |
topic | Software and Data Resources |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6505159/ https://www.ncbi.nlm.nih.gov/pubmed/30894453 http://dx.doi.org/10.1534/g3.119.400018 |
work_keys_str_mv | AT gruenebergalexander bgdataasuiteofrpackagesforgenomicanalysiswithbigdata AT deloscamposgustavo bgdataasuiteofrpackagesforgenomicanalysiswithbigdata |