Cargando…
Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank
Molecular sequence data is an essential component for many biological fields of study. The strength of these data is in their ability to be centralised and compared across research studies. There are many online repositories for molecular sequence data, some of which are very large accumulations of...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Pensoft Publishers
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8443542/ https://www.ncbi.nlm.nih.gov/pubmed/34594153 http://dx.doi.org/10.3897/BDJ.9.e71378 |
_version_ | 1783753202983763968 |
---|---|
author | Young, Robert G Gill, Rekkab Gillis, Daniel Hanner, Robert H |
author_facet | Young, Robert G Gill, Rekkab Gillis, Daniel Hanner, Robert H |
author_sort | Young, Robert G |
collection | PubMed |
description | Molecular sequence data is an essential component for many biological fields of study. The strength of these data is in their ability to be centralised and compared across research studies. There are many online repositories for molecular sequence data, some of which are very large accumulations of varying data types like NCBI’s GenBank. Due to the size and the complexity of the data in these repositories, challenges arise in searching for data of interest. While data repositories exist for molecular markers, taxa and other specific research interests, repositories may not contain, or be suitable for, more specific applications. Manually accessing, searching, downloading, accumulating, dereplicating and cleaning data to construct project-specific datasets is time-consuming. In addition, the manual assembly of datasets presents challenges with reproducibility. Here, we present the MACER package to assist researchers in assembling molecular datasets and provide reproducibility in the process. |
format | Online Article Text |
id | pubmed-8443542 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Pensoft Publishers |
record_format | MEDLINE/PubMed |
spelling | pubmed-84435422021-09-29 Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank Young, Robert G Gill, Rekkab Gillis, Daniel Hanner, Robert H Biodivers Data J R Package Molecular sequence data is an essential component for many biological fields of study. The strength of these data is in their ability to be centralised and compared across research studies. There are many online repositories for molecular sequence data, some of which are very large accumulations of varying data types like NCBI’s GenBank. Due to the size and the complexity of the data in these repositories, challenges arise in searching for data of interest. While data repositories exist for molecular markers, taxa and other specific research interests, repositories may not contain, or be suitable for, more specific applications. Manually accessing, searching, downloading, accumulating, dereplicating and cleaning data to construct project-specific datasets is time-consuming. In addition, the manual assembly of datasets presents challenges with reproducibility. Here, we present the MACER package to assist researchers in assembling molecular datasets and provide reproducibility in the process. Pensoft Publishers 2021-09-08 /pmc/articles/PMC8443542/ /pubmed/34594153 http://dx.doi.org/10.3897/BDJ.9.e71378 Text en Robert G Young, Rekkab Gill, Daniel Gillis, Robert H Hanner https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | R Package Young, Robert G Gill, Rekkab Gillis, Daniel Hanner, Robert H Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank |
title | Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank |
title_full | Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank |
title_fullStr | Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank |
title_full_unstemmed | Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank |
title_short | Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank |
title_sort | molecular acquisition, cleaning and evaluation in r (macer) - a tool to assemble molecular marker datasets from bold and genbank |
topic | R Package |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8443542/ https://www.ncbi.nlm.nih.gov/pubmed/34594153 http://dx.doi.org/10.3897/BDJ.9.e71378 |
work_keys_str_mv | AT youngrobertg molecularacquisitioncleaningandevaluationinrmaceratooltoassemblemolecularmarkerdatasetsfromboldandgenbank AT gillrekkab molecularacquisitioncleaningandevaluationinrmaceratooltoassemblemolecularmarkerdatasetsfromboldandgenbank AT gillisdaniel molecularacquisitioncleaningandevaluationinrmaceratooltoassemblemolecularmarkerdatasetsfromboldandgenbank AT hannerroberth molecularacquisitioncleaningandevaluationinrmaceratooltoassemblemolecularmarkerdatasetsfromboldandgenbank |