Cargando…

Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank

Molecular sequence data is an essential component for many biological fields of study. The strength of these data is in their ability to be centralised and compared across research studies. There are many online repositories for molecular sequence data, some of which are very large accumulations of...

Descripción completa

Detalles Bibliográficos
Autores principales: Young, Robert G, Gill, Rekkab, Gillis, Daniel, Hanner, Robert H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Pensoft Publishers 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8443542/
https://www.ncbi.nlm.nih.gov/pubmed/34594153
http://dx.doi.org/10.3897/BDJ.9.e71378
_version_ 1783753202983763968
author Young, Robert G
Gill, Rekkab
Gillis, Daniel
Hanner, Robert H
author_facet Young, Robert G
Gill, Rekkab
Gillis, Daniel
Hanner, Robert H
author_sort Young, Robert G
collection PubMed
description Molecular sequence data is an essential component for many biological fields of study. The strength of these data is in their ability to be centralised and compared across research studies. There are many online repositories for molecular sequence data, some of which are very large accumulations of varying data types like NCBI’s GenBank. Due to the size and the complexity of the data in these repositories, challenges arise in searching for data of interest. While data repositories exist for molecular markers, taxa and other specific research interests, repositories may not contain, or be suitable for, more specific applications. Manually accessing, searching, downloading, accumulating, dereplicating and cleaning data to construct project-specific datasets is time-consuming. In addition, the manual assembly of datasets presents challenges with reproducibility. Here, we present the MACER package to assist researchers in assembling molecular datasets and provide reproducibility in the process.
format Online
Article
Text
id pubmed-8443542
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Pensoft Publishers
record_format MEDLINE/PubMed
spelling pubmed-84435422021-09-29 Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank Young, Robert G Gill, Rekkab Gillis, Daniel Hanner, Robert H Biodivers Data J R Package Molecular sequence data is an essential component for many biological fields of study. The strength of these data is in their ability to be centralised and compared across research studies. There are many online repositories for molecular sequence data, some of which are very large accumulations of varying data types like NCBI’s GenBank. Due to the size and the complexity of the data in these repositories, challenges arise in searching for data of interest. While data repositories exist for molecular markers, taxa and other specific research interests, repositories may not contain, or be suitable for, more specific applications. Manually accessing, searching, downloading, accumulating, dereplicating and cleaning data to construct project-specific datasets is time-consuming. In addition, the manual assembly of datasets presents challenges with reproducibility. Here, we present the MACER package to assist researchers in assembling molecular datasets and provide reproducibility in the process. Pensoft Publishers 2021-09-08 /pmc/articles/PMC8443542/ /pubmed/34594153 http://dx.doi.org/10.3897/BDJ.9.e71378 Text en Robert G Young, Rekkab Gill, Daniel Gillis, Robert H Hanner https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle R Package
Young, Robert G
Gill, Rekkab
Gillis, Daniel
Hanner, Robert H
Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank
title Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank
title_full Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank
title_fullStr Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank
title_full_unstemmed Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank
title_short Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank
title_sort molecular acquisition, cleaning and evaluation in r (macer) - a tool to assemble molecular marker datasets from bold and genbank
topic R Package
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8443542/
https://www.ncbi.nlm.nih.gov/pubmed/34594153
http://dx.doi.org/10.3897/BDJ.9.e71378
work_keys_str_mv AT youngrobertg molecularacquisitioncleaningandevaluationinrmaceratooltoassemblemolecularmarkerdatasetsfromboldandgenbank
AT gillrekkab molecularacquisitioncleaningandevaluationinrmaceratooltoassemblemolecularmarkerdatasetsfromboldandgenbank
AT gillisdaniel molecularacquisitioncleaningandevaluationinrmaceratooltoassemblemolecularmarkerdatasetsfromboldandgenbank
AT hannerroberth molecularacquisitioncleaningandevaluationinrmaceratooltoassemblemolecularmarkerdatasetsfromboldandgenbank