Cargando…

Molecular Acquisition, Cleaning and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank

Molecular sequence data is an essential component for many biological fields of study. The strength of these data is in their ability to be centralised and compared across research studies. There are many online repositories for molecular sequence data, some of which are very large accumulations of...

Descripción completa

Detalles Bibliográficos
Autores principales: Young, Robert G, Gill, Rekkab, Gillis, Daniel, Hanner, Robert H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Pensoft Publishers 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8443542/
https://www.ncbi.nlm.nih.gov/pubmed/34594153
http://dx.doi.org/10.3897/BDJ.9.e71378
Descripción
Sumario:Molecular sequence data is an essential component for many biological fields of study. The strength of these data is in their ability to be centralised and compared across research studies. There are many online repositories for molecular sequence data, some of which are very large accumulations of varying data types like NCBI’s GenBank. Due to the size and the complexity of the data in these repositories, challenges arise in searching for data of interest. While data repositories exist for molecular markers, taxa and other specific research interests, repositories may not contain, or be suitable for, more specific applications. Manually accessing, searching, downloading, accumulating, dereplicating and cleaning data to construct project-specific datasets is time-consuming. In addition, the manual assembly of datasets presents challenges with reproducibility. Here, we present the MACER package to assist researchers in assembling molecular datasets and provide reproducibility in the process.