Cargando…
Colombia, an unknown genetic diversity in the era of Big Data
BACKGROUND: Latin America harbors some of the most biodiverse countries in the world, including Colombia. Despite the increasing use of cutting-edge technologies in genomics and bioinformatics in several biological science fields around the world, the region has fallen behind in the inclusion of the...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6288850/ https://www.ncbi.nlm.nih.gov/pubmed/30537922 http://dx.doi.org/10.1186/s12864-018-5194-8 |
_version_ | 1783379869887889408 |
---|---|
author | Noreña – P, Alejandra González Muñoz, Andrea Mosquera-Rendón, Jeanneth Botero, Kelly Cristancho, Marco A. |
author_facet | Noreña – P, Alejandra González Muñoz, Andrea Mosquera-Rendón, Jeanneth Botero, Kelly Cristancho, Marco A. |
author_sort | Noreña – P, Alejandra |
collection | PubMed |
description | BACKGROUND: Latin America harbors some of the most biodiverse countries in the world, including Colombia. Despite the increasing use of cutting-edge technologies in genomics and bioinformatics in several biological science fields around the world, the region has fallen behind in the inclusion of these approaches in biodiversity studies. In this study, we used data mining methods to search in four main public databases of genetic sequences such as: NCBI Nucleotide and BioProject, Pathosystems Resource Integration Center, and Barcode of Life Data Systems databases. We aimed to determine how much of the Colombian biodiversity is contained in genetic data stored in these public databases and how much of this information has been generated by national institutions. Additionally, we compared this data for Colombia with other countries of high biodiversity in Latin America, such as Brazil, Argentina, Costa Rica, Mexico, and Peru. RESULTS: In Nucleotide, we found that 66.84% of total records for Colombia have been published at the national level, and this data represents less than 5% of the total number of species reported for the country. In BioProject, 70.46% of records were generated by national institutions and the great majority of them is represented by microorganisms. In BOLD Systems, 26% of records have been submitted by national institutions, representing 258 species for Colombia. This number of species reported for Colombia span approximately 0.46% of the total biodiversity reported for the country (56,343 species). Finally, in PATRIC database, 13.25% of the reported sequences were contributed by national institutions. Colombia has a better biodiversity representation in public databases in comparison to other Latin American countries, like Costa Rica and Peru. Mexico and Argentina have the highest representation of species at the national level, despite Brazil and Colombia, which actually hold the first and second places in biodiversity worldwide. CONCLUSIONS: Our findings show gaps in the representation of the Colombian biodiversity at the molecular and genetic levels in widely consulted public databases. National funding for high-throughput molecular research, NGS technologies costs, and access to genetic resources are limiting factors. This fact should be taken as an opportunity to foster the development of collaborative projects between research groups in the Latin American region to study the vast biodiversity of these countries using ‘omics’ technologies. |
format | Online Article Text |
id | pubmed-6288850 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62888502018-12-14 Colombia, an unknown genetic diversity in the era of Big Data Noreña – P, Alejandra González Muñoz, Andrea Mosquera-Rendón, Jeanneth Botero, Kelly Cristancho, Marco A. BMC Genomics Research BACKGROUND: Latin America harbors some of the most biodiverse countries in the world, including Colombia. Despite the increasing use of cutting-edge technologies in genomics and bioinformatics in several biological science fields around the world, the region has fallen behind in the inclusion of these approaches in biodiversity studies. In this study, we used data mining methods to search in four main public databases of genetic sequences such as: NCBI Nucleotide and BioProject, Pathosystems Resource Integration Center, and Barcode of Life Data Systems databases. We aimed to determine how much of the Colombian biodiversity is contained in genetic data stored in these public databases and how much of this information has been generated by national institutions. Additionally, we compared this data for Colombia with other countries of high biodiversity in Latin America, such as Brazil, Argentina, Costa Rica, Mexico, and Peru. RESULTS: In Nucleotide, we found that 66.84% of total records for Colombia have been published at the national level, and this data represents less than 5% of the total number of species reported for the country. In BioProject, 70.46% of records were generated by national institutions and the great majority of them is represented by microorganisms. In BOLD Systems, 26% of records have been submitted by national institutions, representing 258 species for Colombia. This number of species reported for Colombia span approximately 0.46% of the total biodiversity reported for the country (56,343 species). Finally, in PATRIC database, 13.25% of the reported sequences were contributed by national institutions. Colombia has a better biodiversity representation in public databases in comparison to other Latin American countries, like Costa Rica and Peru. Mexico and Argentina have the highest representation of species at the national level, despite Brazil and Colombia, which actually hold the first and second places in biodiversity worldwide. CONCLUSIONS: Our findings show gaps in the representation of the Colombian biodiversity at the molecular and genetic levels in widely consulted public databases. National funding for high-throughput molecular research, NGS technologies costs, and access to genetic resources are limiting factors. This fact should be taken as an opportunity to foster the development of collaborative projects between research groups in the Latin American region to study the vast biodiversity of these countries using ‘omics’ technologies. BioMed Central 2018-12-11 /pmc/articles/PMC6288850/ /pubmed/30537922 http://dx.doi.org/10.1186/s12864-018-5194-8 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Noreña – P, Alejandra González Muñoz, Andrea Mosquera-Rendón, Jeanneth Botero, Kelly Cristancho, Marco A. Colombia, an unknown genetic diversity in the era of Big Data |
title | Colombia, an unknown genetic diversity in the era of Big Data |
title_full | Colombia, an unknown genetic diversity in the era of Big Data |
title_fullStr | Colombia, an unknown genetic diversity in the era of Big Data |
title_full_unstemmed | Colombia, an unknown genetic diversity in the era of Big Data |
title_short | Colombia, an unknown genetic diversity in the era of Big Data |
title_sort | colombia, an unknown genetic diversity in the era of big data |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6288850/ https://www.ncbi.nlm.nih.gov/pubmed/30537922 http://dx.doi.org/10.1186/s12864-018-5194-8 |
work_keys_str_mv | AT norenapalejandra colombiaanunknowngeneticdiversityintheeraofbigdata AT gonzalezmunozandrea colombiaanunknowngeneticdiversityintheeraofbigdata AT mosquerarendonjeanneth colombiaanunknowngeneticdiversityintheeraofbigdata AT boterokelly colombiaanunknowngeneticdiversityintheeraofbigdata AT cristanchomarcoa colombiaanunknowngeneticdiversityintheeraofbigdata |