Cargando…

4SpecID: Reference DNA Libraries Auditing and Annotation System for Forensic Applications

Forensic genetics is a fast-growing field that frequently requires DNA-based taxonomy, namely, when evidence are parts of specimens, often highly processed in food, potions, or ointments. Reference DNA-sequences libraries, such as BOLD or GenBank, are imperative tools for taxonomic assignment, parti...

Descripción completa

Detalles Bibliográficos
Autores principales: Neto, Luís, Pinto, Nádia, Proença, Alberto, Amorim, António, Conde-Sousa, Eduardo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7824288/
https://www.ncbi.nlm.nih.gov/pubmed/33401773
http://dx.doi.org/10.3390/genes12010061
_version_ 1783640040785575936
author Neto, Luís
Pinto, Nádia
Proença, Alberto
Amorim, António
Conde-Sousa, Eduardo
author_facet Neto, Luís
Pinto, Nádia
Proença, Alberto
Amorim, António
Conde-Sousa, Eduardo
author_sort Neto, Luís
collection PubMed
description Forensic genetics is a fast-growing field that frequently requires DNA-based taxonomy, namely, when evidence are parts of specimens, often highly processed in food, potions, or ointments. Reference DNA-sequences libraries, such as BOLD or GenBank, are imperative tools for taxonomic assignment, particularly when morphology is inadequate for classification. The auditing and curation of these datasets require reliable mechanisms, preferably with automated data preprocessing. Software tools were developed to grade these datasets considering as primary criterion the number of records, which is not compliant with forensic standards, where the priority is validation from independent sources. Moreover, 4SpecID is an efficient and freely available software tool developed to audit and annotate reference libraries, specifically designed for forensic applications. Its intuitive user-friendly interface virtually accesses any database and includes specific data mining functions tuned for the widespread BOLD repositories. The built tool was evaluated in laptop MacBook and a dual-Xeon server with a large BOLD dataset (Culicidae, 36,115 records), and the best execution time to grade the dataset on the laptop was 0.28 s. Datasets of Bovidae and Felidae families were used to evaluate the quality of the tool and the relevance of independent sources validation.
format Online
Article
Text
id pubmed-7824288
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-78242882021-01-24 4SpecID: Reference DNA Libraries Auditing and Annotation System for Forensic Applications Neto, Luís Pinto, Nádia Proença, Alberto Amorim, António Conde-Sousa, Eduardo Genes (Basel) Article Forensic genetics is a fast-growing field that frequently requires DNA-based taxonomy, namely, when evidence are parts of specimens, often highly processed in food, potions, or ointments. Reference DNA-sequences libraries, such as BOLD or GenBank, are imperative tools for taxonomic assignment, particularly when morphology is inadequate for classification. The auditing and curation of these datasets require reliable mechanisms, preferably with automated data preprocessing. Software tools were developed to grade these datasets considering as primary criterion the number of records, which is not compliant with forensic standards, where the priority is validation from independent sources. Moreover, 4SpecID is an efficient and freely available software tool developed to audit and annotate reference libraries, specifically designed for forensic applications. Its intuitive user-friendly interface virtually accesses any database and includes specific data mining functions tuned for the widespread BOLD repositories. The built tool was evaluated in laptop MacBook and a dual-Xeon server with a large BOLD dataset (Culicidae, 36,115 records), and the best execution time to grade the dataset on the laptop was 0.28 s. Datasets of Bovidae and Felidae families were used to evaluate the quality of the tool and the relevance of independent sources validation. MDPI 2021-01-02 /pmc/articles/PMC7824288/ /pubmed/33401773 http://dx.doi.org/10.3390/genes12010061 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Neto, Luís
Pinto, Nádia
Proença, Alberto
Amorim, António
Conde-Sousa, Eduardo
4SpecID: Reference DNA Libraries Auditing and Annotation System for Forensic Applications
title 4SpecID: Reference DNA Libraries Auditing and Annotation System for Forensic Applications
title_full 4SpecID: Reference DNA Libraries Auditing and Annotation System for Forensic Applications
title_fullStr 4SpecID: Reference DNA Libraries Auditing and Annotation System for Forensic Applications
title_full_unstemmed 4SpecID: Reference DNA Libraries Auditing and Annotation System for Forensic Applications
title_short 4SpecID: Reference DNA Libraries Auditing and Annotation System for Forensic Applications
title_sort 4specid: reference dna libraries auditing and annotation system for forensic applications
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7824288/
https://www.ncbi.nlm.nih.gov/pubmed/33401773
http://dx.doi.org/10.3390/genes12010061
work_keys_str_mv AT netoluis 4specidreferencednalibrariesauditingandannotationsystemforforensicapplications
AT pintonadia 4specidreferencednalibrariesauditingandannotationsystemforforensicapplications
AT proencaalberto 4specidreferencednalibrariesauditingandannotationsystemforforensicapplications
AT amorimantonio 4specidreferencednalibrariesauditingandannotationsystemforforensicapplications
AT condesousaeduardo 4specidreferencednalibrariesauditingandannotationsystemforforensicapplications