Cargando…

Structural Analysis of Biodiversity

Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vect...

Descripción completa

Detalles Bibliográficos
Autores principales: Sirovich, Lawrence, Stoeckle, Mark Y., Zhang, Yu
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2827552/
https://www.ncbi.nlm.nih.gov/pubmed/20195371
http://dx.doi.org/10.1371/journal.pone.0009266
_version_ 1782177963588976640
author Sirovich, Lawrence
Stoeckle, Mark Y.
Zhang, Yu
author_facet Sirovich, Lawrence
Stoeckle, Mark Y.
Zhang, Yu
author_sort Sirovich, Lawrence
collection PubMed
description Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity.
format Text
id pubmed-2827552
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-28275522010-03-02 Structural Analysis of Biodiversity Sirovich, Lawrence Stoeckle, Mark Y. Zhang, Yu PLoS One Research Article Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity. Public Library of Science 2010-02-24 /pmc/articles/PMC2827552/ /pubmed/20195371 http://dx.doi.org/10.1371/journal.pone.0009266 Text en Sirovich et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Sirovich, Lawrence
Stoeckle, Mark Y.
Zhang, Yu
Structural Analysis of Biodiversity
title Structural Analysis of Biodiversity
title_full Structural Analysis of Biodiversity
title_fullStr Structural Analysis of Biodiversity
title_full_unstemmed Structural Analysis of Biodiversity
title_short Structural Analysis of Biodiversity
title_sort structural analysis of biodiversity
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2827552/
https://www.ncbi.nlm.nih.gov/pubmed/20195371
http://dx.doi.org/10.1371/journal.pone.0009266
work_keys_str_mv AT sirovichlawrence structuralanalysisofbiodiversity
AT stoecklemarky structuralanalysisofbiodiversity
AT zhangyu structuralanalysisofbiodiversity