Cargando…
Structural Analysis of Biodiversity
Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vect...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2827552/ https://www.ncbi.nlm.nih.gov/pubmed/20195371 http://dx.doi.org/10.1371/journal.pone.0009266 |
_version_ | 1782177963588976640 |
---|---|
author | Sirovich, Lawrence Stoeckle, Mark Y. Zhang, Yu |
author_facet | Sirovich, Lawrence Stoeckle, Mark Y. Zhang, Yu |
author_sort | Sirovich, Lawrence |
collection | PubMed |
description | Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity. |
format | Text |
id | pubmed-2827552 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-28275522010-03-02 Structural Analysis of Biodiversity Sirovich, Lawrence Stoeckle, Mark Y. Zhang, Yu PLoS One Research Article Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity. Public Library of Science 2010-02-24 /pmc/articles/PMC2827552/ /pubmed/20195371 http://dx.doi.org/10.1371/journal.pone.0009266 Text en Sirovich et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Sirovich, Lawrence Stoeckle, Mark Y. Zhang, Yu Structural Analysis of Biodiversity |
title | Structural Analysis of Biodiversity |
title_full | Structural Analysis of Biodiversity |
title_fullStr | Structural Analysis of Biodiversity |
title_full_unstemmed | Structural Analysis of Biodiversity |
title_short | Structural Analysis of Biodiversity |
title_sort | structural analysis of biodiversity |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2827552/ https://www.ncbi.nlm.nih.gov/pubmed/20195371 http://dx.doi.org/10.1371/journal.pone.0009266 |
work_keys_str_mv | AT sirovichlawrence structuralanalysisofbiodiversity AT stoecklemarky structuralanalysisofbiodiversity AT zhangyu structuralanalysisofbiodiversity |