Cargando…
Optimizations for the EcoPod field identification tool
BACKGROUND: We sketch our species identification tool for palm sized computers that helps knowledgeable observers with census activities. An algorithm turns an identification matrix into a minimal length series of questions that guide the operator towards identification. Historic observation data fr...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2322985/ https://www.ncbi.nlm.nih.gov/pubmed/18366649 http://dx.doi.org/10.1186/1471-2105-9-150 |
_version_ | 1782152608289390592 |
---|---|
author | Manoharan, Aswath Stamberger, Jeannie Yu, YuanYuan Paepcke, Andreas |
author_facet | Manoharan, Aswath Stamberger, Jeannie Yu, YuanYuan Paepcke, Andreas |
author_sort | Manoharan, Aswath |
collection | PubMed |
description | BACKGROUND: We sketch our species identification tool for palm sized computers that helps knowledgeable observers with census activities. An algorithm turns an identification matrix into a minimal length series of questions that guide the operator towards identification. Historic observation data from the census geographic area helps minimize question volume. We explore how much historic data is required to boost performance, and whether the use of history negatively impacts identification of rare species. We also explore how characteristics of the matrix interact with the algorithm, and how best to predict the probability of observing a previously unseen species. RESULTS: Point counts of birds taken at Stanford University's Jasper Ridge Biological Preserve between 2000 and 2005 were used to examine the algorithm. A computer identified species by correctly answering, and counting the algorithm's questions. We also explored how the character density of the key matrix and the theoretical minimum number of questions for each bird in the matrix influenced the algorithm. Our investigation of the required probability smoothing determined whether Laplace smoothing of observation probabilities was sufficient, or whether the more complex Good-Turing technique is required. CONCLUSION: Historic data improved identification speed, but only impacted the top 25% most frequently observed birds. For rare birds the history based algorithms did not impose a noticeable penalty in the number of questions required for identification. For our dataset neither age of the historic data, nor the number of observation years impacted the algorithm. Density of characters for different taxa in the identification matrix did not impact the algorithms. Intrinsic differences in identifying different birds did affect the algorithm, but the differences affected the baseline method of not using historic data to exactly the same degree. We found that Laplace smoothing performed better for rare species than Simple Good-Turing, and that, contrary to expectation, the technique did not then adversely affect identification performance for frequently observed birds. |
format | Text |
id | pubmed-2322985 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-23229852008-04-18 Optimizations for the EcoPod field identification tool Manoharan, Aswath Stamberger, Jeannie Yu, YuanYuan Paepcke, Andreas BMC Bioinformatics Research Article BACKGROUND: We sketch our species identification tool for palm sized computers that helps knowledgeable observers with census activities. An algorithm turns an identification matrix into a minimal length series of questions that guide the operator towards identification. Historic observation data from the census geographic area helps minimize question volume. We explore how much historic data is required to boost performance, and whether the use of history negatively impacts identification of rare species. We also explore how characteristics of the matrix interact with the algorithm, and how best to predict the probability of observing a previously unseen species. RESULTS: Point counts of birds taken at Stanford University's Jasper Ridge Biological Preserve between 2000 and 2005 were used to examine the algorithm. A computer identified species by correctly answering, and counting the algorithm's questions. We also explored how the character density of the key matrix and the theoretical minimum number of questions for each bird in the matrix influenced the algorithm. Our investigation of the required probability smoothing determined whether Laplace smoothing of observation probabilities was sufficient, or whether the more complex Good-Turing technique is required. CONCLUSION: Historic data improved identification speed, but only impacted the top 25% most frequently observed birds. For rare birds the history based algorithms did not impose a noticeable penalty in the number of questions required for identification. For our dataset neither age of the historic data, nor the number of observation years impacted the algorithm. Density of characters for different taxa in the identification matrix did not impact the algorithms. Intrinsic differences in identifying different birds did affect the algorithm, but the differences affected the baseline method of not using historic data to exactly the same degree. We found that Laplace smoothing performed better for rare species than Simple Good-Turing, and that, contrary to expectation, the technique did not then adversely affect identification performance for frequently observed birds. BioMed Central 2008-03-17 /pmc/articles/PMC2322985/ /pubmed/18366649 http://dx.doi.org/10.1186/1471-2105-9-150 Text en Copyright © 2008 Manoharan et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Manoharan, Aswath Stamberger, Jeannie Yu, YuanYuan Paepcke, Andreas Optimizations for the EcoPod field identification tool |
title | Optimizations for the EcoPod field identification tool |
title_full | Optimizations for the EcoPod field identification tool |
title_fullStr | Optimizations for the EcoPod field identification tool |
title_full_unstemmed | Optimizations for the EcoPod field identification tool |
title_short | Optimizations for the EcoPod field identification tool |
title_sort | optimizations for the ecopod field identification tool |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2322985/ https://www.ncbi.nlm.nih.gov/pubmed/18366649 http://dx.doi.org/10.1186/1471-2105-9-150 |
work_keys_str_mv | AT manoharanaswath optimizationsfortheecopodfieldidentificationtool AT stambergerjeannie optimizationsfortheecopodfieldidentificationtool AT yuyuanyuan optimizationsfortheecopodfieldidentificationtool AT paepckeandreas optimizationsfortheecopodfieldidentificationtool |