Cargando…
SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences
Summary: Simple Sequence Repeats (SSRs) are used to address a variety of research questions in a variety of fields (e.g. population genetics, phylogenetics, forensics, etc.), due to their high mutability within and between species. Here, we present an innovative algorithm, SA-SSR, based on suffix an...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5013907/ https://www.ncbi.nlm.nih.gov/pubmed/27170037 http://dx.doi.org/10.1093/bioinformatics/btw298 |
_version_ | 1782452236593397760 |
---|---|
author | Pickett, B. D. Karlinsey, S. M. Penrod, C. E. Cormier, M. J. Ebbert, M. T. W. Shiozawa, D. K. Whipple, C. J. Ridge, P. G. |
author_facet | Pickett, B. D. Karlinsey, S. M. Penrod, C. E. Cormier, M. J. Ebbert, M. T. W. Shiozawa, D. K. Whipple, C. J. Ridge, P. G. |
author_sort | Pickett, B. D. |
collection | PubMed |
description | Summary: Simple Sequence Repeats (SSRs) are used to address a variety of research questions in a variety of fields (e.g. population genetics, phylogenetics, forensics, etc.), due to their high mutability within and between species. Here, we present an innovative algorithm, SA-SSR, based on suffix and longest common prefix arrays for efficiently detecting SSRs in large sets of sequences. Existing SSR detection applications are hampered by one or more limitations (i.e. speed, accuracy, ease-of-use, etc.). Our algorithm addresses these challenges while being the most comprehensive and correct SSR detection software available. SA-SSR is 100% accurate and detected >1000 more SSRs than the second best algorithm, while offering greater control to the user than any existing software. Availability and implementation: SA-SSR is freely available at http://github.com/ridgelab/SA-SSR Contact: perry.ridge@byu.edu Supplementary information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-5013907 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-50139072016-09-12 SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences Pickett, B. D. Karlinsey, S. M. Penrod, C. E. Cormier, M. J. Ebbert, M. T. W. Shiozawa, D. K. Whipple, C. J. Ridge, P. G. Bioinformatics Applications Notes Summary: Simple Sequence Repeats (SSRs) are used to address a variety of research questions in a variety of fields (e.g. population genetics, phylogenetics, forensics, etc.), due to their high mutability within and between species. Here, we present an innovative algorithm, SA-SSR, based on suffix and longest common prefix arrays for efficiently detecting SSRs in large sets of sequences. Existing SSR detection applications are hampered by one or more limitations (i.e. speed, accuracy, ease-of-use, etc.). Our algorithm addresses these challenges while being the most comprehensive and correct SSR detection software available. SA-SSR is 100% accurate and detected >1000 more SSRs than the second best algorithm, while offering greater control to the user than any existing software. Availability and implementation: SA-SSR is freely available at http://github.com/ridgelab/SA-SSR Contact: perry.ridge@byu.edu Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2016-09-01 2016-05-11 /pmc/articles/PMC5013907/ /pubmed/27170037 http://dx.doi.org/10.1093/bioinformatics/btw298 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Applications Notes Pickett, B. D. Karlinsey, S. M. Penrod, C. E. Cormier, M. J. Ebbert, M. T. W. Shiozawa, D. K. Whipple, C. J. Ridge, P. G. SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences |
title | SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences |
title_full | SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences |
title_fullStr | SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences |
title_full_unstemmed | SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences |
title_short | SA-SSR: a suffix array-based algorithm for exhaustive and efficient SSR discovery in large genetic sequences |
title_sort | sa-ssr: a suffix array-based algorithm for exhaustive and efficient ssr discovery in large genetic sequences |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5013907/ https://www.ncbi.nlm.nih.gov/pubmed/27170037 http://dx.doi.org/10.1093/bioinformatics/btw298 |
work_keys_str_mv | AT pickettbd sassrasuffixarraybasedalgorithmforexhaustiveandefficientssrdiscoveryinlargegeneticsequences AT karlinseysm sassrasuffixarraybasedalgorithmforexhaustiveandefficientssrdiscoveryinlargegeneticsequences AT penrodce sassrasuffixarraybasedalgorithmforexhaustiveandefficientssrdiscoveryinlargegeneticsequences AT cormiermj sassrasuffixarraybasedalgorithmforexhaustiveandefficientssrdiscoveryinlargegeneticsequences AT ebbertmtw sassrasuffixarraybasedalgorithmforexhaustiveandefficientssrdiscoveryinlargegeneticsequences AT shiozawadk sassrasuffixarraybasedalgorithmforexhaustiveandefficientssrdiscoveryinlargegeneticsequences AT whipplecj sassrasuffixarraybasedalgorithmforexhaustiveandefficientssrdiscoveryinlargegeneticsequences AT ridgepg sassrasuffixarraybasedalgorithmforexhaustiveandefficientssrdiscoveryinlargegeneticsequences |