Cargando…
SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences
Microsatellites or simple sequence repeats (SSRs) are short tandem repeats of DNA widespread in genomes and transcriptomes of diverse organisms and are used in various genetic studies. Few software programs that mine SSRs can be further used to mine polymorphic SSRs, and these programs have poor por...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7398111/ https://www.ncbi.nlm.nih.gov/pubmed/32849772 http://dx.doi.org/10.3389/fgene.2020.00706 |
_version_ | 1783565899586863104 |
---|---|
author | Gou, Xiangjian Shi, Haoran Yu, Shifan Wang, Zhiqiang Li, Caixia Liu, Shihang Ma, Jian Chen, Guangdeng Liu, Tao Liu, Yaxi |
author_facet | Gou, Xiangjian Shi, Haoran Yu, Shifan Wang, Zhiqiang Li, Caixia Liu, Shihang Ma, Jian Chen, Guangdeng Liu, Tao Liu, Yaxi |
author_sort | Gou, Xiangjian |
collection | PubMed |
description | Microsatellites or simple sequence repeats (SSRs) are short tandem repeats of DNA widespread in genomes and transcriptomes of diverse organisms and are used in various genetic studies. Few software programs that mine SSRs can be further used to mine polymorphic SSRs, and these programs have poor portability, have slow computational speed, are highly dependent on other programs, and have low marker development rates. In this study, we develop an algorithm named Simple Sequence Repeat Molecular Marker Developer (SSRMMD), which uses improved regular expressions to rapidly and exhaustively mine perfect SSR loci from any size of assembled sequence. To mine polymorphic SSRs, SSRMMD uses a novel three-stage method to assess the conservativeness of SSR flanking sequences and then uses the sliding window method to fragment each assembled sequence to assess its uniqueness. Furthermore, molecular biology assays support the polymorphic SSRs identified by SSRMMD. SSRMMD is implemented using the Perl programming language and can be downloaded from https://github.com/GouXiangJian/SSRMMD. |
format | Online Article Text |
id | pubmed-7398111 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-73981112020-08-25 SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences Gou, Xiangjian Shi, Haoran Yu, Shifan Wang, Zhiqiang Li, Caixia Liu, Shihang Ma, Jian Chen, Guangdeng Liu, Tao Liu, Yaxi Front Genet Genetics Microsatellites or simple sequence repeats (SSRs) are short tandem repeats of DNA widespread in genomes and transcriptomes of diverse organisms and are used in various genetic studies. Few software programs that mine SSRs can be further used to mine polymorphic SSRs, and these programs have poor portability, have slow computational speed, are highly dependent on other programs, and have low marker development rates. In this study, we develop an algorithm named Simple Sequence Repeat Molecular Marker Developer (SSRMMD), which uses improved regular expressions to rapidly and exhaustively mine perfect SSR loci from any size of assembled sequence. To mine polymorphic SSRs, SSRMMD uses a novel three-stage method to assess the conservativeness of SSR flanking sequences and then uses the sliding window method to fragment each assembled sequence to assess its uniqueness. Furthermore, molecular biology assays support the polymorphic SSRs identified by SSRMMD. SSRMMD is implemented using the Perl programming language and can be downloaded from https://github.com/GouXiangJian/SSRMMD. Frontiers Media S.A. 2020-07-27 /pmc/articles/PMC7398111/ /pubmed/32849772 http://dx.doi.org/10.3389/fgene.2020.00706 Text en Copyright © 2020 Gou, Shi, Yu, Wang, Li, Liu, Ma, Chen, Liu and Liu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Gou, Xiangjian Shi, Haoran Yu, Shifan Wang, Zhiqiang Li, Caixia Liu, Shihang Ma, Jian Chen, Guangdeng Liu, Tao Liu, Yaxi SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences |
title | SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences |
title_full | SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences |
title_fullStr | SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences |
title_full_unstemmed | SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences |
title_short | SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences |
title_sort | ssrmmd: a rapid and accurate algorithm for mining ssr feature loci and candidate polymorphic ssrs based on assembled sequences |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7398111/ https://www.ncbi.nlm.nih.gov/pubmed/32849772 http://dx.doi.org/10.3389/fgene.2020.00706 |
work_keys_str_mv | AT gouxiangjian ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT shihaoran ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT yushifan ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT wangzhiqiang ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT licaixia ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT liushihang ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT majian ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT chenguangdeng ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT liutao ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences AT liuyaxi ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences |