Cargando…

SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences

Microsatellites or simple sequence repeats (SSRs) are short tandem repeats of DNA widespread in genomes and transcriptomes of diverse organisms and are used in various genetic studies. Few software programs that mine SSRs can be further used to mine polymorphic SSRs, and these programs have poor por...

Descripción completa

Detalles Bibliográficos
Autores principales: Gou, Xiangjian, Shi, Haoran, Yu, Shifan, Wang, Zhiqiang, Li, Caixia, Liu, Shihang, Ma, Jian, Chen, Guangdeng, Liu, Tao, Liu, Yaxi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7398111/
https://www.ncbi.nlm.nih.gov/pubmed/32849772
http://dx.doi.org/10.3389/fgene.2020.00706
_version_ 1783565899586863104
author Gou, Xiangjian
Shi, Haoran
Yu, Shifan
Wang, Zhiqiang
Li, Caixia
Liu, Shihang
Ma, Jian
Chen, Guangdeng
Liu, Tao
Liu, Yaxi
author_facet Gou, Xiangjian
Shi, Haoran
Yu, Shifan
Wang, Zhiqiang
Li, Caixia
Liu, Shihang
Ma, Jian
Chen, Guangdeng
Liu, Tao
Liu, Yaxi
author_sort Gou, Xiangjian
collection PubMed
description Microsatellites or simple sequence repeats (SSRs) are short tandem repeats of DNA widespread in genomes and transcriptomes of diverse organisms and are used in various genetic studies. Few software programs that mine SSRs can be further used to mine polymorphic SSRs, and these programs have poor portability, have slow computational speed, are highly dependent on other programs, and have low marker development rates. In this study, we develop an algorithm named Simple Sequence Repeat Molecular Marker Developer (SSRMMD), which uses improved regular expressions to rapidly and exhaustively mine perfect SSR loci from any size of assembled sequence. To mine polymorphic SSRs, SSRMMD uses a novel three-stage method to assess the conservativeness of SSR flanking sequences and then uses the sliding window method to fragment each assembled sequence to assess its uniqueness. Furthermore, molecular biology assays support the polymorphic SSRs identified by SSRMMD. SSRMMD is implemented using the Perl programming language and can be downloaded from https://github.com/GouXiangJian/SSRMMD.
format Online
Article
Text
id pubmed-7398111
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-73981112020-08-25 SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences Gou, Xiangjian Shi, Haoran Yu, Shifan Wang, Zhiqiang Li, Caixia Liu, Shihang Ma, Jian Chen, Guangdeng Liu, Tao Liu, Yaxi Front Genet Genetics Microsatellites or simple sequence repeats (SSRs) are short tandem repeats of DNA widespread in genomes and transcriptomes of diverse organisms and are used in various genetic studies. Few software programs that mine SSRs can be further used to mine polymorphic SSRs, and these programs have poor portability, have slow computational speed, are highly dependent on other programs, and have low marker development rates. In this study, we develop an algorithm named Simple Sequence Repeat Molecular Marker Developer (SSRMMD), which uses improved regular expressions to rapidly and exhaustively mine perfect SSR loci from any size of assembled sequence. To mine polymorphic SSRs, SSRMMD uses a novel three-stage method to assess the conservativeness of SSR flanking sequences and then uses the sliding window method to fragment each assembled sequence to assess its uniqueness. Furthermore, molecular biology assays support the polymorphic SSRs identified by SSRMMD. SSRMMD is implemented using the Perl programming language and can be downloaded from https://github.com/GouXiangJian/SSRMMD. Frontiers Media S.A. 2020-07-27 /pmc/articles/PMC7398111/ /pubmed/32849772 http://dx.doi.org/10.3389/fgene.2020.00706 Text en Copyright © 2020 Gou, Shi, Yu, Wang, Li, Liu, Ma, Chen, Liu and Liu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Gou, Xiangjian
Shi, Haoran
Yu, Shifan
Wang, Zhiqiang
Li, Caixia
Liu, Shihang
Ma, Jian
Chen, Guangdeng
Liu, Tao
Liu, Yaxi
SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences
title SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences
title_full SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences
title_fullStr SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences
title_full_unstemmed SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences
title_short SSRMMD: A Rapid and Accurate Algorithm for Mining SSR Feature Loci and Candidate Polymorphic SSRs Based on Assembled Sequences
title_sort ssrmmd: a rapid and accurate algorithm for mining ssr feature loci and candidate polymorphic ssrs based on assembled sequences
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7398111/
https://www.ncbi.nlm.nih.gov/pubmed/32849772
http://dx.doi.org/10.3389/fgene.2020.00706
work_keys_str_mv AT gouxiangjian ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT shihaoran ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT yushifan ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT wangzhiqiang ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT licaixia ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT liushihang ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT majian ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT chenguangdeng ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT liutao ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences
AT liuyaxi ssrmmdarapidandaccuratealgorithmforminingssrfeaturelociandcandidatepolymorphicssrsbasedonassembledsequences