Optimizing depth and type of high‐throughput sequencing data for microsatellite discovery

PREMISE: Simple sequence repeat (SSR) markers (microsatellites) are a mainstay of many labs, especially when working on a limited budget, carrying out preliminary analyses, and in teaching. Whether SSRs mined from plant genomes or transcriptomes are preferred for certain applications, and the depth...

Descripción completa

Detalles Bibliográficos
Autor principal: Chapman, Mark A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6858294/
https://www.ncbi.nlm.nih.gov/pubmed/31832281
http://dx.doi.org/10.1002/aps3.11298
_version_ 1783470925974339584
author Chapman, Mark A.
author_facet Chapman, Mark A.
author_sort Chapman, Mark A.
collection PubMed
description PREMISE: Simple sequence repeat (SSR) markers (microsatellites) are a mainstay of many labs, especially when working on a limited budget, carrying out preliminary analyses, and in teaching. Whether SSRs mined from plant genomes or transcriptomes are preferred for certain applications, and the depth of sequencing needed to allow efficient SSR discovery, has not been tested. METHODS: I used genome and transcriptome high‐throughput sequencing data at a range of sequencing depths to compare efficacy of SSR identification. I then tested primers from tomato for amplification, polymorphism, and transferability to related species. RESULTS: Small assemblies (two million read pairs) identified ca. 200–2000 potential markers from the genome assemblies and ca. 600–3650 from the transcriptome assemblies. Genome‐derived contigs were often short, potentially precluding primer design. Genomic SSR primers were less transferable across species but exhibited greater variation (partially explained by being composed of more repeat units) than transcriptome‐derived primers. DISCUSSION: Small high‐throughput sequencing resources may be sufficient for identification of hundreds of SSRs. Genomic data may be preferable in species with low polymorphism, but transcriptome data may result in longer loci (more amenable to primer design) and primers may be more transferable to related species.
format Online
Article
Text
id pubmed-6858294
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-68582942019-12-12 Optimizing depth and type of high‐throughput sequencing data for microsatellite discovery Chapman, Mark A. Appl Plant Sci Application Article PREMISE: Simple sequence repeat (SSR) markers (microsatellites) are a mainstay of many labs, especially when working on a limited budget, carrying out preliminary analyses, and in teaching. Whether SSRs mined from plant genomes or transcriptomes are preferred for certain applications, and the depth of sequencing needed to allow efficient SSR discovery, has not been tested. METHODS: I used genome and transcriptome high‐throughput sequencing data at a range of sequencing depths to compare efficacy of SSR identification. I then tested primers from tomato for amplification, polymorphism, and transferability to related species. RESULTS: Small assemblies (two million read pairs) identified ca. 200–2000 potential markers from the genome assemblies and ca. 600–3650 from the transcriptome assemblies. Genome‐derived contigs were often short, potentially precluding primer design. Genomic SSR primers were less transferable across species but exhibited greater variation (partially explained by being composed of more repeat units) than transcriptome‐derived primers. DISCUSSION: Small high‐throughput sequencing resources may be sufficient for identification of hundreds of SSRs. Genomic data may be preferable in species with low polymorphism, but transcriptome data may result in longer loci (more amenable to primer design) and primers may be more transferable to related species. John Wiley and Sons Inc. 2019-11-03 /pmc/articles/PMC6858294/ /pubmed/31832281 http://dx.doi.org/10.1002/aps3.11298 Text en © 2019 Chapman. Applications in Plant Sciences is published by Wiley Periodicals, Inc. on behalf of the Botanical Society of America This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Application Article
Chapman, Mark A.
Optimizing depth and type of high‐throughput sequencing data for microsatellite discovery
title Optimizing depth and type of high‐throughput sequencing data for microsatellite discovery
title_full Optimizing depth and type of high‐throughput sequencing data for microsatellite discovery
title_fullStr Optimizing depth and type of high‐throughput sequencing data for microsatellite discovery
title_full_unstemmed Optimizing depth and type of high‐throughput sequencing data for microsatellite discovery
title_short Optimizing depth and type of high‐throughput sequencing data for microsatellite discovery
title_sort optimizing depth and type of high‐throughput sequencing data for microsatellite discovery
topic Application Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6858294/
https://www.ncbi.nlm.nih.gov/pubmed/31832281
http://dx.doi.org/10.1002/aps3.11298
work_keys_str_mv AT chapmanmarka optimizingdepthandtypeofhighthroughputsequencingdataformicrosatellitediscovery