Cargando…
Nano‐Strainer: A workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads
In modern plant systematics, target enrichment enables simultaneous analysis of hundreds of genes. However, when dealing with reticulate or polyploidization histories, few markers may suffice, but often are required to be single‐copy, a condition that is not necessarily met with commercial capture k...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354226/ https://www.ncbi.nlm.nih.gov/pubmed/37475726 http://dx.doi.org/10.1002/ece3.10190 |
_version_ | 1785074881406173184 |
---|---|
author | Scheunert, Agnes Lautenschlager, Ulrich Ott, Tankred Oberprieler, Christoph |
author_facet | Scheunert, Agnes Lautenschlager, Ulrich Ott, Tankred Oberprieler, Christoph |
author_sort | Scheunert, Agnes |
collection | PubMed |
description | In modern plant systematics, target enrichment enables simultaneous analysis of hundreds of genes. However, when dealing with reticulate or polyploidization histories, few markers may suffice, but often are required to be single‐copy, a condition that is not necessarily met with commercial capture kits. Also, large genome sizes can render target capture ineffective, so that amplicon sequencing would be preferable; however, knowledge about suitable loci is often missing. Here, we present a comprehensive workflow for the identification of putative single‐copy nuclear markers in a genus of interest, by mining a small dataset from target capture using a few representative taxa. The proposed pipeline assesses sequence variability contained in the data from targeted loci and assigns reads to their respective genes, via a combined BLAST/clustering procedure. Cluster consensus sequences are then examined based on four pre‐defined criteria presumably indicative for absence of paralogy. This is done by calculating four specialized indices; loci are ranked according to their performance in these indices, and top‐scoring loci are considered putatively single‐ or low copy. The approach can be applied to any probe set. As it relies on long reads, the present contribution also provides template workflows for processing Nanopore‐based target capture data. Obtained markers are further tested and then entered into amplicon sequencing. For the detection of possibly remaining paralogy in these data, which might occur in groups with rampant paralogy, we also employ the long‐read assembly tool canu. In diploid representatives of the young Compositae genus Leucanthemum, characterized by high levels of polyploidy, our approach resulted in successful amplification of 13 loci. Modifications to remove traces of paralogy were made in seven of these. A species tree from the markers correctly reproduced main relationships in the genus, however, at low resolution. The presented workflow has the potential to valuably support phylogenetic research, for example in polyploid plant groups. |
format | Online Article Text |
id | pubmed-10354226 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-103542262023-07-20 Nano‐Strainer: A workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads Scheunert, Agnes Lautenschlager, Ulrich Ott, Tankred Oberprieler, Christoph Ecol Evol Research Articles In modern plant systematics, target enrichment enables simultaneous analysis of hundreds of genes. However, when dealing with reticulate or polyploidization histories, few markers may suffice, but often are required to be single‐copy, a condition that is not necessarily met with commercial capture kits. Also, large genome sizes can render target capture ineffective, so that amplicon sequencing would be preferable; however, knowledge about suitable loci is often missing. Here, we present a comprehensive workflow for the identification of putative single‐copy nuclear markers in a genus of interest, by mining a small dataset from target capture using a few representative taxa. The proposed pipeline assesses sequence variability contained in the data from targeted loci and assigns reads to their respective genes, via a combined BLAST/clustering procedure. Cluster consensus sequences are then examined based on four pre‐defined criteria presumably indicative for absence of paralogy. This is done by calculating four specialized indices; loci are ranked according to their performance in these indices, and top‐scoring loci are considered putatively single‐ or low copy. The approach can be applied to any probe set. As it relies on long reads, the present contribution also provides template workflows for processing Nanopore‐based target capture data. Obtained markers are further tested and then entered into amplicon sequencing. For the detection of possibly remaining paralogy in these data, which might occur in groups with rampant paralogy, we also employ the long‐read assembly tool canu. In diploid representatives of the young Compositae genus Leucanthemum, characterized by high levels of polyploidy, our approach resulted in successful amplification of 13 loci. Modifications to remove traces of paralogy were made in seven of these. A species tree from the markers correctly reproduced main relationships in the genus, however, at low resolution. The presented workflow has the potential to valuably support phylogenetic research, for example in polyploid plant groups. John Wiley and Sons Inc. 2023-07-18 /pmc/articles/PMC10354226/ /pubmed/37475726 http://dx.doi.org/10.1002/ece3.10190 Text en © 2023 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Articles Scheunert, Agnes Lautenschlager, Ulrich Ott, Tankred Oberprieler, Christoph Nano‐Strainer: A workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads |
title |
Nano‐Strainer: A workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads |
title_full |
Nano‐Strainer: A workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads |
title_fullStr |
Nano‐Strainer: A workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads |
title_full_unstemmed |
Nano‐Strainer: A workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads |
title_short |
Nano‐Strainer: A workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and Oxford Nanopore long reads |
title_sort | nano‐strainer: a workflow for the identification of single‐copy nuclear loci for plant systematic studies, using target capture kits and oxford nanopore long reads |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10354226/ https://www.ncbi.nlm.nih.gov/pubmed/37475726 http://dx.doi.org/10.1002/ece3.10190 |
work_keys_str_mv | AT scheunertagnes nanostraineraworkflowfortheidentificationofsinglecopynuclearlociforplantsystematicstudiesusingtargetcapturekitsandoxfordnanoporelongreads AT lautenschlagerulrich nanostraineraworkflowfortheidentificationofsinglecopynuclearlociforplantsystematicstudiesusingtargetcapturekitsandoxfordnanoporelongreads AT otttankred nanostraineraworkflowfortheidentificationofsinglecopynuclearlociforplantsystematicstudiesusingtargetcapturekitsandoxfordnanoporelongreads AT oberprielerchristoph nanostraineraworkflowfortheidentificationofsinglecopynuclearlociforplantsystematicstudiesusingtargetcapturekitsandoxfordnanoporelongreads |