Cargando…
Fully automated pipeline for detection of sex linked genes using RNA-Seq data
BACKGROUND: Sex chromosomes present a genomic region which to some extent, differs between the genders of a single species. Reliable high-throughput methods for detection of sex chromosomes specific markers are needed, especially in species where genome information is limited. Next generation sequen...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367819/ https://www.ncbi.nlm.nih.gov/pubmed/25884927 http://dx.doi.org/10.1186/s12859-015-0509-0 |
_version_ | 1782362546879070208 |
---|---|
author | Michalovova, Monika Kubat, Zdenek Hobza, Roman Vyskot, Boris Kejnovsky, Eduard |
author_facet | Michalovova, Monika Kubat, Zdenek Hobza, Roman Vyskot, Boris Kejnovsky, Eduard |
author_sort | Michalovova, Monika |
collection | PubMed |
description | BACKGROUND: Sex chromosomes present a genomic region which to some extent, differs between the genders of a single species. Reliable high-throughput methods for detection of sex chromosomes specific markers are needed, especially in species where genome information is limited. Next generation sequencing (NGS) opens the door for identification of unique sequences or searching for nucleotide polymorphisms between datasets. A combination of classical genetic segregation analysis along with RNA-Seq data can present an ideal tool to map and identify sex chromosome-specific expressed markers. To address this challenge, we established genetic cross of dioecious plant Rumex acetosa and generated RNA-Seq data from both parental generation and male and female offspring. RESULTS: We present a pipeline for detection of sex linked genes based on nucleotide polymorphism analysis. In our approach, tracking of nucleotide polymorphisms is carried out using a cross of preferably distant populations. For this reason, only 4 datasets are needed – reads from high-throughput sequencing platforms for parent generation (mother and father) and F1 generation (male and female progeny). Our pipeline uses custom scripts together with external assembly, mapping and variant calling software. Given the resource-intensive nature of the computation, servers with high capacity are a requirement. Therefore, in order to keep this pipeline easily accessible and reproducible, we implemented it in Galaxy – an open, web-based platform for data-intensive biomedical research. Our tools are present in the Galaxy Tool Shed, from which they can be installed to any local Galaxy instance. As an output of the pipeline, user gets a FASTA file with candidate transcriptionally active sex-linked genes, sorted by their relevance. At the same time, a BAM file with identified genes and alignment of reads is also provided. Thus, polymorphisms following segregation pattern can be easily visualized, which significantly enhances primer design and subsequent steps of wet-lab verification. CONCLUSIONS: Our pipeline presents a simple and freely accessible software tool for identification of sex chromosome linked genes in species without an existing reference genome. Based on combination of genetic crosses and RNA-Seq data, we have designed a high-throughput, cost-effective approach for a broad community of scientists focused on sex chromosome structure and evolution. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0509-0) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4367819 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-43678192015-03-21 Fully automated pipeline for detection of sex linked genes using RNA-Seq data Michalovova, Monika Kubat, Zdenek Hobza, Roman Vyskot, Boris Kejnovsky, Eduard BMC Bioinformatics Software BACKGROUND: Sex chromosomes present a genomic region which to some extent, differs between the genders of a single species. Reliable high-throughput methods for detection of sex chromosomes specific markers are needed, especially in species where genome information is limited. Next generation sequencing (NGS) opens the door for identification of unique sequences or searching for nucleotide polymorphisms between datasets. A combination of classical genetic segregation analysis along with RNA-Seq data can present an ideal tool to map and identify sex chromosome-specific expressed markers. To address this challenge, we established genetic cross of dioecious plant Rumex acetosa and generated RNA-Seq data from both parental generation and male and female offspring. RESULTS: We present a pipeline for detection of sex linked genes based on nucleotide polymorphism analysis. In our approach, tracking of nucleotide polymorphisms is carried out using a cross of preferably distant populations. For this reason, only 4 datasets are needed – reads from high-throughput sequencing platforms for parent generation (mother and father) and F1 generation (male and female progeny). Our pipeline uses custom scripts together with external assembly, mapping and variant calling software. Given the resource-intensive nature of the computation, servers with high capacity are a requirement. Therefore, in order to keep this pipeline easily accessible and reproducible, we implemented it in Galaxy – an open, web-based platform for data-intensive biomedical research. Our tools are present in the Galaxy Tool Shed, from which they can be installed to any local Galaxy instance. As an output of the pipeline, user gets a FASTA file with candidate transcriptionally active sex-linked genes, sorted by their relevance. At the same time, a BAM file with identified genes and alignment of reads is also provided. Thus, polymorphisms following segregation pattern can be easily visualized, which significantly enhances primer design and subsequent steps of wet-lab verification. CONCLUSIONS: Our pipeline presents a simple and freely accessible software tool for identification of sex chromosome linked genes in species without an existing reference genome. Based on combination of genetic crosses and RNA-Seq data, we have designed a high-throughput, cost-effective approach for a broad community of scientists focused on sex chromosome structure and evolution. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0509-0) contains supplementary material, which is available to authorized users. BioMed Central 2015-03-11 /pmc/articles/PMC4367819/ /pubmed/25884927 http://dx.doi.org/10.1186/s12859-015-0509-0 Text en © Michalovova et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. |
spellingShingle | Software Michalovova, Monika Kubat, Zdenek Hobza, Roman Vyskot, Boris Kejnovsky, Eduard Fully automated pipeline for detection of sex linked genes using RNA-Seq data |
title | Fully automated pipeline for detection of sex linked genes using RNA-Seq data |
title_full | Fully automated pipeline for detection of sex linked genes using RNA-Seq data |
title_fullStr | Fully automated pipeline for detection of sex linked genes using RNA-Seq data |
title_full_unstemmed | Fully automated pipeline for detection of sex linked genes using RNA-Seq data |
title_short | Fully automated pipeline for detection of sex linked genes using RNA-Seq data |
title_sort | fully automated pipeline for detection of sex linked genes using rna-seq data |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367819/ https://www.ncbi.nlm.nih.gov/pubmed/25884927 http://dx.doi.org/10.1186/s12859-015-0509-0 |
work_keys_str_mv | AT michalovovamonika fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata AT kubatzdenek fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata AT hobzaroman fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata AT vyskotboris fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata AT kejnovskyeduard fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata |