Cargando…

Fully automated pipeline for detection of sex linked genes using RNA-Seq data

BACKGROUND: Sex chromosomes present a genomic region which to some extent, differs between the genders of a single species. Reliable high-throughput methods for detection of sex chromosomes specific markers are needed, especially in species where genome information is limited. Next generation sequen...

Descripción completa

Detalles Bibliográficos
Autores principales: Michalovova, Monika, Kubat, Zdenek, Hobza, Roman, Vyskot, Boris, Kejnovsky, Eduard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367819/
https://www.ncbi.nlm.nih.gov/pubmed/25884927
http://dx.doi.org/10.1186/s12859-015-0509-0
_version_ 1782362546879070208
author Michalovova, Monika
Kubat, Zdenek
Hobza, Roman
Vyskot, Boris
Kejnovsky, Eduard
author_facet Michalovova, Monika
Kubat, Zdenek
Hobza, Roman
Vyskot, Boris
Kejnovsky, Eduard
author_sort Michalovova, Monika
collection PubMed
description BACKGROUND: Sex chromosomes present a genomic region which to some extent, differs between the genders of a single species. Reliable high-throughput methods for detection of sex chromosomes specific markers are needed, especially in species where genome information is limited. Next generation sequencing (NGS) opens the door for identification of unique sequences or searching for nucleotide polymorphisms between datasets. A combination of classical genetic segregation analysis along with RNA-Seq data can present an ideal tool to map and identify sex chromosome-specific expressed markers. To address this challenge, we established genetic cross of dioecious plant Rumex acetosa and generated RNA-Seq data from both parental generation and male and female offspring. RESULTS: We present a pipeline for detection of sex linked genes based on nucleotide polymorphism analysis. In our approach, tracking of nucleotide polymorphisms is carried out using a cross of preferably distant populations. For this reason, only 4 datasets are needed – reads from high-throughput sequencing platforms for parent generation (mother and father) and F1 generation (male and female progeny). Our pipeline uses custom scripts together with external assembly, mapping and variant calling software. Given the resource-intensive nature of the computation, servers with high capacity are a requirement. Therefore, in order to keep this pipeline easily accessible and reproducible, we implemented it in Galaxy – an open, web-based platform for data-intensive biomedical research. Our tools are present in the Galaxy Tool Shed, from which they can be installed to any local Galaxy instance. As an output of the pipeline, user gets a FASTA file with candidate transcriptionally active sex-linked genes, sorted by their relevance. At the same time, a BAM file with identified genes and alignment of reads is also provided. Thus, polymorphisms following segregation pattern can be easily visualized, which significantly enhances primer design and subsequent steps of wet-lab verification. CONCLUSIONS: Our pipeline presents a simple and freely accessible software tool for identification of sex chromosome linked genes in species without an existing reference genome. Based on combination of genetic crosses and RNA-Seq data, we have designed a high-throughput, cost-effective approach for a broad community of scientists focused on sex chromosome structure and evolution. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0509-0) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4367819
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43678192015-03-21 Fully automated pipeline for detection of sex linked genes using RNA-Seq data Michalovova, Monika Kubat, Zdenek Hobza, Roman Vyskot, Boris Kejnovsky, Eduard BMC Bioinformatics Software BACKGROUND: Sex chromosomes present a genomic region which to some extent, differs between the genders of a single species. Reliable high-throughput methods for detection of sex chromosomes specific markers are needed, especially in species where genome information is limited. Next generation sequencing (NGS) opens the door for identification of unique sequences or searching for nucleotide polymorphisms between datasets. A combination of classical genetic segregation analysis along with RNA-Seq data can present an ideal tool to map and identify sex chromosome-specific expressed markers. To address this challenge, we established genetic cross of dioecious plant Rumex acetosa and generated RNA-Seq data from both parental generation and male and female offspring. RESULTS: We present a pipeline for detection of sex linked genes based on nucleotide polymorphism analysis. In our approach, tracking of nucleotide polymorphisms is carried out using a cross of preferably distant populations. For this reason, only 4 datasets are needed – reads from high-throughput sequencing platforms for parent generation (mother and father) and F1 generation (male and female progeny). Our pipeline uses custom scripts together with external assembly, mapping and variant calling software. Given the resource-intensive nature of the computation, servers with high capacity are a requirement. Therefore, in order to keep this pipeline easily accessible and reproducible, we implemented it in Galaxy – an open, web-based platform for data-intensive biomedical research. Our tools are present in the Galaxy Tool Shed, from which they can be installed to any local Galaxy instance. As an output of the pipeline, user gets a FASTA file with candidate transcriptionally active sex-linked genes, sorted by their relevance. At the same time, a BAM file with identified genes and alignment of reads is also provided. Thus, polymorphisms following segregation pattern can be easily visualized, which significantly enhances primer design and subsequent steps of wet-lab verification. CONCLUSIONS: Our pipeline presents a simple and freely accessible software tool for identification of sex chromosome linked genes in species without an existing reference genome. Based on combination of genetic crosses and RNA-Seq data, we have designed a high-throughput, cost-effective approach for a broad community of scientists focused on sex chromosome structure and evolution. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0509-0) contains supplementary material, which is available to authorized users. BioMed Central 2015-03-11 /pmc/articles/PMC4367819/ /pubmed/25884927 http://dx.doi.org/10.1186/s12859-015-0509-0 Text en © Michalovova et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Software
Michalovova, Monika
Kubat, Zdenek
Hobza, Roman
Vyskot, Boris
Kejnovsky, Eduard
Fully automated pipeline for detection of sex linked genes using RNA-Seq data
title Fully automated pipeline for detection of sex linked genes using RNA-Seq data
title_full Fully automated pipeline for detection of sex linked genes using RNA-Seq data
title_fullStr Fully automated pipeline for detection of sex linked genes using RNA-Seq data
title_full_unstemmed Fully automated pipeline for detection of sex linked genes using RNA-Seq data
title_short Fully automated pipeline for detection of sex linked genes using RNA-Seq data
title_sort fully automated pipeline for detection of sex linked genes using rna-seq data
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4367819/
https://www.ncbi.nlm.nih.gov/pubmed/25884927
http://dx.doi.org/10.1186/s12859-015-0509-0
work_keys_str_mv AT michalovovamonika fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata
AT kubatzdenek fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata
AT hobzaroman fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata
AT vyskotboris fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata
AT kejnovskyeduard fullyautomatedpipelinefordetectionofsexlinkedgenesusingrnaseqdata