Cargando…
Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection
Advances in high-throughput sequencing (HTS) technologies have accelerated our knowledge of genomes in hundreds of organisms, but the presence of repetitions found in every genome raises challenges to unambiguously map short reads. In particular, short polymorphic reads that are multialigned hinder...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4059241/ https://www.ncbi.nlm.nih.gov/pubmed/24558267 http://dx.doi.org/10.1534/g3.113.009464 |
_version_ | 1782321217884127232 |
---|---|
author | Jubin, Claire Serero, Alexandre Loeillet, Sophie Barillot, Emmanuel Nicolas, Alain |
author_facet | Jubin, Claire Serero, Alexandre Loeillet, Sophie Barillot, Emmanuel Nicolas, Alain |
author_sort | Jubin, Claire |
collection | PubMed |
description | Advances in high-throughput sequencing (HTS) technologies have accelerated our knowledge of genomes in hundreds of organisms, but the presence of repetitions found in every genome raises challenges to unambiguously map short reads. In particular, short polymorphic reads that are multialigned hinder our capacity to detect mutations. Here, we present two complementary bioinformatics strategies to perform more robust analyses of genome content and sequencing data, validated by use of the Saccharomyces cerevisiae fully sequenced genome. First, we created an annotated HTS profile for the reference genome, based on the production of virtual HTS reads. Using variable read lengths and different numbers of mismatches, we found that 35 nt-reads, with a maximum of 6 mismatches, targets 89.5% of the genome to unique (U) regions. Longer reads consisting of 50−100 nt provided little additional benefits on the U regions extent. Second, to analyze the remaining multialigned (M) regions, we identified the intragenomic single-nucleotide variants and thus defined the unique (M(U)) and multialigned (M(M)) subregions, as exemplified for the polymorphic copies of the six flocculation genes and the 50 Ty retrotransposons. As a resource, the coordinates of the U and M regions of the yeast genome have been added to the Saccharomyces Genome Database (www.yeastgenome.org). The benefit of this advanced method of genome annotation was confirmed by our ability to identify acquired single nucleotide polymorphisms in the U and M regions of an experimentally sequenced variant wild-type yeast strain. |
format | Online Article Text |
id | pubmed-4059241 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-40592412014-06-16 Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection Jubin, Claire Serero, Alexandre Loeillet, Sophie Barillot, Emmanuel Nicolas, Alain G3 (Bethesda) Investigations Advances in high-throughput sequencing (HTS) technologies have accelerated our knowledge of genomes in hundreds of organisms, but the presence of repetitions found in every genome raises challenges to unambiguously map short reads. In particular, short polymorphic reads that are multialigned hinder our capacity to detect mutations. Here, we present two complementary bioinformatics strategies to perform more robust analyses of genome content and sequencing data, validated by use of the Saccharomyces cerevisiae fully sequenced genome. First, we created an annotated HTS profile for the reference genome, based on the production of virtual HTS reads. Using variable read lengths and different numbers of mismatches, we found that 35 nt-reads, with a maximum of 6 mismatches, targets 89.5% of the genome to unique (U) regions. Longer reads consisting of 50−100 nt provided little additional benefits on the U regions extent. Second, to analyze the remaining multialigned (M) regions, we identified the intragenomic single-nucleotide variants and thus defined the unique (M(U)) and multialigned (M(M)) subregions, as exemplified for the polymorphic copies of the six flocculation genes and the 50 Ty retrotransposons. As a resource, the coordinates of the U and M regions of the yeast genome have been added to the Saccharomyces Genome Database (www.yeastgenome.org). The benefit of this advanced method of genome annotation was confirmed by our ability to identify acquired single nucleotide polymorphisms in the U and M regions of an experimentally sequenced variant wild-type yeast strain. Genetics Society of America 2014-02-20 /pmc/articles/PMC4059241/ /pubmed/24558267 http://dx.doi.org/10.1534/g3.113.009464 Text en Copyright © 2014 Jubin et al. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution Unported License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Investigations Jubin, Claire Serero, Alexandre Loeillet, Sophie Barillot, Emmanuel Nicolas, Alain Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection |
title | Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection |
title_full | Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection |
title_fullStr | Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection |
title_full_unstemmed | Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection |
title_short | Sequence Profiling of the Saccharomyces cerevisiae Genome Permits Deconvolution of Unique and Multialigned Reads for Variant Detection |
title_sort | sequence profiling of the saccharomyces cerevisiae genome permits deconvolution of unique and multialigned reads for variant detection |
topic | Investigations |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4059241/ https://www.ncbi.nlm.nih.gov/pubmed/24558267 http://dx.doi.org/10.1534/g3.113.009464 |
work_keys_str_mv | AT jubinclaire sequenceprofilingofthesaccharomycescerevisiaegenomepermitsdeconvolutionofuniqueandmultialignedreadsforvariantdetection AT sereroalexandre sequenceprofilingofthesaccharomycescerevisiaegenomepermitsdeconvolutionofuniqueandmultialignedreadsforvariantdetection AT loeilletsophie sequenceprofilingofthesaccharomycescerevisiaegenomepermitsdeconvolutionofuniqueandmultialignedreadsforvariantdetection AT barillotemmanuel sequenceprofilingofthesaccharomycescerevisiaegenomepermitsdeconvolutionofuniqueandmultialignedreadsforvariantdetection AT nicolasalain sequenceprofilingofthesaccharomycescerevisiaegenomepermitsdeconvolutionofuniqueandmultialignedreadsforvariantdetection |