Cargando…

Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity

Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel compu...

Descripción completa

Detalles Bibliográficos
Autores principales: Waszak, Sebastian M., Hasin, Yehudit, Zichner, Thomas, Olender, Tsviya, Keydar, Ifat, Khen, Miriam, Stütz, Adrian M., Schlattl, Andreas, Lancet, Doron, Korbel, Jan O.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2978733/
https://www.ncbi.nlm.nih.gov/pubmed/21085617
http://dx.doi.org/10.1371/journal.pcbi.1000988
_version_ 1782191293553704960
author Waszak, Sebastian M.
Hasin, Yehudit
Zichner, Thomas
Olender, Tsviya
Keydar, Ifat
Khen, Miriam
Stütz, Adrian M.
Schlattl, Andreas
Lancet, Doron
Korbel, Jan O.
author_facet Waszak, Sebastian M.
Hasin, Yehudit
Zichner, Thomas
Olender, Tsviya
Keydar, Ifat
Khen, Miriam
Stütz, Adrian M.
Schlattl, Andreas
Lancet, Doron
Korbel, Jan O.
author_sort Waszak, Sebastian M.
collection PubMed
description Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95–99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ∼15% and ∼20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing.
format Text
id pubmed-2978733
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-29787332010-11-17 Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity Waszak, Sebastian M. Hasin, Yehudit Zichner, Thomas Olender, Tsviya Keydar, Ifat Khen, Miriam Stütz, Adrian M. Schlattl, Andreas Lancet, Doron Korbel, Jan O. PLoS Comput Biol Research Article Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95–99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ∼15% and ∼20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing. Public Library of Science 2010-11-11 /pmc/articles/PMC2978733/ /pubmed/21085617 http://dx.doi.org/10.1371/journal.pcbi.1000988 Text en Waszak et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Waszak, Sebastian M.
Hasin, Yehudit
Zichner, Thomas
Olender, Tsviya
Keydar, Ifat
Khen, Miriam
Stütz, Adrian M.
Schlattl, Andreas
Lancet, Doron
Korbel, Jan O.
Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity
title Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity
title_full Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity
title_fullStr Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity
title_full_unstemmed Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity
title_short Systematic Inference of Copy-Number Genotypes from Personal Genome Sequencing Data Reveals Extensive Olfactory Receptor Gene Content Diversity
title_sort systematic inference of copy-number genotypes from personal genome sequencing data reveals extensive olfactory receptor gene content diversity
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2978733/
https://www.ncbi.nlm.nih.gov/pubmed/21085617
http://dx.doi.org/10.1371/journal.pcbi.1000988
work_keys_str_mv AT waszaksebastianm systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT hasinyehudit systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT zichnerthomas systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT olendertsviya systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT keydarifat systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT khenmiriam systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT stutzadrianm systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT schlattlandreas systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT lancetdoron systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity
AT korbeljano systematicinferenceofcopynumbergenotypesfrompersonalgenomesequencingdatarevealsextensiveolfactoryreceptorgenecontentdiversity