Cargando…

Copy number variant detection in inbred strains from short read sequence data

Summary: We have developed an algorithm to detect copy number variants (CNVs) in homozygous organisms, such as inbred laboratory strains of mice, from short read sequence data. Our novel approach exploits the fact that inbred mice are homozygous at virtually every position in the genome to detect CN...

Descripción completa

Detalles Bibliográficos
Autores principales: Simpson, Jared T., McIntyre, Rebecca E., Adams, David J., Durbin, Richard
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820678/
https://www.ncbi.nlm.nih.gov/pubmed/20022973
http://dx.doi.org/10.1093/bioinformatics/btp693
_version_ 1782177402413121536
author Simpson, Jared T.
McIntyre, Rebecca E.
Adams, David J.
Durbin, Richard
author_facet Simpson, Jared T.
McIntyre, Rebecca E.
Adams, David J.
Durbin, Richard
author_sort Simpson, Jared T.
collection PubMed
description Summary: We have developed an algorithm to detect copy number variants (CNVs) in homozygous organisms, such as inbred laboratory strains of mice, from short read sequence data. Our novel approach exploits the fact that inbred mice are homozygous at virtually every position in the genome to detect CNVs using a hidden Markov model (HMM). This HMM uses both the density of sequence reads mapped to the genome, and the rate of apparent heterozygous single nucleotide polymorphisms, to determine genomic copy number. We tested our algorithm on short read sequence data generated from re-sequencing chromosome 17 of the mouse strains A/J and CAST/EiJ with the Illumina platform. In total, we identified 118 copy number variants (43 for A/J and 75 for CAST/EiJ). We investigated the performance of our algorithm through comparison to CNVs previously identified by array-comparative genomic hybridization (array CGH). We performed quantitative-PCR validation on a subset of the calls that differed from the array CGH data sets. Availability: The software described in this manuscript, named cnD for copy number detector, is free and released under the GPL. The program is implemented in the D programming language using the Tango library. Source code and pre-compiled binaries are available at http://www.sanger.ac.uk/resources/software/cnd.html Contact: rd@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
format Text
id pubmed-2820678
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28206782010-02-12 Copy number variant detection in inbred strains from short read sequence data Simpson, Jared T. McIntyre, Rebecca E. Adams, David J. Durbin, Richard Bioinformatics Applications Note Summary: We have developed an algorithm to detect copy number variants (CNVs) in homozygous organisms, such as inbred laboratory strains of mice, from short read sequence data. Our novel approach exploits the fact that inbred mice are homozygous at virtually every position in the genome to detect CNVs using a hidden Markov model (HMM). This HMM uses both the density of sequence reads mapped to the genome, and the rate of apparent heterozygous single nucleotide polymorphisms, to determine genomic copy number. We tested our algorithm on short read sequence data generated from re-sequencing chromosome 17 of the mouse strains A/J and CAST/EiJ with the Illumina platform. In total, we identified 118 copy number variants (43 for A/J and 75 for CAST/EiJ). We investigated the performance of our algorithm through comparison to CNVs previously identified by array-comparative genomic hybridization (array CGH). We performed quantitative-PCR validation on a subset of the calls that differed from the array CGH data sets. Availability: The software described in this manuscript, named cnD for copy number detector, is free and released under the GPL. The program is implemented in the D programming language using the Tango library. Source code and pre-compiled binaries are available at http://www.sanger.ac.uk/resources/software/cnd.html Contact: rd@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2010-02-15 2009-12-18 /pmc/articles/PMC2820678/ /pubmed/20022973 http://dx.doi.org/10.1093/bioinformatics/btp693 Text en © The Author(s) 2009. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Simpson, Jared T.
McIntyre, Rebecca E.
Adams, David J.
Durbin, Richard
Copy number variant detection in inbred strains from short read sequence data
title Copy number variant detection in inbred strains from short read sequence data
title_full Copy number variant detection in inbred strains from short read sequence data
title_fullStr Copy number variant detection in inbred strains from short read sequence data
title_full_unstemmed Copy number variant detection in inbred strains from short read sequence data
title_short Copy number variant detection in inbred strains from short read sequence data
title_sort copy number variant detection in inbred strains from short read sequence data
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820678/
https://www.ncbi.nlm.nih.gov/pubmed/20022973
http://dx.doi.org/10.1093/bioinformatics/btp693
work_keys_str_mv AT simpsonjaredt copynumbervariantdetectionininbredstrainsfromshortreadsequencedata
AT mcintyrerebeccae copynumbervariantdetectionininbredstrainsfromshortreadsequencedata
AT adamsdavidj copynumbervariantdetectionininbredstrainsfromshortreadsequencedata
AT durbinrichard copynumbervariantdetectionininbredstrainsfromshortreadsequencedata