Cargando…

Copy number variation analysis based on AluScan sequences

BACKGROUND: AluScan combines inter-Alu PCR using multiple Alu-based primers with opposite orientations and next-generation sequencing to capture a huge number of Alu-proximal genomic sequences for investigation. Its requirement of only sub-microgram quantities of DNA facilitates the examination of l...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Jian-Feng, Ding, Xiao-Fan, Chen, Lei, Mat, Wai-Kin, Xu, Michelle Zhi, Chen, Jin-Fei, Wang, Jian-Min, Xu, Lin, Poon, Wai-Sang, Kwong, Ava, Leung, Gilberto Ka-Kit, Tan, Tze-Ching, Yu, Chi-Hung, Ke, Yue-Bin, Xu, Xin-Yun, Ke, Xiao-Yan, Ma, Ronald CW, Chan, Juliana CN, Wan, Wei-Qing, Zhang, Li-Wei, Kumar, Yogesh, Tsang, Shui-Ying, Li, Shao, Wang, Hong-Yang, Xue, Hong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4273479/
https://www.ncbi.nlm.nih.gov/pubmed/25558350
http://dx.doi.org/10.1186/s13336-014-0015-z
_version_ 1782349843354615808
author Yang, Jian-Feng
Ding, Xiao-Fan
Chen, Lei
Mat, Wai-Kin
Xu, Michelle Zhi
Chen, Jin-Fei
Wang, Jian-Min
Xu, Lin
Poon, Wai-Sang
Kwong, Ava
Leung, Gilberto Ka-Kit
Tan, Tze-Ching
Yu, Chi-Hung
Ke, Yue-Bin
Xu, Xin-Yun
Ke, Xiao-Yan
Ma, Ronald CW
Chan, Juliana CN
Wan, Wei-Qing
Zhang, Li-Wei
Kumar, Yogesh
Tsang, Shui-Ying
Li, Shao
Wang, Hong-Yang
Xue, Hong
author_facet Yang, Jian-Feng
Ding, Xiao-Fan
Chen, Lei
Mat, Wai-Kin
Xu, Michelle Zhi
Chen, Jin-Fei
Wang, Jian-Min
Xu, Lin
Poon, Wai-Sang
Kwong, Ava
Leung, Gilberto Ka-Kit
Tan, Tze-Ching
Yu, Chi-Hung
Ke, Yue-Bin
Xu, Xin-Yun
Ke, Xiao-Yan
Ma, Ronald CW
Chan, Juliana CN
Wan, Wei-Qing
Zhang, Li-Wei
Kumar, Yogesh
Tsang, Shui-Ying
Li, Shao
Wang, Hong-Yang
Xue, Hong
author_sort Yang, Jian-Feng
collection PubMed
description BACKGROUND: AluScan combines inter-Alu PCR using multiple Alu-based primers with opposite orientations and next-generation sequencing to capture a huge number of Alu-proximal genomic sequences for investigation. Its requirement of only sub-microgram quantities of DNA facilitates the examination of large numbers of samples. However, the special features of AluScan data rendered difficult the calling of copy number variation (CNV) directly using the calling algorithms designed for whole genome sequencing (WGS) or exome sequencing. RESULTS: In this study, an AluScanCNV package has been assembled for efficient CNV calling from AluScan sequencing data employing a Geary-Hinkley transformation (GHT) of read-depth ratios between either paired test-control samples, or between test samples and a reference template constructed from reference samples, to call the localized CNVs, followed by use of a GISTIC-like algorithm to identify recurrent CNVs and circular binary segmentation (CBS) to reveal large extended CNVs. To evaluate the utility of CNVs called from AluScan data, the AluScans from 23 non-cancer and 38 cancer genomes were analyzed in this study. The glioma samples analyzed yielded the familiar extended copy-number losses on chromosomes 1p and 9. Also, the recurrent somatic CNVs identified from liver cancer samples were similar to those reported for liver cancer WGS with respect to a striking enrichment of copy-number gains in chromosomes 1q and 8q. When localized or recurrent CNV-features capable of distinguishing between liver and non-liver cancer samples were selected by correlation-based machine learning, a highly accurate separation of the liver and non-liver cancer classes was attained. CONCLUSIONS: The results obtained from non-cancer and cancerous tissues indicated that the AluScanCNV package can be employed to call localized, recurrent and extended CNVs from AluScan sequences. Moreover, both the localized and recurrent CNVs identified by this method could be subjected to machine-learning selection to yield distinguishing CNV-features that were capable of separating between liver cancers and other types of cancers. Since the method is applicable to any human DNA sample with or without the availability of a paired control, it can also be employed to analyze the constitutional CNVs of individuals. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13336-014-0015-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4273479
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42734792015-01-02 Copy number variation analysis based on AluScan sequences Yang, Jian-Feng Ding, Xiao-Fan Chen, Lei Mat, Wai-Kin Xu, Michelle Zhi Chen, Jin-Fei Wang, Jian-Min Xu, Lin Poon, Wai-Sang Kwong, Ava Leung, Gilberto Ka-Kit Tan, Tze-Ching Yu, Chi-Hung Ke, Yue-Bin Xu, Xin-Yun Ke, Xiao-Yan Ma, Ronald CW Chan, Juliana CN Wan, Wei-Qing Zhang, Li-Wei Kumar, Yogesh Tsang, Shui-Ying Li, Shao Wang, Hong-Yang Xue, Hong J Clin Bioinforma Methodology BACKGROUND: AluScan combines inter-Alu PCR using multiple Alu-based primers with opposite orientations and next-generation sequencing to capture a huge number of Alu-proximal genomic sequences for investigation. Its requirement of only sub-microgram quantities of DNA facilitates the examination of large numbers of samples. However, the special features of AluScan data rendered difficult the calling of copy number variation (CNV) directly using the calling algorithms designed for whole genome sequencing (WGS) or exome sequencing. RESULTS: In this study, an AluScanCNV package has been assembled for efficient CNV calling from AluScan sequencing data employing a Geary-Hinkley transformation (GHT) of read-depth ratios between either paired test-control samples, or between test samples and a reference template constructed from reference samples, to call the localized CNVs, followed by use of a GISTIC-like algorithm to identify recurrent CNVs and circular binary segmentation (CBS) to reveal large extended CNVs. To evaluate the utility of CNVs called from AluScan data, the AluScans from 23 non-cancer and 38 cancer genomes were analyzed in this study. The glioma samples analyzed yielded the familiar extended copy-number losses on chromosomes 1p and 9. Also, the recurrent somatic CNVs identified from liver cancer samples were similar to those reported for liver cancer WGS with respect to a striking enrichment of copy-number gains in chromosomes 1q and 8q. When localized or recurrent CNV-features capable of distinguishing between liver and non-liver cancer samples were selected by correlation-based machine learning, a highly accurate separation of the liver and non-liver cancer classes was attained. CONCLUSIONS: The results obtained from non-cancer and cancerous tissues indicated that the AluScanCNV package can be employed to call localized, recurrent and extended CNVs from AluScan sequences. Moreover, both the localized and recurrent CNVs identified by this method could be subjected to machine-learning selection to yield distinguishing CNV-features that were capable of separating between liver cancers and other types of cancers. Since the method is applicable to any human DNA sample with or without the availability of a paired control, it can also be employed to analyze the constitutional CNVs of individuals. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13336-014-0015-z) contains supplementary material, which is available to authorized users. BioMed Central 2014-12-05 /pmc/articles/PMC4273479/ /pubmed/25558350 http://dx.doi.org/10.1186/s13336-014-0015-z Text en © Yang et al.; licensee BioMed Central. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Yang, Jian-Feng
Ding, Xiao-Fan
Chen, Lei
Mat, Wai-Kin
Xu, Michelle Zhi
Chen, Jin-Fei
Wang, Jian-Min
Xu, Lin
Poon, Wai-Sang
Kwong, Ava
Leung, Gilberto Ka-Kit
Tan, Tze-Ching
Yu, Chi-Hung
Ke, Yue-Bin
Xu, Xin-Yun
Ke, Xiao-Yan
Ma, Ronald CW
Chan, Juliana CN
Wan, Wei-Qing
Zhang, Li-Wei
Kumar, Yogesh
Tsang, Shui-Ying
Li, Shao
Wang, Hong-Yang
Xue, Hong
Copy number variation analysis based on AluScan sequences
title Copy number variation analysis based on AluScan sequences
title_full Copy number variation analysis based on AluScan sequences
title_fullStr Copy number variation analysis based on AluScan sequences
title_full_unstemmed Copy number variation analysis based on AluScan sequences
title_short Copy number variation analysis based on AluScan sequences
title_sort copy number variation analysis based on aluscan sequences
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4273479/
https://www.ncbi.nlm.nih.gov/pubmed/25558350
http://dx.doi.org/10.1186/s13336-014-0015-z
work_keys_str_mv AT yangjianfeng copynumbervariationanalysisbasedonaluscansequences
AT dingxiaofan copynumbervariationanalysisbasedonaluscansequences
AT chenlei copynumbervariationanalysisbasedonaluscansequences
AT matwaikin copynumbervariationanalysisbasedonaluscansequences
AT xumichellezhi copynumbervariationanalysisbasedonaluscansequences
AT chenjinfei copynumbervariationanalysisbasedonaluscansequences
AT wangjianmin copynumbervariationanalysisbasedonaluscansequences
AT xulin copynumbervariationanalysisbasedonaluscansequences
AT poonwaisang copynumbervariationanalysisbasedonaluscansequences
AT kwongava copynumbervariationanalysisbasedonaluscansequences
AT leunggilbertokakit copynumbervariationanalysisbasedonaluscansequences
AT tantzeching copynumbervariationanalysisbasedonaluscansequences
AT yuchihung copynumbervariationanalysisbasedonaluscansequences
AT keyuebin copynumbervariationanalysisbasedonaluscansequences
AT xuxinyun copynumbervariationanalysisbasedonaluscansequences
AT kexiaoyan copynumbervariationanalysisbasedonaluscansequences
AT maronaldcw copynumbervariationanalysisbasedonaluscansequences
AT chanjulianacn copynumbervariationanalysisbasedonaluscansequences
AT wanweiqing copynumbervariationanalysisbasedonaluscansequences
AT zhangliwei copynumbervariationanalysisbasedonaluscansequences
AT kumaryogesh copynumbervariationanalysisbasedonaluscansequences
AT tsangshuiying copynumbervariationanalysisbasedonaluscansequences
AT lishao copynumbervariationanalysisbasedonaluscansequences
AT wanghongyang copynumbervariationanalysisbasedonaluscansequences
AT xuehong copynumbervariationanalysisbasedonaluscansequences