Cargando…
A reference human genome dataset of the BGISEQ-500 sequencer
Background: BGISEQ-500 is a new desktop sequencer developed by BGI. Using DNA nanoball and combinational probe anchor synthesis developed from Complete Genomics™ sequencing technologies, it generates short reads at a large scale. Findings: Here, we present the first human whole-genome sequencing dat...
Autores principales: | , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5467036/ https://www.ncbi.nlm.nih.gov/pubmed/28379488 http://dx.doi.org/10.1093/gigascience/gix024 |
_version_ | 1783243200695107584 |
---|---|
author | Huang, Jie Liang, Xinming Xuan, Yuankai Geng, Chunyu Li, Yuxiang Lu, Haorong Qu, Shoufang Mei, Xianglin Chen, Hongbo Yu, Ting Sun, Nan Rao, Junhua Wang, Jiahao Zhang, Wenwei Chen, Ying Liao, Sha Jiang, Hui Liu, Xin Yang, Zhaopeng Mu, Feng Gao, Shangxian |
author_facet | Huang, Jie Liang, Xinming Xuan, Yuankai Geng, Chunyu Li, Yuxiang Lu, Haorong Qu, Shoufang Mei, Xianglin Chen, Hongbo Yu, Ting Sun, Nan Rao, Junhua Wang, Jiahao Zhang, Wenwei Chen, Ying Liao, Sha Jiang, Hui Liu, Xin Yang, Zhaopeng Mu, Feng Gao, Shangxian |
author_sort | Huang, Jie |
collection | PubMed |
description | Background: BGISEQ-500 is a new desktop sequencer developed by BGI. Using DNA nanoball and combinational probe anchor synthesis developed from Complete Genomics™ sequencing technologies, it generates short reads at a large scale. Findings: Here, we present the first human whole-genome sequencing dataset of BGISEQ-500. The dataset was generated by sequencing the widely used cell line HG001 (NA12878) in two sequencing runs of paired-end 50 bp (PE50) and two sequencing runs of paired-end 100 bp (PE100). We also include examples of the raw images from the sequencer for reference. Finally, we identified variations using this dataset, estimated the accuracy of the variations, and compared to that of the variations identified from similar amounts of publicly available HiSeq2500 data. Conclusions: We found similar single nucleotide polymorphism (SNP) detection accuracy for the BGISEQ-500 PE100 data (false positive rate [FPR] = 0.00020%, sensitivity = 96.20%) compared to the PE150 HiSeq2500 data (FPR = 0.00017%, sensitivity = 96.60%) better SNP detection accuracy than the PE50 data (FPR = 0.0006%, sensitivity = 94.15%). But for insertions and deletions (indels), we found lower accuracy for BGISEQ-500 data (FPR = 0.00069% and 0.00067% for PE100 and PE50 respectively, sensitivity = 88.52% and 70.93%) than the HiSeq2500 data (FPR = 0.00032%, sensitivity = 96.28%). Our dataset can serve as the reference dataset, providing basic information not just for future development, but also for all research and applications based on the new sequencing platform. |
format | Online Article Text |
id | pubmed-5467036 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-54670362017-06-19 A reference human genome dataset of the BGISEQ-500 sequencer Huang, Jie Liang, Xinming Xuan, Yuankai Geng, Chunyu Li, Yuxiang Lu, Haorong Qu, Shoufang Mei, Xianglin Chen, Hongbo Yu, Ting Sun, Nan Rao, Junhua Wang, Jiahao Zhang, Wenwei Chen, Ying Liao, Sha Jiang, Hui Liu, Xin Yang, Zhaopeng Mu, Feng Gao, Shangxian Gigascience Data Note Background: BGISEQ-500 is a new desktop sequencer developed by BGI. Using DNA nanoball and combinational probe anchor synthesis developed from Complete Genomics™ sequencing technologies, it generates short reads at a large scale. Findings: Here, we present the first human whole-genome sequencing dataset of BGISEQ-500. The dataset was generated by sequencing the widely used cell line HG001 (NA12878) in two sequencing runs of paired-end 50 bp (PE50) and two sequencing runs of paired-end 100 bp (PE100). We also include examples of the raw images from the sequencer for reference. Finally, we identified variations using this dataset, estimated the accuracy of the variations, and compared to that of the variations identified from similar amounts of publicly available HiSeq2500 data. Conclusions: We found similar single nucleotide polymorphism (SNP) detection accuracy for the BGISEQ-500 PE100 data (false positive rate [FPR] = 0.00020%, sensitivity = 96.20%) compared to the PE150 HiSeq2500 data (FPR = 0.00017%, sensitivity = 96.60%) better SNP detection accuracy than the PE50 data (FPR = 0.0006%, sensitivity = 94.15%). But for insertions and deletions (indels), we found lower accuracy for BGISEQ-500 data (FPR = 0.00069% and 0.00067% for PE100 and PE50 respectively, sensitivity = 88.52% and 70.93%) than the HiSeq2500 data (FPR = 0.00032%, sensitivity = 96.28%). Our dataset can serve as the reference dataset, providing basic information not just for future development, but also for all research and applications based on the new sequencing platform. Oxford University Press 2017-04-01 /pmc/articles/PMC5467036/ /pubmed/28379488 http://dx.doi.org/10.1093/gigascience/gix024 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Data Note Huang, Jie Liang, Xinming Xuan, Yuankai Geng, Chunyu Li, Yuxiang Lu, Haorong Qu, Shoufang Mei, Xianglin Chen, Hongbo Yu, Ting Sun, Nan Rao, Junhua Wang, Jiahao Zhang, Wenwei Chen, Ying Liao, Sha Jiang, Hui Liu, Xin Yang, Zhaopeng Mu, Feng Gao, Shangxian A reference human genome dataset of the BGISEQ-500 sequencer |
title | A reference human genome dataset of the BGISEQ-500 sequencer |
title_full | A reference human genome dataset of the BGISEQ-500 sequencer |
title_fullStr | A reference human genome dataset of the BGISEQ-500 sequencer |
title_full_unstemmed | A reference human genome dataset of the BGISEQ-500 sequencer |
title_short | A reference human genome dataset of the BGISEQ-500 sequencer |
title_sort | reference human genome dataset of the bgiseq-500 sequencer |
topic | Data Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5467036/ https://www.ncbi.nlm.nih.gov/pubmed/28379488 http://dx.doi.org/10.1093/gigascience/gix024 |
work_keys_str_mv | AT huangjie areferencehumangenomedatasetofthebgiseq500sequencer AT liangxinming areferencehumangenomedatasetofthebgiseq500sequencer AT xuanyuankai areferencehumangenomedatasetofthebgiseq500sequencer AT gengchunyu areferencehumangenomedatasetofthebgiseq500sequencer AT liyuxiang areferencehumangenomedatasetofthebgiseq500sequencer AT luhaorong areferencehumangenomedatasetofthebgiseq500sequencer AT qushoufang areferencehumangenomedatasetofthebgiseq500sequencer AT meixianglin areferencehumangenomedatasetofthebgiseq500sequencer AT chenhongbo areferencehumangenomedatasetofthebgiseq500sequencer AT yuting areferencehumangenomedatasetofthebgiseq500sequencer AT sunnan areferencehumangenomedatasetofthebgiseq500sequencer AT raojunhua areferencehumangenomedatasetofthebgiseq500sequencer AT wangjiahao areferencehumangenomedatasetofthebgiseq500sequencer AT zhangwenwei areferencehumangenomedatasetofthebgiseq500sequencer AT chenying areferencehumangenomedatasetofthebgiseq500sequencer AT liaosha areferencehumangenomedatasetofthebgiseq500sequencer AT jianghui areferencehumangenomedatasetofthebgiseq500sequencer AT liuxin areferencehumangenomedatasetofthebgiseq500sequencer AT yangzhaopeng areferencehumangenomedatasetofthebgiseq500sequencer AT mufeng areferencehumangenomedatasetofthebgiseq500sequencer AT gaoshangxian areferencehumangenomedatasetofthebgiseq500sequencer AT huangjie referencehumangenomedatasetofthebgiseq500sequencer AT liangxinming referencehumangenomedatasetofthebgiseq500sequencer AT xuanyuankai referencehumangenomedatasetofthebgiseq500sequencer AT gengchunyu referencehumangenomedatasetofthebgiseq500sequencer AT liyuxiang referencehumangenomedatasetofthebgiseq500sequencer AT luhaorong referencehumangenomedatasetofthebgiseq500sequencer AT qushoufang referencehumangenomedatasetofthebgiseq500sequencer AT meixianglin referencehumangenomedatasetofthebgiseq500sequencer AT chenhongbo referencehumangenomedatasetofthebgiseq500sequencer AT yuting referencehumangenomedatasetofthebgiseq500sequencer AT sunnan referencehumangenomedatasetofthebgiseq500sequencer AT raojunhua referencehumangenomedatasetofthebgiseq500sequencer AT wangjiahao referencehumangenomedatasetofthebgiseq500sequencer AT zhangwenwei referencehumangenomedatasetofthebgiseq500sequencer AT chenying referencehumangenomedatasetofthebgiseq500sequencer AT liaosha referencehumangenomedatasetofthebgiseq500sequencer AT jianghui referencehumangenomedatasetofthebgiseq500sequencer AT liuxin referencehumangenomedatasetofthebgiseq500sequencer AT yangzhaopeng referencehumangenomedatasetofthebgiseq500sequencer AT mufeng referencehumangenomedatasetofthebgiseq500sequencer AT gaoshangxian referencehumangenomedatasetofthebgiseq500sequencer |