Cargando…

Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study

Ventricular septal defect (VSD) is a fatal congenital heart disease showing severe consequence in affected infants. Early diagnosis plays an important role, particularly through genetic variants. Existing panel-based approaches of variants mining suffer from shortage of large panels, costly sequenci...

Descripción completa

Detalles Bibliográficos
Autores principales: Jiang, Peng, Hu, Yaofei, Wang, Yiqi, Zhang, Jin, Zhu, Qinghong, Bai, Lin, Tong, Qiang, Li, Tao, Zhao, Liang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6694746/
https://www.ncbi.nlm.nih.gov/pubmed/31440271
http://dx.doi.org/10.3389/fgene.2019.00670
_version_ 1783443892273676288
author Jiang, Peng
Hu, Yaofei
Wang, Yiqi
Zhang, Jin
Zhu, Qinghong
Bai, Lin
Tong, Qiang
Li, Tao
Zhao, Liang
author_facet Jiang, Peng
Hu, Yaofei
Wang, Yiqi
Zhang, Jin
Zhu, Qinghong
Bai, Lin
Tong, Qiang
Li, Tao
Zhao, Liang
author_sort Jiang, Peng
collection PubMed
description Ventricular septal defect (VSD) is a fatal congenital heart disease showing severe consequence in affected infants. Early diagnosis plays an important role, particularly through genetic variants. Existing panel-based approaches of variants mining suffer from shortage of large panels, costly sequencing, and missing rare variants. Although a trio-based method alleviates these limitations to some extent, it is agnostic to novel mutations and computational intensive. Considering these limitations, we are studying a novel variants mining algorithm from trio-based sequencing data and apply it on a VSD trio to identify associated mutations. Our approach starts with irrelevant k-mer filtering from sequences of a trio via a newly conceived coupled Bloom Filter, then corrects sequencing errors by using a statistical approach and extends kept k-mers into long sequences. These extended sequences are used as input for variants needed. Later, the obtained variants are comprehensively analyzed against existing databases to mine VSD-related mutations. Experiments show that our trio-based algorithm narrows down candidate coding genes and lncRNAs by about 10- and 5-folds comparing with single sequence-based approaches, respectively. Meanwhile, our algorithm is 10 times faster and 2 magnitudes memory-frugal compared with existing state-of-the-art approach. By applying our approach to a VSD trio, we fish out an unreported gene—CD80, a combination of two genes—MYBPC3 and TRDN and a lncRNA—NONHSAT096266.2, which are highly likely to be VSD-related.
format Online
Article
Text
id pubmed-6694746
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-66947462019-08-22 Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study Jiang, Peng Hu, Yaofei Wang, Yiqi Zhang, Jin Zhu, Qinghong Bai, Lin Tong, Qiang Li, Tao Zhao, Liang Front Genet Genetics Ventricular septal defect (VSD) is a fatal congenital heart disease showing severe consequence in affected infants. Early diagnosis plays an important role, particularly through genetic variants. Existing panel-based approaches of variants mining suffer from shortage of large panels, costly sequencing, and missing rare variants. Although a trio-based method alleviates these limitations to some extent, it is agnostic to novel mutations and computational intensive. Considering these limitations, we are studying a novel variants mining algorithm from trio-based sequencing data and apply it on a VSD trio to identify associated mutations. Our approach starts with irrelevant k-mer filtering from sequences of a trio via a newly conceived coupled Bloom Filter, then corrects sequencing errors by using a statistical approach and extends kept k-mers into long sequences. These extended sequences are used as input for variants needed. Later, the obtained variants are comprehensively analyzed against existing databases to mine VSD-related mutations. Experiments show that our trio-based algorithm narrows down candidate coding genes and lncRNAs by about 10- and 5-folds comparing with single sequence-based approaches, respectively. Meanwhile, our algorithm is 10 times faster and 2 magnitudes memory-frugal compared with existing state-of-the-art approach. By applying our approach to a VSD trio, we fish out an unreported gene—CD80, a combination of two genes—MYBPC3 and TRDN and a lncRNA—NONHSAT096266.2, which are highly likely to be VSD-related. Frontiers Media S.A. 2019-08-08 /pmc/articles/PMC6694746/ /pubmed/31440271 http://dx.doi.org/10.3389/fgene.2019.00670 Text en Copyright © 2019 Jiang, Hu, Wang, Zhang, Zhu, Bai, Tong, Li and Zhao http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Jiang, Peng
Hu, Yaofei
Wang, Yiqi
Zhang, Jin
Zhu, Qinghong
Bai, Lin
Tong, Qiang
Li, Tao
Zhao, Liang
Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study
title Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study
title_full Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study
title_fullStr Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study
title_full_unstemmed Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study
title_short Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study
title_sort efficient mining of variants from trios for ventricular septal defect association study
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6694746/
https://www.ncbi.nlm.nih.gov/pubmed/31440271
http://dx.doi.org/10.3389/fgene.2019.00670
work_keys_str_mv AT jiangpeng efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy
AT huyaofei efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy
AT wangyiqi efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy
AT zhangjin efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy
AT zhuqinghong efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy
AT bailin efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy
AT tongqiang efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy
AT litao efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy
AT zhaoliang efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy