Cargando…
Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study
Ventricular septal defect (VSD) is a fatal congenital heart disease showing severe consequence in affected infants. Early diagnosis plays an important role, particularly through genetic variants. Existing panel-based approaches of variants mining suffer from shortage of large panels, costly sequenci...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6694746/ https://www.ncbi.nlm.nih.gov/pubmed/31440271 http://dx.doi.org/10.3389/fgene.2019.00670 |
_version_ | 1783443892273676288 |
---|---|
author | Jiang, Peng Hu, Yaofei Wang, Yiqi Zhang, Jin Zhu, Qinghong Bai, Lin Tong, Qiang Li, Tao Zhao, Liang |
author_facet | Jiang, Peng Hu, Yaofei Wang, Yiqi Zhang, Jin Zhu, Qinghong Bai, Lin Tong, Qiang Li, Tao Zhao, Liang |
author_sort | Jiang, Peng |
collection | PubMed |
description | Ventricular septal defect (VSD) is a fatal congenital heart disease showing severe consequence in affected infants. Early diagnosis plays an important role, particularly through genetic variants. Existing panel-based approaches of variants mining suffer from shortage of large panels, costly sequencing, and missing rare variants. Although a trio-based method alleviates these limitations to some extent, it is agnostic to novel mutations and computational intensive. Considering these limitations, we are studying a novel variants mining algorithm from trio-based sequencing data and apply it on a VSD trio to identify associated mutations. Our approach starts with irrelevant k-mer filtering from sequences of a trio via a newly conceived coupled Bloom Filter, then corrects sequencing errors by using a statistical approach and extends kept k-mers into long sequences. These extended sequences are used as input for variants needed. Later, the obtained variants are comprehensively analyzed against existing databases to mine VSD-related mutations. Experiments show that our trio-based algorithm narrows down candidate coding genes and lncRNAs by about 10- and 5-folds comparing with single sequence-based approaches, respectively. Meanwhile, our algorithm is 10 times faster and 2 magnitudes memory-frugal compared with existing state-of-the-art approach. By applying our approach to a VSD trio, we fish out an unreported gene—CD80, a combination of two genes—MYBPC3 and TRDN and a lncRNA—NONHSAT096266.2, which are highly likely to be VSD-related. |
format | Online Article Text |
id | pubmed-6694746 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-66947462019-08-22 Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study Jiang, Peng Hu, Yaofei Wang, Yiqi Zhang, Jin Zhu, Qinghong Bai, Lin Tong, Qiang Li, Tao Zhao, Liang Front Genet Genetics Ventricular septal defect (VSD) is a fatal congenital heart disease showing severe consequence in affected infants. Early diagnosis plays an important role, particularly through genetic variants. Existing panel-based approaches of variants mining suffer from shortage of large panels, costly sequencing, and missing rare variants. Although a trio-based method alleviates these limitations to some extent, it is agnostic to novel mutations and computational intensive. Considering these limitations, we are studying a novel variants mining algorithm from trio-based sequencing data and apply it on a VSD trio to identify associated mutations. Our approach starts with irrelevant k-mer filtering from sequences of a trio via a newly conceived coupled Bloom Filter, then corrects sequencing errors by using a statistical approach and extends kept k-mers into long sequences. These extended sequences are used as input for variants needed. Later, the obtained variants are comprehensively analyzed against existing databases to mine VSD-related mutations. Experiments show that our trio-based algorithm narrows down candidate coding genes and lncRNAs by about 10- and 5-folds comparing with single sequence-based approaches, respectively. Meanwhile, our algorithm is 10 times faster and 2 magnitudes memory-frugal compared with existing state-of-the-art approach. By applying our approach to a VSD trio, we fish out an unreported gene—CD80, a combination of two genes—MYBPC3 and TRDN and a lncRNA—NONHSAT096266.2, which are highly likely to be VSD-related. Frontiers Media S.A. 2019-08-08 /pmc/articles/PMC6694746/ /pubmed/31440271 http://dx.doi.org/10.3389/fgene.2019.00670 Text en Copyright © 2019 Jiang, Hu, Wang, Zhang, Zhu, Bai, Tong, Li and Zhao http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Jiang, Peng Hu, Yaofei Wang, Yiqi Zhang, Jin Zhu, Qinghong Bai, Lin Tong, Qiang Li, Tao Zhao, Liang Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study |
title | Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study |
title_full | Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study |
title_fullStr | Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study |
title_full_unstemmed | Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study |
title_short | Efficient Mining of Variants From Trios for Ventricular Septal Defect Association Study |
title_sort | efficient mining of variants from trios for ventricular septal defect association study |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6694746/ https://www.ncbi.nlm.nih.gov/pubmed/31440271 http://dx.doi.org/10.3389/fgene.2019.00670 |
work_keys_str_mv | AT jiangpeng efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy AT huyaofei efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy AT wangyiqi efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy AT zhangjin efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy AT zhuqinghong efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy AT bailin efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy AT tongqiang efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy AT litao efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy AT zhaoliang efficientminingofvariantsfromtriosforventricularseptaldefectassociationstudy |