Cargando…

Computational framework for targeted high-coverage sequencing based NIPT

Non-invasive prenatal testing (NIPT) enables accurate detection of fetal chromosomal trisomies. The majority of publicly available computational methods for sequencing-based NIPT analyses rely on low-coverage whole-genome sequencing (WGS) data and are not applicable for targeted high-coverage sequen...

Descripción completa

Detalles Bibliográficos
Autores principales: Teder, Hindrek, Paluoja, Priit, Rekker, Kadri, Salumets, Andres, Krjutškov, Kaarel, Palta, Priit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6613673/
https://www.ncbi.nlm.nih.gov/pubmed/31283802
http://dx.doi.org/10.1371/journal.pone.0209139
_version_ 1783433071852257280
author Teder, Hindrek
Paluoja, Priit
Rekker, Kadri
Salumets, Andres
Krjutškov, Kaarel
Palta, Priit
author_facet Teder, Hindrek
Paluoja, Priit
Rekker, Kadri
Salumets, Andres
Krjutškov, Kaarel
Palta, Priit
author_sort Teder, Hindrek
collection PubMed
description Non-invasive prenatal testing (NIPT) enables accurate detection of fetal chromosomal trisomies. The majority of publicly available computational methods for sequencing-based NIPT analyses rely on low-coverage whole-genome sequencing (WGS) data and are not applicable for targeted high-coverage sequencing data from cell-free DNA samples. Here, we present a novel computational framework for a targeted high-coverage sequencing-based NIPT analysis. The developed framework uses a hidden Markov model (HMM) in conjunction with a supplemental machine learning model, such as decision tree (DT) or support vector machine (SVM), to detect fetal trisomy and parental origin of additional fetal chromosomes. These models were developed using simulated datasets covering a wide range of biologically relevant scenarios with various chromosomal quantities, parental origins of extra chromosomes, fetal DNA fractions, and sequencing read depths. Developed models were tested on simulated and experimental targeted sequencing datasets. Consequently, we determined the functional feasibility and limitations of each proposed approach and demonstrated that read count-based HMM achieved the best overall classification accuracy of 0.89 for detecting fetal euploidies and trisomies on simulated dataset. Furthermore, we show that by using the DT and SVM on the HMM classification results, it was possible to increase the final trisomy classification accuracy to 0.98 and 0.99, respectively. We demonstrate that read count and allelic ratio-based models can achieve a high accuracy (up to 0.98) for detecting fetal trisomy even if the fetal fraction is as low as 2%. Currently, existing commercial NIPT analysis requires at least 4% of fetal fraction, which can be possibly a challenge in case of early gestational age (<10 weeks) or high maternal body mass index (>35 kg/m(2)). More accurate detection can be achieved at higher sequencing depth using HMM in conjunction with supplemental models, which significantly improve the trisomy detection especially in borderline scenarios (e.g., very low fetal fraction) and enables to perform NIPT even earlier than 10 weeks of pregnancy.
format Online
Article
Text
id pubmed-6613673
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-66136732019-07-23 Computational framework for targeted high-coverage sequencing based NIPT Teder, Hindrek Paluoja, Priit Rekker, Kadri Salumets, Andres Krjutškov, Kaarel Palta, Priit PLoS One Research Article Non-invasive prenatal testing (NIPT) enables accurate detection of fetal chromosomal trisomies. The majority of publicly available computational methods for sequencing-based NIPT analyses rely on low-coverage whole-genome sequencing (WGS) data and are not applicable for targeted high-coverage sequencing data from cell-free DNA samples. Here, we present a novel computational framework for a targeted high-coverage sequencing-based NIPT analysis. The developed framework uses a hidden Markov model (HMM) in conjunction with a supplemental machine learning model, such as decision tree (DT) or support vector machine (SVM), to detect fetal trisomy and parental origin of additional fetal chromosomes. These models were developed using simulated datasets covering a wide range of biologically relevant scenarios with various chromosomal quantities, parental origins of extra chromosomes, fetal DNA fractions, and sequencing read depths. Developed models were tested on simulated and experimental targeted sequencing datasets. Consequently, we determined the functional feasibility and limitations of each proposed approach and demonstrated that read count-based HMM achieved the best overall classification accuracy of 0.89 for detecting fetal euploidies and trisomies on simulated dataset. Furthermore, we show that by using the DT and SVM on the HMM classification results, it was possible to increase the final trisomy classification accuracy to 0.98 and 0.99, respectively. We demonstrate that read count and allelic ratio-based models can achieve a high accuracy (up to 0.98) for detecting fetal trisomy even if the fetal fraction is as low as 2%. Currently, existing commercial NIPT analysis requires at least 4% of fetal fraction, which can be possibly a challenge in case of early gestational age (<10 weeks) or high maternal body mass index (>35 kg/m(2)). More accurate detection can be achieved at higher sequencing depth using HMM in conjunction with supplemental models, which significantly improve the trisomy detection especially in borderline scenarios (e.g., very low fetal fraction) and enables to perform NIPT even earlier than 10 weeks of pregnancy. Public Library of Science 2019-07-08 /pmc/articles/PMC6613673/ /pubmed/31283802 http://dx.doi.org/10.1371/journal.pone.0209139 Text en © 2019 Teder et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Teder, Hindrek
Paluoja, Priit
Rekker, Kadri
Salumets, Andres
Krjutškov, Kaarel
Palta, Priit
Computational framework for targeted high-coverage sequencing based NIPT
title Computational framework for targeted high-coverage sequencing based NIPT
title_full Computational framework for targeted high-coverage sequencing based NIPT
title_fullStr Computational framework for targeted high-coverage sequencing based NIPT
title_full_unstemmed Computational framework for targeted high-coverage sequencing based NIPT
title_short Computational framework for targeted high-coverage sequencing based NIPT
title_sort computational framework for targeted high-coverage sequencing based nipt
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6613673/
https://www.ncbi.nlm.nih.gov/pubmed/31283802
http://dx.doi.org/10.1371/journal.pone.0209139
work_keys_str_mv AT tederhindrek computationalframeworkfortargetedhighcoveragesequencingbasednipt
AT paluojapriit computationalframeworkfortargetedhighcoveragesequencingbasednipt
AT rekkerkadri computationalframeworkfortargetedhighcoveragesequencingbasednipt
AT salumetsandres computationalframeworkfortargetedhighcoveragesequencingbasednipt
AT krjutskovkaarel computationalframeworkfortargetedhighcoveragesequencingbasednipt
AT paltapriit computationalframeworkfortargetedhighcoveragesequencingbasednipt