Cargando…

Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data

BACKGROUND: Due to the affordability of whole-genome sequencing, the genetic association design can now address rare diseases. However, some common statistical association methods only consider homozygosity mapping and need several criteria, such as sliding windows of a given size and statistical si...

Descripción completa

Detalles Bibliográficos
Autores principales: Hsieh, Ai-Ru, Sie, Jia Jyun, Chang, Chien Ching, Ott, Jurg, Lian, Ie-Bin, Fann, Cathy S. J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7325894/
https://www.ncbi.nlm.nih.gov/pubmed/32655614
http://dx.doi.org/10.3389/fgene.2020.00555
_version_ 1783552227614392320
author Hsieh, Ai-Ru
Sie, Jia Jyun
Chang, Chien Ching
Ott, Jurg
Lian, Ie-Bin
Fann, Cathy S. J.
author_facet Hsieh, Ai-Ru
Sie, Jia Jyun
Chang, Chien Ching
Ott, Jurg
Lian, Ie-Bin
Fann, Cathy S. J.
author_sort Hsieh, Ai-Ru
collection PubMed
description BACKGROUND: Due to the affordability of whole-genome sequencing, the genetic association design can now address rare diseases. However, some common statistical association methods only consider homozygosity mapping and need several criteria, such as sliding windows of a given size and statistical significance threshold setting, such as P-value < 0.05 to achieve good power in rare disease association detection. METHODS: Our region-specific method, called expanded maximal segmental score (eMSS), converts p-values into continuous scores based on the maximal segmental score (MSS) (Lin et al., 2014) for detecting disease-associated segments. Our eMSS considers the whole genome sequence data, not only regions of homozygosity in candidate genes. Unlike sliding window methods of a given size, eMSS does not need predetermined parameters, such as window size or minimum or maximum number of SNPs in a segment. The performance of eMSS was evaluated by simulations and real data analysis for autosomal recessive diseases multiple intestinal atresia (MIA) and osteogenesis imperfecta (OI), where the number of cases is extremely small. For the real data, the results by eMSS were compared with a state-of-the-art method, HDR-del (Imai et al., 2016). RESULTS: Our simulation results show that eMSS had higher power as the number of non-causal haplotype blocks decreased. The type I error for eMSS under different scenarios was well controlled, p < 0.05. For our observed data, the bone morphogenetic protein 1 (BMP1) gene on chromosome 8, the Violaxanthin de-epoxidase-related chloroplast (VDR) gene on chromosome 12 associated with OI, and the tetratricopeptide repeat domain 7A (TTC7A) gene on chromosome 2 associated with MIA have previously been identified as harboring the relevant pathogenic mutations. CONCLUSIONS: When compared to HDR-del, our eMSS is powerful in analyzing even small numbers of recessive cases, and the results show that the method can further reduce numbers of candidate variants to a very small set of susceptibility pathogenic variants underlying OI and MIA. When we conduct whole-genome sequence analysis, eMSS used 3/5 the computation time of HDR-del. Without additional parameters needing to be set in the segment detection, the computational burden for eMSS is lower compared with that in other region-specific approaches.
format Online
Article
Text
id pubmed-7325894
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-73258942020-07-09 Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data Hsieh, Ai-Ru Sie, Jia Jyun Chang, Chien Ching Ott, Jurg Lian, Ie-Bin Fann, Cathy S. J. Front Genet Genetics BACKGROUND: Due to the affordability of whole-genome sequencing, the genetic association design can now address rare diseases. However, some common statistical association methods only consider homozygosity mapping and need several criteria, such as sliding windows of a given size and statistical significance threshold setting, such as P-value < 0.05 to achieve good power in rare disease association detection. METHODS: Our region-specific method, called expanded maximal segmental score (eMSS), converts p-values into continuous scores based on the maximal segmental score (MSS) (Lin et al., 2014) for detecting disease-associated segments. Our eMSS considers the whole genome sequence data, not only regions of homozygosity in candidate genes. Unlike sliding window methods of a given size, eMSS does not need predetermined parameters, such as window size or minimum or maximum number of SNPs in a segment. The performance of eMSS was evaluated by simulations and real data analysis for autosomal recessive diseases multiple intestinal atresia (MIA) and osteogenesis imperfecta (OI), where the number of cases is extremely small. For the real data, the results by eMSS were compared with a state-of-the-art method, HDR-del (Imai et al., 2016). RESULTS: Our simulation results show that eMSS had higher power as the number of non-causal haplotype blocks decreased. The type I error for eMSS under different scenarios was well controlled, p < 0.05. For our observed data, the bone morphogenetic protein 1 (BMP1) gene on chromosome 8, the Violaxanthin de-epoxidase-related chloroplast (VDR) gene on chromosome 12 associated with OI, and the tetratricopeptide repeat domain 7A (TTC7A) gene on chromosome 2 associated with MIA have previously been identified as harboring the relevant pathogenic mutations. CONCLUSIONS: When compared to HDR-del, our eMSS is powerful in analyzing even small numbers of recessive cases, and the results show that the method can further reduce numbers of candidate variants to a very small set of susceptibility pathogenic variants underlying OI and MIA. When we conduct whole-genome sequence analysis, eMSS used 3/5 the computation time of HDR-del. Without additional parameters needing to be set in the segment detection, the computational burden for eMSS is lower compared with that in other region-specific approaches. Frontiers Media S.A. 2020-06-12 /pmc/articles/PMC7325894/ /pubmed/32655614 http://dx.doi.org/10.3389/fgene.2020.00555 Text en Copyright © 2020 Hsieh, Sie, Chang, Ott, Lian and Fann. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Hsieh, Ai-Ru
Sie, Jia Jyun
Chang, Chien Ching
Ott, Jurg
Lian, Ie-Bin
Fann, Cathy S. J.
Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data
title Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data
title_full Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data
title_fullStr Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data
title_full_unstemmed Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data
title_short Maximal Segmental Score Method for Localizing Recessive Disease Variants Based on Sequence Data
title_sort maximal segmental score method for localizing recessive disease variants based on sequence data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7325894/
https://www.ncbi.nlm.nih.gov/pubmed/32655614
http://dx.doi.org/10.3389/fgene.2020.00555
work_keys_str_mv AT hsiehairu maximalsegmentalscoremethodforlocalizingrecessivediseasevariantsbasedonsequencedata
AT siejiajyun maximalsegmentalscoremethodforlocalizingrecessivediseasevariantsbasedonsequencedata
AT changchienching maximalsegmentalscoremethodforlocalizingrecessivediseasevariantsbasedonsequencedata
AT ottjurg maximalsegmentalscoremethodforlocalizingrecessivediseasevariantsbasedonsequencedata
AT lianiebin maximalsegmentalscoremethodforlocalizingrecessivediseasevariantsbasedonsequencedata
AT fanncathysj maximalsegmentalscoremethodforlocalizingrecessivediseasevariantsbasedonsequencedata