Cargando…

EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps

The Hi-C technique has been shown to be a promising method to detect structural variations (SVs) in human genomes. However, algorithms that can use Hi-C data for a full-range SV detection have been severely lacking. Current methods can only identify interchromosomal translocations and long-range int...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Xiaotao, Luan, Yu, Yue, Feng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Association for the Advancement of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200291/
https://www.ncbi.nlm.nih.gov/pubmed/35704579
http://dx.doi.org/10.1126/sciadv.abn9215
_version_ 1784728027943403520
author Wang, Xiaotao
Luan, Yu
Yue, Feng
author_facet Wang, Xiaotao
Luan, Yu
Yue, Feng
author_sort Wang, Xiaotao
collection PubMed
description The Hi-C technique has been shown to be a promising method to detect structural variations (SVs) in human genomes. However, algorithms that can use Hi-C data for a full-range SV detection have been severely lacking. Current methods can only identify interchromosomal translocations and long-range intrachromosomal SVs (>1 Mb) at less-than-optimal resolution. Therefore, we develop EagleC, a framework that combines deep-learning and ensemble-learning strategies to predict a full range of SVs at high resolution. We show that EagleC can uniquely capture a set of fusion genes that are missed by whole-genome sequencing or nanopore. Furthermore, EagleC also effectively captures SVs in other chromatin interaction platforms, such as HiChIP, Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET), and capture Hi-C. We apply EagleC in more than 100 cancer cell lines and primary tumors and identify a valuable set of high-quality SVs. Last, we demonstrate that EagleC can be applied to single-cell Hi-C and used to study the SV heterogeneity in primary tumors.
format Online
Article
Text
id pubmed-9200291
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Association for the Advancement of Science
record_format MEDLINE/PubMed
spelling pubmed-92002912022-06-27 EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps Wang, Xiaotao Luan, Yu Yue, Feng Sci Adv Biomedicine and Life Sciences The Hi-C technique has been shown to be a promising method to detect structural variations (SVs) in human genomes. However, algorithms that can use Hi-C data for a full-range SV detection have been severely lacking. Current methods can only identify interchromosomal translocations and long-range intrachromosomal SVs (>1 Mb) at less-than-optimal resolution. Therefore, we develop EagleC, a framework that combines deep-learning and ensemble-learning strategies to predict a full range of SVs at high resolution. We show that EagleC can uniquely capture a set of fusion genes that are missed by whole-genome sequencing or nanopore. Furthermore, EagleC also effectively captures SVs in other chromatin interaction platforms, such as HiChIP, Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET), and capture Hi-C. We apply EagleC in more than 100 cancer cell lines and primary tumors and identify a valuable set of high-quality SVs. Last, we demonstrate that EagleC can be applied to single-cell Hi-C and used to study the SV heterogeneity in primary tumors. American Association for the Advancement of Science 2022-06-15 /pmc/articles/PMC9200291/ /pubmed/35704579 http://dx.doi.org/10.1126/sciadv.abn9215 Text en Copyright © 2022 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC). https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license (https://creativecommons.org/licenses/by-nc/4.0/) , which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.
spellingShingle Biomedicine and Life Sciences
Wang, Xiaotao
Luan, Yu
Yue, Feng
EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps
title EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps
title_full EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps
title_fullStr EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps
title_full_unstemmed EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps
title_short EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps
title_sort eaglec: a deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps
topic Biomedicine and Life Sciences
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200291/
https://www.ncbi.nlm.nih.gov/pubmed/35704579
http://dx.doi.org/10.1126/sciadv.abn9215
work_keys_str_mv AT wangxiaotao eaglecadeeplearningframeworkfordetectingafullrangeofstructuralvariationsfrombulkandsinglecellcontactmaps
AT luanyu eaglecadeeplearningframeworkfordetectingafullrangeofstructuralvariationsfrombulkandsinglecellcontactmaps
AT yuefeng eaglecadeeplearningframeworkfordetectingafullrangeofstructuralvariationsfrombulkandsinglecellcontactmaps