Cargando…
EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps
The Hi-C technique has been shown to be a promising method to detect structural variations (SVs) in human genomes. However, algorithms that can use Hi-C data for a full-range SV detection have been severely lacking. Current methods can only identify interchromosomal translocations and long-range int...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Association for the Advancement of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200291/ https://www.ncbi.nlm.nih.gov/pubmed/35704579 http://dx.doi.org/10.1126/sciadv.abn9215 |
_version_ | 1784728027943403520 |
---|---|
author | Wang, Xiaotao Luan, Yu Yue, Feng |
author_facet | Wang, Xiaotao Luan, Yu Yue, Feng |
author_sort | Wang, Xiaotao |
collection | PubMed |
description | The Hi-C technique has been shown to be a promising method to detect structural variations (SVs) in human genomes. However, algorithms that can use Hi-C data for a full-range SV detection have been severely lacking. Current methods can only identify interchromosomal translocations and long-range intrachromosomal SVs (>1 Mb) at less-than-optimal resolution. Therefore, we develop EagleC, a framework that combines deep-learning and ensemble-learning strategies to predict a full range of SVs at high resolution. We show that EagleC can uniquely capture a set of fusion genes that are missed by whole-genome sequencing or nanopore. Furthermore, EagleC also effectively captures SVs in other chromatin interaction platforms, such as HiChIP, Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET), and capture Hi-C. We apply EagleC in more than 100 cancer cell lines and primary tumors and identify a valuable set of high-quality SVs. Last, we demonstrate that EagleC can be applied to single-cell Hi-C and used to study the SV heterogeneity in primary tumors. |
format | Online Article Text |
id | pubmed-9200291 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | American Association for the Advancement of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-92002912022-06-27 EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps Wang, Xiaotao Luan, Yu Yue, Feng Sci Adv Biomedicine and Life Sciences The Hi-C technique has been shown to be a promising method to detect structural variations (SVs) in human genomes. However, algorithms that can use Hi-C data for a full-range SV detection have been severely lacking. Current methods can only identify interchromosomal translocations and long-range intrachromosomal SVs (>1 Mb) at less-than-optimal resolution. Therefore, we develop EagleC, a framework that combines deep-learning and ensemble-learning strategies to predict a full range of SVs at high resolution. We show that EagleC can uniquely capture a set of fusion genes that are missed by whole-genome sequencing or nanopore. Furthermore, EagleC also effectively captures SVs in other chromatin interaction platforms, such as HiChIP, Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET), and capture Hi-C. We apply EagleC in more than 100 cancer cell lines and primary tumors and identify a valuable set of high-quality SVs. Last, we demonstrate that EagleC can be applied to single-cell Hi-C and used to study the SV heterogeneity in primary tumors. American Association for the Advancement of Science 2022-06-15 /pmc/articles/PMC9200291/ /pubmed/35704579 http://dx.doi.org/10.1126/sciadv.abn9215 Text en Copyright © 2022 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial License 4.0 (CC BY-NC). https://creativecommons.org/licenses/by-nc/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license (https://creativecommons.org/licenses/by-nc/4.0/) , which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited. |
spellingShingle | Biomedicine and Life Sciences Wang, Xiaotao Luan, Yu Yue, Feng EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps |
title | EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps |
title_full | EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps |
title_fullStr | EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps |
title_full_unstemmed | EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps |
title_short | EagleC: A deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps |
title_sort | eaglec: a deep-learning framework for detecting a full range of structural variations from bulk and single-cell contact maps |
topic | Biomedicine and Life Sciences |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9200291/ https://www.ncbi.nlm.nih.gov/pubmed/35704579 http://dx.doi.org/10.1126/sciadv.abn9215 |
work_keys_str_mv | AT wangxiaotao eaglecadeeplearningframeworkfordetectingafullrangeofstructuralvariationsfrombulkandsinglecellcontactmaps AT luanyu eaglecadeeplearningframeworkfordetectingafullrangeofstructuralvariationsfrombulkandsinglecellcontactmaps AT yuefeng eaglecadeeplearningframeworkfordetectingafullrangeofstructuralvariationsfrombulkandsinglecellcontactmaps |