Cargando…
CBCR: A Curriculum Based Strategy For Chromosome Reconstruction
In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073114/ https://www.ncbi.nlm.nih.gov/pubmed/33923653 http://dx.doi.org/10.3390/ijms22084140 |
_version_ | 1783684059303510016 |
---|---|
author | Hovenga, Van Oluwadare, Oluwatosin |
author_facet | Hovenga, Van Oluwadare, Oluwatosin |
author_sort | Hovenga, Van |
collection | PubMed |
description | In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data. CBCR takes intra-chromosomal Hi-C interaction frequencies as an input and outputs a set of [Formula: see text] coordinates that estimate the chromosome’s three dimensional structure in the form of a .pdb file. The algorithm relies on progressively training a distance-restraint-based algorithm with a strategy we refer to as curriculum learning. Curriculum learning divides the Hi-C data into classes based on contact frequency and progressively re-trains the distance-restraint algorithm based on the assumed importance of each curriculum in predicting the underlying chromosome structure. The distance-restraint algorithm relies on a modification of a Gaussian maximum likelihood function that scales probabilities based on the importance of features. We evaluate the performance of CBCR on both simulated and actual Hi-C data and perform validation on FISH, HiChIP, and ChIA-PET data as well. We also compare the performance of CBCR to several current methods. Our analysis shows that the use of curricula affects the rate of convergence of the optimization while decreasing the computational cost of our distance-restraint algorithm. Also, CBCR is more robust to increases in data resolution and therefore yields superior reconstruction accuracy of higher resolution data than all other methods in our comparison. |
format | Online Article Text |
id | pubmed-8073114 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-80731142021-04-27 CBCR: A Curriculum Based Strategy For Chromosome Reconstruction Hovenga, Van Oluwadare, Oluwatosin Int J Mol Sci Article In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data. CBCR takes intra-chromosomal Hi-C interaction frequencies as an input and outputs a set of [Formula: see text] coordinates that estimate the chromosome’s three dimensional structure in the form of a .pdb file. The algorithm relies on progressively training a distance-restraint-based algorithm with a strategy we refer to as curriculum learning. Curriculum learning divides the Hi-C data into classes based on contact frequency and progressively re-trains the distance-restraint algorithm based on the assumed importance of each curriculum in predicting the underlying chromosome structure. The distance-restraint algorithm relies on a modification of a Gaussian maximum likelihood function that scales probabilities based on the importance of features. We evaluate the performance of CBCR on both simulated and actual Hi-C data and perform validation on FISH, HiChIP, and ChIA-PET data as well. We also compare the performance of CBCR to several current methods. Our analysis shows that the use of curricula affects the rate of convergence of the optimization while decreasing the computational cost of our distance-restraint algorithm. Also, CBCR is more robust to increases in data resolution and therefore yields superior reconstruction accuracy of higher resolution data than all other methods in our comparison. MDPI 2021-04-16 /pmc/articles/PMC8073114/ /pubmed/33923653 http://dx.doi.org/10.3390/ijms22084140 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hovenga, Van Oluwadare, Oluwatosin CBCR: A Curriculum Based Strategy For Chromosome Reconstruction |
title | CBCR: A Curriculum Based Strategy For Chromosome Reconstruction |
title_full | CBCR: A Curriculum Based Strategy For Chromosome Reconstruction |
title_fullStr | CBCR: A Curriculum Based Strategy For Chromosome Reconstruction |
title_full_unstemmed | CBCR: A Curriculum Based Strategy For Chromosome Reconstruction |
title_short | CBCR: A Curriculum Based Strategy For Chromosome Reconstruction |
title_sort | cbcr: a curriculum based strategy for chromosome reconstruction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073114/ https://www.ncbi.nlm.nih.gov/pubmed/33923653 http://dx.doi.org/10.3390/ijms22084140 |
work_keys_str_mv | AT hovengavan cbcracurriculumbasedstrategyforchromosomereconstruction AT oluwadareoluwatosin cbcracurriculumbasedstrategyforchromosomereconstruction |