Cargando…

CBCR: A Curriculum Based Strategy For Chromosome Reconstruction

In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data...

Descripción completa

Detalles Bibliográficos
Autores principales: Hovenga, Van, Oluwadare, Oluwatosin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073114/
https://www.ncbi.nlm.nih.gov/pubmed/33923653
http://dx.doi.org/10.3390/ijms22084140
_version_ 1783684059303510016
author Hovenga, Van
Oluwadare, Oluwatosin
author_facet Hovenga, Van
Oluwadare, Oluwatosin
author_sort Hovenga, Van
collection PubMed
description In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data. CBCR takes intra-chromosomal Hi-C interaction frequencies as an input and outputs a set of [Formula: see text] coordinates that estimate the chromosome’s three dimensional structure in the form of a .pdb file. The algorithm relies on progressively training a distance-restraint-based algorithm with a strategy we refer to as curriculum learning. Curriculum learning divides the Hi-C data into classes based on contact frequency and progressively re-trains the distance-restraint algorithm based on the assumed importance of each curriculum in predicting the underlying chromosome structure. The distance-restraint algorithm relies on a modification of a Gaussian maximum likelihood function that scales probabilities based on the importance of features. We evaluate the performance of CBCR on both simulated and actual Hi-C data and perform validation on FISH, HiChIP, and ChIA-PET data as well. We also compare the performance of CBCR to several current methods. Our analysis shows that the use of curricula affects the rate of convergence of the optimization while decreasing the computational cost of our distance-restraint algorithm. Also, CBCR is more robust to increases in data resolution and therefore yields superior reconstruction accuracy of higher resolution data than all other methods in our comparison.
format Online
Article
Text
id pubmed-8073114
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-80731142021-04-27 CBCR: A Curriculum Based Strategy For Chromosome Reconstruction Hovenga, Van Oluwadare, Oluwatosin Int J Mol Sci Article In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data. CBCR takes intra-chromosomal Hi-C interaction frequencies as an input and outputs a set of [Formula: see text] coordinates that estimate the chromosome’s three dimensional structure in the form of a .pdb file. The algorithm relies on progressively training a distance-restraint-based algorithm with a strategy we refer to as curriculum learning. Curriculum learning divides the Hi-C data into classes based on contact frequency and progressively re-trains the distance-restraint algorithm based on the assumed importance of each curriculum in predicting the underlying chromosome structure. The distance-restraint algorithm relies on a modification of a Gaussian maximum likelihood function that scales probabilities based on the importance of features. We evaluate the performance of CBCR on both simulated and actual Hi-C data and perform validation on FISH, HiChIP, and ChIA-PET data as well. We also compare the performance of CBCR to several current methods. Our analysis shows that the use of curricula affects the rate of convergence of the optimization while decreasing the computational cost of our distance-restraint algorithm. Also, CBCR is more robust to increases in data resolution and therefore yields superior reconstruction accuracy of higher resolution data than all other methods in our comparison. MDPI 2021-04-16 /pmc/articles/PMC8073114/ /pubmed/33923653 http://dx.doi.org/10.3390/ijms22084140 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hovenga, Van
Oluwadare, Oluwatosin
CBCR: A Curriculum Based Strategy For Chromosome Reconstruction
title CBCR: A Curriculum Based Strategy For Chromosome Reconstruction
title_full CBCR: A Curriculum Based Strategy For Chromosome Reconstruction
title_fullStr CBCR: A Curriculum Based Strategy For Chromosome Reconstruction
title_full_unstemmed CBCR: A Curriculum Based Strategy For Chromosome Reconstruction
title_short CBCR: A Curriculum Based Strategy For Chromosome Reconstruction
title_sort cbcr: a curriculum based strategy for chromosome reconstruction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073114/
https://www.ncbi.nlm.nih.gov/pubmed/33923653
http://dx.doi.org/10.3390/ijms22084140
work_keys_str_mv AT hovengavan cbcracurriculumbasedstrategyforchromosomereconstruction
AT oluwadareoluwatosin cbcracurriculumbasedstrategyforchromosomereconstruction