Cargando…

CBCR: A Curriculum Based Strategy For Chromosome Reconstruction

In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data...

Descripción completa

Detalles Bibliográficos
Autores principales: Hovenga, Van, Oluwadare, Oluwatosin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073114/
https://www.ncbi.nlm.nih.gov/pubmed/33923653
http://dx.doi.org/10.3390/ijms22084140
Descripción
Sumario:In this paper, we introduce a novel algorithm that aims to estimate chromosomes’ structure from their Hi-C contact data, called Curriculum Based Chromosome Reconstruction (CBCR). Specifically, our method performs this three dimensional reconstruction using cis-chromosomal interactions from Hi-C data. CBCR takes intra-chromosomal Hi-C interaction frequencies as an input and outputs a set of [Formula: see text] coordinates that estimate the chromosome’s three dimensional structure in the form of a .pdb file. The algorithm relies on progressively training a distance-restraint-based algorithm with a strategy we refer to as curriculum learning. Curriculum learning divides the Hi-C data into classes based on contact frequency and progressively re-trains the distance-restraint algorithm based on the assumed importance of each curriculum in predicting the underlying chromosome structure. The distance-restraint algorithm relies on a modification of a Gaussian maximum likelihood function that scales probabilities based on the importance of features. We evaluate the performance of CBCR on both simulated and actual Hi-C data and perform validation on FISH, HiChIP, and ChIA-PET data as well. We also compare the performance of CBCR to several current methods. Our analysis shows that the use of curricula affects the rate of convergence of the optimization while decreasing the computational cost of our distance-restraint algorithm. Also, CBCR is more robust to increases in data resolution and therefore yields superior reconstruction accuracy of higher resolution data than all other methods in our comparison.