Cargando…
Impact of data resolution on three-dimensional structure inference methods
BACKGROUND: Assays that are capable of detecting genome-wide chromatin interactions have produced massive amount of data and led to great understanding of the chromosomal three-dimensional (3D) structure. As technology becomes more sophisticated, higher-and-higher resolution data are being produced,...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4744395/ https://www.ncbi.nlm.nih.gov/pubmed/26852142 http://dx.doi.org/10.1186/s12859-016-0894-z |
_version_ | 1782414482252759040 |
---|---|
author | Park, Jincheol Lin, Shili |
author_facet | Park, Jincheol Lin, Shili |
author_sort | Park, Jincheol |
collection | PubMed |
description | BACKGROUND: Assays that are capable of detecting genome-wide chromatin interactions have produced massive amount of data and led to great understanding of the chromosomal three-dimensional (3D) structure. As technology becomes more sophisticated, higher-and-higher resolution data are being produced, going from the initial 1 Megabases (Mb) resolution to the current 10 Kilobases (Kb) or even 1 Kb resolution. The availability of genome-wide interaction data necessitates development of analytical methods to recover the underlying 3D spatial chromatin structure, but challenges abound. Most of the methods were proposed for analyzing data at low resolution (1 Mb). Their behaviors are thus unknown for higher resolution data. For such data, one of the key features is the high proportion of “0” contact counts among all available data, in other words, the excess of zeros. RESULTS: To address the issue of excess of zeros, in this paper, we propose a truncated Random effect EXpression (tREX) method that can handle data at various resolutions. We then assess the performance of tREX and a number of leading existing methods for recovering the underlying chromatin 3D structure. This was accomplished by creating in-silico data to mimic multiple levels of resolution and submit the methods to a “stress test”. Finally, we applied tREX and the comparison methods to a Hi-C dataset for which FISH measurements are available to evaluate estimation accuracy. CONCLUSION: The proposed tREX method achieves consistently good performance in all 30 simulated settings considered. It is not only robust to resolution level and underlying parameters, but also insensitive to model misspecification. This conclusion is based on observations made in terms of 3D structure estimation accuracy and preservation of topologically associated domains. Application of the methods to the human lymphoblastoid cell line data on chromosomes 14 and 22 further substantiates the superior performance of tREX: the constructed 3D structure from tREX is consistent with the FISH measurements, and the corresponding distances predicted by tREX have higher correlation with the FISH measurements than any of the comparison methods. SOFTWARE: An open-source R-package is available at http://www.stat.osu.edu/~statgen/Software/tRex. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0894-z) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4744395 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-47443952016-02-07 Impact of data resolution on three-dimensional structure inference methods Park, Jincheol Lin, Shili BMC Bioinformatics Research Article BACKGROUND: Assays that are capable of detecting genome-wide chromatin interactions have produced massive amount of data and led to great understanding of the chromosomal three-dimensional (3D) structure. As technology becomes more sophisticated, higher-and-higher resolution data are being produced, going from the initial 1 Megabases (Mb) resolution to the current 10 Kilobases (Kb) or even 1 Kb resolution. The availability of genome-wide interaction data necessitates development of analytical methods to recover the underlying 3D spatial chromatin structure, but challenges abound. Most of the methods were proposed for analyzing data at low resolution (1 Mb). Their behaviors are thus unknown for higher resolution data. For such data, one of the key features is the high proportion of “0” contact counts among all available data, in other words, the excess of zeros. RESULTS: To address the issue of excess of zeros, in this paper, we propose a truncated Random effect EXpression (tREX) method that can handle data at various resolutions. We then assess the performance of tREX and a number of leading existing methods for recovering the underlying chromatin 3D structure. This was accomplished by creating in-silico data to mimic multiple levels of resolution and submit the methods to a “stress test”. Finally, we applied tREX and the comparison methods to a Hi-C dataset for which FISH measurements are available to evaluate estimation accuracy. CONCLUSION: The proposed tREX method achieves consistently good performance in all 30 simulated settings considered. It is not only robust to resolution level and underlying parameters, but also insensitive to model misspecification. This conclusion is based on observations made in terms of 3D structure estimation accuracy and preservation of topologically associated domains. Application of the methods to the human lymphoblastoid cell line data on chromosomes 14 and 22 further substantiates the superior performance of tREX: the constructed 3D structure from tREX is consistent with the FISH measurements, and the corresponding distances predicted by tREX have higher correlation with the FISH measurements than any of the comparison methods. SOFTWARE: An open-source R-package is available at http://www.stat.osu.edu/~statgen/Software/tRex. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-0894-z) contains supplementary material, which is available to authorized users. BioMed Central 2016-02-06 /pmc/articles/PMC4744395/ /pubmed/26852142 http://dx.doi.org/10.1186/s12859-016-0894-z Text en © Park and Lin. 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Park, Jincheol Lin, Shili Impact of data resolution on three-dimensional structure inference methods |
title | Impact of data resolution on three-dimensional structure inference methods |
title_full | Impact of data resolution on three-dimensional structure inference methods |
title_fullStr | Impact of data resolution on three-dimensional structure inference methods |
title_full_unstemmed | Impact of data resolution on three-dimensional structure inference methods |
title_short | Impact of data resolution on three-dimensional structure inference methods |
title_sort | impact of data resolution on three-dimensional structure inference methods |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4744395/ https://www.ncbi.nlm.nih.gov/pubmed/26852142 http://dx.doi.org/10.1186/s12859-016-0894-z |
work_keys_str_mv | AT parkjincheol impactofdataresolutiononthreedimensionalstructureinferencemethods AT linshili impactofdataresolutiononthreedimensionalstructureinferencemethods |