Cargando…
Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data
As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the ne...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671393/ https://www.ncbi.nlm.nih.gov/pubmed/33575605 http://dx.doi.org/10.1093/nargab/lqaa054 |
_version_ | 1783610920467955712 |
---|---|
author | Luu, Phuc-Loi Ong, Phuc-Thinh Dinh, Thanh-Phuoc Clark, Susan J |
author_facet | Luu, Phuc-Loi Ong, Phuc-Thinh Dinh, Thanh-Phuoc Clark, Susan J |
author_sort | Luu, Phuc-Loi |
collection | PubMed |
description | As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the new assembly or by converting the coordinates of the data between assemblies using a mapping file, an approach referred to as ‘liftover’. Compared to re-alignment approaches, liftover is a more rapid and cost-effective solution. Here, we benchmark six liftover tools commonly used for conversion between genome assemblies by coordinates, including UCSC liftOver, rtracklayer::liftOver, CrossMap, NCBI Remap, flo and segment_liftover to determine how they performed for whole genome bisulphite sequencing (WGBS) and ChIP-seq data. Our results show high correlation between the six tools for conversion of 43 WGBS paired samples. For the chromatin sequencing data we found from interval conversion of 366 ChIP-Seq datasets, segment_liftover generates more reliable results than USCS liftOver. However, we found some regions do not always remain the same after liftover. To further increase the accuracy of liftover and avoid misleading results, we developed a three-step guideline that removes aberrant regions to ensure more robust genome conversion between reference assemblies. |
format | Online Article Text |
id | pubmed-7671393 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-76713932021-02-10 Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data Luu, Phuc-Loi Ong, Phuc-Thinh Dinh, Thanh-Phuoc Clark, Susan J NAR Genom Bioinform Methods Article As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the new assembly or by converting the coordinates of the data between assemblies using a mapping file, an approach referred to as ‘liftover’. Compared to re-alignment approaches, liftover is a more rapid and cost-effective solution. Here, we benchmark six liftover tools commonly used for conversion between genome assemblies by coordinates, including UCSC liftOver, rtracklayer::liftOver, CrossMap, NCBI Remap, flo and segment_liftover to determine how they performed for whole genome bisulphite sequencing (WGBS) and ChIP-seq data. Our results show high correlation between the six tools for conversion of 43 WGBS paired samples. For the chromatin sequencing data we found from interval conversion of 366 ChIP-Seq datasets, segment_liftover generates more reliable results than USCS liftOver. However, we found some regions do not always remain the same after liftover. To further increase the accuracy of liftover and avoid misleading results, we developed a three-step guideline that removes aberrant regions to ensure more robust genome conversion between reference assemblies. Oxford University Press 2020-08-06 /pmc/articles/PMC7671393/ /pubmed/33575605 http://dx.doi.org/10.1093/nargab/lqaa054 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Article Luu, Phuc-Loi Ong, Phuc-Thinh Dinh, Thanh-Phuoc Clark, Susan J Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data |
title | Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data |
title_full | Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data |
title_fullStr | Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data |
title_full_unstemmed | Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data |
title_short | Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data |
title_sort | benchmark study comparing liftover tools for genome conversion of epigenome sequencing data |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671393/ https://www.ncbi.nlm.nih.gov/pubmed/33575605 http://dx.doi.org/10.1093/nargab/lqaa054 |
work_keys_str_mv | AT luuphucloi benchmarkstudycomparingliftovertoolsforgenomeconversionofepigenomesequencingdata AT ongphucthinh benchmarkstudycomparingliftovertoolsforgenomeconversionofepigenomesequencingdata AT dinhthanhphuoc benchmarkstudycomparingliftovertoolsforgenomeconversionofepigenomesequencingdata AT clarksusanj benchmarkstudycomparingliftovertoolsforgenomeconversionofepigenomesequencingdata |