Cargando…

Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data

As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Luu, Phuc-Loi, Ong, Phuc-Thinh, Dinh, Thanh-Phuoc, Clark, Susan J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671393/
https://www.ncbi.nlm.nih.gov/pubmed/33575605
http://dx.doi.org/10.1093/nargab/lqaa054
_version_ 1783610920467955712
author Luu, Phuc-Loi
Ong, Phuc-Thinh
Dinh, Thanh-Phuoc
Clark, Susan J
author_facet Luu, Phuc-Loi
Ong, Phuc-Thinh
Dinh, Thanh-Phuoc
Clark, Susan J
author_sort Luu, Phuc-Loi
collection PubMed
description As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the new assembly or by converting the coordinates of the data between assemblies using a mapping file, an approach referred to as ‘liftover’. Compared to re-alignment approaches, liftover is a more rapid and cost-effective solution. Here, we benchmark six liftover tools commonly used for conversion between genome assemblies by coordinates, including UCSC liftOver, rtracklayer::liftOver, CrossMap, NCBI Remap, flo and segment_liftover to determine how they performed for whole genome bisulphite sequencing (WGBS) and ChIP-seq data. Our results show high correlation between the six tools for conversion of 43 WGBS paired samples. For the chromatin sequencing data we found from interval conversion of 366 ChIP-Seq datasets, segment_liftover generates more reliable results than USCS liftOver. However, we found some regions do not always remain the same after liftover. To further increase the accuracy of liftover and avoid misleading results, we developed a three-step guideline that removes aberrant regions to ensure more robust genome conversion between reference assemblies.
format Online
Article
Text
id pubmed-7671393
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-76713932021-02-10 Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data Luu, Phuc-Loi Ong, Phuc-Thinh Dinh, Thanh-Phuoc Clark, Susan J NAR Genom Bioinform Methods Article As reference genome assemblies are updated there is a need to convert epigenome sequence data from older genome assemblies to newer versions, to facilitate data integration and visualization on the same coordinate system. Conversion can be done by re-alignment of the original sequence data to the new assembly or by converting the coordinates of the data between assemblies using a mapping file, an approach referred to as ‘liftover’. Compared to re-alignment approaches, liftover is a more rapid and cost-effective solution. Here, we benchmark six liftover tools commonly used for conversion between genome assemblies by coordinates, including UCSC liftOver, rtracklayer::liftOver, CrossMap, NCBI Remap, flo and segment_liftover to determine how they performed for whole genome bisulphite sequencing (WGBS) and ChIP-seq data. Our results show high correlation between the six tools for conversion of 43 WGBS paired samples. For the chromatin sequencing data we found from interval conversion of 366 ChIP-Seq datasets, segment_liftover generates more reliable results than USCS liftOver. However, we found some regions do not always remain the same after liftover. To further increase the accuracy of liftover and avoid misleading results, we developed a three-step guideline that removes aberrant regions to ensure more robust genome conversion between reference assemblies. Oxford University Press 2020-08-06 /pmc/articles/PMC7671393/ /pubmed/33575605 http://dx.doi.org/10.1093/nargab/lqaa054 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Article
Luu, Phuc-Loi
Ong, Phuc-Thinh
Dinh, Thanh-Phuoc
Clark, Susan J
Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data
title Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data
title_full Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data
title_fullStr Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data
title_full_unstemmed Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data
title_short Benchmark study comparing liftover tools for genome conversion of epigenome sequencing data
title_sort benchmark study comparing liftover tools for genome conversion of epigenome sequencing data
topic Methods Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671393/
https://www.ncbi.nlm.nih.gov/pubmed/33575605
http://dx.doi.org/10.1093/nargab/lqaa054
work_keys_str_mv AT luuphucloi benchmarkstudycomparingliftovertoolsforgenomeconversionofepigenomesequencingdata
AT ongphucthinh benchmarkstudycomparingliftovertoolsforgenomeconversionofepigenomesequencingdata
AT dinhthanhphuoc benchmarkstudycomparingliftovertoolsforgenomeconversionofepigenomesequencingdata
AT clarksusanj benchmarkstudycomparingliftovertoolsforgenomeconversionofepigenomesequencingdata