Cargando…
SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier
BACKGROUND: Gene homology type classification is required for many types of genome analyses, including comparative genomics, phylogenetics, and protein function annotation. Consequently, a large variety of tools have been developed to perform homology classification across genomes of different speci...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6812468/ https://www.ncbi.nlm.nih.gov/pubmed/31648300 http://dx.doi.org/10.1093/gigascience/giz118 |
_version_ | 1783462667477843968 |
---|---|
author | Hu, Xiao Friedberg, Iddo |
author_facet | Hu, Xiao Friedberg, Iddo |
author_sort | Hu, Xiao |
collection | PubMed |
description | BACKGROUND: Gene homology type classification is required for many types of genome analyses, including comparative genomics, phylogenetics, and protein function annotation. Consequently, a large variety of tools have been developed to perform homology classification across genomes of different species. However, when applied to large genomic data sets, these tools require high memory and CPU usage, typically available only in computational clusters. FINDINGS: Here we present a new graph-based orthology analysis tool, SwiftOrtho, which is optimized for speed and memory usage when applied to large-scale data. SwiftOrtho uses long k-mers to speed up homology search, while using a reduced amino acid alphabet and spaced seeds to compensate for the loss of sensitivity due to long k-mers. In addition, it uses an affinity propagation algorithm to reduce the memory usage when clustering large-scale orthology relationships into orthologous groups. In our tests, SwiftOrtho was the only tool that completed orthology analysis of proteins from 1,760 bacterial genomes on a computer with only 4 GB RAM. Using various standard orthology data sets, we also show that SwiftOrtho has a high accuracy. CONCLUSIONS: SwiftOrtho enables the accurate comparative genomic analyses of thousands of genomes using low-memory computers. SwiftOrtho is available at https://github.com/Rinoahu/SwiftOrtho |
format | Online Article Text |
id | pubmed-6812468 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-68124682019-10-28 SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier Hu, Xiao Friedberg, Iddo Gigascience Technical Note BACKGROUND: Gene homology type classification is required for many types of genome analyses, including comparative genomics, phylogenetics, and protein function annotation. Consequently, a large variety of tools have been developed to perform homology classification across genomes of different species. However, when applied to large genomic data sets, these tools require high memory and CPU usage, typically available only in computational clusters. FINDINGS: Here we present a new graph-based orthology analysis tool, SwiftOrtho, which is optimized for speed and memory usage when applied to large-scale data. SwiftOrtho uses long k-mers to speed up homology search, while using a reduced amino acid alphabet and spaced seeds to compensate for the loss of sensitivity due to long k-mers. In addition, it uses an affinity propagation algorithm to reduce the memory usage when clustering large-scale orthology relationships into orthologous groups. In our tests, SwiftOrtho was the only tool that completed orthology analysis of proteins from 1,760 bacterial genomes on a computer with only 4 GB RAM. Using various standard orthology data sets, we also show that SwiftOrtho has a high accuracy. CONCLUSIONS: SwiftOrtho enables the accurate comparative genomic analyses of thousands of genomes using low-memory computers. SwiftOrtho is available at https://github.com/Rinoahu/SwiftOrtho Oxford University Press 2019-10-24 /pmc/articles/PMC6812468/ /pubmed/31648300 http://dx.doi.org/10.1093/gigascience/giz118 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Technical Note Hu, Xiao Friedberg, Iddo SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier |
title | SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier |
title_full | SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier |
title_fullStr | SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier |
title_full_unstemmed | SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier |
title_short | SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier |
title_sort | swiftortho: a fast, memory-efficient, multiple genome orthology classifier |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6812468/ https://www.ncbi.nlm.nih.gov/pubmed/31648300 http://dx.doi.org/10.1093/gigascience/giz118 |
work_keys_str_mv | AT huxiao swiftorthoafastmemoryefficientmultiplegenomeorthologyclassifier AT friedbergiddo swiftorthoafastmemoryefficientmultiplegenomeorthologyclassifier |