Cargando…
Application of a superword array in genome assembly
We introduce a data structure called a superword array for finding quickly matches between DNA sequences. The superword array possesses some desirable features of the lookup table and suffix array. We describe simple algorithms for constructing and using a superword array to find pairs of sequences...
Autores principales: | , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1325203/ https://www.ncbi.nlm.nih.gov/pubmed/16397298 http://dx.doi.org/10.1093/nar/gkj419 |
_version_ | 1782126470739525632 |
---|---|
author | Huang, Xiaoqiu Yang, Shiaw-Pyng Chinwalla, Asif T. Hillier, LaDeana W. Minx, Patrick Mardis, Elaine R. Wilson, Richard K. |
author_facet | Huang, Xiaoqiu Yang, Shiaw-Pyng Chinwalla, Asif T. Hillier, LaDeana W. Minx, Patrick Mardis, Elaine R. Wilson, Richard K. |
author_sort | Huang, Xiaoqiu |
collection | PubMed |
description | We introduce a data structure called a superword array for finding quickly matches between DNA sequences. The superword array possesses some desirable features of the lookup table and suffix array. We describe simple algorithms for constructing and using a superword array to find pairs of sequences that share a unique superword. The algorithms are implemented in a genome assembly program called PCAP.REP for computation of overlaps between reads. Experimental results produced by PCAP.REP and PCAP on a whole-genome dataset show that PCAP.REP produced a more accurate and contiguous assembly than PCAP. |
format | Text |
id | pubmed-1325203 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-13252032006-01-10 Application of a superword array in genome assembly Huang, Xiaoqiu Yang, Shiaw-Pyng Chinwalla, Asif T. Hillier, LaDeana W. Minx, Patrick Mardis, Elaine R. Wilson, Richard K. Nucleic Acids Res Article We introduce a data structure called a superword array for finding quickly matches between DNA sequences. The superword array possesses some desirable features of the lookup table and suffix array. We describe simple algorithms for constructing and using a superword array to find pairs of sequences that share a unique superword. The algorithms are implemented in a genome assembly program called PCAP.REP for computation of overlaps between reads. Experimental results produced by PCAP.REP and PCAP on a whole-genome dataset show that PCAP.REP produced a more accurate and contiguous assembly than PCAP. Oxford University Press 2006 2006-01-05 /pmc/articles/PMC1325203/ /pubmed/16397298 http://dx.doi.org/10.1093/nar/gkj419 Text en © The Author 2006. Published by Oxford University Press. All rights reserved |
spellingShingle | Article Huang, Xiaoqiu Yang, Shiaw-Pyng Chinwalla, Asif T. Hillier, LaDeana W. Minx, Patrick Mardis, Elaine R. Wilson, Richard K. Application of a superword array in genome assembly |
title | Application of a superword array in genome assembly |
title_full | Application of a superword array in genome assembly |
title_fullStr | Application of a superword array in genome assembly |
title_full_unstemmed | Application of a superword array in genome assembly |
title_short | Application of a superword array in genome assembly |
title_sort | application of a superword array in genome assembly |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1325203/ https://www.ncbi.nlm.nih.gov/pubmed/16397298 http://dx.doi.org/10.1093/nar/gkj419 |
work_keys_str_mv | AT huangxiaoqiu applicationofasuperwordarrayingenomeassembly AT yangshiawpyng applicationofasuperwordarrayingenomeassembly AT chinwallaasift applicationofasuperwordarrayingenomeassembly AT hillierladeanaw applicationofasuperwordarrayingenomeassembly AT minxpatrick applicationofasuperwordarrayingenomeassembly AT mardiselainer applicationofasuperwordarrayingenomeassembly AT wilsonrichardk applicationofasuperwordarrayingenomeassembly |