Cargando…

Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus

BACKGROUND: De novo assembly of non-model organism’s transcriptomes has recently been on the rise in concert with the number of de novo transcriptome assembly software programs. There is a knowledge gap as to what assembler software or k-mer strategy is best for construction of an optimal de novo as...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rana, Satshil B., Zadlock, Frank J., Zhang, Ziping, Murphy, Wyatt R., Bentivegna, Carolyn S.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4824410/ https://www.ncbi.nlm.nih.gov/pubmed/27054874 http://dx.doi.org/10.1371/journal.pone.0153104

_version_	1782426083767877632
author	Rana, Satshil B. Zadlock, Frank J. Zhang, Ziping Murphy, Wyatt R. Bentivegna, Carolyn S.
author_facet	Rana, Satshil B. Zadlock, Frank J. Zhang, Ziping Murphy, Wyatt R. Bentivegna, Carolyn S.
author_sort	Rana, Satshil B.
collection	PubMed
description	BACKGROUND: De novo assembly of non-model organism’s transcriptomes has recently been on the rise in concert with the number of de novo transcriptome assembly software programs. There is a knowledge gap as to what assembler software or k-mer strategy is best for construction of an optimal de novo assembly. Additionally, there is a lack of consensus on which evaluation metrics should be used to assess the quality of de novo transcriptome assemblies. RESULT: Six different assembly strategies were evaluated from four different assemblers. The Trinity assembly was used in its default 25 single k-mer value while Bridger, Oases, and SOAPdenovo-Trans were performed with multiple k-mer strategies. Bridger, Oases, and SOAPdenovo-Trans used a small multiple k-mer (SMK) strategy consisting of the k-mer lengths of 21, 25, 27, 29, 31, and 33. Additionally, Oases and SOAPdenovo-Trans were performed using a large multiple k-mer (LMK) strategy consisting of k-mer lengths of 25, 35, 45, 55, 65, 75, and 85. Eleven metrics were used to evaluate each assembly strategy including three genome related evaluation metrics (contig number, N50 length, Contigs >1 kb, reads) and eight transcriptome evaluation metrics (mapped back to transcripts (RMBT), number of full length transcripts, number of open reading frames, Detonate RSEM-EVAL score, and percent alignment to the southern platyfish, Amazon molly, BUSCO and CEGMA databases). The assembly strategy that performed the best, that is it was within the top three of each evaluation metric, was the Bridger assembly (10 of 11) followed by the Oases SMK assembly (8 of 11), the Oases LMK assembly (6 of 11), the Trinity assembly (4 of 11), the SOAP LMK assembly (4 of 11), and the SOAP SMK assembly (3 of 11). CONCLUSION: This study provides an in-depth multi k-mer strategy investigation concluding that the assembler itself had a greater impact than k-mer size regardless of the strategy employed. Additionally, the comprehensive performance transcriptome evaluation metrics utilized in this study identified the need for choosing metrics centered on user defined research goals. Based on the evaluation metrics performed, the Bridger assembly was able to construct the best assembly of the testis transcriptome in Fundulus heteroclitus.
format	Online Article Text
id	pubmed-4824410
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-48244102016-04-22 Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus Rana, Satshil B. Zadlock, Frank J. Zhang, Ziping Murphy, Wyatt R. Bentivegna, Carolyn S. PLoS One Research Article BACKGROUND: De novo assembly of non-model organism’s transcriptomes has recently been on the rise in concert with the number of de novo transcriptome assembly software programs. There is a knowledge gap as to what assembler software or k-mer strategy is best for construction of an optimal de novo assembly. Additionally, there is a lack of consensus on which evaluation metrics should be used to assess the quality of de novo transcriptome assemblies. RESULT: Six different assembly strategies were evaluated from four different assemblers. The Trinity assembly was used in its default 25 single k-mer value while Bridger, Oases, and SOAPdenovo-Trans were performed with multiple k-mer strategies. Bridger, Oases, and SOAPdenovo-Trans used a small multiple k-mer (SMK) strategy consisting of the k-mer lengths of 21, 25, 27, 29, 31, and 33. Additionally, Oases and SOAPdenovo-Trans were performed using a large multiple k-mer (LMK) strategy consisting of k-mer lengths of 25, 35, 45, 55, 65, 75, and 85. Eleven metrics were used to evaluate each assembly strategy including three genome related evaluation metrics (contig number, N50 length, Contigs >1 kb, reads) and eight transcriptome evaluation metrics (mapped back to transcripts (RMBT), number of full length transcripts, number of open reading frames, Detonate RSEM-EVAL score, and percent alignment to the southern platyfish, Amazon molly, BUSCO and CEGMA databases). The assembly strategy that performed the best, that is it was within the top three of each evaluation metric, was the Bridger assembly (10 of 11) followed by the Oases SMK assembly (8 of 11), the Oases LMK assembly (6 of 11), the Trinity assembly (4 of 11), the SOAP LMK assembly (4 of 11), and the SOAP SMK assembly (3 of 11). CONCLUSION: This study provides an in-depth multi k-mer strategy investigation concluding that the assembler itself had a greater impact than k-mer size regardless of the strategy employed. Additionally, the comprehensive performance transcriptome evaluation metrics utilized in this study identified the need for choosing metrics centered on user defined research goals. Based on the evaluation metrics performed, the Bridger assembly was able to construct the best assembly of the testis transcriptome in Fundulus heteroclitus. Public Library of Science 2016-04-07 /pmc/articles/PMC4824410/ /pubmed/27054874 http://dx.doi.org/10.1371/journal.pone.0153104 Text en © 2016 Rana et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Rana, Satshil B. Zadlock, Frank J. Zhang, Ziping Murphy, Wyatt R. Bentivegna, Carolyn S. Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus
title	Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus
title_full	Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus
title_fullStr	Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus
title_full_unstemmed	Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus
title_short	Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus
title_sort	comparison of de novo transcriptome assemblers and k-mer strategies using the killifish, fundulus heteroclitus
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4824410/ https://www.ncbi.nlm.nih.gov/pubmed/27054874 http://dx.doi.org/10.1371/journal.pone.0153104
work_keys_str_mv	AT ranasatshilb comparisonofdenovotranscriptomeassemblersandkmerstrategiesusingthekillifishfundulusheteroclitus AT zadlockfrankj comparisonofdenovotranscriptomeassemblersandkmerstrategiesusingthekillifishfundulusheteroclitus AT zhangziping comparisonofdenovotranscriptomeassemblersandkmerstrategiesusingthekillifishfundulusheteroclitus AT murphywyattr comparisonofdenovotranscriptomeassemblersandkmerstrategiesusingthekillifishfundulusheteroclitus AT bentivegnacarolyns comparisonofdenovotranscriptomeassemblersandkmerstrategiesusingthekillifishfundulusheteroclitus

Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus

Ejemplares similares