Cargando…

De novo sequence assembly requires bioinformatic checking of chimeric sequences

De novo assembly of sequence reads from next generation sequencing platforms is a common strategy for detecting presence and sequencing of viruses in biospecimens. Amplification artifacts and presence of several related viruses in the same specimen can lead to assembly of erroneous, chimeric sequenc...

Descripción completa

Detalles Bibliográficos
Autores principales: Arroyo Mühr, Laila Sara, Lagheden, Camilla, Hassan, Sadaf Sakina, Kleppe, Sara Nordqvist, Hultin, Emilie, Dillner, Joakim
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7417191/
https://www.ncbi.nlm.nih.gov/pubmed/32777809
http://dx.doi.org/10.1371/journal.pone.0237455
_version_ 1783569441546567680
author Arroyo Mühr, Laila Sara
Lagheden, Camilla
Hassan, Sadaf Sakina
Kleppe, Sara Nordqvist
Hultin, Emilie
Dillner, Joakim
author_facet Arroyo Mühr, Laila Sara
Lagheden, Camilla
Hassan, Sadaf Sakina
Kleppe, Sara Nordqvist
Hultin, Emilie
Dillner, Joakim
author_sort Arroyo Mühr, Laila Sara
collection PubMed
description De novo assembly of sequence reads from next generation sequencing platforms is a common strategy for detecting presence and sequencing of viruses in biospecimens. Amplification artifacts and presence of several related viruses in the same specimen can lead to assembly of erroneous, chimeric sequences. We now report that such chimeras can also occur between viral and non-viral biological sequences incorrectly joined together which may cause erroneous detection of viruses, highlighting the importance of performing a chimera checking step in bioinformatics pipelines. Using Illumina NextSeq and metagenomic sequencing, we analyzed 80 consecutive non-melanoma skin cancers (NMSCs) from 11 immunosuppressed patients together with 11 NMSCs from patients who had only developed 1 NMSC. We aligned high-quality reads against a Human Papillomavirus (HPV) database and found HPV sequences in 9/91 specimens. A previous bioinformatic analysis of the same crude sequencing data from some of these samples had found an additional 3 specimens to be HPV-positive after performing de novo assembly. The reason for the discrepancy was investigated and found to be mostly caused by chimeric sequences containing both viral and non-viral sequences. Non-viral sequences were present in these 3 samples. To avoid erroneous detection of HPV when performing sequencing, we thus developed a novel script to identify HPV chimeric sequences.
format Online
Article
Text
id pubmed-7417191
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-74171912020-08-19 De novo sequence assembly requires bioinformatic checking of chimeric sequences Arroyo Mühr, Laila Sara Lagheden, Camilla Hassan, Sadaf Sakina Kleppe, Sara Nordqvist Hultin, Emilie Dillner, Joakim PLoS One Research Article De novo assembly of sequence reads from next generation sequencing platforms is a common strategy for detecting presence and sequencing of viruses in biospecimens. Amplification artifacts and presence of several related viruses in the same specimen can lead to assembly of erroneous, chimeric sequences. We now report that such chimeras can also occur between viral and non-viral biological sequences incorrectly joined together which may cause erroneous detection of viruses, highlighting the importance of performing a chimera checking step in bioinformatics pipelines. Using Illumina NextSeq and metagenomic sequencing, we analyzed 80 consecutive non-melanoma skin cancers (NMSCs) from 11 immunosuppressed patients together with 11 NMSCs from patients who had only developed 1 NMSC. We aligned high-quality reads against a Human Papillomavirus (HPV) database and found HPV sequences in 9/91 specimens. A previous bioinformatic analysis of the same crude sequencing data from some of these samples had found an additional 3 specimens to be HPV-positive after performing de novo assembly. The reason for the discrepancy was investigated and found to be mostly caused by chimeric sequences containing both viral and non-viral sequences. Non-viral sequences were present in these 3 samples. To avoid erroneous detection of HPV when performing sequencing, we thus developed a novel script to identify HPV chimeric sequences. Public Library of Science 2020-08-10 /pmc/articles/PMC7417191/ /pubmed/32777809 http://dx.doi.org/10.1371/journal.pone.0237455 Text en © 2020 Arroyo Mühr et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Arroyo Mühr, Laila Sara
Lagheden, Camilla
Hassan, Sadaf Sakina
Kleppe, Sara Nordqvist
Hultin, Emilie
Dillner, Joakim
De novo sequence assembly requires bioinformatic checking of chimeric sequences
title De novo sequence assembly requires bioinformatic checking of chimeric sequences
title_full De novo sequence assembly requires bioinformatic checking of chimeric sequences
title_fullStr De novo sequence assembly requires bioinformatic checking of chimeric sequences
title_full_unstemmed De novo sequence assembly requires bioinformatic checking of chimeric sequences
title_short De novo sequence assembly requires bioinformatic checking of chimeric sequences
title_sort de novo sequence assembly requires bioinformatic checking of chimeric sequences
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7417191/
https://www.ncbi.nlm.nih.gov/pubmed/32777809
http://dx.doi.org/10.1371/journal.pone.0237455
work_keys_str_mv AT arroyomuhrlailasara denovosequenceassemblyrequiresbioinformaticcheckingofchimericsequences
AT laghedencamilla denovosequenceassemblyrequiresbioinformaticcheckingofchimericsequences
AT hassansadafsakina denovosequenceassemblyrequiresbioinformaticcheckingofchimericsequences
AT kleppesaranordqvist denovosequenceassemblyrequiresbioinformaticcheckingofchimericsequences
AT hultinemilie denovosequenceassemblyrequiresbioinformaticcheckingofchimericsequences
AT dillnerjoakim denovosequenceassemblyrequiresbioinformaticcheckingofchimericsequences