Cargando…

Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan)

Genome assembly of short reads from large plant genomes remains a challenge in computational biology despite major developments in next generation sequencing. Of late several draft assemblies have been reported in sequenced plant genomes. The reported draft genome assemblies of Cajanus cajan have di...

Descripción completa

Detalles Bibliográficos
Autores principales: Marla, Soma S., Mishra, Pallavi, Maurya, Ranjeet, Singh, Mohar, Wankhede, Dhammaprakash Pandhari, Kumar, Anil, Yadav, Mahesh C., Subbarao, N., Singh, Sanjeev K., Kumar, Rajesh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7770131/
https://www.ncbi.nlm.nih.gov/pubmed/33384719
http://dx.doi.org/10.3389/fgene.2020.607432
_version_ 1783629441303314432
author Marla, Soma S.
Mishra, Pallavi
Maurya, Ranjeet
Singh, Mohar
Wankhede, Dhammaprakash Pandhari
Kumar, Anil
Yadav, Mahesh C.
Subbarao, N.
Singh, Sanjeev K.
Kumar, Rajesh
author_facet Marla, Soma S.
Mishra, Pallavi
Maurya, Ranjeet
Singh, Mohar
Wankhede, Dhammaprakash Pandhari
Kumar, Anil
Yadav, Mahesh C.
Subbarao, N.
Singh, Sanjeev K.
Kumar, Rajesh
author_sort Marla, Soma S.
collection PubMed
description Genome assembly of short reads from large plant genomes remains a challenge in computational biology despite major developments in next generation sequencing. Of late several draft assemblies have been reported in sequenced plant genomes. The reported draft genome assemblies of Cajanus cajan have different levels of genome completeness, a large number of repeats, gaps, and segmental duplications. Draft assemblies with portions of genome missing are shorter than the referenced original genome. These assemblies come with low map accuracy affecting further functional annotation and the prediction of gene components as desired by crop researchers. Genome coverage, i.e., the number of sequenced raw reads mapped onto a certain location of the genome is an important quality indicator of completeness and assembly quality in draft assemblies. The present work aimed to improve the coverage in reported de novo sequenced draft genomes (GCA_000340665.1 and GCA_000230855.2) of pigeonpea, a legume widely cultivated in India. The two recently sequenced assemblies, A1 and A2 comprised 72% and 75% of the estimated coverage of the genome, respectively. We employed an assembly reconciliation approach to compare the draft assemblies and merge them, filling the gaps by employing an algorithm size sorting mate-pair library to generate a high quality and near complete assembly with enhanced contiguity. The majority of gaps present within scaffolds were filled with right-sized mate-pair reads. The improved assembly reduced the number of gaps than those reported in draft assemblies resulting in an improved genome coverage of 82.4%. Map accuracy of the improved assembly was evaluated using various quality metrics and for the presence of specific trait-related functional genes. Employed pair-end and mate-pair local libraries helped us to reduce gaps, repeats, and other sequence errors resulting in lengthier scaffolds compared to the two draft assemblies. We reported the prediction of putative host resistance genes against Fusarium wilt disease by their performance and evaluated them both in wet laboratory and field phenotypic conditions.
format Online
Article
Text
id pubmed-7770131
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-77701312020-12-30 Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan) Marla, Soma S. Mishra, Pallavi Maurya, Ranjeet Singh, Mohar Wankhede, Dhammaprakash Pandhari Kumar, Anil Yadav, Mahesh C. Subbarao, N. Singh, Sanjeev K. Kumar, Rajesh Front Genet Genetics Genome assembly of short reads from large plant genomes remains a challenge in computational biology despite major developments in next generation sequencing. Of late several draft assemblies have been reported in sequenced plant genomes. The reported draft genome assemblies of Cajanus cajan have different levels of genome completeness, a large number of repeats, gaps, and segmental duplications. Draft assemblies with portions of genome missing are shorter than the referenced original genome. These assemblies come with low map accuracy affecting further functional annotation and the prediction of gene components as desired by crop researchers. Genome coverage, i.e., the number of sequenced raw reads mapped onto a certain location of the genome is an important quality indicator of completeness and assembly quality in draft assemblies. The present work aimed to improve the coverage in reported de novo sequenced draft genomes (GCA_000340665.1 and GCA_000230855.2) of pigeonpea, a legume widely cultivated in India. The two recently sequenced assemblies, A1 and A2 comprised 72% and 75% of the estimated coverage of the genome, respectively. We employed an assembly reconciliation approach to compare the draft assemblies and merge them, filling the gaps by employing an algorithm size sorting mate-pair library to generate a high quality and near complete assembly with enhanced contiguity. The majority of gaps present within scaffolds were filled with right-sized mate-pair reads. The improved assembly reduced the number of gaps than those reported in draft assemblies resulting in an improved genome coverage of 82.4%. Map accuracy of the improved assembly was evaluated using various quality metrics and for the presence of specific trait-related functional genes. Employed pair-end and mate-pair local libraries helped us to reduce gaps, repeats, and other sequence errors resulting in lengthier scaffolds compared to the two draft assemblies. We reported the prediction of putative host resistance genes against Fusarium wilt disease by their performance and evaluated them both in wet laboratory and field phenotypic conditions. Frontiers Media S.A. 2020-12-15 /pmc/articles/PMC7770131/ /pubmed/33384719 http://dx.doi.org/10.3389/fgene.2020.607432 Text en Copyright © 2020 Marla, Mishra, Maurya, Singh, Wankhede, Kumar, Yadav, Subbarao, Singh and Kumar. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Marla, Soma S.
Mishra, Pallavi
Maurya, Ranjeet
Singh, Mohar
Wankhede, Dhammaprakash Pandhari
Kumar, Anil
Yadav, Mahesh C.
Subbarao, N.
Singh, Sanjeev K.
Kumar, Rajesh
Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan)
title Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan)
title_full Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan)
title_fullStr Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan)
title_full_unstemmed Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan)
title_short Refinement of Draft Genome Assemblies of Pigeonpea (Cajanus cajan)
title_sort refinement of draft genome assemblies of pigeonpea (cajanus cajan)
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7770131/
https://www.ncbi.nlm.nih.gov/pubmed/33384719
http://dx.doi.org/10.3389/fgene.2020.607432
work_keys_str_mv AT marlasomas refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT mishrapallavi refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT mauryaranjeet refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT singhmohar refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT wankhededhammaprakashpandhari refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT kumaranil refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT yadavmaheshc refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT subbaraon refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT singhsanjeevk refinementofdraftgenomeassembliesofpigeonpeacajanuscajan
AT kumarrajesh refinementofdraftgenomeassembliesofpigeonpeacajanuscajan