Cargando…

Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing

Structural variation (SV) is typically defined as variation within the human genome that exceeds 50 base pairs (bp). SV may be copy number neutral or it may involve duplications, deletions, and complex rearrangements. Recent studies have shown SV to be associated with many human diseases. However, s...

Descripción completa

Detalles Bibliográficos
Autores principales: Cook, George W., Benton, Michael G., Akerley, Wallace, Mayhew, George F., Moehlenkamp, Cynthia, Raterman, Denise, Burgess, Daniel L., Rowell, William J., Lambert, Christine, Eng, Kevin, Gu, Jenny, Baybayan, Primo, Fussell, John T., Herbold, Heath D., O’Shea, John M., Varghese, Thomas K., Emerson, Lyska L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6961855/
https://www.ncbi.nlm.nih.gov/pubmed/31940362
http://dx.doi.org/10.1371/journal.pone.0226340
_version_ 1783488056910675968
author Cook, George W.
Benton, Michael G.
Akerley, Wallace
Mayhew, George F.
Moehlenkamp, Cynthia
Raterman, Denise
Burgess, Daniel L.
Rowell, William J.
Lambert, Christine
Eng, Kevin
Gu, Jenny
Baybayan, Primo
Fussell, John T.
Herbold, Heath D.
O’Shea, John M.
Varghese, Thomas K.
Emerson, Lyska L.
author_facet Cook, George W.
Benton, Michael G.
Akerley, Wallace
Mayhew, George F.
Moehlenkamp, Cynthia
Raterman, Denise
Burgess, Daniel L.
Rowell, William J.
Lambert, Christine
Eng, Kevin
Gu, Jenny
Baybayan, Primo
Fussell, John T.
Herbold, Heath D.
O’Shea, John M.
Varghese, Thomas K.
Emerson, Lyska L.
author_sort Cook, George W.
collection PubMed
description Structural variation (SV) is typically defined as variation within the human genome that exceeds 50 base pairs (bp). SV may be copy number neutral or it may involve duplications, deletions, and complex rearrangements. Recent studies have shown SV to be associated with many human diseases. However, studies of SV have been challenging due to technological constraints. With the advent of third generation (long-read) sequencing technology, exploration of longer stretches of DNA not easily examined previously has been made possible. In the present study, we utilized third generation (long-read) sequencing techniques to examine SV in the EGFR landscape of four haplotypes derived from two human samples. We analyzed the EGFR gene and its landscape (+/- 500,000 base pairs) using this approach and were able to identify a region of non-coding DNA with over 90% similarity to the most common activating EGFR mutation in non-small cell lung cancer. Based on previously published Alu-element genome instability algorithms, we propose a molecular mechanism to explain how this non-coding region of DNA may be interacting with and impacting the stability of the EGFR gene and potentially generating this cancer-driver gene. By these techniques, we were also able to identify previously hidden structural variation in the four haplotypes and in the human reference genome (hg38). We applied previously published algorithms to compare the relative stabilities of these five different EGFR gene landscape haplotypes to estimate their relative potentials to generate the EGFR exon 19, 15 bp canonical deletion. To our knowledge, the present study is the first to use the differences in genomic architecture between targeted cancer-linked phased haplotypes to estimate their relative potentials to form a common cancer-linked driver mutation.
format Online
Article
Text
id pubmed-6961855
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-69618552020-01-26 Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing Cook, George W. Benton, Michael G. Akerley, Wallace Mayhew, George F. Moehlenkamp, Cynthia Raterman, Denise Burgess, Daniel L. Rowell, William J. Lambert, Christine Eng, Kevin Gu, Jenny Baybayan, Primo Fussell, John T. Herbold, Heath D. O’Shea, John M. Varghese, Thomas K. Emerson, Lyska L. PLoS One Research Article Structural variation (SV) is typically defined as variation within the human genome that exceeds 50 base pairs (bp). SV may be copy number neutral or it may involve duplications, deletions, and complex rearrangements. Recent studies have shown SV to be associated with many human diseases. However, studies of SV have been challenging due to technological constraints. With the advent of third generation (long-read) sequencing technology, exploration of longer stretches of DNA not easily examined previously has been made possible. In the present study, we utilized third generation (long-read) sequencing techniques to examine SV in the EGFR landscape of four haplotypes derived from two human samples. We analyzed the EGFR gene and its landscape (+/- 500,000 base pairs) using this approach and were able to identify a region of non-coding DNA with over 90% similarity to the most common activating EGFR mutation in non-small cell lung cancer. Based on previously published Alu-element genome instability algorithms, we propose a molecular mechanism to explain how this non-coding region of DNA may be interacting with and impacting the stability of the EGFR gene and potentially generating this cancer-driver gene. By these techniques, we were also able to identify previously hidden structural variation in the four haplotypes and in the human reference genome (hg38). We applied previously published algorithms to compare the relative stabilities of these five different EGFR gene landscape haplotypes to estimate their relative potentials to generate the EGFR exon 19, 15 bp canonical deletion. To our knowledge, the present study is the first to use the differences in genomic architecture between targeted cancer-linked phased haplotypes to estimate their relative potentials to form a common cancer-linked driver mutation. Public Library of Science 2020-01-15 /pmc/articles/PMC6961855/ /pubmed/31940362 http://dx.doi.org/10.1371/journal.pone.0226340 Text en © 2020 Cook et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Cook, George W.
Benton, Michael G.
Akerley, Wallace
Mayhew, George F.
Moehlenkamp, Cynthia
Raterman, Denise
Burgess, Daniel L.
Rowell, William J.
Lambert, Christine
Eng, Kevin
Gu, Jenny
Baybayan, Primo
Fussell, John T.
Herbold, Heath D.
O’Shea, John M.
Varghese, Thomas K.
Emerson, Lyska L.
Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing
title Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing
title_full Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing
title_fullStr Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing
title_full_unstemmed Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing
title_short Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing
title_sort structural variation and its potential impact on genome instability: novel discoveries in the egfr landscape by long-read sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6961855/
https://www.ncbi.nlm.nih.gov/pubmed/31940362
http://dx.doi.org/10.1371/journal.pone.0226340
work_keys_str_mv AT cookgeorgew structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT bentonmichaelg structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT akerleywallace structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT mayhewgeorgef structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT moehlenkampcynthia structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT ratermandenise structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT burgessdaniell structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT rowellwilliamj structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT lambertchristine structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT engkevin structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT gujenny structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT baybayanprimo structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT fusselljohnt structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT herboldheathd structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT osheajohnm structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT varghesethomask structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing
AT emersonlyskal structuralvariationanditspotentialimpactongenomeinstabilitynoveldiscoveriesintheegfrlandscapebylongreadsequencing