Cargando…

Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations

The accurate assessment of antibody glycosylation during bioprocessing requires the high-throughput generation of large amounts of glycomics data. This allows bioprocess engineers to identify critical process parameters that control the glycosylation critical quality attributes. The advances made in...

Descripción completa

Detalles Bibliográficos
Autores principales: Walsh, Ian, Choo, Matthew S F, Chiin, Sim Lyn, Mak, Amelia, Tay, Shi Jie, Rudd, Pauline M, Yuansheng, Yang, Choo, Andre, Swan, Ho Ying, Nguyen-Khuong, Terry
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Beilstein-Institut 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7476600/
https://www.ncbi.nlm.nih.gov/pubmed/32952725
http://dx.doi.org/10.3762/bjoc.16.176
_version_ 1783579733281210368
author Walsh, Ian
Choo, Matthew S F
Chiin, Sim Lyn
Mak, Amelia
Tay, Shi Jie
Rudd, Pauline M
Yuansheng, Yang
Choo, Andre
Swan, Ho Ying
Nguyen-Khuong, Terry
author_facet Walsh, Ian
Choo, Matthew S F
Chiin, Sim Lyn
Mak, Amelia
Tay, Shi Jie
Rudd, Pauline M
Yuansheng, Yang
Choo, Andre
Swan, Ho Ying
Nguyen-Khuong, Terry
author_sort Walsh, Ian
collection PubMed
description The accurate assessment of antibody glycosylation during bioprocessing requires the high-throughput generation of large amounts of glycomics data. This allows bioprocess engineers to identify critical process parameters that control the glycosylation critical quality attributes. The advances made in protocols for capillary electrophoresis-laser-induced fluorescence (CE-LIF) measurements of antibody N-glycans have increased the potential for generating large datasets of N-glycosylation values for assessment. With large cohorts of CE-LIF data, peak picking and peak area calculations still remain a problem for fast and accurate quantitation, despite the presence of internal and external standards to reduce misalignment for the qualitative analysis. The peak picking and area calculation problems are often due to fluctuations introduced by varying process conditions resulting in heterogeneous peak shapes. Additionally, peaks with co-eluting glycans can produce peaks of a non-Gaussian nature in some process conditions and not in others. Here, we describe an approach to quantitatively and qualitatively curate large cohort CE-LIF glycomics data. For glycan identification, a previously reported method based on internal triple standards is used. For determining the glycan relative quantities our method uses a clustering algorithm to ‘divide and conquer’ highly heterogeneous electropherograms into similar groups, making it easier to define peaks manually. Open-source software is then used to determine peak areas of the manually defined peaks. We successfully applied this semi-automated method to a dataset (containing 391 glycoprofiles) of monoclonal antibody biosimilars from a bioreactor optimization study. The key advantage of this computational approach is that all runs can be analyzed simultaneously with high accuracy in glycan identification and quantitation and there is no theoretical limit to the scale of this method.
format Online
Article
Text
id pubmed-7476600
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Beilstein-Institut
record_format MEDLINE/PubMed
spelling pubmed-74766002020-09-18 Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations Walsh, Ian Choo, Matthew S F Chiin, Sim Lyn Mak, Amelia Tay, Shi Jie Rudd, Pauline M Yuansheng, Yang Choo, Andre Swan, Ho Ying Nguyen-Khuong, Terry Beilstein J Org Chem Full Research Paper The accurate assessment of antibody glycosylation during bioprocessing requires the high-throughput generation of large amounts of glycomics data. This allows bioprocess engineers to identify critical process parameters that control the glycosylation critical quality attributes. The advances made in protocols for capillary electrophoresis-laser-induced fluorescence (CE-LIF) measurements of antibody N-glycans have increased the potential for generating large datasets of N-glycosylation values for assessment. With large cohorts of CE-LIF data, peak picking and peak area calculations still remain a problem for fast and accurate quantitation, despite the presence of internal and external standards to reduce misalignment for the qualitative analysis. The peak picking and area calculation problems are often due to fluctuations introduced by varying process conditions resulting in heterogeneous peak shapes. Additionally, peaks with co-eluting glycans can produce peaks of a non-Gaussian nature in some process conditions and not in others. Here, we describe an approach to quantitatively and qualitatively curate large cohort CE-LIF glycomics data. For glycan identification, a previously reported method based on internal triple standards is used. For determining the glycan relative quantities our method uses a clustering algorithm to ‘divide and conquer’ highly heterogeneous electropherograms into similar groups, making it easier to define peaks manually. Open-source software is then used to determine peak areas of the manually defined peaks. We successfully applied this semi-automated method to a dataset (containing 391 glycoprofiles) of monoclonal antibody biosimilars from a bioreactor optimization study. The key advantage of this computational approach is that all runs can be analyzed simultaneously with high accuracy in glycan identification and quantitation and there is no theoretical limit to the scale of this method. Beilstein-Institut 2020-08-27 /pmc/articles/PMC7476600/ /pubmed/32952725 http://dx.doi.org/10.3762/bjoc.16.176 Text en Copyright © 2020, Walsh et al. https://creativecommons.org/licenses/by/4.0https://www.beilstein-journals.org/bjoc/termsThis is an Open Access article under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0). Please note that the reuse, redistribution and reproduction in particular requires that the authors and source are credited. The license is subject to the Beilstein Journal of Organic Chemistry terms and conditions: (https://www.beilstein-journals.org/bjoc/terms)
spellingShingle Full Research Paper
Walsh, Ian
Choo, Matthew S F
Chiin, Sim Lyn
Mak, Amelia
Tay, Shi Jie
Rudd, Pauline M
Yuansheng, Yang
Choo, Andre
Swan, Ho Ying
Nguyen-Khuong, Terry
Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_full Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_fullStr Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_full_unstemmed Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_short Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_sort clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
topic Full Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7476600/
https://www.ncbi.nlm.nih.gov/pubmed/32952725
http://dx.doi.org/10.3762/bjoc.16.176
work_keys_str_mv AT walshian clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT choomatthewsf clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT chiinsimlyn clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT makamelia clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT tayshijie clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT ruddpaulinem clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT yuanshengyang clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT chooandre clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT swanhoying clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT nguyenkhuongterry clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations