Cargando…

Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics

Cell line development is an essential stage in biopharmaceutical development that often lies on the critical path. Failure to fully characterise the lead clone during initial screening can lead to lengthy project delays during scale-up, which can potentially compromise commercial manufacturing succe...

Descripción completa

Detalles Bibliográficos
Autores principales: Goldrick, Stephen, Alosert, Haneen, Lovelady, Clare, Bond, Nicholas J., Senussi, Tarik, Hatton, Diane, Klein, John, Cheeks, Matthew, Turner, Richard, Savery, James, Farid, Suzanne S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10277482/
https://www.ncbi.nlm.nih.gov/pubmed/37342509
http://dx.doi.org/10.3389/fbioe.2023.1160223
_version_ 1785060291054141440
author Goldrick, Stephen
Alosert, Haneen
Lovelady, Clare
Bond, Nicholas J.
Senussi, Tarik
Hatton, Diane
Klein, John
Cheeks, Matthew
Turner, Richard
Savery, James
Farid, Suzanne S.
author_facet Goldrick, Stephen
Alosert, Haneen
Lovelady, Clare
Bond, Nicholas J.
Senussi, Tarik
Hatton, Diane
Klein, John
Cheeks, Matthew
Turner, Richard
Savery, James
Farid, Suzanne S.
author_sort Goldrick, Stephen
collection PubMed
description Cell line development is an essential stage in biopharmaceutical development that often lies on the critical path. Failure to fully characterise the lead clone during initial screening can lead to lengthy project delays during scale-up, which can potentially compromise commercial manufacturing success. In this study, we propose a novel cell line development methodology, referenced as CLD ( 4 ), which involves four steps enabling autonomous data-driven selection of the lead clone. The first step involves the digitalisation of the process and storage of all available information within a structured data lake. The second step calculates a new metric referenced as the cell line manufacturability index (MI ( CL )) quantifying the performance of each clone by considering the selection criteria relevant to productivity, growth and product quality. The third step implements machine learning (ML) to identify any potential risks associated with process operation and relevant critical quality attributes (CQAs). The final step of CLD ( 4 ) takes into account the available metadata and summaries all relevant statistics generated in steps 1–3 in an automated report utilising a natural language generation (NLG) algorithm. The CLD ( 4 ) methodology was implemented to select the lead clone of a recombinant Chinese hamster ovary (CHO) cell line producing high levels of an antibody-peptide fusion with a known product quality issue related to end-point trisulfide bond (TSB) concentration. CLD ( 4 ) identified sub-optimal process conditions leading to increased levels of trisulfide bond that would not be identified through conventional cell line development methodologies. CLD ( 4 ) embodies the core principles of Industry 4.0 and demonstrates the benefits of increased digitalisation, data lake integration, predictive analytics and autonomous report generation to enable more informed decision making.
format Online
Article
Text
id pubmed-10277482
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-102774822023-06-20 Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics Goldrick, Stephen Alosert, Haneen Lovelady, Clare Bond, Nicholas J. Senussi, Tarik Hatton, Diane Klein, John Cheeks, Matthew Turner, Richard Savery, James Farid, Suzanne S. Front Bioeng Biotechnol Bioengineering and Biotechnology Cell line development is an essential stage in biopharmaceutical development that often lies on the critical path. Failure to fully characterise the lead clone during initial screening can lead to lengthy project delays during scale-up, which can potentially compromise commercial manufacturing success. In this study, we propose a novel cell line development methodology, referenced as CLD ( 4 ), which involves four steps enabling autonomous data-driven selection of the lead clone. The first step involves the digitalisation of the process and storage of all available information within a structured data lake. The second step calculates a new metric referenced as the cell line manufacturability index (MI ( CL )) quantifying the performance of each clone by considering the selection criteria relevant to productivity, growth and product quality. The third step implements machine learning (ML) to identify any potential risks associated with process operation and relevant critical quality attributes (CQAs). The final step of CLD ( 4 ) takes into account the available metadata and summaries all relevant statistics generated in steps 1–3 in an automated report utilising a natural language generation (NLG) algorithm. The CLD ( 4 ) methodology was implemented to select the lead clone of a recombinant Chinese hamster ovary (CHO) cell line producing high levels of an antibody-peptide fusion with a known product quality issue related to end-point trisulfide bond (TSB) concentration. CLD ( 4 ) identified sub-optimal process conditions leading to increased levels of trisulfide bond that would not be identified through conventional cell line development methodologies. CLD ( 4 ) embodies the core principles of Industry 4.0 and demonstrates the benefits of increased digitalisation, data lake integration, predictive analytics and autonomous report generation to enable more informed decision making. Frontiers Media S.A. 2023-06-05 /pmc/articles/PMC10277482/ /pubmed/37342509 http://dx.doi.org/10.3389/fbioe.2023.1160223 Text en Copyright © 2023 Goldrick, Alosert, Lovelady, Bond, Senussi, Hatton, Klein, Cheeks, Turner, Savery and Farid. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Bioengineering and Biotechnology
Goldrick, Stephen
Alosert, Haneen
Lovelady, Clare
Bond, Nicholas J.
Senussi, Tarik
Hatton, Diane
Klein, John
Cheeks, Matthew
Turner, Richard
Savery, James
Farid, Suzanne S.
Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics
title Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics
title_full Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics
title_fullStr Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics
title_full_unstemmed Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics
title_short Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics
title_sort next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics
topic Bioengineering and Biotechnology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10277482/
https://www.ncbi.nlm.nih.gov/pubmed/37342509
http://dx.doi.org/10.3389/fbioe.2023.1160223
work_keys_str_mv AT goldrickstephen nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT aloserthaneen nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT loveladyclare nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT bondnicholasj nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT senussitarik nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT hattondiane nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT kleinjohn nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT cheeksmatthew nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT turnerrichard nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT saveryjames nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics
AT faridsuzannes nextgenerationcelllineselectionmethodologyleveragingdatalakesnaturallanguagegenerationandadvanceddataanalytics