Cargando…

CeL-ID: cell line identification using RNA-seq data

BACKGROUND: Cell lines form the cornerstone of cell-based experimentation studies into understanding the underlying mechanisms of normal and disease biology including cancer. However, it is commonly acknowledged that contamination of cell lines is a prevalent problem affecting biomedical science and...

Descripción completa

Detalles Bibliográficos
Autores principales: Mohammad, Tabrez A., Tsai, Yun S., Ameer, Safwa, Chen, Hung-I Harry, Chiu, Yu-Chiao, Chen, Yidong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6360649/
https://www.ncbi.nlm.nih.gov/pubmed/30712511
http://dx.doi.org/10.1186/s12864-018-5371-9
_version_ 1783392539174240256
author Mohammad, Tabrez A.
Tsai, Yun S.
Ameer, Safwa
Chen, Hung-I Harry
Chiu, Yu-Chiao
Chen, Yidong
author_facet Mohammad, Tabrez A.
Tsai, Yun S.
Ameer, Safwa
Chen, Hung-I Harry
Chiu, Yu-Chiao
Chen, Yidong
author_sort Mohammad, Tabrez A.
collection PubMed
description BACKGROUND: Cell lines form the cornerstone of cell-based experimentation studies into understanding the underlying mechanisms of normal and disease biology including cancer. However, it is commonly acknowledged that contamination of cell lines is a prevalent problem affecting biomedical science and available methods for cell line authentication suffer from limited access as well as being too daunting and time-consuming for many researchers. Therefore, a new and cost effective approach for authentication and quality control of cell lines is needed. RESULTS: We have developed a new RNA-seq based approach named CeL-ID for cell line authentication. CeL-ID uses RNA-seq data to identify variants and compare with variant profiles of other cell lines. RNA-seq data for 934 CCLE cell lines downloaded from NCI GDC were used to generate cell line specific variant profiles and pair-wise correlations were calculated using frequencies and depth of coverage values of all the variants. Comparative analysis of variant profiles revealed that variant profiles differ significantly from cell line to cell line whereas identical, synonymous and derivative cell lines share high variant identity and are highly correlated (ρ > 0.9). Our benchmarking studies revealed that CeL-ID method can identify a cell line with high accuracy and can be a valuable tool of cell line authentication in biomedical science. Finally, CeL-ID estimates the possible cross contamination using linear mixture model if no perfect match was detected. CONCLUSIONS: In this study, we show the utility of an RNA-seq based approach for cell line authentication. Our comparative analysis of variant profiles derived from RNA-seq data revealed that variant profiles of each cell line are distinct and overall share low variant identity with other cell lines whereas identical or synonymous cell lines show significantly high variant identity and hence variant profiles can be used as a discriminatory/identifying feature in cell authentication model. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5371-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6360649
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63606492019-02-08 CeL-ID: cell line identification using RNA-seq data Mohammad, Tabrez A. Tsai, Yun S. Ameer, Safwa Chen, Hung-I Harry Chiu, Yu-Chiao Chen, Yidong BMC Genomics Research BACKGROUND: Cell lines form the cornerstone of cell-based experimentation studies into understanding the underlying mechanisms of normal and disease biology including cancer. However, it is commonly acknowledged that contamination of cell lines is a prevalent problem affecting biomedical science and available methods for cell line authentication suffer from limited access as well as being too daunting and time-consuming for many researchers. Therefore, a new and cost effective approach for authentication and quality control of cell lines is needed. RESULTS: We have developed a new RNA-seq based approach named CeL-ID for cell line authentication. CeL-ID uses RNA-seq data to identify variants and compare with variant profiles of other cell lines. RNA-seq data for 934 CCLE cell lines downloaded from NCI GDC were used to generate cell line specific variant profiles and pair-wise correlations were calculated using frequencies and depth of coverage values of all the variants. Comparative analysis of variant profiles revealed that variant profiles differ significantly from cell line to cell line whereas identical, synonymous and derivative cell lines share high variant identity and are highly correlated (ρ > 0.9). Our benchmarking studies revealed that CeL-ID method can identify a cell line with high accuracy and can be a valuable tool of cell line authentication in biomedical science. Finally, CeL-ID estimates the possible cross contamination using linear mixture model if no perfect match was detected. CONCLUSIONS: In this study, we show the utility of an RNA-seq based approach for cell line authentication. Our comparative analysis of variant profiles derived from RNA-seq data revealed that variant profiles of each cell line are distinct and overall share low variant identity with other cell lines whereas identical or synonymous cell lines show significantly high variant identity and hence variant profiles can be used as a discriminatory/identifying feature in cell authentication model. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12864-018-5371-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-02-04 /pmc/articles/PMC6360649/ /pubmed/30712511 http://dx.doi.org/10.1186/s12864-018-5371-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Mohammad, Tabrez A.
Tsai, Yun S.
Ameer, Safwa
Chen, Hung-I Harry
Chiu, Yu-Chiao
Chen, Yidong
CeL-ID: cell line identification using RNA-seq data
title CeL-ID: cell line identification using RNA-seq data
title_full CeL-ID: cell line identification using RNA-seq data
title_fullStr CeL-ID: cell line identification using RNA-seq data
title_full_unstemmed CeL-ID: cell line identification using RNA-seq data
title_short CeL-ID: cell line identification using RNA-seq data
title_sort cel-id: cell line identification using rna-seq data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6360649/
https://www.ncbi.nlm.nih.gov/pubmed/30712511
http://dx.doi.org/10.1186/s12864-018-5371-9
work_keys_str_mv AT mohammadtabreza celidcelllineidentificationusingrnaseqdata
AT tsaiyuns celidcelllineidentificationusingrnaseqdata
AT ameersafwa celidcelllineidentificationusingrnaseqdata
AT chenhungiharry celidcelllineidentificationusingrnaseqdata
AT chiuyuchiao celidcelllineidentificationusingrnaseqdata
AT chenyidong celidcelllineidentificationusingrnaseqdata