Celda: a Bayesian model to perform co-clustering of genes into modules and cells into subpopulations using single-cell RNA-seq data
Date
2022-09
Authors
Wang, Zhe
Yang, Shiyi
Koga, Yusuke
Corbett, Sean E.
Shea, Conor V.
Johnson, W. Evan
Yajima, Masanao
Campbell, Joshua David
Version
Published version
OA Version
Citation
Z. Wang, S. Yang, Y. Koga, S.E. Corbett, C.V. Shea, W.E. Johnson, M. Yajima, J.D. Campbell. 2022. "Celda: a Bayesian model to perform co-clustering of genes into modules and cells into subpopulations using single-cell RNA-seq data." NAR Genomics and Bioinformatics, Volume 4, Issue 3, pp.lqac066-. https://doi.org/10.1093/nargab/lqac066
Abstract
Single-cell RNA-seq (scRNA-seq) has emerged as a powerful technique to quantify gene expression in individual cells and to elucidate the molecular and cellular building blocks of complex tissues. We developed a novel Bayesian hierarchical model called Cellular Latent Dirichlet Allocation (Celda) to perform co-clustering of genes into transcriptional modules and cells into subpopulations. Celda can quantify the probabilistic contribution of each gene to each module, each module to each cell population and each cell population to each sample. In a peripheral blood mononuclear cell dataset, Celda identified a subpopulation of proliferating T cells and a plasma cell which were missed by two other common single-cell workflows. Celda also identified transcriptional modules that could be used to characterize unique and shared biological programs across cell types. Finally, Celda outperformed other approaches for clustering genes into modules on simulated data. Celda presents a novel method for characterizing transcriptional programs and cellular heterogeneity in scRNA-seq data.
Description
License
© The Author(s) 2022. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.