CanSig Benchmarks Methods for Reproducible Cancer Cell State Discovery from Single-Cell Transcriptomic Data
Single-cell RNA sequencing facilitates the discovery of gene expression signatures that define cell states across patients, which could be used in patient stratification and precision oncology. However, the lack of standardization in computational methodologies used to analyze these data impedes the reproducibility of signature detection. To address this, we developed CanSig, a comprehensive benchmarking tool that evaluates methods for identifying transcriptional signatures in cancer. CanSig integrates metrics for batch correction and biological signal conservation with a transcriptional signature correlation metric to score methods according to signature rediscovery, cross-dataset reproducibility, and clinical relevance. CanSig was applied to 13 methods on 12 single-cell RNA sequencing datasets from five human cancer types—glioblastoma, breast cancer, lung adenocarcinoma, rhabdomyosarcoma, and cutaneous squamous cell carcinoma—representing 185 patients and 174,000 malignant cells. The signatures identified with these methods correlated with clinically relevant outcomes, including patient survival and lymph node metastasis. These results identified Harmony, BBKNN, and fastMNN as the highest-scoring integration methods for discovering shared transcriptional states in cancer. Overall, CanSig provides a standardized, reproducible framework for uncovering clinically relevant cancer cell states in single-cell transcriptomics. Significance: The development of CanSig facilitates computational strategies for the reproducible discovery of shared cancer cell states, improving cross-study reliability and enabling the detection of clinically relevant gene signatures. This article is part of a special series: Driving Cancer Discoveries with Computational Research, Data Science, and Machine Learning/AI .
can-25-0940.pdf
Main Document
Published version
openaccess
CC BY-NC-ND
6.58 MB
Adobe PDF
5a920d6167dba76088ac823c98c02370