Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. CanSig Benchmarks Methods for Reproducible Cancer Cell State Discovery from Single-Cell Transcriptomic Data
 
research article

CanSig Benchmarks Methods for Reproducible Cancer Cell State Discovery from Single-Cell Transcriptomic Data

Barkmann, Florian
•
Yates, Josephine
•
Czyż, Paweł
Show more
January 22, 2026
Cancer Research

Single-cell RNA sequencing facilitates the discovery of gene expression signatures that define cell states across patients, which could be used in patient stratification and precision oncology. However, the lack of standardization in computational methodologies used to analyze these data impedes the reproducibility of signature detection. To address this, we developed CanSig, a comprehensive benchmarking tool that evaluates methods for identifying transcriptional signatures in cancer. CanSig integrates metrics for batch correction and biological signal conservation with a transcriptional signature correlation metric to score methods according to signature rediscovery, cross-dataset reproducibility, and clinical relevance. CanSig was applied to 13 methods on 12 single-cell RNA sequencing datasets from five human cancer types—glioblastoma, breast cancer, lung adenocarcinoma, rhabdomyosarcoma, and cutaneous squamous cell carcinoma—representing 185 patients and 174,000 malignant cells. The signatures identified with these methods correlated with clinically relevant outcomes, including patient survival and lymph node metastasis. These results identified Harmony, BBKNN, and fastMNN as the highest-scoring integration methods for discovering shared transcriptional states in cancer. Overall, CanSig provides a standardized, reproducible framework for uncovering clinically relevant cancer cell states in single-cell transcriptomics. Significance: The development of CanSig facilitates computational strategies for the reproducible discovery of shared cancer cell states, improving cross-study reliability and enabling the detection of clinically relevant gene signatures. This article is part of a special series: Driving Cancer Discoveries with Computational Research, Data Science, and Machine Learning/AI .

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

can-25-0940.pdf

Type

Main Document

Version

Published version

Access type

openaccess

License Condition

CC BY-NC-ND

Size

6.58 MB

Format

Adobe PDF

Checksum (MD5)

5a920d6167dba76088ac823c98c02370

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés