Scoring transcriptional shifts in scRNAseq

The OAR score reveals cellular transcriptional shifts, allowing cell prioritization for downstream applications. For best results, apply the test to group of cells where you expect some common transcriptional programs - i.e. one cell type across various biological samples or conditions. OAR scores are cluster agnostic (no cluster labels are required) and are robust across:

Technologies
Technical batches/library preparations
Organisms

OAR score is a measure of transcriptional shifts among cells. A cell with a positive OAR score is one where a set of genes appears to be expressed more homogeneously than in other cells tested, and is consequently a highly distinct cell.

Motivation

scRNAseq data is very sparse (50-90% of expression values are 0). Sparsity is generally attributed to technical limitations associated with capturing RNA molecules from individual cells. Some 0s are expected, and are a consequence of the Gamma-Poisson distribution of count data¹, whereas “Drop-out” (when 0s occur where positive counts are expected) is a problem associated with specific technologies (UMI- vs. nonUMI-based)².

Sparsity has been used to:

Is there something else we can learn about cellular identity from sparsity in scRNAseq data?

Test overview

At the core of the OAR score is the identification of gene co-expression patterns, followed by comparing the distribution of genes expressed in the identified patterns in each cell individually.

Base Test

To calculate the OAR score we:

Estimate Hamming distances between binarized vectors of gene expression.
Group genes across gene co-expression patterns based on hamming distances between them. Genes with unique patterns - i.e. with no “neighbors”, are grouped together.
Compare the distribution of gene expression across identified patterns for each cell with a Kruskal-Wallis test.
Scale the resulting corrected p value distributions across all cells to obtain the OAR score.

Installation

To install the latest version of our package, run:

devtools::install_github("Sanin-Lab/OARscRNA")

If you want to install our vignettes (takes a few minutes!), try:

devtools::install_github("Sanin-Lab/OARscRNA", build_vignettes = TRUE)

For Mac and Linux users: The package uses FastHamming::hamming_distance() to speed up the hamming distance calculation. Unless OpenMP is installed in your computer, the function will default to use all available threads. To install via Homebrew run on your terminal: brew install libomp

Usage

To calculate an OAR score from a Seurat object with default parameters run:

oar(data = seurat.obj)

Or from a matrix of unnormalized read counts, run:

oar(data = read.counts, seurat_v5 = F)

This will automatically filter genes with low expression, identify a suitable tolerance for the hamming distance, and return a OAR score, corrected and uncorrected p values and sparsity for each cell (column) in the supplied object.

If a Seurat object is supplied, the results are added as columns in the meta.data slot.

For full details on all parameters, including a step-by-step breakdown of the process, please visit our documentation or view our vignettes with browseVignettes(package = "OAR").

Tutorials and Applications

Quick overview

Follow our quick guide on running the analysis with a single line: vignette("introductory_vignette")

Cell prioritization for downstream analysis

Follow our step-by-step tutorial on exploring how we can identify highly activated plasmacytoid dendritic cells based on a high OAR score: vignette("detailed_tutorial")

Cell prioritization across biological conditions

OAR scores are typically more informative when distinguishing among cells of the same type. When working with a dataset with diverse cell types or dramatically affected by a biological variable, it can be helpful to split the data by that factor and run the test independently: vignette("OAR_Factor")

Model gene expression data at the single cell level

Identify genes responsible for high OAR scoring cells at a single cell resolution: vignette("Gene_expression")

References

scRNAseq implementation: Chen, R., Moore, H., Gueguen, PM., Kelly, B., Sanin, DE., (2025). Co-expression patterns in single-cell transcriptomes reveal transcriptional shifts. In press