This R Markdown document performs a high-resolution cell-level spatial analysis using Xenium breast cancer data and VoltRon package. Unlike Visium (spot-based), Xenium captures individual cells, allowing us to investigate local microenvironments, immune-tumor interactions, and expression hotspots.
We perform a full analysis including: data filtering, disk-based representation for large-scale handling, clustering, marker analysis, and hotspot detection.
If you want to reproduce the experiment: 👉 Click here to the Docker and the data setup instructions
Compared to spot-based methods (e.g., Visium), Xenium captures individual cell or even molecule-level information, enabling high-resolution analysis of cell identity, spatial interactions, and functional zones within tissues. This allows us to examine not just spatial zones of gene expression but individual cells and their microenvironments.
library(HDF5Array)
library(ImageArray)
## Warning: replacing previous import 'pizzarr::slice' by 'IRanges::slice' when
## loading 'ZarrArray'
library(BPCells)
library(VoltRon)
library(rhdf5)
library(Seurat)
library(patchwork)
library(ComplexHeatmap)
We start by importing the Xenium dataset and filtering out cells with low expression counts (e.g., low RNA molecule counts per cell) to remove noise. This helps reduce background signal and computational burden.
Xen_R1 <- importXenium("../workshop/data/BreastCancer/Xenium_R1/outs/",
sample_name = "XeniumR1",
overwrite_resolution = TRUE,
resolution_level = 3)
# Filter cells with low molecule counts
Xen_R1 <- subset(Xen_R1, Count > 5)
Xenium datasets are often large (especially in molecule-resolution),
so we store them on disk using an HDF5VoltRon
format.
Disk-backed formats allow us to analyze large spatial datasets without
crashing R due to memory limits.
# Save and reload Xenium object
Xen_R1_ondisk <- saveVoltRon(Xen_R1,
format = "HDF5VoltRon",
output = "../workshop/data/ondisk/Xen_R1",
replace = TRUE)
Xen_R1_ondisk <- loadVoltRon("../workshop/data/ondisk/Xen_R1/")
We normalize gene expression, perform dimensionality reduction (PCA + UMAP), and cluster cells using SNN. The UMAP visualization shows distinct clusters of cells based on expression, while the spatial plot reveals how those clusters organize within tissue.
Xen_R1_ondisk <- normalizeData(Xen_R1_ondisk, sizefactor = 1000)
Xen_R1_ondisk <- getPCA(Xen_R1_ondisk, dims = 20, overwrite = TRUE)
Xen_R1_ondisk <- getUMAP(Xen_R1_ondisk, dims = 1:20, overwrite = TRUE)
Xen_R1_ondisk <- getProfileNeighbors(Xen_R1_ondisk, dims = 1:20, method = "SNN")
Xen_R1_ondisk <- getClusters(Xen_R1_ondisk, resolution = 1.3, label = "Cluste rs", graph = "SNN")
# UMAP + spatial plots
p1 <- vrEmbeddingPlot(Xen_R1_ondisk, group.by = "Clusters", embedding = "umap", pt.size = 0.4, label = TRUE)
p2 <- vrSpatialPlot(Xen_R1_ondisk, group.by = "Clusters", pt.size = 0.18)
(p1 | p2)
## Warning: Converting to a dense matrix may use excessive memory
## This message is displayed once every 8 hours.
## Warning: Removed 689 rows containing non-finite outside the scale range
## (`stat_bin2d()`).
## Warning: Removed 450 rows containing missing values or values outside the scale range
## (`geom_tile()`).
We identify marker genes and annotate each cluster based on known cell types. This gives biological meaning to the clusters and helps define what kinds of cells are where in the tissue.
# Convert to Seurat for marker detection
Xen_R1$Clusters <- Xen_R1_ondisk$Clusters
Xen_R1_seu <- VoltRon::as.Seurat(Xen_R1, cell.assay = "Xenium", type = "image")
Idents(Xen_R1_seu) <- "Clusters"
Xen_R1_seu <- NormalizeData(Xen_R1_seu, scale.factor = 1000)
markers <- FindAllMarkers(Xen_R1_seu)
# Annotate clusters with known labels
annotations <- read.table("../workshop/data/BreastCancer/Xenium_R1/annotation.txt")[,1]
CellType <- factor(Xen_R1_ondisk$Clusters)
levels(CellType) <- annotations
Xen_R1_ondisk$CellType <- as.character(CellType)
# Visualize annotations
p1 <- vrSpatialPlot(Xen_R1_ondisk, group.by = "CellType", pt.size = 0.18, alpha = 1)
p2 <- vrEmbeddingPlot(Xen_R1_ondisk, group.by = "CellType", embedding = "umap", pt.size = 0.4, label = TRUE)
# Save annotated object
Xen_R1_ondisk <- saveVoltRon(Xen_R1_ondisk)
(p1 | p2)
## Warning: Removed 689 rows containing non-finite outside the scale range
## (`stat_bin2d()`).
## Warning: Removed 450 rows containing missing values or values outside the scale range
## (`geom_tile()`).
Now that we know cell types, we want to see how they’re arranged in tissue. Using spatial neighborhoods, we build a graph of cell-cell proximity, compute the abundance of cell types around each cell, normalize them (CLR), and perform k-means clustering.
These niche clusters reveal tissue microenvironments — like immune zones or invasive tumor nests — based on compositional cell-type context.
# Build spatial neighbor graph
Xen_R1_ondisk <- getSpatialNeighbors(Xen_R1_ondisk, radius = 30, method = "radius")
# Build niche matrix
Xen_R1_ondisk <- getNicheAssay(Xen_R1_ondisk, label = "CellType", graph.type = "radius", overwrite =TRUE)
# Normalize + cluster niche profiles
vrMainFeatureType(Xen_R1_ondisk) <- "Niche"
Xen_R1_ondisk <- normalizeData(Xen_R1_ondisk, method = "CLR")
Xen_R1_ondisk <- getClusters(Xen_R1_ondisk, nclus = 9, method = "kmeans", label = "niche_clusters")
# Visualize
p1 <- vrSpatialPlot(Xen_R1_ondisk, group.by = "niche_clusters", alpha = 1)
p2 <- vrHeatmapPlot(Xen_R1_ondisk, features = vrFeatures(Xen_R1_ondisk), group.by = "niche_clusters")
(p1 | p2)
## `use_raster` is automatically set to TRUE for a matrix with more than
## 2000 columns You can control `use_raster` argument by explicitly
## setting TRUE/FALSE to it.
##
## Set `ht_opt$message = FALSE` to turn off this message.
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:BiocGenerics':
##
## combine
## Warning: Removed 689 rows containing non-finite outside the scale range
## (`stat_bin2d()`).
## Warning: Removed 448 rows containing missing values or values outside the scale range
## (`geom_tile()`).
💬 Interpretation: In the clustering plots, different niche clusters represent distinct spatial microenvironments. For example: cluster 2 (ACTA2 and KRT-positive epithelial cells) suggest structural zones, while immune-dense niches cluster separately. These patterns help us map functional domains within the tissue, guided by cellular composition and spatial context.
To explore immune infiltration or stromal interactions, we subset and visualize specific cell types. These plots help confirm findings from the niche heatmaps.
p1 <- vrSpatialPlot(Xen_R1_ondisk, group.by = "CellType", pt.size = 0.18, alpha = 1,
group.ids = c("ACTA2_myoepithelial", "KRT15_myoepithelial"))
p2 <- vrSpatialPlot(Xen_R1_ondisk, group.by = "CellType", pt.size = 1, alpha = 1,
group.ids = c("CD4_TCells", "CD8_TCells", "BCells"), n.tile = 400)
(p1 | p2)
## Warning: Removed 53 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 94 rows containing non-finite outside the scale range
## (`stat_bin2d()`).
## Warning: Removed 230 rows containing missing values or values outside the scale range
## (`geom_tile()`).
🔍 Interpretation of Spatial & Niche Clustering Plots
- ACTA2 / KRT15 myoepithelial cells form structural boundaries, likely marking tumor edges or ductal structures.
- CD4 / CD8 T cells and B cells highlight areas of immune infiltration, with CD8+ T cells suggesting cytotoxic responses and B cells possibly forming lymphoid-like structures.
This analysis detects statistical hotspots of a
specific gene (e.g. PGR
), identifying areas where
expression is significantly elevated in both a cell and its
neighbors.
It uses Getis-Ord statistics, comparing local
vs. global expression means, and outputs per-cell p-values.
# Build spatial neighbor graph for hotspot
Xen_R1_ondisk <- getSpatialNeighbors(Xen_R1_ondisk, method = "radius", radius = 15, graph.key = "radius_hot")
# Define RNA features as main
vrMainFeatureType(Xen_R1_ondisk) <- "RNA"
# Visualize gene
p1 <- vrSpatialFeaturePlot(Xen_R1_ondisk, features = "PGR", alpha = 1, background.color = "black", n.tile = 300)
# Run hotspot test
Xen_R1_ondisk <- getHotSpotAnalysis(Xen_R1_ondisk, features = "PGR", graph.type = "radius_hot", alpha.value = 0.001)
# Visualize significant regions
p2 <- vrSpatialFeaturePlot(Xen_R1_ondisk, features = "PGR_hotspot_stat", alpha = 1, background.color = "black", n.tile = 400)
p3 <- vrSpatialPlot(Xen_R1_ondisk, group.by = "PGR_hotspot_flag", alpha = 1, background.color = "black", n.tile = 400)
p1 + p2 + p3
💬 Explanation: Hotspot analysis pinpoints regions of biological activity, such as hormone-active tumor zones or interferon responses. It complements clustering by analyzing spatial patterns of single gene features.
This Xenium pipeline provides a high-resolution, cell-level spatial analysis of breast cancer tissue, including:
These steps help us understand cellular organization, interactions, and functional zones — essential for cancer biology, immunology, and tissue profiling.
Use a pre-built Docker image with VoltRon and all dependencies pre-installed.
Make sure Docker Desktop is installed.
Pull the image:
docker pull amanukyan1385/rstudio-voltron:main
docker run --rm -ti \
-e PASSWORD=yourpassword \
-p 8787:8787 \
-v Path/to/your/dataset:/home/rstudio/project \
amanukyan1385/rstudio-voltron:main
rstudio
yourpassword
# you can change your own
passwordYou can also use Docker Desktop GUI if you prefer volume mounting.
Download the complete dataset bundle used in this notebook:
Download all datasets (ZIP) - including datasets for other markdown