This R Markdown document demonstrates how to:
We will use:
S2_N
,
S2_orf1ab
).If you want to reproduce the experiment: 👉 Click here to the Docker and the data setup instructions
Xen_R1
: Processed Xenium object of breast cancer
section (cells).Xen_R1_image
: Corresponding H&E image.vr2_merged_acute1
: COVID-19 Xenium dataset with both
cells and molecules
(Xenium
and Xenium_mol
).acutecase1_membrane.geojson
: Annotated hyaline membrane
regions.acutecase1_heimage.jpg
: H&E image of lung
tissue.library(VoltRon)
library(dplyr)
library(Seurat)
library(ComplexHeatmap)
library(basilisk)
library(vitessceR)
library(ggnewscale)
Spatial transcriptomics often captures DAPI-based coordinates, while histology gives tissue context (e.g., H&E). Registration bridges these layers. These integrated features enables morphology-guided cell analysis:
For example:
Xen_R1 <- loadVoltRon("../workshop/data/ondisk/Xen_R1/")
We store them on disk using an HDF5VoltRon format. Disk-backed formats allow us to analyze large spatial datasets without crashing R due to memory limits.
Xen_R1_image <- importImageData("../workshop/data/BreastCancer/Xenium_R1/Xenium_FFPE_Human_Breast_Cancer_Rep1_he_image_highres.tif",
sample_name = "HEimage",
channel_names = "H&E", tile.size = 100)
Xen_R1_image_disk <- saveVoltRon(Xen_R1_image,
format = "HDF5VoltRon",
output = "../data/ondisk/Xen_R1_image/",
replace = TRUE)
# load H&E data from disk
Xen_R1_image_disk <- loadVoltRon("../data/ondisk/Xen_R1_image/")
We now perform automated image registration. Behind the scenes, this detects features (e.g., nuclei, shapes) and applies homography to align two spatial maps.
xen_reg <- registerSpatialData(object_list = list(Xen_R1, Xen_R1_image_disk)) # This will give you a interface to do the registration for two images like below
We then transfer the aligned image and visualize the clusters overlayed on H&E.
Xenium_reg <- xen_reg$registered_spat[[2]]
vrImages(Xen_R1[["Assay1"]], channel = "H&E") <- vrImages(Xenium_reg, name = "main_reg", channel = "H&E")
vrSpatialPlot(Xen_R1, group.by = "Clusters", channel = "H&E")
## Warning: Converting to a dense matrix may use excessive memory
## This message is displayed once every 8 hours.
## Warning: Removed 347 rows containing missing values or values outside the scale range
## (`geom_tile()`).
COVID-19 lungs contain viral RNAs (S2_N
,
S2_orf1ab
), some inside cells, some in hyaline
membrane regions (mucosal damage zones). We analyze:
We study:
vr2_merged_acute1 <- readRDS(file = "../workshop/data/COVID19/acutecase1_annotated.rds")
SampleMetadata(vr2_merged_acute1)
## Assay Layer Sample
## Assay7 Xenium Section1 acute case 1
## Assay8 Xenium_mol Section1 acute case 1
Xenium captures not only transcripts within cells but also those outside the cells. They use DAPI staining to mark the boundary and calculate the transcripts within the cells to get the Cell assay. This allows Xenium to provide Cell assay and Molecule assay. This feature provides a more comprehensive understanding of spatial transcriptomics.
# Cells
p1 <- vrSpatialPlot(vr2_merged_acute1, assay = "Xenium", group.by = "CellType", plot.segments = TRUE)
# Molecules
p2 <- vrSpatialPlot(vr2_merged_acute1, assay = "Xenium_mol", group.by = "gene", group.ids = c("S2_N", "S2_orf1ab"), n.tile = 500)
p3 <- vrSpatialPlot(vr2_merged_acute1, assay = "Xenium_mol", group.by = "gene", group.ids = c("S2_N", "S2_orf1ab"), n.tile = 500) |>
addSpatialLayer(vr2_merged_acute1, assay = "Xenium", group.by = "CellType", plot.segments = TRUE)
(p1 | p2 | p3)
## Warning: Removed 54 rows containing missing values or values outside the scale range
## (`geom_tile()`).
## Removed 54 rows containing missing values or values outside the scale range
## (`geom_tile()`).
💬 Explanation: Xenium can capture all transcripts with in the slides. Molecules often lie outside cells, particularly in damaged zones, making tissue morphology crucial for interpretation.
Extra Notes: The expression of S2_N and S2_orf1ab genes are done by 10X company. Also, we don’t know the expression here. I think they capture those gene with some threshold and everything above the threshold they will count as a dot in in image. That’s why we do not discuss the expression for S2_N and S2_orf1ab genes but discuss the counts of them!
We load an H&E image annotated by a pathologist for hyaline membranes. These are regions of epithelial damage. With this, we could study how hyaline membrane affects COVID-19 activity.
After registration, we register the image and its annotations to our main dataset.
imgdata <- importImageData("../workshop/data/COVID19/acutecase1_heimage.jpg", tile.size = 10,
segments = "../workshop/data/COVID19/acutecase1_membrane.geojson", sample_name = "acute case 1 (HE)",
channel_names = "H&E")
imgdata <- flipCoordinates(imgdata, assay = "ROIAnnotation")
vr2_merged_acute1 <- modulateImage(vr2_merged_acute1, brightness = 300)
xen_reg <- registerSpatialData(object_list = list(vr2_merged_acute1, imgdata)) # This will have interface like below
imgdata_reg <- xen_reg$registered_spat[[2]]
vrImages(vr2_merged_acute1[["Assay7"]], name = "main", channel = "H&E") <- vrImages(imgdata_reg, assay = "Assay1", name = "main_reg")
vrImages(vr2_merged_acute1[["Assay8"]], name = "main", channel = "H&E") <- vrImages(imgdata_reg, assay = "Assay1", name = "main_reg")
# Add ROI Assay
vr2_merged_acute1 <- addAssay(vr2_merged_acute1,
assay = imgdata_reg[["Assay2"]],
metadata = Metadata(imgdata_reg, assay = "ROIAnnotation"),
assay_name = "ROIAnnotation",
sample = "acute case 1", layer = "Section1")
vrMainSpatial(vr2_merged_acute1[["Assay9"]]) <- "main_reg"
vrMainAssay(vr2_merged_acute1) <- "ROIAnnotation"
vr2_merged_acute1$Region <- "Hyaline Membrane"
p1 <- vrSpatialPlot(vr2_merged_acute1, assay = "Xenium_mol", group.by = "gene",
group.ids = c("S2_N", "S2_orf1ab"), n.tile = 500) |>
addSpatialLayer(vr2_merged_acute1, assay = "ROIAnnotation",
group.by = "Region", alpha = 0.3, spatial = "main_reg",
colors = list(`Hyaline Membrane` = "blue"))
p2 <- vrSpatialPlot(vr2_merged_acute1, assay = "ROIAnnotation",
group.by = "Region", alpha = 0.3, spatial = "main_reg",
colors = list(`Hyaline Membrane` = "blue"))
(p1 | p2)
## Warning: Duplicated `override.aes` is ignored.
## Warning: Removed 54 rows containing missing values or values outside the scale range
## (`geom_tile()`).
💬 Explanation: Here, can see the images are overlapping. But we still don’t know how many S2_N and S2_orf1ab are there within hyaline membrane. Therefore, we need to do the following steps.
To assess whether molecules are enriched in hyaline membrane zones. This allows us to tag each molecule with the region it resides in — helping quantify infection by zone.
vrSpatialNames(vr2_merged_acute1, assay = "all")
vrMainSpatial(vr2_merged_acute1[["Assay9"]]) <- "main_reg"
vr2_merged_acute1 <- transferData(object = vr2_merged_acute1,
from = "Assay9",
to = "Assay8",
features = "Region")
s2_summary_hyaline <- Metadata(vr2_merged_acute1, assay = "Xenium_mol") %>%
filter(gene %in% c("S2_N", "S2_orf1ab"), Region == "Hyaline Membrane") %>%
summarise(S2_N = sum(gene == "S2_N"),
S2_orf1ab = sum(gene == "S2_orf1ab"),
ratio = sum(gene == "S2_N")/sum(gene == "S2_orf1ab")) %>%
as.matrix()
s2_summary_hyaline
## S2_N S2_orf1ab ratio
## [1,] 50323 32769 1.535689
🧠 Interpretation:
S2_N / S2_orf1ab
ratio informs us about viral replication. A higher ratio indicates more active replication. So from this observation, we can say the viral is growing within the hyaline membrane.If you wonder why we use ratio instead of expression, see the Explanation section in here.
After knowing the viral is growing within the hyaline membrane, we want to explore more about the Xenium dataset. The annotation step doesn’t show here but bascially just annotate the cells base on their gene expression.
Here, we have a cell type called “H.I. Cells”, which means highly infected cells.
We also curious about whether there is a region have lots of “H.I. Cells” and is COVID-19 highly active in the region or not.
## Use can use Rstduio locally, You can use this function.
as.AnnData(vr2_merged_acute1, assay = "Assay7", file = "../workshop/data/ondisk/COVID19/vr2_merged_acute1_annotated_registered.zarr",
flip_coordinates = TRUE, create.ometiff = TRUE, name = "main", channel = "H&E")
# If you use Docker, use this function! But the function may not as flexible as the function above
# vrSpatialPlot(vr2_merged_acute1, assay = "Xenium",channel = "H&E", group.by = "CellType", interactive = TRUE)
🧠 Interpretation: We could see there indeed has a region that is full of “H.I. Cells”. As we zoom in, we could see there just like a battle zone of the cells. So now we curious about what’s the
S2_N / S2_orf1ab
ratio.
Before calculating the S2_N / S2_orf1ab
ratio, we need
to mark down this region. Here, we used an interactive annotation
interface to define infected zones. You can see the
video as a demo.
# Manually annotate infected zones
vr2_merged_acute1 <- annotateSpatialData(vr2_merged_acute1, assay = "Xenium_mol", label = "Region2",
use.image = TRUE, channel = "H&E", annotation_assay = "InfectedAnnotation")
# The functionwill give you GUI interface. I use video for demo here
Here, we integrated the infected region with “S2_N”, “S2_orf1ab” image and hyaline membrane image.
vrSpatialPlot(vr2_merged_acute1, assay = "Xenium_mol", group.by = "gene",
group.ids = c("S2_N", "S2_orf1ab"), n.tile = 500) |>
addSpatialLayer(vr2_merged_acute1, assay = "ROIAnnotation", group.by = "Region", alpha = 0.3, spatial = "main_reg",
colors = list(`Hyaline Membrane` = "blue")) |>
addSpatialLayer(vr2_merged_acute1, assay = "InfectedAnnotation", group.by = "Region2", alpha = 0.8,
colors = list(`Infected` = "yellow"))
## Warning: Duplicated `override.aes` is ignored.
## Duplicated `override.aes` is ignored.
## Warning: Removed 54 rows containing missing values or values outside the scale range
## (`geom_tile()`).
Now, we can compare the S2_N / S2_orf1ab
ratio of
infected region with hyaline membrane region.
# Compare abundance
s2_summary_infected <-
Metadata(vr2_merged_acute1, assay = "Xenium_mol") %>%
filter(gene %in% c("S2_N", "S2_orf1ab"),
Region2 == "Infected") %>%
summarise(S2_N = sum(gene == "S2_N"),
S2_orf1ab = sum(gene == "S2_orf1ab"),
ratio = sum(gene == "S2_N")/sum(gene == "S2_orf1ab")) %>%
as.matrix()
rbind(s2_summary_infected, s2_summary_hyaline)
## S2_N S2_orf1ab ratio
## [1,] 6437 2421 2.658819
## [2,] 50323 32769 1.535689
🧠 Interpretation: We could see the infected region has higher
S2_N / S2_orf1ab
ratio than hyaline membrane region. But how significant it is?
Here, we do the fish test to see how significant the infected region is higher than hyaline membrane region.
# check all abundances
S2_table <- matrix(c(s2_summary_hyaline[,1:2],
s2_summary_infected[,1:2]),
dimnames = list(Region = c("Hyaline", "Infected"),
S2 = c("N", "orf1ab")),
ncol = 2, byrow = TRUE)
fisher.test(S2_table, alternative = "two.sided")
##
## Fisher's Exact Test for Count Data
##
## data: S2_table
## p-value < 2.2e-16
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 0.5498699 0.6065638
## sample estimates:
## odds ratio
## 0.5775876
🧠 Interpretation: With p-value 2.2e-16, we could confidently say the infected region has higher ratio than hyaline membrane region.
Using a spatial graph, we locate clusters of high local expression (i.e., hotspots) of S2_N and ORF1ab.
vr2_merged_acute1 <- getSpatialNeighbors(vr2_merged_acute1, assay = "Xenium_mol",
group.by = "gene", group.ids = c("S2_N", "S2_orf1ab"),
method = "radius", radius = 50)
vr2_merged_acute1 <- getHotSpotAnalysis(vr2_merged_acute1, assay = "Xenium_mol",
features = "gene", graph.type = "radius")
vrSpatialPlot(vr2_merged_acute1, assay = "Xenium_mol",
group.by = "gene_hotspot_flag", group.ids = c("cold", "hot"),
alpha = 1, background.color = "white")
This pipeline illustrates how integrating multiple spatial data types can enhance biological insight:
Together, they provide a powerful framework for spatial biology and disease pathology studies.
Use a pre-built Docker image with VoltRon and all dependencies pre-installed.
Make sure Docker Desktop is installed.
Pull the image:
docker pull amanukyan1385/rstudio-voltron:main
docker run --rm -ti \
-e PASSWORD=yourpassword \
-p 8787:8787 \
-v Path/to/your/dataset:/home/rstudio/project \
amanukyan1385/rstudio-voltron:main
rstudio
yourpassword
# you can change your own
passwordYou can also use Docker Desktop GUI if you prefer volume mounting.
Download the complete dataset bundle used in this notebook:
Download all datasets (ZIP) - including datasets for other markdown