Gradient Neighborhoods: profiling the pancreatic islet microenvironment¶
This notebook demonstrates the gradient neighborhood workflow in celldega.nbhd. From a
region of interest (ROI) we build concentric rings that step outward into the surrounding
tissue and inward into the ROI, then use NeighborhoodCollection to measure how cell-type
composition and gene expression change with distance from the ROI boundary.
Here the ROI is the set of pancreatic islets in a Xenium human pancreas section. We:
- Load a pre-clustered Xenium AnnData (Leiden clusters + spatial coordinates).
- Identify the endocrine / islet cluster from hormone markers and trace it with an alpha shape (a multipolygon, one piece per islet).
- Build inward + outward gradient rings with
nbhd.calc_gradient(obs_name=...). - Quantify the cell-type proportion and mean hormone expression gradient per ring.
- Visualize the rings as static plots and as an interactive Landscape overlay.
The dataset is the public
Xenium_V1_human_Pancreas_FFPE_outs.h5ad
and the matching DegaFiles landscape (see the
Scanpy-Squidpy Xenium Pancreas
tutorial for how this AnnData was produced).
from pathlib import Path
from urllib.parse import quote
import geopandas as gpd
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import requests
import scanpy as sc
from scipy import sparse
from shapely.geometry import Point
import celldega as dega
1. Load the Xenium pancreas AnnData¶
REPO_ID = "broadinstitute/Celldega_Supporting_Data"
REVISION = "main"
CACHE_DIR = Path("data/celldega_supporting_data")
H5AD_PATH = "Xenium_V1_human_Pancreas_FFPE_outs.h5ad"
def download_repo_file(repo_id, repo_path, cache_dir, revision="main"):
cache_dir.mkdir(parents=True, exist_ok=True)
local_path = cache_dir / Path(repo_path).name
if local_path.exists():
return local_path
encoded_path = quote(repo_path)
url = f"https://huggingface.co/datasets/{repo_id}/resolve/{revision}/{encoded_path}"
with requests.get(url, stream=True, timeout=120) as response:
response.raise_for_status()
with local_path.open("wb") as handle:
for chunk in response.iter_content(chunk_size=1024 * 1024):
if chunk:
handle.write(chunk)
return local_path
local_h5ad = download_repo_file(REPO_ID, H5AD_PATH, CACHE_DIR, REVISION)
adata = sc.read_h5ad(local_h5ad)
adata
AnnData object with n_obs × n_vars = 122678 × 377
obs: 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'unassigned_codeword_counts', 'deprecated_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'z_level', 'nucleus_count', 'cell_labels', 'n_counts', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'spatialdata_attrs', 'umap'
obsm: 'X_pca', 'X_umap', 'spatial'
varm: 'PCs'
obsp: 'connectivities', 'distances'
# Xenium coordinates are in microns; make sure they live in obsm["spatial"].
def ensure_spatial_coordinates(adata):
if "spatial" in adata.obsm:
return
for x_col, y_col in [("x_centroid", "y_centroid"), ("x_location", "y_location"), ("x", "y")]:
if x_col in adata.obs and y_col in adata.obs:
adata.obsm["spatial"] = adata.obs[[x_col, y_col]].to_numpy()
return
raise ValueError("The AnnData needs adata.obsm['spatial'] or x/y centroid columns.")
ensure_spatial_coordinates(adata)
spatial = adata.obsm["spatial"]
2. Identify the islet (endocrine) cluster¶
Pancreatic islets are marked by the endocrine hormones INS (β-cells), GCG (α-cells), SST (δ-cells), and PPY (PP-cells). We score each Leiden cluster by mean hormone expression and pick the clear winner. (In this dataset that is cluster 9 — but the score keeps the example robust if the clustering is ever re-run and the labels change.)
hormones = ["INS", "GCG", "SST", "PPY"]
marker_df = sc.get.obs_df(adata, keys=[*hormones, "leiden"])
endocrine_score = marker_df.groupby("leiden", observed=True)[hormones].mean().sum(axis=1)
islet_cluster = endocrine_score.idxmax()
print(f"Islet (endocrine) cluster: {islet_cluster}")
endocrine_score.sort_values(ascending=False).round(3)
Islet (endocrine) cluster: 9
leiden 9 4.350 3 0.111 6 0.090 5 0.083 0 0.071 10 0.068 8 0.067 7 0.037 2 0.034 1 0.029 4 0.016 dtype: float32
3. Trace the islets with an alpha shape¶
dega.nbhd.alpha_shape wraps the islet-cell point cloud in a concave hull. Because the islets
are scattered across the section, the result is a MultiPolygon — one piece per islet. The
inv_alpha parameter controls how tightly the hull hugs the points: a smaller value hugs the
points more tightly and breaks the islets into more, smaller pieces. We use inv_alpha=25 to keep
the individual islets well separated.
We then wrap the islet alpha shape as a NeighborhoodCollection — here a single neighborhood named
after the cluster. In a larger analysis this collection might hold one alpha-shape neighborhood per
cell-type cluster (e.g. from alpha_shape_cell_clusters); the gradient step below simply picks one
of them by name.
islet_mask = (adata.obs["leiden"] == islet_cluster).to_numpy()
islet_points = spatial[islet_mask]
islets = dega.nbhd.alpha_shape(islet_points, inv_alpha=25)
print(f"{islet_mask.sum()} islet cells -> {len(islets.geoms)} islet polygons")
# Wrap the islet alpha shape as a neighborhood named after the cluster (e.g. "9").
nbhd_alpha = dega.nbhd.NeighborhoodCollection(
gdf=gpd.GeoDataFrame({"name": [islet_cluster]}, geometry=[islets]),
nbhd_type="alpha_shape",
)
nbhd_alpha.obs
2717 islet cells -> 49 islet polygons
| neighborhood_id | name | neighborhood_type | method | area | area_um2 | centroid_x | centroid_y | |
|---|---|---|---|---|---|---|---|---|
| neighborhood_id | ||||||||
| 9 | 9 | 9 | alpha_shape | alpha_shape | 245231.981777 | 245231.981777 | 5046.796303 | 1420.907351 |
View the islet alpha shapes on the Landscape¶
Before computing the gradient, we can overlay the islet alpha shapes on the interactive Landscape.
Passing a NeighborhoodCollection to nbhd= is all that's needed — the widget reads the
micron_to_image_transform.csv from the DegaFiles base_url and converts the micron geometry to
pixels itself. Setting landscape_state="nbhd" reveals the neighborhood layer in the initial
view (the same way landscape_state="umap" opens on the UMAP).
base_url = (
"https://raw.githubusercontent.com/broadinstitute/"
"celldega_Xenium_human_Pancreas_FFPE/main/"
"Landscape_Xenium_V1_human_Pancreas_FFPE_outs_webp"
)
landscape_alpha = dega.viz.Landscape(
technology="Xenium",
base_url=base_url,
height=600,
adata=adata,
nbhd=nbhd_alpha, # the islet alpha shapes
landscape_state="nbhd", # show the neighborhoods on load
)
landscape_alpha
<celldega.viz.widget.Landscape object at 0x14683b440>
4. Build inward + outward gradient rings¶
nbhd.calc_gradient(obs_name=...) grows concentric rings around one neighborhood in the
collection and returns a new gradient NeighborhoodCollection — one ring per observation, ordered
inner-most to outer-most. We anchor on the islet neighborhood and step every 50 µm in both
directions, going out to 400 µm so the gradient reaches well into the surrounding tissue.
To stop the outward rings from overhanging into empty regions near the edges of the section, we
clip them to a tissue boundary. Passing clip_reference=spatial lets calc_gradient compute a
tissue alpha shape on the fly from all cell positions; clip_alpha can be coarse (a large value such
as 200) since we only need the overall tissue outline, not fine detail. The geometry is already in
microns, so we leave is_pixel_space=False (the default).
Inward erosion stops automatically once a small islet erodes away, so there are fewer inward rings than outward ones.
nbhd_grad = nbhd_alpha.calc_gradient(
obs_name=islet_cluster, # the neighborhood to anchor the gradient on
direction="both", # "outward", "inward", or "both"
bin_width=50, # ring width in microns
max_dist=400, # reach well outside the islets
clip_reference=spatial, # clip outward rings to a tissue alpha shape
clip_alpha=200, # coarse tissue outline is fine
)
ring_order = list(nbhd_grad.obs.index)
nbhd_grad.obs[["direction", "dist_start_um", "dist_end_um", "area_um2"]]
| direction | dist_start_um | dist_end_um | area_um2 | |
|---|---|---|---|---|
| neighborhood_id | ||||
| in (-50~-100) µm | inward | -100 | -50 | 5.207563e+03 |
| in (0~-50) µm | inward | -50 | 0 | 2.400244e+05 |
| out (+0~+50) µm | outward | 0 | 50 | 9.692854e+05 |
| out (+50~+100) µm | outward | 50 | 100 | 1.333964e+06 |
| out (+100~+150) µm | outward | 100 | 150 | 1.460667e+06 |
| out (+150~+200) µm | outward | 150 | 200 | 1.526653e+06 |
| out (+200~+250) µm | outward | 200 | 250 | 1.497781e+06 |
| out (+250~+300) µm | outward | 250 | 300 | 1.414923e+06 |
| out (+300~+350) µm | outward | 300 | 350 | 1.232888e+06 |
| out (+350~+400) µm | outward | 350 | 400 | 9.900265e+05 |
Static plot — islets and their gradient rings¶
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Left: islet cells + alpha-shape outline.
axes[0].scatter(spatial[:, 0], spatial[:, 1], s=0.2, color="lightgray", label="all cells")
axes[0].scatter(islet_points[:, 0], islet_points[:, 1], s=0.6, color="crimson", label="islet cells")
gpd.GeoSeries([islets]).boundary.plot(ax=axes[0], color="black", linewidth=0.5)
axes[0].set_title(f"Islet cells (cluster {islet_cluster}) + alpha shape")
axes[0].legend(markerscale=8, loc="upper left")
# Right: gradient rings, colored inward (reds) -> outward (blues).
nbhd_grad.gdf.plot(ax=axes[1], color=nbhd_grad.gdf["color"], edgecolor="white", linewidth=0.2)
axes[1].set_title("Gradient rings (red = inward, blue = outward)")
for ax in axes:
ax.set_aspect("equal")
ax.set_xlim(2000, 3500) # zoom to an islet-rich region
ax.set_ylim(1800, 800)
ax.set_xlabel("x (µm)")
ax.set_ylabel("y (µm)")
fig.tight_layout()
5. Quantify the gradient with NeighborhoodCollection¶
Because the rings are a NeighborhoodCollection, the standard calc_nbhd_by_* methods summarize
each ring directly:
calc_nbhd_by_pop→ cell-type proportion per ring (here, the islet-cell fraction).calc_nbhd_by_gene→ mean expression per ring (here, the islet hormones).
We keep every ring (drop_missing=False) so the gradient axis stays complete.
nbhd_grad.calc_nbhd_by_pop(adata, category="leiden", output="proportion", min_cells=5, drop_missing=False)
Calculating NBP
nbhd_grad.calc_nbhd_by_gene(adata, by="cell", min_cells=5, drop_missing=False)
Calculating neighborhood-by-gene (cell-derived)
def modality_frame(modality):
X = modality.X.toarray() if sparse.issparse(modality.X) else np.asarray(modality.X)
return pd.DataFrame(X, index=modality.obs_names, columns=modality.var_names).reindex(ring_order)
proportion = modality_frame(nbhd_grad.mod["population"])
expression = modality_frame(nbhd_grad.mod["gene"])
summary = pd.DataFrame({
"direction": nbhd_grad.obs["direction"].reindex(ring_order),
"dist_start_um": nbhd_grad.obs["dist_start_um"].reindex(ring_order),
"islet_proportion": proportion[islet_cluster],
**{f"{g}_mean": expression[g] for g in hormones if g in expression.columns},
})
summary.round(3)
| direction | dist_start_um | islet_proportion | INS_mean | GCG_mean | SST_mean | PPY_mean | |
|---|---|---|---|---|---|---|---|
| neighborhood_id | |||||||
| in (-50~-100) µm | inward | -100 | 1.000 | 2.609 | 1.760 | 1.376 | 0.000 |
| in (0~-50) µm | inward | -50 | 0.840 | 1.820 | 1.617 | 0.789 | 0.051 |
| out (+0~+50) µm | outward | 0 | 0.001 | 0.153 | 0.168 | 0.056 | 0.010 |
| out (+50~+100) µm | outward | 50 | 0.001 | 0.032 | 0.028 | 0.013 | 0.004 |
| out (+100~+150) µm | outward | 100 | 0.001 | 0.018 | 0.015 | 0.007 | 0.002 |
| out (+150~+200) µm | outward | 150 | 0.001 | 0.013 | 0.012 | 0.008 | 0.002 |
| out (+200~+250) µm | outward | 200 | 0.000 | 0.009 | 0.007 | 0.004 | 0.004 |
| out (+250~+300) µm | outward | 250 | 0.000 | 0.007 | 0.007 | 0.004 | 0.002 |
| out (+300~+350) µm | outward | 300 | 0.000 | 0.005 | 0.006 | 0.006 | 0.002 |
| out (+350~+400) µm | outward | 350 | 0.000 | 0.006 | 0.005 | 0.002 | 0.002 |
Static plots — composition and expression vs. distance from the islet edge¶
x = nbhd_grad.obs["dist_start_um"].reindex(ring_order).astype(float).to_numpy()
fig, axes = plt.subplots(1, 2, figsize=(13, 4.5))
axes[0].plot(x, proportion[islet_cluster].to_numpy(), "o-", color="crimson")
axes[0].axvline(0, ls="--", color="gray")
axes[0].set_title("Cell-type proportion gradient")
axes[0].set_ylabel("islet-cell proportion")
for gene in ["INS", "GCG", "SST"]:
axes[1].plot(x, expression[gene].to_numpy(), "o-", label=gene)
axes[1].axvline(0, ls="--", color="gray")
axes[1].set_title("Hormone expression gradient")
axes[1].set_ylabel("mean expression")
axes[1].legend()
for ax in axes:
ax.set_xlabel("signed distance from islet edge (µm) (negative = inside)")
fig.tight_layout()
Both readouts fall off sharply right at the islet boundary (distance 0): islet cells and their hormones are essentially confined to the ROI, with a thin spillover into the first outward ring before the surrounding acinar tissue takes over.
6. Clustergram of the gradient gene feature space¶
The neighborhood-by-gene modality is a matrix of rings × genes, so we can cluster it as a
heatmap to see which genes share a gradient profile. We keep the 40 most variable genes across the
rings, transpose to genes × rings, and build a dega.clust.Matrix (row-z-scored) rendered as an
interactive Clustergram. Genes separate into an islet-enriched block (high inside, e.g. INS / GCG /
SST) and an acinar block (high outside), while the rings order into an inward vs. outward split.
gene_frame = modality_frame(nbhd_grad.mod["gene"])
top_variable_genes = gene_frame.var(axis=0).sort_values(ascending=False).head(40).index
gradient_gene_matrix = gene_frame[top_variable_genes].T # genes (rows) x rings (cols)
mat = dega.clust.Matrix(gradient_gene_matrix, row_entity="gene", col_entity="nbhd")
mat.norm(axis="row", by="zscore")
mat.clust()
cgm = dega.viz.Clustergram(matrix=mat, width=550, height=650)
<celldega.viz.widget.Clustergram object at 0x14682d5e0>
cgm
<celldega.viz.widget.Clustergram object at 0x14682d5e0>
7. Overlay the gradient rings on the Landscape¶
Finally we overlay the gradient NeighborhoodCollection itself. As above, passing nbhd=nbhd_grad
is all that's required — the widget converts the micron ring geometry to pixels using the DegaFiles
transform, each ring keeps its gradient color (reds inward, blues outward), and
landscape_state="nbhd" shows the rings in the initial view. Toggle the NBHD / CELL buttons
to switch between the rings and the cells.
landscape_grad = dega.viz.Landscape(
technology="Xenium",
base_url=base_url,
height=600,
adata=adata,
nbhd=nbhd_grad, # the gradient rings
landscape_state="nbhd", # show the rings on load
)
landscape_grad
<celldega.viz.widget.Landscape object at 0x145c90c50>