Neighborhood Module API Reference
Module for performing neighborhood analysis.
NeighborhoodCollection
Bases: CelldegaCollection
Neighborhood-level or spatial-region MuData collection.
Observations are neighborhoods or spatial regions. Feature modalities live in
mod and global observation relations live in relations/mdata.obsp.
Geometry is kept as a live GeoDataFrame in memory; durable geometry
storage can be layered on later with WKB columns or GeoParquet sidecars.
geometry
property
Neighborhood geometry. Alias of :attr:gdf (single source of truth).
__init__(obs=None, mod=None, mdata=None, gdf=None, nbhd_type=None, data_dir=None, source=None, name=None, meta=None, nbhd_col='name', geometry=None, relations=None, provenance=None, uns=None, memberships=None, transformation_matrix=None)
Build a neighborhood / spatial-region collection.
The observation axis (one row per neighborhood) is established from a
neighborhood GeoDataFrame (gdf — the usual path, produced by
alpha_shape / generate_hextile / etc.), from an explicit obs
table paired with geometry, or from a pre-built mdata. When built
from gdf, per-neighborhood area/area_um2 and centroid columns
are derived and the neighborhood-id column is normalized.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs
|
DataFrame | None
|
Pre-built neighborhood observation table (use with |
None
|
mod
|
dict[str, AnnData] | None
|
Feature-space modalities to attach up front. |
None
|
mdata
|
MuData | None
|
Pre-built |
None
|
gdf
|
GeoDataFrame | None
|
Neighborhood geometry; each row becomes an observation. Mutually
exclusive with |
None
|
nbhd_type
|
str | None
|
Label for how the neighborhoods were made (e.g.
|
None
|
data_dir
|
str | None
|
DegaFiles/instrument directory used as the default source for the transcript- and transform-loading methods. |
None
|
source
|
str | dict[str, Any] | None
|
Source descriptor recorded in provenance. |
None
|
name
|
str | None
|
Optional collection name. |
None
|
meta
|
dict[str, Any] | None
|
Extra metadata merged into |
None
|
nbhd_col
|
str
|
Column in |
'name'
|
geometry
|
GeoDataFrame | None
|
Neighborhood geometry paired with an explicit |
None
|
relations
|
dict[str, spmatrix] | None
|
Square neighborhood-by-neighborhood matrices for
|
None
|
provenance
|
dict[str, Any] | None
|
Free-form provenance metadata. |
None
|
uns
|
dict[str, Any] | None
|
Extra Celldega metadata. |
None
|
memberships
|
dict[str, spmatrix] | None
|
Membership matrices (e.g. cell-to-neighborhood); kept in
memory only (not persisted by |
None
|
transformation_matrix
|
Any | None
|
Optional micron-to-pixel affine (see
:meth: |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
calc_gradient(obs_name, direction='both', bin_width=10, max_dist=50, nbhd_type='gradient', *, technology=None, scale_um_per_pixel=None, is_pixel_space=False, clip_boundary=None, clip_reference=None, clip_alpha=100, **kwargs)
Calculate a gradient collection around one neighborhood in this collection.
Picks the neighborhood identified by obs_name and grows fixed-width
bands outward from and/or inward into it (see
:func:~celldega.nbhd.gradient.calculate_gradient), returning a new
gradient NeighborhoodCollection — one neighborhood (observation) per
ring, ordered inner-most to outer-most. From there the usual
calc_nbhd_by_* methods summarize cell composition or expression per
ring, profiling how the tissue changes with distance from that
neighborhood's boundary. A gradient is therefore always anchored to a
concrete neighborhood (e.g. a per-cluster alpha shape) rather than a
loose geometry.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs_name
|
str
|
Identifier of the source neighborhood to anchor the gradient
on — matched against the collection's observation index (and, as a
fallback, the |
required |
direction
|
str
|
|
'both'
|
bin_width
|
float
|
Width of each ring in microns (default |
10
|
max_dist
|
float
|
Maximum distance from the neighborhood boundary in microns
(default |
50
|
nbhd_type
|
str
|
Label recorded on the new collection (default
|
'gradient'
|
technology
|
str | None
|
Imaging platform used to look up |
None
|
scale_um_per_pixel
|
float | None
|
Microns per pixel; required (directly or via
|
None
|
is_pixel_space
|
bool
|
|
False
|
clip_boundary
|
Any | None
|
Optional precomputed tissue boundary to clip outward
rings to (takes precedence over |
None
|
clip_reference
|
Any | None
|
Optional point cloud (cells) from which a tissue alpha shape is computed on the fly to clip outward rings. |
None
|
clip_alpha
|
float
|
Inverse-alpha for the on-the-fly alpha shape (default
|
100
|
**kwargs
|
Any
|
Forwarded to the new :class: |
{}
|
Returns:
| Type | Description |
|---|---|
NeighborhoodCollection
|
A new |
NeighborhoodCollection
|
rings around |
Raises:
| Type | Description |
|---|---|
ValueError
|
If this collection has no geometry. |
KeyError
|
If |
Examples:
>>> # nbhd holds one alpha-shape neighborhood per cell-type cluster
>>> grad_nbhd = nbhd.calc_gradient(obs_name="9", direction="both",
... bin_width=50, max_dist=200)
>>> grad_nbhd.calc_nbhd_by_gene(adata, by="cell")
>>> grad_nbhd.obs[["direction", "dist_start_um"]].head(3)
calc_nbhd_bordering(metric='border_ratio', key='bordering', category='leiden')
Calculate a neighborhood-by-neighborhood bordering relation.
Computes pairwise border relationships between neighborhoods and stores
the square matrix in relations[key] (mdata.obsp).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metric
|
str
|
Border metric (e.g. |
'border_ratio'
|
key
|
str
|
Name for the relation in |
'bordering'
|
category
|
str
|
Neighborhood category recorded on the computed result. |
'leiden'
|
Returns:
| Type | Description |
|---|---|
spmatrix
|
The stored sparse relation matrix. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If geometry is not set. |
calc_nbhd_by_gene(adata=None, by='cell', modality_name=None, min_cells=1, data_dir=None, drop_missing=True)
Calculate a neighborhood-by-gene modality and attach it to self.mod.
Builds per-neighborhood gene expression — mean expression of contained
cells (by="cell") or transcript counts (by="cell-free").
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
adata
|
AnnData | None
|
Cell-level |
None
|
by
|
str
|
|
'cell'
|
modality_name
|
str | None
|
Key for the modality; defaults to |
None
|
min_cells
|
int
|
Minimum cells/transcripts for a neighborhood to be kept. |
1
|
data_dir
|
str | None
|
Transcript directory for |
None
|
drop_missing
|
bool
|
When |
True
|
Returns:
| Type | Description |
|---|---|
None
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
calc_nbhd_by_pop(adata, category='leiden', modality_name='population', output='proportion', min_cells=5, drop_missing=True)
Calculate a neighborhood-by-population modality and attach it to self.mod.
Spatially assigns cells to neighborhoods and, per neighborhood, counts
cells per category value to form a neighborhood (rows) by population
(columns) feature matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
adata
|
AnnData
|
Cell-level |
required |
category
|
str
|
|
'leiden'
|
modality_name
|
str
|
Key for the modality in |
'population'
|
output
|
str
|
|
'proportion'
|
min_cells
|
int
|
Minimum cells for a neighborhood to be included. |
5
|
drop_missing
|
bool
|
When |
True
|
Returns:
| Type | Description |
|---|---|
None
|
|
calc_nbhd_overlap(metric='iou', key='overlap', category='leiden')
Calculate a neighborhood-by-neighborhood overlap relation.
Computes pairwise geometric overlap between neighborhoods and stores the
square matrix in relations[key] (mdata.obsp).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metric
|
str
|
Overlap metric — |
'iou'
|
key
|
str
|
Name for the relation in |
'overlap'
|
category
|
str
|
Neighborhood category recorded on the computed result. |
'leiden'
|
Returns:
| Type | Description |
|---|---|
spmatrix
|
The stored sparse relation matrix. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If geometry is not set. |
calc_nbhd_transcript_assignment(data_dir=None)
Add per-neighborhood transcript-assignment columns to obs.
From transcripts.parquet in data_dir, adds three obs columns
(on the underlying MuData) for each neighborhood:
total_transcripts— transcripts falling inside the neighborhood.unassigned_transcripts— those withcell_id == "UNASSIGNED".transcript_assignment_proportion— assigned / total (0.0when the neighborhood has no transcripts).
Assumption: the transcript-to-cell assignment is not computed here —
it must already be present in the instrument data, with unassigned
transcripts marked by the "UNASSIGNED" sentinel (Xenium convention).
Only transcripts are needed — no adata or cell polygons.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_dir
|
str | None
|
Directory containing |
None
|
Returns:
| Type | Description |
|---|---|
None
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If geometry or a usable |
from_gdf(gdf, nbhd_type='neighborhood', **kwargs)
classmethod
Create a NeighborhoodCollection from a neighborhood GeoDataFrame.
Convenience wrapper for NeighborhoodCollection(gdf=gdf,
nbhd_type=nbhd_type, **kwargs).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gdf
|
GeoDataFrame
|
Neighborhood geometry; each row becomes an observation. |
required |
nbhd_type
|
str
|
Label for how the neighborhoods were made. |
'neighborhood'
|
**kwargs
|
Any
|
Forwarded to the constructor. |
{}
|
Returns:
| Type | Description |
|---|---|
NeighborhoodCollection
|
A new |
load_transformation_matrix(data_dir=None)
Load the micron-to-pixel transformation matrix from DegaFiles.
Reads micron_to_image_transform.csv and stores it via
:meth:set_transformation_matrix. Later this matrix can instead be
supplied directly (e.g. from SpatialData).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_dir
|
str | None
|
Directory containing the transform CSV; defaults to
|
None
|
Returns:
| Type | Description |
|---|---|
ndarray
|
The loaded matrix as a float |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no |
set_transformation_matrix(matrix)
Set the micron-to-pixel affine transformation matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
matrix
|
Any
|
Affine mapping micron coordinates (the geometry's native
space) to image/pixel space, as a |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
The stored matrix as a float |
ndarray
|
|
to_pixel_gdf()
Return the neighborhood geometry ready for pixel-space visualization.
Adds a geometry_pixel column (micron geometry transformed to image
space via the stored transformation matrix) and leaves the original
micron geometry intact. The result can be passed straight to
Landscape(nbhd=...), which renders geometry_pixel directly when
present rather than applying its own transform.
Returns:
| Type | Description |
|---|---|
GeoDataFrame
|
A copy of |
Raises:
| Type | Description |
|---|---|
ValueError
|
If geometry or the transformation matrix is not set. |
alpha_shape_cell_clusters(adata, cat='cluster', alphas=(100, 150, 200, 250, 300, 350), meta_cluster=None)
Compute alpha shapes for each cluster in the cell metadata.
Parameters
adata : AnnData AnnData object with cell metadata in obs and spatial coordinates in obsm["spatial"]. cat : str Column name in adata.obs containing cluster/category labels. alphas : Sequence[float] List of inverse alpha values to compute shapes for. meta_cluster : pd.DataFrame | None Optional DataFrame with cluster metadata including 'color' column. If not provided, colors will be extracted from adata.uns[f'{cat}_colors'] if available, otherwise defaults to black.
Returns
gpd.GeoDataFrame GeoDataFrame with alpha shapes for each cluster at each alpha value.
calculate_gradient(source, direction='both', bin_width=10, max_dist=50, *, technology=None, scale_um_per_pixel=None, is_pixel_space=False, clip_boundary=None, clip_reference=None, clip_alpha=100, add_colors=True, nbhd_col='name')
Generate concentric gradient rings outward from and/or inward into an ROI.
Starting from the boundary of the merged source geometry, this builds
fixed-width bands every bin_width microns out to max_dist:
- Outward bands grow away from the ROI (positive distance). Use
clip_boundary/clip_referenceto stop them from running off the tissue. - Inward bands erode into the ROI (negative distance) and stop automatically once the geometry erodes to nothing.
Each band is one row of the returned GeoDataFrame, ordered inner-most to
outer-most, with a signed distance so you can spatially join cells to bands
(gpd.sjoin) and correlate composition or expression against distance from
the ROI edge.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
GeoDataFrame | GeoSeries | BaseGeometry | NeighborhoodCollection
|
The ROI. May be a |
required |
direction
|
str
|
|
'both'
|
bin_width
|
float
|
Width of each ring in microns (default |
10
|
max_dist
|
float
|
Maximum distance from the ROI boundary in microns (default
|
50
|
technology
|
str | None
|
Imaging platform (e.g. |
None
|
scale_um_per_pixel
|
float | None
|
Microns per pixel. Required (directly or via
|
None
|
is_pixel_space
|
bool
|
|
False
|
clip_boundary
|
Any | None
|
Optional precomputed tissue boundary
( |
None
|
clip_reference
|
Any | None
|
Optional point cloud ( |
None
|
clip_alpha
|
float
|
Inverse-alpha value for the on-the-fly alpha shape (default
|
100
|
add_colors
|
bool
|
If |
True
|
nbhd_col
|
str
|
Name of the band-identifier column in the output (default
|
'name'
|
Returns:
| Type | Description |
|---|---|
GeoDataFrame
|
A |
GeoDataFrame
|
outer-most, and columns: |
GeoDataFrame
|
|
GeoDataFrame
|
|
GeoDataFrame
|
|
GeoDataFrame
|
|
GeoDataFrame
|
|
GeoDataFrame
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Examples:
Inward and outward micron-space rings straight into a collection-ready
GeoDataFrame::
>>> import celldega as dega
>>> gdf_rings = dega.nbhd.calculate_gradient(
... gdf_tumor, direction="both", bin_width=10, max_dist=50
... )
>>> gdf_rings[["name", "direction", "dist_start_um"]].head(3)
Outward-only rings from pixel-space geometry, clipped to a tissue alpha shape computed on the fly from cell centroids so the rings cannot run off the tissue::
>>> gdf_rings = dega.nbhd.calculate_gradient(
... gdf_tumor,
... direction="outward",
... technology="Xenium",
... is_pixel_space=True,
... clip_reference=gdf_cells,
... clip_alpha=100,
... )
Tag cells with the band they fall in for downstream gradient analysis::
>>> joined = gpd.sjoin(
... gdf_cells, gdf_rings[["ring_range_um", "geometry"]],
... how="left", predicate="within",
... )
filter_alpha_shapes(gdf_alpha, alpha, min_area=0, clean_names=True)
Filter alpha shapes by a specific alpha value and optionally clean up names.
Alpha shapes computed by alpha_shape_cell_clusters have names in the format
{category}_{alpha} (e.g., "cluster_0_150"). This function filters to a specific
alpha value and removes the trailing _{alpha} suffix from names.
Parameters
gdf_alpha : gpd.GeoDataFrame
GeoDataFrame of alpha shapes with 'inv_alpha', 'area', 'name', and 'cat' columns.
Typically the output of alpha_shape_cell_clusters.
alpha : float
The inverse alpha value to filter for (must match values in 'inv_alpha' column).
min_area : float, default 0
Minimum area threshold. Shapes with area <= min_area are excluded.
clean_names : bool, default True
If True, removes the trailing _{alpha} suffix from the 'name' column,
leaving just the category name (e.g., "cluster_0" instead of "cluster_0_150").
Returns
gpd.GeoDataFrame Filtered GeoDataFrame with optionally cleaned names.
Examples
gdf_alpha = dega.nbhd.alpha_shape_cell_clusters(adata, cat="leiden") gdf_filtered = dega.nbhd.filter_alpha_shapes(gdf_alpha, alpha=150)
Names are now just category names without the alpha suffix
print(gdf_filtered["name"].tolist()[:3]) ['0', '1', '2']
generate_hextile(adata, diameter=100)
Generate a hexagonal grid over the bounding box of cell spatial coordinates.
Parameters
adata : AnnData
AnnData object with spatial coordinates in obsm["spatial"].
diameter : float, default 100
Diameter of each hexagon in the same units as the spatial coordinates
(typically microns).
Returns
gpd.GeoDataFrame GeoDataFrame with hexagon geometries covering the spatial extent. Columns: "name" (hex_0, hex_1, ...), "geometry" (Polygon).
Examples
gdf_hex = dega.nbhd.generate_hextile(adata, diameter=100) gdf_hex.shape (1234, 2)
hextile_niche(gdf_hex, adata_hex, category='leiden', dissolve=True)
Create niche polygons from hextiles based on clustering results.
Takes hexagon geometries and assigns them to niches based on clustering (e.g., Leiden clustering of hexagon population distributions). Optionally dissolves adjacent hexagons of the same niche into unified polygons.
Parameters
gdf_hex : gpd.GeoDataFrame
GeoDataFrame of hexagon geometries. Must be indexed by hexagon name
(matching adata_hex.obs.index).
adata_hex : AnnData
AnnData object containing clustering results in obs[category].
The index must match the hexagon names in gdf_hex.
Colors can be provided in uns[f"{category}_colors"].
category : str, default "leiden"
Column name in adata_hex.obs containing the niche/cluster assignment.
dissolve : bool, default True
If True, dissolve adjacent hexagons of the same niche into unified
MultiPolygon geometries. If False, return individual hexagons with
their niche assignment.
Returns
gpd.GeoDataFrame If dissolve=True: GeoDataFrame with dissolved niche polygons. Columns: "name", "cat", "geometry", "color", "area". If dissolve=False: GeoDataFrame with individual hexagons and niche assignment. Columns: "name", "cat", "geometry", "color".
Examples
Generate hexagons and compute population distribution
gdf_hex = dega.nbhd.generate_hextile(adata, diameter=100) adata_hex = dega.nbhd.calc_nbhd_by_pop(adata, gdf_hex, category="leiden")
Cluster hexagons by population similarity (e.g., using scanpy)
import scanpy as sc sc.pp.pca(adata_hex) sc.pp.neighbors(adata_hex) sc.tl.leiden(adata_hex)
Create dissolved niche polygons
gdf_niche = dega.nbhd.hextile_niche(gdf_hex, adata_hex, category="leiden")
Or keep individual hexagons with niche assignment
gdf_hex_niche = dega.nbhd.hextile_niche(gdf_hex, adata_hex, dissolve=False)