Skip to content

Neighborhood Module API Reference

Module for performing neighborhood analysis.

NeighborhoodCollection

Bases: CelldegaCollection

Neighborhood-level or spatial-region MuData collection.

Observations are neighborhoods or spatial regions. Feature modalities live in mod and global observation relations live in relations/mdata.obsp. Geometry is kept as a live GeoDataFrame in memory; durable geometry storage can be layered on later with WKB columns or GeoParquet sidecars.

geometry property

Neighborhood geometry. Alias of :attr:gdf (single source of truth).

__init__(obs=None, mod=None, mdata=None, gdf=None, nbhd_type=None, data_dir=None, source=None, name=None, meta=None, nbhd_col='name', geometry=None, relations=None, provenance=None, uns=None, memberships=None, transformation_matrix=None)

Build a neighborhood / spatial-region collection.

The observation axis (one row per neighborhood) is established from a neighborhood GeoDataFrame (gdf — the usual path, produced by alpha_shape / generate_hextile / etc.), from an explicit obs table paired with geometry, or from a pre-built mdata. When built from gdf, per-neighborhood area/area_um2 and centroid columns are derived and the neighborhood-id column is normalized.

Parameters:

Name Type Description Default
obs DataFrame | None

Pre-built neighborhood observation table (use with geometry, not with gdf).

None
mod dict[str, AnnData] | None

Feature-space modalities to attach up front.

None
mdata MuData | None

Pre-built MuData to wrap (e.g. from read).

None
gdf GeoDataFrame | None

Neighborhood geometry; each row becomes an observation. Mutually exclusive with obs/geometry.

None
nbhd_type str | None

Label for how the neighborhoods were made (e.g. "hextile", "alpha_shape"); defaults to "neighborhood".

None
data_dir str | None

DegaFiles/instrument directory used as the default source for the transcript- and transform-loading methods.

None
source str | dict[str, Any] | None

Source descriptor recorded in provenance.

None
name str | None

Optional collection name.

None
meta dict[str, Any] | None

Extra metadata merged into uns["celldega"].

None
nbhd_col str

Column in gdf identifying each neighborhood (falls back to neighborhood_id / nbhd_id).

'name'
geometry GeoDataFrame | None

Neighborhood geometry paired with an explicit obs (alternative to gdf).

None
relations dict[str, spmatrix] | None

Square neighborhood-by-neighborhood matrices for mdata.obsp.

None
provenance dict[str, Any] | None

Free-form provenance metadata.

None
uns dict[str, Any] | None

Extra Celldega metadata.

None
memberships dict[str, spmatrix] | None

Membership matrices (e.g. cell-to-neighborhood); kept in memory only (not persisted by write).

None
transformation_matrix Any | None

Optional micron-to-pixel affine (see :meth:set_transformation_matrix).

None

Raises:

Type Description
ValueError

If gdf is combined with obs or geometry.

calc_gradient(obs_name, direction='both', bin_width=10, max_dist=50, nbhd_type='gradient', *, technology=None, scale_um_per_pixel=None, is_pixel_space=False, clip_boundary=None, clip_reference=None, clip_alpha=100, **kwargs)

Calculate a gradient collection around one neighborhood in this collection.

Picks the neighborhood identified by obs_name and grows fixed-width bands outward from and/or inward into it (see :func:~celldega.nbhd.gradient.calculate_gradient), returning a new gradient NeighborhoodCollection — one neighborhood (observation) per ring, ordered inner-most to outer-most. From there the usual calc_nbhd_by_* methods summarize cell composition or expression per ring, profiling how the tissue changes with distance from that neighborhood's boundary. A gradient is therefore always anchored to a concrete neighborhood (e.g. a per-cluster alpha shape) rather than a loose geometry.

Parameters:

Name Type Description Default
obs_name str

Identifier of the source neighborhood to anchor the gradient on — matched against the collection's observation index (and, as a fallback, the name column). Its geometry is the ROI.

required
direction str

"outward", "inward", or "both" (default).

'both'
bin_width float

Width of each ring in microns (default 10).

10
max_dist float

Maximum distance from the neighborhood boundary in microns (default 50).

50
nbhd_type str

Label recorded on the new collection (default "gradient").

'gradient'
technology str | None

Imaging platform used to look up scale_um_per_pixel for pixel-space geometry (e.g. "Xenium").

None
scale_um_per_pixel float | None

Microns per pixel; required (directly or via technology) when is_pixel_space=True.

None
is_pixel_space bool

True if this collection's geometry is in pixel units; False (default) if already in microns.

False
clip_boundary Any | None

Optional precomputed tissue boundary to clip outward rings to (takes precedence over clip_reference).

None
clip_reference Any | None

Optional point cloud (cells) from which a tissue alpha shape is computed on the fly to clip outward rings.

None
clip_alpha float

Inverse-alpha for the on-the-fly alpha shape (default 100).

100
**kwargs Any

Forwarded to the new :class:NeighborhoodCollection (e.g. name, data_dir).

{}

Returns:

Type Description
NeighborhoodCollection

A new NeighborhoodCollection whose observations are the gradient

NeighborhoodCollection

rings around obs_name.

Raises:

Type Description
ValueError

If this collection has no geometry.

KeyError

If obs_name is not found.

Examples:

>>> # nbhd holds one alpha-shape neighborhood per cell-type cluster
>>> grad_nbhd = nbhd.calc_gradient(obs_name="9", direction="both",
...                                bin_width=50, max_dist=200)
>>> grad_nbhd.calc_nbhd_by_gene(adata, by="cell")
>>> grad_nbhd.obs[["direction", "dist_start_um"]].head(3)

calc_nbhd_bordering(metric='border_ratio', key='bordering', category='leiden')

Calculate a neighborhood-by-neighborhood bordering relation.

Computes pairwise border relationships between neighborhoods and stores the square matrix in relations[key] (mdata.obsp).

Parameters:

Name Type Description Default
metric str

Border metric (e.g. "border_ratio", "binary").

'border_ratio'
key str

Name for the relation in relations.

'bordering'
category str

Neighborhood category recorded on the computed result.

'leiden'

Returns:

Type Description
spmatrix

The stored sparse relation matrix.

Raises:

Type Description
ValueError

If geometry is not set.

calc_nbhd_by_gene(adata=None, by='cell', modality_name=None, min_cells=1, data_dir=None, drop_missing=True)

Calculate a neighborhood-by-gene modality and attach it to self.mod.

Builds per-neighborhood gene expression — mean expression of contained cells (by="cell") or transcript counts (by="cell-free").

Parameters:

Name Type Description Default
adata AnnData | None

Cell-level AnnData (required when by="cell"); needs spatial coordinates in obsm["spatial"].

None
by str

"cell" for cell-derived mean expression or "cell-free" for transcript counts.

'cell'
modality_name str | None

Key for the modality; defaults to "gene" (cell-derived) or "gene_cell_free" (transcript-derived).

None
min_cells int

Minimum cells/transcripts for a neighborhood to be kept.

1
data_dir str | None

Transcript directory for by="cell-free"; defaults to self.data_dir.

None
drop_missing bool

When True (default), neighborhoods with fewer than min_cells cells (or transcripts) are removed from the collection entirely. When False, the collection keeps all neighborhoods and the modality is attached with zero-filled rows for those that fall below min_cells.

True

Returns:

Type Description
None

None — the modality is attached to self.mod.

Raises:

Type Description
ValueError

If adata is missing for by="cell", or data_dir is missing for by="cell-free".

calc_nbhd_by_pop(adata, category='leiden', modality_name='population', output='proportion', min_cells=5, drop_missing=True)

Calculate a neighborhood-by-population modality and attach it to self.mod.

Spatially assigns cells to neighborhoods and, per neighborhood, counts cells per category value to form a neighborhood (rows) by population (columns) feature matrix.

Parameters:

Name Type Description Default
adata AnnData

Cell-level AnnData with spatial coordinates in obsm["spatial"] and category in obs.

required
category str

obs column naming the population/cell-type/cluster.

'leiden'
modality_name str

Key for the modality in self.mod.

'population'
output str

"proportion" (within-neighborhood fractions) or "counts".

'proportion'
min_cells int

Minimum cells for a neighborhood to be included.

5
drop_missing bool

When True (default), neighborhoods with fewer than min_cells cells are removed from the collection entirely so the observation axis only contains neighborhoods with data. When False, the collection keeps all neighborhoods and the modality is attached with zero-filled rows for those that fall below min_cells.

True

Returns:

Type Description
None

None — the modality is attached to self.mod[modality_name].

calc_nbhd_overlap(metric='iou', key='overlap', category='leiden')

Calculate a neighborhood-by-neighborhood overlap relation.

Computes pairwise geometric overlap between neighborhoods and stores the square matrix in relations[key] (mdata.obsp).

Parameters:

Name Type Description Default
metric str

Overlap metric — "iou" (intersection over union), "ioa" (intersection over the row neighborhood's area), or "intersection" (raw intersection area).

'iou'
key str

Name for the relation in relations.

'overlap'
category str

Neighborhood category recorded on the computed result.

'leiden'

Returns:

Type Description
spmatrix

The stored sparse relation matrix.

Raises:

Type Description
ValueError

If geometry is not set.

calc_nbhd_transcript_assignment(data_dir=None)

Add per-neighborhood transcript-assignment columns to obs.

From transcripts.parquet in data_dir, adds three obs columns (on the underlying MuData) for each neighborhood:

  • total_transcripts — transcripts falling inside the neighborhood.
  • unassigned_transcripts — those with cell_id == "UNASSIGNED".
  • transcript_assignment_proportion — assigned / total (0.0 when the neighborhood has no transcripts).

Assumption: the transcript-to-cell assignment is not computed here — it must already be present in the instrument data, with unassigned transcripts marked by the "UNASSIGNED" sentinel (Xenium convention). Only transcripts are needed — no adata or cell polygons.

Parameters:

Name Type Description Default
data_dir str | None

Directory containing transcripts.parquet; defaults to self.data_dir.

None

Returns:

Type Description
None

None — the three columns are added to self.obs.

Raises:

Type Description
ValueError

If geometry or a usable data_dir is missing, or the transcripts lack a cell_id column. A complete absence of the "UNASSIGNED" sentinel only warns.

from_gdf(gdf, nbhd_type='neighborhood', **kwargs) classmethod

Create a NeighborhoodCollection from a neighborhood GeoDataFrame.

Convenience wrapper for NeighborhoodCollection(gdf=gdf, nbhd_type=nbhd_type, **kwargs).

Parameters:

Name Type Description Default
gdf GeoDataFrame

Neighborhood geometry; each row becomes an observation.

required
nbhd_type str

Label for how the neighborhoods were made.

'neighborhood'
**kwargs Any

Forwarded to the constructor.

{}

Returns:

Type Description
NeighborhoodCollection

A new NeighborhoodCollection.

load_transformation_matrix(data_dir=None)

Load the micron-to-pixel transformation matrix from DegaFiles.

Reads micron_to_image_transform.csv and stores it via :meth:set_transformation_matrix. Later this matrix can instead be supplied directly (e.g. from SpatialData).

Parameters:

Name Type Description Default
data_dir str | None

Directory containing the transform CSV; defaults to self.data_dir.

None

Returns:

Type Description
ndarray

The loaded matrix as a float ndarray.

Raises:

Type Description
ValueError

If no data_dir is available.

set_transformation_matrix(matrix)

Set the micron-to-pixel affine transformation matrix.

Parameters:

Name Type Description Default
matrix Any

Affine mapping micron coordinates (the geometry's native space) to image/pixel space, as a (2, 3) or (3, 3) array.

required

Returns:

Type Description
ndarray

The stored matrix as a float ndarray. It is also mirrored into

ndarray

uns so it round-trips through write/read.

to_pixel_gdf()

Return the neighborhood geometry ready for pixel-space visualization.

Adds a geometry_pixel column (micron geometry transformed to image space via the stored transformation matrix) and leaves the original micron geometry intact. The result can be passed straight to Landscape(nbhd=...), which renders geometry_pixel directly when present rather than applying its own transform.

Returns:

Type Description
GeoDataFrame

A copy of gdf with an added geometry_pixel column.

Raises:

Type Description
ValueError

If geometry or the transformation matrix is not set.

alpha_shape_cell_clusters(adata, cat='cluster', alphas=(100, 150, 200, 250, 300, 350), meta_cluster=None)

Compute alpha shapes for each cluster in the cell metadata.

Parameters

adata : AnnData AnnData object with cell metadata in obs and spatial coordinates in obsm["spatial"]. cat : str Column name in adata.obs containing cluster/category labels. alphas : Sequence[float] List of inverse alpha values to compute shapes for. meta_cluster : pd.DataFrame | None Optional DataFrame with cluster metadata including 'color' column. If not provided, colors will be extracted from adata.uns[f'{cat}_colors'] if available, otherwise defaults to black.

Returns

gpd.GeoDataFrame GeoDataFrame with alpha shapes for each cluster at each alpha value.

calculate_gradient(source, direction='both', bin_width=10, max_dist=50, *, technology=None, scale_um_per_pixel=None, is_pixel_space=False, clip_boundary=None, clip_reference=None, clip_alpha=100, add_colors=True, nbhd_col='name')

Generate concentric gradient rings outward from and/or inward into an ROI.

Starting from the boundary of the merged source geometry, this builds fixed-width bands every bin_width microns out to max_dist:

  • Outward bands grow away from the ROI (positive distance). Use clip_boundary/clip_reference to stop them from running off the tissue.
  • Inward bands erode into the ROI (negative distance) and stop automatically once the geometry erodes to nothing.

Each band is one row of the returned GeoDataFrame, ordered inner-most to outer-most, with a signed distance so you can spatially join cells to bands (gpd.sjoin) and correlate composition or expression against distance from the ROI edge.

Parameters:

Name Type Description Default
source GeoDataFrame | GeoSeries | BaseGeometry | NeighborhoodCollection

The ROI. May be a GeoDataFrame/GeoSeries (all rows are dissolved into one shape), a :class:~celldega.nbhd.collection.NeighborhoodCollection (its gdf is used), or a bare shapely (Multi)Polygon.

required
direction str

"outward", "inward", or "both" (default).

'both'
bin_width float

Width of each ring in microns (default 10).

10
max_dist float

Maximum distance from the ROI boundary in microns (default 50). Applied symmetrically to outward and inward bands.

50
technology str | None

Imaging platform (e.g. "Xenium") used to look up scale_um_per_pixel when the geometry is in pixel space. Ignored if scale_um_per_pixel is given.

None
scale_um_per_pixel float | None

Microns per pixel. Required (directly or via technology) when is_pixel_space=True.

None
is_pixel_space bool

True if source geometry is in pixel units; False (default) if it is already in microns — the natural space of a NeighborhoodCollection. Controls how micron ring widths are converted to the geometry's units and how areas are reported.

False
clip_boundary Any | None

Optional precomputed tissue boundary (GeoDataFrame/GeoSeries/geometry) to clip outward rings to. Takes precedence over clip_reference. Use this to pass your own whole-tissue alpha shape at an alpha of your choosing.

None
clip_reference Any | None

Optional point cloud (GeoDataFrame of cells, a GeoSeries, or an (N, 2) array) from which a tissue alpha shape is computed on the fly to clip outward rings. Must be in the same coordinate space as source.

None
clip_alpha float

Inverse-alpha value for the on-the-fly alpha shape (default 100). Larger values trace tissue boundaries more loosely.

100
add_colors bool

If True (default), add a color column (Blues for outward bands, Reds for inward) for visualization.

True
nbhd_col str

Name of the band-identifier column in the output (default "name"), matching what NeighborhoodCollection expects.

'name'

Returns:

Type Description
GeoDataFrame

A GeoDataFrame with one row per ring, ordered inner-most to

GeoDataFrame

outer-most, and columns:

GeoDataFrame
  • name / ring_range_um — band label, e.g. "out (+0~+10) µm".
GeoDataFrame
  • direction"outward" or "inward".
GeoDataFrame
  • dist_start_um / dist_end_um — signed band edges in microns.
GeoDataFrame
  • area / area_um2 / area_px2 — band areas.
GeoDataFrame
  • color — hex color (when add_colors).
GeoDataFrame
  • geometry — the band polygon, in the source coordinate space.

Raises:

Type Description
ValueError

If direction is invalid, or is_pixel_space=True without a resolvable scale_um_per_pixel.

Examples:

Inward and outward micron-space rings straight into a collection-ready GeoDataFrame::

>>> import celldega as dega
>>> gdf_rings = dega.nbhd.calculate_gradient(
...     gdf_tumor, direction="both", bin_width=10, max_dist=50
... )
>>> gdf_rings[["name", "direction", "dist_start_um"]].head(3)

Outward-only rings from pixel-space geometry, clipped to a tissue alpha shape computed on the fly from cell centroids so the rings cannot run off the tissue::

>>> gdf_rings = dega.nbhd.calculate_gradient(
...     gdf_tumor,
...     direction="outward",
...     technology="Xenium",
...     is_pixel_space=True,
...     clip_reference=gdf_cells,
...     clip_alpha=100,
... )

Tag cells with the band they fall in for downstream gradient analysis::

>>> joined = gpd.sjoin(
...     gdf_cells, gdf_rings[["ring_range_um", "geometry"]],
...     how="left", predicate="within",
... )

filter_alpha_shapes(gdf_alpha, alpha, min_area=0, clean_names=True)

Filter alpha shapes by a specific alpha value and optionally clean up names.

Alpha shapes computed by alpha_shape_cell_clusters have names in the format {category}_{alpha} (e.g., "cluster_0_150"). This function filters to a specific alpha value and removes the trailing _{alpha} suffix from names.

Parameters

gdf_alpha : gpd.GeoDataFrame GeoDataFrame of alpha shapes with 'inv_alpha', 'area', 'name', and 'cat' columns. Typically the output of alpha_shape_cell_clusters. alpha : float The inverse alpha value to filter for (must match values in 'inv_alpha' column). min_area : float, default 0 Minimum area threshold. Shapes with area <= min_area are excluded. clean_names : bool, default True If True, removes the trailing _{alpha} suffix from the 'name' column, leaving just the category name (e.g., "cluster_0" instead of "cluster_0_150").

Returns

gpd.GeoDataFrame Filtered GeoDataFrame with optionally cleaned names.

Examples

gdf_alpha = dega.nbhd.alpha_shape_cell_clusters(adata, cat="leiden") gdf_filtered = dega.nbhd.filter_alpha_shapes(gdf_alpha, alpha=150)

Names are now just category names without the alpha suffix

print(gdf_filtered["name"].tolist()[:3]) ['0', '1', '2']

generate_hextile(adata, diameter=100)

Generate a hexagonal grid over the bounding box of cell spatial coordinates.

Parameters

adata : AnnData AnnData object with spatial coordinates in obsm["spatial"]. diameter : float, default 100 Diameter of each hexagon in the same units as the spatial coordinates (typically microns).

Returns

gpd.GeoDataFrame GeoDataFrame with hexagon geometries covering the spatial extent. Columns: "name" (hex_0, hex_1, ...), "geometry" (Polygon).

Examples

gdf_hex = dega.nbhd.generate_hextile(adata, diameter=100) gdf_hex.shape (1234, 2)

hextile_niche(gdf_hex, adata_hex, category='leiden', dissolve=True)

Create niche polygons from hextiles based on clustering results.

Takes hexagon geometries and assigns them to niches based on clustering (e.g., Leiden clustering of hexagon population distributions). Optionally dissolves adjacent hexagons of the same niche into unified polygons.

Parameters

gdf_hex : gpd.GeoDataFrame GeoDataFrame of hexagon geometries. Must be indexed by hexagon name (matching adata_hex.obs.index). adata_hex : AnnData AnnData object containing clustering results in obs[category]. The index must match the hexagon names in gdf_hex. Colors can be provided in uns[f"{category}_colors"]. category : str, default "leiden" Column name in adata_hex.obs containing the niche/cluster assignment. dissolve : bool, default True If True, dissolve adjacent hexagons of the same niche into unified MultiPolygon geometries. If False, return individual hexagons with their niche assignment.

Returns

gpd.GeoDataFrame If dissolve=True: GeoDataFrame with dissolved niche polygons. Columns: "name", "cat", "geometry", "color", "area". If dissolve=False: GeoDataFrame with individual hexagons and niche assignment. Columns: "name", "cat", "geometry", "color".

Examples

Generate hexagons and compute population distribution

gdf_hex = dega.nbhd.generate_hextile(adata, diameter=100) adata_hex = dega.nbhd.calc_nbhd_by_pop(adata, gdf_hex, category="leiden")

Cluster hexagons by population similarity (e.g., using scanpy)

import scanpy as sc sc.pp.pca(adata_hex) sc.pp.neighbors(adata_hex) sc.tl.leiden(adata_hex)

Create dissolved niche polygons

gdf_niche = dega.nbhd.hextile_niche(gdf_hex, adata_hex, category="leiden")

Or keep individual hexagons with niche assignment

gdf_hex_niche = dega.nbhd.hextile_niche(gdf_hex, adata_hex, dissolve=False)