Algorithm Layer
Experimental - Not Reviewed
Content may be incomplete or inaccurate.
Purpose
The Algorithm Layer forms the foundation of StarryNight, providing Python functions that implement core image processing logic. This layer solves the fundamental problem of encapsulating scientific algorithms with minimal dependencies on infrastructure concerns, enabling reuse across different execution contexts and frameworks. By minimizing dependencies to essential utilities and templates, algorithms can be developed, tested, and validated with clear interfaces, ensuring that scientific logic remains focused and maintainable.
Responsibilities
- Implement standalone Python functions for image processing tasks
- Organize algorithms by workflow type in
starrynight/src/starrynight/algorithms/
- Core workflows:
illum_calc
,illum_apply
,analysis
,align
,preprocess
- Specialized workflows:
segcheck
,presegcheck
,stitchcrop
,inventory
,index
- Shared execution:
cp.py
provides common CellProfiler execution
- Core workflows:
- Maintain clear input/output contracts using standard Python types and cloudpathlib paths
- NOT responsible for: Command-line interfaces, workflow orchestration, execution management, or any infrastructure concerns
Key Design Decisions
-
Functional Independence (Decision #2): Algorithm functions minimize dependencies, importing only necessary utilities from
starrynight.utils
,templates
, andparsers
. This controlled dependency approach balances practical code reuse with maintaining clear boundaries. Core algorithm logic remains independent, while common utilities for data formatting and template rendering are shared. -
Three-Function Pattern (Decision #7): CellProfiler-based algorithms follow a consistent pattern of LoadData generation (
gen_*_load_data
), Pipeline generation (gen_*_cppipe
), and Execution (run_cp
). The execution function is shared across all CellProfiler workflows, promoting code reuse. This standardization enables predictable integration while maintaining flexibility for non-CellProfiler algorithms.
Interfaces
Inputs
- File paths via cloudpathlib AnyPath (supporting local and cloud storage)
- Algorithm-specific parameters as primitive Python types
- Data structures using standard libraries (pandas/polars DataFrames, NumPy arrays)
Outputs
- Processed data files written to specified paths
- Return values as standard Python types or None for side-effect operations
- Generated artifacts (CSV files, .cppipe definitions, processed images)
Dependencies
- Internal dependencies:
starrynight.utils
(data formatting),starrynight.templates
(template rendering),starrynight.parsers
(file parsing) - External dependencies: cloudpathlib, pandas/polars, NumPy, scikit-image, CellProfiler (for specific algorithms)
Patterns
The algorithm layer follows consistent patterns for implementing image processing workflows. Here's the typical structure found in starrynight/src/starrynight/algorithms/
:
Three-Function Pattern
The pattern consists of three complementary functions that work together for CellProfiler workflows:
-
LoadData Generation (
gen_*_load_data
): Reads input data, applies filtering and transformations, then writes a LoadData CSV file. These functions leverage shared utilities fromstarrynight.utils
for consistent data formatting across workflows. -
Pipeline Generation (
gen_*_cppipe
): Configures CellProfiler modules, builds the processing pipeline, and saves the configuration. The template system fromstarrynight.templates
ensures consistent pipeline generation while allowing workflow-specific customization. -
Execution (
run_cp
): A shared function that loads the pipeline, configures the execution context, and runs the processing. This common execution function is implemented instarrynight/src/starrynight/algorithms/cp.py
and used by all CellProfiler-based algorithms.
This pattern enables consistent integration while maintaining flexibility for non-CellProfiler algorithms that may only need subset of these functions.
Naming Conventions
Functions follow predictable naming patterns:
gen_*_load_data
for LoadData generation functionsgen_*_cppipe
for pipeline generation functionsrun_*
for execution functions
Path Handling
All paths use cloudpathlib's AnyPath for cloud/local compatibility, ensuring algorithms work seamlessly across different storage backends.
Utility Delegation
Common operations are delegated to starrynight.utils
modules, promoting code reuse and consistent behavior across algorithms.
Separation of Concerns
Data preparation, pipeline configuration, and execution are kept as distinct functions, allowing flexible composition and testing of individual components.
Implementation Location
All algorithm implementations can be found in:
- Primary location:
starrynight/src/starrynight/algorithms/
- Tests:
starrynight/tests/algorithms/
Each algorithm file typically contains the relevant functions for its workflow, following the patterns described above.
See Also
- Next: CLI Layer