StarryNight Architecture Overview
Navigating the Architecture Documentation
This document provides a high-level structural map of the StarryNight system and serves as your main guide to the architecture documentation. Below is a recommended reading path based on your role and interests:
If you are a... | Start here | Then explore |
---|---|---|
New developer | This overview + Practical Integration | Layer-specific docs based on your focus |
Algorithm developer | Algorithm Layer | CLI Layer → Module Layer |
Module developer | Module Layer | Pipeline Layer |
Pipeline developer | Pipeline Layer | Execution Layer |
Configuration developer | Configuration Layer | Relevant layer docs |
Biologist/Non-technical user | Architecture for Biologists | Practical Integration |
For different perspectives on the architecture:
- Biological analogies: See Architecture for Biologists
- Concrete implementation examples: See Practical Integration Walkthrough
- Detailed layer explanations: Follow links in the Layer Overview section below
Architecture at a Glance
The architecture consists of conceptual layers and their implementation:
flowchart TB
%% Top container for all conceptual layers
subgraph "Conceptual Architecture"
A[Algorithm Layer<br>Algorithm Sets with<br>Data/Pipeline/Execution Functions] -->|exposed as| B[CLI Layer<br>Command Groups<br>& Commands]
B -.->|invoked by| C[Module Layer<br>Bilayer Specs + <br>Pipecraft Compute Graphs]
C -->|composed into| D[Pipeline Layer<br>Sequential/Parallel<br>Workflows]
D -->|translated to| E[Execution Layer<br>Snakemake Rules & <br>Container Execution]
end
%% Bottom container for implementation directories
subgraph "Code Implementation"
I1["/starrynight/algorithms/"]
I2["/starrynight/cli/"]
I3["/starrynight/modules/"]
I4["/starrynight/pipelines/"]
I5["/pipecraft/"]
I6["/pipecraft/backend/"]
end
%% Connect layers to implementation
A -.-> I1
B -.-> I2
C -.-> I3
D -.-> I4 & I5
E -.-> I6
%% Styling
classDef code fill:#f9f9f9,stroke:#333,stroke-width:1px
class A,B,C,D,E code
classDef impl fill:#fffdf0,stroke:#333,stroke-width:1px
class I1,I2,I3,I4,I5,I6 impl
Note: The Configuration Layer (in /starrynight/experiments/
) provides parameters across all layers but is not shown in this diagram for clarity.
Core Design Principles
- Separation of concerns: Each layer has distinct responsibilities
- Progressive composition: Build complex pipelines from simple components
- Containerized execution: Run code in reproducible environments
- Configuration inference: The Configuration Layer provides adaptive parameters across all other layers through smart defaults and data-driven inference
Layer Overview
StarryNight consists of six interconnected layers:
Layer | Primary Responsibility | Key Components | Details |
---|---|---|---|
Algorithm | Core image processing without dependencies | Functions for data prep, pipeline generation, execution | Details → |
CLI | Command-line interface | Command groups, parameter parsing/validation | Details → |
Module | Standardized components | Input/output specs, compute graph generation | Details → |
Pipeline | Workflow composition | Sequential/parallel execution paths | Details → |
Execution | Backend runtime | Snakemake rules, container management | Details → |
Configuration | Parameter inference | Experiment configs, adaptive settings | Details → |
Data and Control Flow
The StarryNight architecture transforms data through two primary phases: pipeline composition and runtime execution. While shown as linear flows for clarity, the architecture supports iterative scientific workflows where researchers can adjust parameters and re-run specific steps based on results.
Pipeline Composition Phase
This phase focuses on how configurations become executable workflows:
sequenceDiagram
participant Config as Configuration Layer
participant Module as Module Layer
participant Pipeline as Pipeline Layer
participant Execution as Execution Layer
Config->>Module: Supply module parameters
Module->>Module: Generate compute graphs
Module->>Pipeline: Pass module specifications
Pipeline->>Pipeline: Compose workflow
Pipeline->>Execution: Submit workflow
Execution->>Execution: Translate to Snakemake rules
Runtime Execution Phase
This phase shows how workflows execute and process data:
sequenceDiagram
participant Execution as Execution Layer
participant Runtime as Container Runtime
participant CLI as CLI Layer
participant Algorithm as Algorithm Layer
participant Storage as File System
Execution->>Runtime: Schedule container execution
Runtime->>CLI: Invoke CLI commands
CLI->>Algorithm: Call algorithm functions
Algorithm->>Storage: Write intermediate results
Runtime->>Runtime: Process data through pipeline
Runtime->>Storage: Write final outputs
Key Architecture Steps
The complete flow can be summarized in these key steps:
- Configuration defines parameters for all layers
- Modules generate compute graphs from configuration
- Pipeline composes modules into workflows
- Execution translates workflows to backend-specific formats
- Container runtime executes the workflows
- CLI layer processes commands and invokes algorithms
- Algorithm layer performs core processing functions
- Results are stored in configured locations
Implementation Organization
StarryNight is organized as a monorepo with four main packages:
StarryNight Core (/starrynight/src/starrynight/
)
The foundation of the platform providing specialized algorithms for microscopy image analysis:
- CLI Tools: Command-line interfaces for each algorithm
- Algorithms: Image processing algorithms for microscopy data
- Modules System: Standardized module structure for algorithm implementation
- Parsers: File path parsing and metadata extraction
- Utilities: Common functions for file handling, data transformation, etc.
PipeCraft (/pipecraft/src/pipecraft/
)
The pipeline compiler and execution framework:
- Pipeline Definition: Python API for defining computational workflows
- Node System: Individual processing steps as configurable nodes
- Backend Abstraction: Support for local, Docker, and AWS Batch execution
- Template System: Pre-defined templates for common backends
Conductor (Job orchestration and API)
Manages the execution environment:
- REST API: API for job management and monitoring
- Database: Storage for project configurations and job results
- Job Management: Scheduling, execution, and monitoring of jobs
- WebSockets: Real-time updates on job status
Canvas (Web UI for pipeline configuration)
The web-based user interface:
- Job Configuration: Interface for setting up and parameterizing workflows
- Run Monitoring: Real-time status updates and logs
- Results Visualization: Interactive exploration of outputs
- File Management: Interface for managing input and output files
Extension Points
StarryNight provides these key extension points:
- New algorithms:
/starrynight/algorithms/
- New CLI commands:
/starrynight/cli/
- New modules:
/starrynight/modules/
- New experiment types:
/starrynight/experiments/
- New backends:
/pipecraft/backend/
For examples of how to extend these components, see the Practical Integration document.
Key Terms
- Algorithm Set: Group of related processing functions
- Module: Standardized component with inputs/outputs
- Pipeline: Composed workflow of modules
- Execution Backend: System that runs pipelines
- Experiment: Configuration for a specific workflow
- Bilayers: Schema system for module specifications
- Compute Graph: Definition of operations and connections
For practical examples showing how these concepts work together, see Practical Integration