Project Context: python-sentinel-pipeline (Status: May 30, 2026 - ROI & Notification Enhancement Phase)
- Fixed ROI Transparency: Resolved a regression where dark pixels (value 0) were incorrectly treated as transparent by removing harmful
srcNodata=0settings and ensuring properdstAlphahandling ingdal.Warp. - Multi-Tile ROI Mosaicing: Finalized the
roi_manager.pyrefactor to group input products by orbit and date, enabling seamless mosaicing of ROIs that span multiple tiles. - Enhanced Orbit Metadata: Updated
metadata_engine.pyto extract and storerelative_orbitandorbit_directionfor both Sentinel-1 and Sentinel-2, improving product traceability. - Enhanced ROI Notifications:
- Integrated
appriseintoroi_manager.pyto send targeted alerts for ROI crops. - Implemented decoupled image generation: downscaled previews for Bluesky and full-size high-quality JPEGs for direct notifications (Signal, etc.).
- Integrated
- Universal ROI Support:
- Refactored ROI logic to be sensor-agnostic; any product (S1, S2, FUSED) can now be targetted for ROI cropping.
- Hardened filename parsing and timestamp extraction using robust "pinch" logic to handle compound ROI names (e.g.,
Ust-Luga) and complex product types (e.g.,RADAR-BURN).
- Centralized Metadata Architecture:
- Refactored
metadata_engine.pyas the single source of truth for product identification, resolution mapping, and sidecar generation. - Simplified
rebuild_metadata.pyto delegate all heavy lifting to the engine, ensuring consistent metadata across the entire pipeline. - ROI crops now correctly inherit parent metadata like
cloud_cover.
- Refactored
- Viewer UI/UX Polish:
- Streamlined ROI display with dedicated "Regions of Interest" section and subheaders.
- Added dual-mode sorting for ROIs ("By Product" vs "By ROI") with full persistence.
- Corrected "Invalid Date" issues by fixing ISO timestamp formatting in sidecars.
- Updated translations across all supported languages (FI, SE, EN, DE).
- Download Progress Indicators:
- Implemented a real-time ASCII progress bar in
copernicus/_class.pyfor large Sentinel-1 downloads, providing immediate visual feedback on data volume and speed.
- Implemented a real-time ASCII progress bar in
("Performance & Reliability" Phase - April 2026)
- Multi-BBOX Support: Modified
functions.get_boxes()to support semicolon-separated lists and JSON lists. - Footprint Optimization: Implemented 10x mask downsampling and recursive hole-filling.
- Memory Safety: Introduced "Balanced Parallelism" limiting
ThreadPoolExecutorand GDAL sub-processes. - Alpha Integrity: Unified alpha mask generation across all S2 products.
- Viewer Enhancements: Robust
zIndexmanagement for Overlays and UI consistency improvements.
- Vectorization Scaling: Attempting to vectorize 10m masks with speckle noise is a CPU death trap. 100m downsampling with hole-filling provides identical UI utility for 1% of the compute cost.
- Sub-process Concurrency: When parallelizing tasks that invoke GDAL, the product of
Finalizer_Workers * GDAL_NUM_THREADSmust be carefully managed to avoid exponential memory spikes. - Variable Interpolation:
python-dotenvhas limits with complex string interpolation. Manual parsing and fallback resolution is sometimes necessary for robust configuration management.
- Docker Production: Finalize the Nginx + Pipeline production container suite. Needs conceptualizing.
- Verification: Monitor long-running queue for any remaining OOM edge cases.
The pipeline is now highly autonomous and resilient. It not only searches, downloads, processes, and manages metadata, but it now proactively predicts its next data intake, automatically generates targeted regional crops, and gracefully handles corrupted historical data.