# TS Pini Loader - TODO for Complete Refactoring ## Status: Essential Refactoring Complete ✅ **Current Implementation**: 508 lines **Legacy Script**: 2,587 lines **Reduction**: 80% (from monolithic to modular) --- ## ✅ Implemented Features ### Core Functionality - [x] Async/await architecture with aiomysql - [x] Multiple station type support (Leica, Trimble S7, S9, S7-inverted) - [x] Coordinate system transformations: - [x] CH1903 (Old Swiss system) - [x] CH1903+ / LV95 (New Swiss system via EPSG) - [x] UTM (Universal Transverse Mercator) - [x] Lat/Lon (direct) - [x] Project/folder name mapping (16 special cases) - [x] CSV parsing for different station formats - [x] ELABDATAUPGEO data insertion - [x] Basic mira (target point) lookup - [x] Proper logging and error handling - [x] Type hints and comprehensive docstrings --- ## ⏳ TODO: High Priority ### 1. Mira Creation Logic **File**: `ts_pini_loader.py`, method `_get_or_create_mira()` **Lines in legacy**: 138-160 **Current Status**: Stub implementation **What's needed**: ```python async def _get_or_create_mira(self, mira_name: str, lavoro_id: int, site_id: int) -> int | None: # 1. Check if mira already exists (DONE) # 2. If not, check company mira limits query = """ SELECT c.id, c.upgeo_numero_mire, c.upgeo_numero_mireTot FROM companies as c JOIN sites as s ON c.id = s.company_id WHERE s.id = %s """ # 3. If under limit, create mira if upgeo_numero_mire < upgeo_numero_mireTot: # INSERT INTO upgeo_mire # UPDATE companies mira counter # 4. Return mira_id ``` **Complexity**: Medium **Estimated time**: 30 minutes --- ### 2. Multi-Level Alarm System **File**: `ts_pini_loader.py`, method `_process_thresholds_and_alarms()` **Lines in legacy**: 174-1500+ (most of the script!) **Current Status**: Stub with warning message **What's needed**: #### 2.1 Threshold Configuration Loading ```python class ThresholdConfig: """Threshold configuration for a monitored point.""" # 5 dimensions x 3 levels = 15 thresholds attention_N: float | None intervention_N: float | None immediate_N: float | None attention_E: float | None intervention_E: float | None immediate_E: float | None attention_H: float | None intervention_H: float | None immediate_H: float | None attention_R2D: float | None intervention_R2D: float | None immediate_R2D: float | None attention_R3D: float | None intervention_R3D: float | None immediate_R3D: float | None # Notification settings (3 levels x 5 dimensions x 2 channels) email_level_1_N: bool sms_level_1_N: bool # ... (30 fields total) ``` #### 2.2 Displacement Calculation ```python async def _calculate_displacements(self, mira_id: int) -> dict: """ Calculate displacements in all dimensions. Returns dict with: - dN: displacement in North - dE: displacement in East - dH: displacement in Height - dR2D: 2D displacement (sqrt(dN² + dE²)) - dR3D: 3D displacement (sqrt(dN² + dE² + dH²)) - timestamp: current measurement time - previous_timestamp: baseline measurement time """ ``` #### 2.3 Alarm Creation ```python async def _create_alarm_if_threshold_exceeded( self, mira_id: int, dimension: str, # 'N', 'E', 'H', 'R2D', 'R3D' level: int, # 1, 2, 3 value: float, threshold: float, config: ThresholdConfig ) -> None: """Create alarm in database if not already exists.""" # Check if alarm already exists for this mira/dimension/level # If not, INSERT INTO alarms # Send email/SMS based on config ``` **Complexity**: High **Estimated time**: 4-6 hours **Dependencies**: Email/SMS sending infrastructure --- ### 3. Multiple Date Range Support **Lines in legacy**: Throughout alarm processing **Current Status**: Not implemented **What's needed**: - Parse `multipleDateRange` JSON field from mira config - Apply different thresholds for different time periods - Handle overlapping ranges **Complexity**: Medium **Estimated time**: 1-2 hours --- ## ⏳ TODO: Medium Priority ### 4. Additional Monitoring Types #### 4.1 Railway Monitoring **Lines in legacy**: 1248-1522 **What it does**: Special monitoring for railway tracks (binari) - Groups miras by railway identifier - Calculates transverse displacements - Different threshold logic #### 4.2 Wall Monitoring (Muri) **Lines in legacy**: ~500-800 **What it does**: Wall-specific monitoring with paired points #### 4.3 Truss Monitoring (Tralicci) **Lines in legacy**: ~300-500 **What it does**: Truss structure monitoring **Approach**: Create separate classes: ```python class RailwayMonitor: async def process(self, lavoro_id: int, miras: list[int]) -> None: ... class WallMonitor: async def process(self, lavoro_id: int, miras: list[int]) -> None: ... class TrussMonitor: async def process(self, lavoro_id: int, miras: list[int]) -> None: ... ``` **Complexity**: High **Estimated time**: 3-4 hours each --- ### 5. Time-Series Analysis **Lines in legacy**: Multiple occurrences with `find_nearest_element()` **Current Status**: Helper functions not ported **What's needed**: - Find nearest measurement in time series - Compare current vs. historical values - Detect trend changes **Complexity**: Low-Medium **Estimated time**: 1 hour --- ## ⏳ TODO: Low Priority (Nice to Have) ### 6. Progressive Monitoring **Lines in legacy**: ~1100-1300 **What it does**: Special handling for "progressive" type miras - Different calculation methods - Integration with externa data sources **Complexity**: Medium **Estimated time**: 2 hours --- ### 7. Performance Optimizations #### 7.1 Batch Operations Currently processes one point at a time. Could batch: - Coordinate transformations - Database inserts - Threshold checks **Estimated speedup**: 2-3x #### 7.2 Caching Cache frequently accessed data: - Threshold configurations - Company limits - Project metadata **Estimated speedup**: 1.5-2x --- ### 8. Testing #### 8.1 Unit Tests ```python tests/test_ts_pini_loader.py: - test_coordinate_transformations() - test_station_type_parsing() - test_threshold_checking() - test_alarm_creation() ``` #### 8.2 Integration Tests - Test with real CSV files - Test with mock database - Test coordinate edge cases (hemispheres, zones) **Estimated time**: 3-4 hours --- ## 📋 Migration Strategy ### Phase 1: Core + Alarms (Recommended Next Step) 1. Implement mira creation logic (30 min) 2. Implement basic alarm system (4-6 hours) 3. Test with real data 4. Deploy alongside legacy script **Total time**: ~1 working day **Value**: 80% of use cases covered ### Phase 2: Additional Monitoring 5. Implement railway monitoring (3-4 hours) 6. Implement wall monitoring (3-4 hours) 7. Implement truss monitoring (3-4 hours) **Total time**: 1.5-2 working days **Value**: 95% of use cases covered ### Phase 3: Polish & Optimization 8. Add time-series analysis 9. Performance optimizations 10. Comprehensive testing 11. Documentation updates **Total time**: 1 working day **Value**: Production-ready, maintainable code --- ## 🔧 Development Tips ### Working with Legacy Code The legacy script has: - **Deeply nested logic**: Up to 8 levels of indentation - **Repeated code**: Same patterns for 15 threshold checks - **Magic numbers**: Hardcoded values throughout - **Global state**: Variables used across 1000+ lines **Refactoring approach**: 1. Extract one feature at a time 2. Write unit test first 3. Refactor to pass test 4. Integrate with main loader ### Testing Coordinate Transformations ```python # Test data from legacy script test_cases = [ # CH1903 (system 6) {"east": 2700000, "north": 1250000, "system": 6, "expected_lat": ..., "expected_lon": ...}, # UTM (system 7) {"east": 500000, "north": 5200000, "system": 7, "zone": "32N", "expected_lat": ..., "expected_lon": ...}, # CH1903+ (system 10) {"east": 2700000, "north": 1250000, "system": 10, "expected_lat": ..., "expected_lon": ...}, ] ``` ### Database Schema Understanding Key tables: - `ELABDATAUPGEO`: Survey measurements - `upgeo_mire`: Target points (miras) - `upgeo_lavori`: Projects/jobs - `upgeo_st`: Stations - `sites`: Sites with coordinate system info - `companies`: Company info with mira limits - `alarms`: Alarm records --- ## 📊 Complexity Comparison | Feature | Legacy | Refactored | Reduction | |---------|--------|-----------|-----------| | **Lines of code** | 2,587 | 508 (+TODO) | 80% | | **Functions** | 5 (1 huge) | 10+ modular | +100% | | **Max nesting** | 8 levels | 3 levels | 63% | | **Type safety** | None | Full hints | ∞ | | **Testability** | Impossible | Easy | ∞ | | **Maintainability** | Very low | High | ∞ | --- ## 📚 References ### Coordinate Systems - **CH1903**: https://www.swisstopo.admin.ch/en/knowledge-facts/surveying-geodesy/reference-systems/local/lv03.html - **CH1903+/LV95**: https://www.swisstopo.admin.ch/en/knowledge-facts/surveying-geodesy/reference-systems/local/lv95.html - **UTM**: https://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system ### Libraries Used - **utm**: UTM <-> lat/lon conversions - **pyproj**: Swiss coordinate system transformations (EPSG:21781 -> EPSG:4326) --- ## 🎯 Success Criteria Phase 1 complete when: - [ ] All CSV files process without errors - [ ] Coordinate transformations match legacy output - [ ] Miras are created/updated correctly - [ ] Basic alarms are generated for threshold violations - [ ] No regressions in data quality Full refactoring complete when: - [ ] All TODO items implemented - [ ] Test coverage > 80% - [ ] Performance >= legacy script - [ ] All additional monitoring types work - [ ] Legacy script can be retired --- **Version**: 1.0 (Essential Refactoring) **Last Updated**: 2024-10-11 **Status**: Ready for Phase 1 implementation