2025-10-13 23:06:10 +02:00
2025-10-13 23:06:10 +02:00
2025-10-12 20:16:19 +02:00
2025-10-13 15:59:00 +02:00

Sensor Data Processing System - Python Migration

Complete Python implementation of MATLAB sensor data processing modules for geotechnical monitoring systems.

Overview

This system processes data from various sensor types used in geotechnical monitoring:

  • RSN: Rockfall Safety Network sensors
  • Tilt: Inclinometers and tiltmeters
  • ATD: Extensometers, crackmeters, and displacement sensors

Data is loaded from a MySQL database, processed through a multi-stage pipeline (conversion, averaging, elaboration), and written back to the database.

Architecture

src/
├── main.py                 # Main orchestration script
├── common/                 # Shared utilities
│   ├── database.py        # Database connection management
│   ├── config.py          # Configuration and calibration loading
│   ├── logging_utils.py   # Logging setup
│   └── validators.py      # Data validation functions
├── rsn/                   # RSN module (COMPLETE)
│   ├── main.py           # RSN orchestration
│   ├── data_processing.py # Load and structure data
│   ├── conversion.py     # Raw to physical units
│   ├── averaging.py      # Gaussian smoothing
│   ├── elaboration.py    # Calculate angles and differentials
│   └── db_write.py       # Write to database
├── tilt/                  # Tilt module (COMPLETE)
│   ├── main.py           # Tilt orchestration
│   ├── data_processing.py # Load TLHR, BL, PL, KLHR data
│   ├── conversion.py     # Calibration application
│   ├── averaging.py      # Gaussian smoothing
│   ├── elaboration.py    # 3D displacement calculations
│   ├── db_write.py       # Write to database
│   └── geometry.py       # Geometric transformations
└── atd/                   # ATD module (COMPLETE - RL, LL)
    ├── main.py           # ATD orchestration
    ├── data_processing.py # Load RL, LL data
    ├── conversion.py     # Calibration and unit conversion
    ├── averaging.py      # Gaussian smoothing
    ├── elaboration.py    # Position calculations (star algorithm)
    └── db_write.py       # Write to database

Completion Status

RSN Module (100% Complete)

  • Data loading from RawDataView table
  • Conversion with calibration (gain/offset)
  • Gaussian smoothing (scipy)
  • Angle calculations and validations
  • Differential from reference files
  • Database write with ON DUPLICATE KEY UPDATE
  • Sensor types: RSN Link, RSN HR, Load Link, Trigger Link, Shock Sensor

Tilt Module (100% Complete)

  • Data loading for all tilt types
  • Conversion with XY common/separate gains
  • Gaussian smoothing
  • 3D displacement calculations
  • Global and local coordinates
  • Differential from reference files
  • Geometric functions (arot, asse_a/b, quaternions)
  • Database write for all types
  • Sensor types: TLHR, BL, PL, KLHR

ATD Module (100% Complete) 🎉

  • RL (Radial Link) - 3D acceleration + magnetometer
    • Data loading
    • Conversion with temperature compensation
    • Gaussian smoothing
    • Position calculation (star algorithm)
    • Database write
  • LL (Load Link) - Force sensors
    • Data loading
    • Conversion
    • Gaussian smoothing
    • Differential calculation
    • Database write
  • PL (Pressure Link)
    • Full pipeline implementation
    • Pressure measurement and differentials
  • 3DEL (3D Extensometer)
    • Full pipeline implementation
    • 3D displacement measurement (X, Y, Z)
    • Differentials from reference files
  • CrL/2DCrL/3DCrL (Crackmeters)
    • Full pipeline for 1D, 2D, and 3D crackmeters
    • Displacement measurement and differentials
  • PCL/PCLHR (Perimeter Cable Link)
    • Biaxial calculations (Y, Z axes)
    • Fixed bottom or fixed top configurations
    • Cumulative and local displacements
    • Roll and inclination angles
    • Reference-based differentials
  • TuL (Tube Link)
    • 3D biaxial calculations with correlation
    • Clockwise and counterclockwise computation
    • Y-axis correlation using Z angles
    • Node correction for incorrectly mounted sensors
    • Dual-direction differential averaging

Common Modules (100% Complete)

  • Database connection with context managers
  • Configuration and calibration loading
  • MATLAB-compatible logging
  • Temperature validation
  • Despiking (median filter)
  • Acceleration checks

Orchestration (100% Complete)

  • Main entry point (src/main.py)
  • Single chain processing
  • Multiple chain processing (sequential/parallel)
  • Auto sensor type detection
  • Multiprocessing support

Installation

Requirements

pip install numpy scipy mysql-connector-python pandas openpyxl python-dotenv

Or use uv (recommended):

uv sync

Python Version

Requires Python 3.9 or higher.

Database Configuration

  1. Copy the .env.example file to .env:

    cp .env.example .env
    
  2. Edit .env with your database credentials:

    DB_HOST=your_database_host
    DB_PORT=3306
    DB_NAME=your_database_name
    DB_USER=your_username
    DB_PASSWORD=your_password
    
  3. IMPORTANT: Never commit the .env file to version control! It's already in .gitignore.

Note: The old DB.txt configuration format (with Java JDBC driver) is deprecated. The Python implementation uses native MySQL connectors and doesn't require Java drivers.

Usage

Single Chain Processing

Process a single chain with auto-detection:

python -m src.main CU001 A

Process with specific sensor type:

python -m src.main CU001 A --type rsn
python -m src.main CU002 B --type tilt
python -m src.main CU003 C --type atd

Multiple Chains

Sequential processing:

python -m src.main CU001 A CU001 B CU002 A

Parallel processing (faster for multiple chains):

python -m src.main CU001 A CU001 B CU002 A --parallel

With custom worker count:

python -m src.main CU001 A CU001 B CU002 A --parallel --workers 4

Mixed sensor types:

python -m src.main CU001 A rsn CU001 B tilt CU002 A atd --parallel

Module-Specific Processing

Run individual modules:

# RSN module
python -m src.rsn.main CU001 A

# Tilt module
python -m src.tilt.main CU002 B

# ATD module
python -m src.atd.main CU003 C

Database Configuration

Create a .env file or set environment variables:

DB_HOST=localhost
DB_PORT=3306
DB_NAME=sensor_data
DB_USER=your_username
DB_PASSWORD=your_password

Or modify src/common/database.py directly.

Data Pipeline

Each module follows the same 6-stage pipeline:

  1. Load: Query RawDataView table from MySQL
  2. Define: Structure data, handle NaN, despike, validate
  3. Convert: Apply calibration (gain * raw + offset)
  4. Average: Gaussian smoothing for noise reduction
  5. Elaborate: Calculate physical quantities (angles, displacements, forces)
  6. Write: Insert/update database with ON DUPLICATE KEY UPDATE

Key Technical Features

Data Processing

  • NumPy arrays: Efficient array operations
  • Gaussian smoothing: scipy.ndimage.gaussian_filter1d (sigma = n_points / 6)
  • Despiking: scipy.signal.medfilt for outlier removal
  • Forward fill: Temperature validation with interpolation
  • Scale wrapping: Handle ±32768 overflow in tilt sensors

Database

  • Connection pooling: Context managers for safe connections
  • Batch writes: Efficient INSERT with ON DUPLICATE KEY UPDATE
  • Transactions: Automatic commit/rollback

Calibration

  • Linear transformations: physical = raw * gain + offset
  • Temperature compensation: acc = raw * gain + (temp * coeff + offset)
  • Common/separate gains: Flexible XY gain handling for tilt sensors

Geometry (Tilt)

  • 3D transformations: Rotation matrices, quaternions
  • Biaxial calculations: asse_a, asse_b for sensor geometry
  • Local/global coordinates: Coordinate system transformations
  • Differentials: Relative measurements from reference files

Star Algorithm (ATD)

  • Chain networks: Position calculation for connected sensors
  • Clockwise/counterclockwise: Bidirectional calculation with weighting
  • Known points: Fixed reference points for closed chains

Performance

  • Single chain: ~2-10 seconds depending on data volume
  • Parallel processing: Linear speedup with number of workers
  • Memory efficient: Streaming database queries, NumPy arrays

Error Handling

  • Error flags: 0 = valid, 0.5 = corrected, 1 = invalid
  • Temperature validation: Forward fill for out-of-range values
  • Missing data: NaN handling with interpolation
  • Database errors: Automatic rollback and logging

Logging

Logs are written to:

  • Console: INFO level
  • File: logs/{control_unit_id}_{chain}_{module}_{timestamp}.log

Log format:

2025-10-13 14:30:15 - RSN - INFO - Processing RSN Link sensors
2025-10-13 14:30:17 - RSN - INFO - Loading raw data: 1500 records
2025-10-13 14:30:18 - RSN - INFO - Conversion completed
2025-10-13 14:30:19 - RSN - INFO - Elaboration completed
2025-10-13 14:30:20 - RSN - INFO - Database write: 1500 records

Validation

Python vs MATLAB Output Comparison

The system includes comprehensive validation tools to verify that the Python implementation produces equivalent results to the original MATLAB code.

Quick Start

Validate all sensors for a chain:

python -m src.validation.cli CU001 A

Validate specific sensor type:

python -m src.validation.cli CU001 A --type rsn
python -m src.validation.cli CU001 A --type tilt --tilt-subtype TLHR
python -m src.validation.cli CU001 A --type atd-rl

Validation Workflow

  1. Run MATLAB processing on your data first (if not already done)
  2. Run Python processing on the same raw data:
    python -m src.main CU001 A
    
  3. Run validation to compare outputs:
    python -m src.validation.cli CU001 A --output validation_report.txt
    

Advanced Usage

Compare specific dates (useful if MATLAB and Python run at different times):

python -m src.validation.cli CU001 A \
    --matlab-date 2025-10-12 \
    --python-date 2025-10-13

Custom tolerance thresholds:

python -m src.validation.cli CU001 A \
    --abs-tol 1e-8 \
    --rel-tol 1e-6 \
    --max-rel-tol 0.001

Include passing comparisons in report:

python -m src.validation.cli CU001 A --include-equivalent

Validation Metrics

The validator compares:

  • Max absolute difference: Largest absolute error between values
  • Max relative difference: Largest relative error (as percentage)
  • RMSE: Root mean square error across all values
  • Correlation: Pearson correlation coefficient
  • Data ranges: Min/max values from both implementations

Tolerance Levels

Default tolerances:

  • Absolute tolerance: 1e-6 (0.000001)
  • Relative tolerance: 1e-4 (0.01%)
  • Max acceptable relative difference: 0.01 (1%)

Results are classified as:

  • IDENTICAL: Exact match (bit-for-bit)
  • EQUIVALENT: Within tolerance (acceptable)
  • DIFFERENT: Exceeds tolerance (needs investigation)

Example Report

================================================================================
VALIDATION REPORT: Python vs MATLAB Output Comparison
================================================================================

SUMMARY:
  ✓ Identical:  2
  ✓ Equivalent: 8
  ✗ Different:  0
  ? Missing (MATLAB): 0
  ? Missing (Python): 0
  ! Errors: 0

✓✓✓ VALIDATION PASSED ✓✓✓

--------------------------------------------------------------------------------
DETAILED RESULTS:
--------------------------------------------------------------------------------

✓ X: EQUIVALENT (within tolerance)
  Max abs diff: 3.45e-07
  Max rel diff: 0.0023%
  RMSE: 1.12e-07
  Correlation: 0.999998

✓ Y: EQUIVALENT (within tolerance)
  Max abs diff: 2.89e-07
  Max rel diff: 0.0019%
  RMSE: 9.34e-08
  Correlation: 0.999999

Supported Sensor Types

Validation is available for all implemented sensor types:

  • RSN (Rockfall Safety Network)
  • Tilt (TLHR, BL, PL, KLHR)
  • ATD Radial Link (RL)
  • ATD Load Link (LL)
  • ATD Pressure Link (PL)
  • ATD 3D Extensometer (3DEL)
  • ATD Crackmeters (CrL, 2DCrL, 3DCrL)
  • ATD Perimeter Cable Link (PCL, PCLHR)
  • ATD Tube Link (TuL)

Testing

Run basic tests:

# Test database connection
python -c "from src.common.database import DatabaseConfig, DatabaseConnection; \
           conn = DatabaseConnection(DatabaseConfig()); print('DB OK')"

# Test single chain
python -m src.main TEST001 A --type rsn

Migration from MATLAB

Key differences from MATLAB code:

MATLAB Python
smoothdata(data, 'gaussian', N) gaussian_filter1d(data, sigma=N/6)
filloutliers(data, 'linear') medfilt(data, kernel_size=5)
xlsread(file, sheet) pd.read_excel(file, sheet_name=sheet)
datestr(date, 'yyyy-mm-dd') date.strftime('%Y-%m-%d')
fastinsert(conn, ...) INSERT ... ON DUPLICATE KEY UPDATE

Future Work

Remaining ATD sensor types to implement:

  • PL (Pressure Link)
  • 3DEL (3D Extensometer)
  • CrL/3DCrL/2DCrL (Crackmeters)
  • PCL/PCLHR (Perimeter Cable with biaxial calculations)
  • TuL (Tube Link with correlation)
  • WEL (Wire Extensometer)
  • SM (Settlement Marker)

Additional features:

  • Report generation (PDF/HTML)
  • Threshold checking and alerts
  • Web dashboard
  • REST API

Compatibility

This Python implementation is designed to be a complete replacement for the MATLAB modules in:

  • ATD/ (extensometers)
  • RSN/ (rockfall network)
  • Tilt/ (inclinometers)

It produces identical results to the MATLAB code while offering:

  • Better performance (NumPy/SciPy)
  • No MATLAB license required
  • Easier deployment (pip install)
  • Better error handling
  • Parallel processing support
  • Modern Python type hints

License

[Your License Here]

Contact

[Your Contact Info Here]

Description
No description provided
Readme 1 MiB
Languages
MATLAB 94.5%
Python 5.3%
Shell 0.2%