Add comprehensive validation system and migrate to .env configuration

This commit includes:

1. Database Configuration Migration:
   - Migrated from DB.txt (Java JDBC) to .env (python-dotenv)
   - Added .env.example template with clear variable names
   - Updated database.py to use environment variables
   - Added python-dotenv>=1.0.0 to dependencies
   - Updated .gitignore to exclude sensitive files

2. Validation System (1,294 lines):
   - comparator.py: Statistical comparison with RMSE, correlation, tolerances
   - db_extractor.py: Database queries for all sensor types
   - validator.py: High-level validation orchestration
   - cli.py: Command-line interface for validation
   - README.md: Comprehensive validation documentation

3. Validation Features:
   - Compare Python vs MATLAB outputs from database
   - Support for all sensor types (RSN, Tilt, ATD)
   - Statistical metrics: max abs/rel diff, RMSE, correlation
   - Configurable tolerances (abs, rel, max)
   - Detailed validation reports
   - CLI and programmatic APIs

4. Examples and Documentation:
   - validate_example.sh: Bash script example
   - validate_example.py: Python programmatic example
   - Updated main README with validation section
   - Added validation workflow and troubleshooting guide

Benefits:
-  No Java driver needed (native Python connectors)
-  Secure .env configuration (excluded from git)
-  Comprehensive validation against MATLAB
-  Statistical confidence in migration accuracy
-  Automated validation reports

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-10-13 15:34:13 +02:00
parent 876ef073fc
commit 23c53cf747
25 changed files with 7476 additions and 83 deletions

61
validate_example.sh Executable file
View File

@@ -0,0 +1,61 @@
#!/bin/bash
# Example validation script
# Demonstrates how to run Python processing and validate against MATLAB output
set -e # Exit on error
# Configuration
CONTROL_UNIT="CU001"
CHAIN="A"
OUTPUT_DIR="validation_reports"
DATE=$(date +%Y-%m-%d_%H-%M-%S)
echo "========================================"
echo "Python vs MATLAB Validation Script"
echo "========================================"
echo "Control Unit: $CONTROL_UNIT"
echo "Chain: $CHAIN"
echo "Date: $DATE"
echo ""
# Create output directory
mkdir -p "$OUTPUT_DIR"
# Step 1: Run Python processing
echo "Step 1: Running Python processing..."
python -m src.main "$CONTROL_UNIT" "$CHAIN"
echo "✓ Python processing complete"
echo ""
# Step 2: Wait a moment for database commit
sleep 2
# Step 3: Run validation for all sensor types
echo "Step 2: Running validation..."
REPORT_FILE="$OUTPUT_DIR/${CONTROL_UNIT}_${CHAIN}_validation_${DATE}.txt"
python -m src.validation.cli "$CONTROL_UNIT" "$CHAIN" \
--output "$REPORT_FILE" \
--include-equivalent
echo "✓ Validation complete"
echo ""
# Step 4: Display summary
echo "========================================"
echo "Validation Summary"
echo "========================================"
cat "$REPORT_FILE"
echo ""
echo "Full report saved to: $REPORT_FILE"
# Check if validation passed
if grep -q "VALIDATION PASSED" "$REPORT_FILE"; then
echo "✓✓✓ SUCCESS: Python output matches MATLAB ✓✓✓"
exit 0
else
echo "✗✗✗ WARNING: Validation detected differences ✗✗✗"
echo "Please review the report above for details."
exit 1
fi