Commit Graph

6 Commits

Author SHA1 Message Date
fe2d173b0f Optimize consolidation fetching with GROUP BY and reduced limit
Changed consolidation_group_limit from 100k to 10k for faster queries.

Reverted to GROUP BY approach for getting consolidation keys:
- Uses MySQL index efficiently: (UnitName, ToolNameID, NodeNum, EventDate, EventTime)
- GROUP BY with NodeNum ensures we don't lose any combinations
- Faster GROUP BY queries than large ORDER BY queries
- Smaller LIMIT = faster pagination

This matches the original optimization suggestion and should be faster.

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-25 22:22:30 +01:00
bb27f749a0 Implement partition-based consolidation for ELABDATADISP
Changed consolidation strategy to leverage MySQL partitioning:
- Added get_table_partitions() to list all partitions
- Added fetch_consolidation_groups_from_partition() to read groups by consolidation key
- Each group (UnitName, ToolNameID, EventDate, EventTime) is fetched completely
- All nodes of same group are consolidated into single row with JSONB measurements
- Process partitions sequentially for predictable memory usage

Key benefits:
- Guaranteed complete consolidation (no fragmentation across batches)
- Deterministic behavior - same group always consolidated together
- Better memory efficiency with partition limits (100k groups per query)
- Clear audit trail of which partition each row came from

Tested with partition d3: 6960 input rows → 100 consolidated rows (69.6:1 ratio)
with groups containing 24-72 nodes each.

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-25 21:49:30 +01:00
b09cfcf9df fix: Add timeout settings and retry logic to MySQL connector
Configuration improvements:
- Set read_timeout=300 (5 minutes) to handle long queries
- Set write_timeout=300 (5 minutes) for writes
- Set max_allowed_packet=64MB to handle larger data transfers

Retry logic:
- Added retry mechanism with max 3 retries on fetch failure
- Auto-reconnect on connection loss before retry
- Better error messages showing retry attempts

This fixes the 'connection is lost' error that occurs during
long-running migrations by:
1. Giving MySQL queries more time to complete
2. Allowing larger packet sizes for bulk data
3. Automatically recovering from connection drops

Fixes: 'Connection is lost' error during full migration
2025-12-21 09:53:34 +01:00
e381618255 fix: Support both uppercase and lowercase table names in TABLE_CONFIGS
- TABLE_CONFIGS now accepts both 'RAWDATACOR' and 'rawdatacor' as keys
- TABLE_CONFIGS now accepts both 'ELABDATADISP' and 'elabdatadisp' as keys
- Reuse same config dict for both cases to avoid duplication

This allows FullMigrator to work correctly when initialized with
uppercase table names from the CLI while DataTransformer works
with lowercase names.

Fixes: 'Unknown table: RAWDATACOR' error during migration
2025-12-10 20:28:19 +01:00
410b253808 fix: Update Pydantic v2 configuration for .env loading
- Fix ConfigDict model_config for Pydantic v2.12+ compatibility
- Add env_file and env_file_encoding to all config classes
- Each config class now properly loads from .env with correct prefix

Fixes: ValidationError when loading settings from .env file
CLI now works correctly with 'uv run python main.py'
2025-12-10 20:11:12 +01:00
62577d3200 feat: Add MySQL to PostgreSQL migration tool with JSONB transformation
Implement comprehensive migration solution with:
- Full and incremental migration modes
- JSONB schema transformation for RAWDATACOR and ELABDATADISP tables
- Native PostgreSQL partitioning (2014-2031)
- Optimized GIN indexes for JSONB queries
- Rich logging with progress tracking
- Complete benchmark system for MySQL vs PostgreSQL comparison
- CLI interface with multiple commands (setup, migrate, benchmark)
- Configuration management via .env file
- Error handling and retry logic
- Batch processing for performance (configurable batch size)

Database transformations:
- RAWDATACOR: 16 Val columns + units → single JSONB measurements
- ELABDATADISP: 25+ measurement fields → structured JSONB with categories

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-10 19:57:11 +01:00