mysql2postgres

Files

alex 1430ef206f fix: Ensure complete node consolidation by ordering MySQL query by consolidation key

Root cause: Nodes 1-11 had IDs in 132M+ range while nodes 12-22 had IDs in 298-308
range, causing them to be fetched in batches thousands apart using keyset pagination
by ID. This meant they arrived as separate groups and were never unified into a
single consolidated row.

Solution: Order MySQL query by (UnitName, ToolNameID, EventDate, EventTime) instead
of by ID. This guarantees all rows for the same consolidation key arrive together,
ensuring they are grouped and consolidated into a single row with JSONB measurements
keyed by node number.

Changes:
- fetch_consolidation_groups_from_partition(): Changed from keyset pagination by ID
  to ORDER BY consolidation key. Simplify grouping logic since ORDER BY already ensures
  consecutive rows have same key.
- full_migration.py: Add cleanup of partial partitions on resume. When resuming and a
  partition was started but not completed, delete its incomplete data before
  re-processing to avoid duplicates. Also recalculate total_rows_migrated from actual
  database count.
- config.py: Add postgres_pk field to TABLE_CONFIGS to specify correct primary key
  column names in PostgreSQL (id vs id_elab_data).
- Cleanup: Remove temporary test scripts used during debugging

Performance note: ORDER BY consolidation key requires index for speed. Index
(UnitName, ToolNameID, EventDate, EventTime) created with ALGORITHM=INPLACE
LOCK=NONE to avoid blocking reads.

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

2025-12-26 18:22:23 +01:00

benchmark

fix: Add timeout settings and retry logic to MySQL connector

2025-12-21 09:53:34 +01:00

connectors

fix: Ensure complete node consolidation by ordering MySQL query by consolidation key

2025-12-26 18:22:23 +01:00

migrator

fix: Ensure complete node consolidation by ordering MySQL query by consolidation key

2025-12-26 18:22:23 +01:00

transformers

cleanup: Remove unnecessary debug logging from consolidation logic

2025-12-26 14:55:22 +01:00

utils

chore: Revert throughput reporting feature from progress tracker

2025-12-23 16:47:10 +01:00

__init__.py

feat: Add MySQL to PostgreSQL migration tool with JSONB transformation

2025-12-10 19:57:11 +01:00