clean docs
This commit is contained in:
78
scripts/README.md
Normal file
78
scripts/README.md
Normal file
@@ -0,0 +1,78 @@
|
||||
# Migration Scripts
|
||||
|
||||
Utility scripts per la gestione della migrazione.
|
||||
|
||||
## sync_migration_state.py
|
||||
|
||||
Sincronizza la tabella `migration_state` con i dati effettivamente presenti in PostgreSQL.
|
||||
|
||||
### Quando usare
|
||||
|
||||
Usa questo script quando `migration_state` non è sincronizzato con i dati reali, ad esempio:
|
||||
- Dopo inserimenti manuali in PostgreSQL
|
||||
- Dopo corruzione dello stato
|
||||
- Prima di eseguire migrazione incrementale su dati già esistenti
|
||||
|
||||
### Come funziona
|
||||
|
||||
Per ogni tabella (rawdatacor, elabdatadisp):
|
||||
1. Trova la riga con MAX(created_at) - l'ultima riga inserita
|
||||
2. Estrae la consolidation key da quella riga
|
||||
3. Aggiorna `migration_state._global` con quella chiave
|
||||
|
||||
### Utilizzo
|
||||
|
||||
```bash
|
||||
# Eseguire dalla root del progetto
|
||||
python scripts/sync_migration_state.py
|
||||
```
|
||||
|
||||
### Output
|
||||
|
||||
```
|
||||
Syncing migration_state with actual PostgreSQL data...
|
||||
================================================================================
|
||||
|
||||
ELABDATADISP:
|
||||
Most recently inserted row (by created_at):
|
||||
created_at: 2025-12-30 11:58:24
|
||||
event_timestamp: 2025-12-30 14:58:24
|
||||
Consolidation key: (ID0290, DT0007, 2025-12-30, 14:58:24)
|
||||
✓ Updated migration_state with this key
|
||||
|
||||
RAWDATACOR:
|
||||
Most recently inserted row (by created_at):
|
||||
created_at: 2025-12-30 11:13:29
|
||||
event_timestamp: 2025-12-30 11:11:39
|
||||
Consolidation key: (ID0304, DT0024, 2025-12-30, 11:11:39)
|
||||
✓ Updated migration_state with this key
|
||||
|
||||
================================================================================
|
||||
✓ Done! Incremental migration will now start from the correct position.
|
||||
```
|
||||
|
||||
### Effetti
|
||||
|
||||
Dopo aver eseguito questo script:
|
||||
- `migration_state._global` sarà aggiornato con l'ultima chiave migrata
|
||||
- `python main.py migrate incremental` partirà dalla posizione corretta
|
||||
- Non verranno create duplicazioni (usa ON CONFLICT DO NOTHING)
|
||||
|
||||
### Avvertenze
|
||||
|
||||
- Esclude automaticamente dati corrotti (unit_name come `[Ljava.lang.String;@...`)
|
||||
- Usa `created_at` per trovare l'ultima riga inserita (non `event_timestamp`)
|
||||
- Sovrascrive lo stato globale esistente
|
||||
|
||||
### Verifica
|
||||
|
||||
Dopo aver eseguito lo script, verifica lo stato:
|
||||
|
||||
```sql
|
||||
SELECT table_name, partition_name, last_key
|
||||
FROM migration_state
|
||||
WHERE partition_name = '_global'
|
||||
ORDER BY table_name;
|
||||
```
|
||||
|
||||
Dovrebbe mostrare le chiavi più recenti per entrambe le tabelle.
|
||||
63
scripts/sync_migration_state.py
Executable file
63
scripts/sync_migration_state.py
Executable file
@@ -0,0 +1,63 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Sync migration_state with actual data in PostgreSQL tables."""
|
||||
|
||||
import sys
|
||||
sys.path.insert(0, '/home/alex/devel/mysql2postgres')
|
||||
|
||||
from src.connectors.postgres_connector import PostgreSQLConnector
|
||||
from src.migrator.state_manager import StateManager
|
||||
|
||||
def sync_table_state(table_name: str):
|
||||
"""Sync migration_state for a table with its actual data."""
|
||||
with PostgreSQLConnector() as pg_conn:
|
||||
cursor = pg_conn.connection.cursor()
|
||||
|
||||
# Find the row with MAX(created_at) - most recently inserted
|
||||
# Exclude corrupted data (Java strings)
|
||||
cursor.execute(f"""
|
||||
SELECT unit_name, tool_name_id,
|
||||
DATE(event_timestamp)::text as event_date,
|
||||
event_timestamp::time::text as event_time,
|
||||
created_at,
|
||||
event_timestamp
|
||||
FROM {table_name}
|
||||
WHERE unit_name NOT LIKE '[L%' -- Exclude corrupted Java strings
|
||||
ORDER BY created_at DESC
|
||||
LIMIT 1
|
||||
""")
|
||||
|
||||
result = cursor.fetchone()
|
||||
if not result:
|
||||
print(f"No data found in {table_name}")
|
||||
return
|
||||
|
||||
unit_name, tool_name_id, event_date, event_time, created_at, event_timestamp = result
|
||||
|
||||
print(f"\n{table_name.upper()}:")
|
||||
print(f" Most recently inserted row (by created_at):")
|
||||
print(f" created_at: {created_at}")
|
||||
print(f" event_timestamp: {event_timestamp}")
|
||||
print(f" Consolidation key: ({unit_name}, {tool_name_id}, {event_date}, {event_time})")
|
||||
|
||||
# Update global migration_state with this key
|
||||
state_mgr = StateManager(pg_conn, table_name, partition_name="_global")
|
||||
|
||||
last_key = {
|
||||
"unit_name": unit_name,
|
||||
"tool_name_id": tool_name_id,
|
||||
"event_date": event_date,
|
||||
"event_time": event_time
|
||||
}
|
||||
|
||||
state_mgr.update_state(last_key=last_key)
|
||||
print(f" ✓ Updated migration_state with this key")
|
||||
|
||||
if __name__ == "__main__":
|
||||
print("Syncing migration_state with actual PostgreSQL data...")
|
||||
print("="*80)
|
||||
|
||||
sync_table_state("elabdatadisp")
|
||||
sync_table_state("rawdatacor")
|
||||
|
||||
print("\n" + "="*80)
|
||||
print("✓ Done! Incremental migration will now start from the correct position.")
|
||||
Reference in New Issue
Block a user