fix: Order rows by consolidation key to keep related nodes together in batches
When fetching rows for consolidation, the original keyset pagination only ordered by id, which caused nodes from the same (unit, tool, timestamp) to be split across multiple batches. This resulted in incomplete consolidation, with some nodes being missed. Solution: Order by consolidation columns in addition to id: - Primary: id (for keyset pagination) - Secondary: UnitName, ToolNameID, EventDate, EventTime, NodeNum This ensures all nodes with the same (unit, tool, timestamp) are grouped together in the same batch, allowing proper consolidation within the batch. Fixes: Nodes being lost during ELABDATADISP consolidation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -257,11 +257,15 @@ class MySQLConnector:
|
||||
with self.connection.cursor() as cursor:
|
||||
# Use keyset pagination: fetch by id > last_id
|
||||
# This is much more efficient than OFFSET for large tables
|
||||
# Order by id first for pagination, then by consolidation key to keep
|
||||
# related nodes together in the same batch
|
||||
order_clause = f"`{id_column}` ASC, `UnitName` ASC, `ToolNameID` ASC, `EventDate` ASC, `EventTime` ASC, `NodeNum` ASC"
|
||||
|
||||
if last_id is None:
|
||||
query = f"SELECT * FROM `{table}` ORDER BY `{id_column}` ASC LIMIT %s"
|
||||
query = f"SELECT * FROM `{table}` ORDER BY {order_clause} LIMIT %s"
|
||||
cursor.execute(query, (batch_size,))
|
||||
else:
|
||||
query = f"SELECT * FROM `{table}` WHERE `{id_column}` > %s ORDER BY `{id_column}` ASC LIMIT %s"
|
||||
query = f"SELECT * FROM `{table}` WHERE `{id_column}` > %s ORDER BY {order_clause} LIMIT %s"
|
||||
cursor.execute(query, (last_id, batch_size))
|
||||
|
||||
rows = cursor.fetchall()
|
||||
|
||||
Reference in New Issue
Block a user