Files
mysql2postgres/test_generator_output.py
alex 79cd4f4559 fix: Fix duplicate group insertion in consolidation generator
Critical bug: current_group and current_key were inside the while loop,
causing them to be reset on each batch iteration. When an incomplete group
spanned a batch boundary, it would be:
1. Buffered at end of batch N (in local current_group)
2. LOST when loop continued (new local variables created)
3. Re-fetched and yielded again in batch N+1

This caused the same consolidated record to be inserted many times.

Solution: Move current_group and current_key OUTSIDE while loop to persist
across batch iterations. Incomplete groups now properly merge across batch
boundaries without duplication.

Algorithm:
- Only yield groups when we're 100% certain they're complete
- A group is complete when the next key differs from current key
- At batch boundaries, incomplete groups stay buffered for next batch
- Resume always uses last_completed_key to avoid re-processing

This fixes the user's observation of 27 identical rows for the same
consolidated record.

🤖 Generated with Claude Code

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2025-12-27 10:26:39 +01:00

2.0 KiB