Critical bug: current_group and current_key were inside the while loop,
causing them to be reset on each batch iteration. When an incomplete group
spanned a batch boundary, it would be:
1. Buffered at end of batch N (in local current_group)
2. LOST when loop continued (new local variables created)
3. Re-fetched and yielded again in batch N+1
This caused the same consolidated record to be inserted many times.
Solution: Move current_group and current_key OUTSIDE while loop to persist
across batch iterations. Incomplete groups now properly merge across batch
boundaries without duplication.
Algorithm:
- Only yield groups when we're 100% certain they're complete
- A group is complete when the next key differs from current key
- At batch boundaries, incomplete groups stay buffered for next batch
- Resume always uses last_completed_key to avoid re-processing
This fixes the user's observation of 27 identical rows for the same
consolidated record.
🤖 Generated with Claude Code
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>