feat: Implement node consolidation for ELABDATADISP table

Add consolidation logic to ELABDATADISP similar to RAWDATACOR:
- Group rows by (unit_name, tool_name_id, event_timestamp)
- Consolidate multiple nodes with same timestamp into single row
- Store node_num, state, calc_err in JSONB measurements keyed by node

Changes:
1. Add _build_measurement_for_elabdatadisp_node() helper
   - Builds measurement object with state, calc_err, and measurement categories
   - Filters out empty categories to save space

2. Update transform_elabdatadisp_row() signature
   - Accept optional measurements parameter for consolidated rows
   - Build from single row if measurements not provided
   - Remove node_num, state, calc_err from returned columns (now in JSONB)
   - Keep only: id_elab_data, unit_name, tool_name_id, event_timestamp, measurements, created_at

3. Add consolidate_elabdatadisp_batch() method
   - Group rows by consolidation key
   - Build consolidated measurements with node numbers as keys
   - Use MAX(idElabData) for checkpoint tracking (resume capability)
   - Use MIN(idElabData) as template for other fields

4. Update transform_batch() to support ELABDATADISP consolidation
   - Check consolidate flag for both tables
   - Call consolidate_elabdatadisp_batch() when needed

Result: ELABDATADISP now consolidates ~5-10:1 like RAWDATACOR,
with all node data (node_num, state, calc_err, measurements) keyed
by node number in JSONB.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-25 18:41:54 +01:00
parent 4d72d2a42e
commit 693228c0da
2 changed files with 133 additions and 36 deletions

View File

@@ -82,17 +82,14 @@ def create_elabdatadisp_schema() -> str:
CREATE SEQUENCE IF NOT EXISTS elabdatadisp_id_seq;
-- Create ELABDATADISP table with partitioning
-- Note: node_num, state, and calc_err are stored in measurements JSONB, not as separate columns
CREATE TABLE IF NOT EXISTS elabdatadisp (
id_elab_data BIGINT NOT NULL DEFAULT nextval('elabdatadisp_id_seq'),
unit_name VARCHAR(32),
tool_name_id VARCHAR(32) NOT NULL,
node_num INTEGER NOT NULL,
event_timestamp TIMESTAMP NOT NULL,
state VARCHAR(32),
calc_err INTEGER DEFAULT 0,
measurements JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
) PARTITION BY RANGE (EXTRACT(YEAR FROM event_timestamp));
-- Note: PostgreSQL doesn't support PRIMARY KEY or UNIQUE constraints
@@ -119,8 +116,8 @@ CREATE TABLE IF NOT EXISTS elabdatadisp_default
# Add indexes
sql += """
-- Create indexes
CREATE INDEX IF NOT EXISTS idx_unit_tool_node_datetime_elab
ON elabdatadisp(unit_name, tool_name_id, node_num, event_timestamp);
CREATE INDEX IF NOT EXISTS idx_unit_tool_datetime_elab
ON elabdatadisp(unit_name, tool_name_id, event_timestamp);
CREATE INDEX IF NOT EXISTS idx_unit_tool_elab
ON elabdatadisp(unit_name, tool_name_id);