This commit is contained in:
2025-12-30 15:33:32 +01:00
parent bcedae40fc
commit 03e39eb925
3 changed files with 60 additions and 10 deletions

View File

@@ -19,6 +19,10 @@
- **Configuration**: Added `PROGRESS_LOG_INTERVAL` to control logging frequency
- **Configuration**: Added `BENCHMARK_OUTPUT_DIR` to specify benchmark results directory
- **Documentation**: Updated README.md, MIGRATION_WORKFLOW.md, QUICKSTART.md, EXAMPLE_WORKFLOW.md with current implementation
- **Documentation**: Corrected index and partitioning documentation to reflect actual PostgreSQL schema:
- Uses `event_timestamp` (not separate event_date/event_time)
- Primary key includes `event_year` for partitioning
- Consolidation key is UNIQUE (unit_name, tool_name_id, event_timestamp, event_year)
### Removed
- **migration_state.json**: Replaced by PostgreSQL table

View File

@@ -338,6 +338,33 @@ CREATE TABLE rawdatacor_2024 PARTITION OF rawdatacor
PostgreSQL automatically routes INSERTs to the correct partition based on `event_year`.
### Indexes in PostgreSQL
Both tables have these indexes automatically created:
**Primary Key** (required for partitioned tables):
```sql
-- Must include partition key (event_year)
UNIQUE (id, event_year)
```
**Consolidation Key** (prevents duplicates):
```sql
-- Ensures one record per consolidation group
UNIQUE (unit_name, tool_name_id, event_timestamp, event_year)
```
**Query Optimization**:
```sql
-- Fast filtering by unit/tool
(unit_name, tool_name_id)
-- JSONB queries with GIN index
GIN (measurements)
```
**Note**: All indexes are automatically created on all partitions when you run `setup --create-schema`.
---
## Summary

View File

@@ -254,16 +254,25 @@ LIMIT 1000;
## Partizionamento
Entrambe le tabelle sono partizionate per anno (RANGE partitioning su `EXTRACT(YEAR FROM event_date)`):
Entrambe le tabelle sono partizionate per anno usando la colonna `event_year`:
```sql
-- Partizioni create automaticamente per:
-- rawdatacor_2014, rawdatacor_2015, ..., rawdatacor_2031
-- elabdatadisp_2014, elabdatadisp_2015, ..., elabdatadisp_2031
-- Partizionamento basato su event_year (calcolato da event_timestamp durante insert)
CREATE TABLE rawdatacor_2024 PARTITION OF rawdatacor
FOR VALUES FROM (2024) TO (2025);
-- Query partizionata (constraint exclusion automatico)
SELECT * FROM rawdatacor
WHERE event_date >= '2024-01-01' AND event_date < '2024-12-31';
WHERE event_year = 2024;
-- PostgreSQL usa solo rawdatacor_2024
-- Oppure usando event_timestamp
SELECT * FROM rawdatacor
WHERE event_timestamp >= '2024-01-01' AND event_timestamp < '2025-01-01';
-- PostgreSQL usa solo rawdatacor_2024
```
@@ -271,18 +280,28 @@ WHERE event_date >= '2024-01-01' AND event_date < '2024-12-31';
### RAWDATACOR
```sql
idx_unit_tool_node_datetime -- (unit_name, tool_name_id, node_num, event_date, event_time)
idx_unit_tool -- (unit_name, tool_name_id)
idx_measurements_gin -- GIN index su measurements JSONB
idx_event_date -- (event_date)
-- Primary key (necessario per tabelle partizionate)
rawdatacor_pkey -- UNIQUE (id, event_year)
-- Consolidation key (previene duplicati)
rawdatacor_consolidation_key_unique -- UNIQUE (unit_name, tool_name_id, event_timestamp, event_year)
-- Query optimization
idx_rawdatacor_unit_tool -- (unit_name, tool_name_id)
idx_rawdatacor_measurements_gin -- GIN (measurements) per query JSONB
```
### ELABDATADISP
```sql
idx_unit_tool_node_datetime -- (unit_name, tool_name_id, node_num, event_date, event_time)
idx_unit_tool -- (unit_name, tool_name_id)
idx_measurements_gin -- GIN index su measurements JSONB
idx_event_date -- (event_date)
-- Primary key (necessario per tabelle partizionate)
elabdatadisp_pkey -- UNIQUE (id, event_year)
-- Consolidation key (previene duplicati)
elabdatadisp_consolidation_key_unique -- UNIQUE (unit_name, tool_name_id, event_timestamp, event_year)
-- Query optimization
idx_elabdatadisp_unit_tool -- (unit_name, tool_name_id)
idx_elabdatadisp_measurements_gin -- GIN (measurements) per query JSONB
```
## Benchmark