PostgreSQL Telemetry Database

PostgreSQL can be used as a telemetry database. It uses partitioned tables for time-series data storage.

Activation

To enable PostgreSQL as a telemetry database, set the Terraform variable use_pgsql_for_telemetries to true when deploying the infrastructure. This can be done by setting the GitLab environment variable USE_PGSQL_FOR_TELEMETRIES to true. This will create the required infrastructure components for the feature to work.

Database Hosting

It is up to the customer to choose where to host the PostgreSQL database for telemetries. Options include:

The same PostgreSQL server as the one used for Kamea's Management API
A separate PostgreSQL server dedicated to telemetries

The choice depends on scalability considerations. Using a separate server allows scaling telemetry storage independently from the API database, which can be beneficial for high-volume telemetry workloads.

By default, when using the provided pipelines and Terraform configuration, a new database is created in the same PostgreSQL server that hosts the Management API database.

Additionally, the following Terraform variables must be provided:

pgsql_telemetries_db_name: Name of the PostgreSQL database (default: "telemetries")
pgsql_telemetries_db_ca_b64: Base64-encoded CA certificate for the PostgreSQL database connection. This can be set via the GitLab environment variable PGSQL_TELEMETRIES_DB_CA_PATH, which should contain the path to the PEM certificate file. In GitLab UI, create the variable with the type File, and copy the certificate content into it. The pipeline will retrieve the file and convert it to base64 format automatically before providing it to Terraform.

The API will automatically discover the PostgreSQL connector when the following environment variables are set (these are configured automatically by Terraform when use_pgsql_for_telemetries is enabled):

TELEM_DB_REQ_PGSQL_URL: URL of the PostgreSQL telemetry endpoint
TELEM_DB_REQ_PGSQL_SECRET: Secret key for authenticating API requests to the PostgreSQL connector

Schema Setup

Before using PostgreSQL for telemetries, the database schema must be initialized. A SQL script is provided at core/libs/ingestion/sql/init_telem_table.sql that creates:

Partitioned Table: A parent telemetry table partitioned by timestamp range
Monthly Partitions: Child partitions created monthly from January 2025 to January 2030 (60 months)
Indexes: Indexes for common query patterns

Manual Schema Setup

To set up the schema manually:

Connect to the PostgreSQL database using a client with appropriate permissions
Execute the initialization script:

psql -h <host> -U <user> -d <database> -f core/libs/ingestion/sql/init_telem_table.sql

Or if using a different PostgreSQL client, copy and execute the contents of init_telem_table.sql.

The script creates:

Table Structure:
- device_id (uuid): Device identifier
- key (text): Telemetry key/metric name
- timestamp (timestamptz): Timestamp of the reading
- value_num (double precision): Numeric value (nullable)
- value_string (text): String value (nullable)
- value_bool (boolean): Boolean value (nullable)

Note

Each row must have exactly one value type set (enforced by a check constraint).

Partitioning: Monthly partitions named telemetry_YYYY_MM (e.g., telemetry_2025_01)
Indexes: Two composite indexes are created to optimize common query patterns:
- telemetry_device_key_timestamp_idx on (device_id, key, timestamp): Optimizes queries filtering by a specific device, a specific telemetry key, and a time range.
- telemetry_device_timestamp_idx on (device_id, timestamp): Optimizes queries filtering by device and time range without filtering by key.

Creating Future Partitions

The initialization script creates partitions up to January 2030. For data beyond this date, additional partitions must be manually created:

CREATE TABLE telemetry_2030_02 PARTITION OF telemetry
FOR VALUES FROM ('2030-02-01') TO ('2030-03-01');

Granularity Parameter

When querying telemetries from PostgreSQL through Kamea, a granularity parameter is supported.

Format: Positive integer representing minutes
Behavior: Groups telemetry data into time buckets of the specified duration (in minutes) and returns aggregated values (average for numeric values)
Example:
granularity: 30 groups data into 30-minute buckets
granularity: 60 groups data into 1-hour buckets
granularity: 1440 groups data into 1-day buckets (24 hours × 60 minutes)

When granularity is provided:

The query returns aggregated data points at the specified interval
Only numeric values (value_num) are aggregated using the AVG() function
The keys parameter must contain at least one element (granularity requires filtering by specific keys)
Timestamps in the response represent the start of each time bucket

When granularity is omitted:

The query returns raw telemetry data points as stored in the database
All value types (numeric, string, boolean) are returned as-is
No aggregation is performed

Azure Functions

The PostgreSQL telemetry feature uses Azure Functions to handle telemetry storage and retrieval. Two functions are deployed:

HTTP Trigger Function: Get Telemetries

The HTTP trigger function exposes a POST endpoint for querying telemetries from PostgreSQL.

Endpoint: /api/telemetries

Method: POST

Authentication: Requires the x-api-key header with the value set in the PGSQL_AF_MGMT_API_KEY environment variable.

Request Body:

{
    "deviceIds": ["device-uuid-1", "device-uuid-2"],
    "keys": ["temperature", "humidity"],
    "from": "2025-01-01T00:00:00Z",
    "to": "2025-01-31T23:59:59Z",
    "page": 0,
    "limit": 100,
    "granularity": 60
}

Parameters:

deviceIds (required): Array of device UUIDs
keys (optional): Array of telemetry keys to filter. If empty, returns all keys. Required when granularity is provided.
from (required): Start timestamp (ISO 8601 format)
to (optional): End timestamp (ISO 8601 format). Defaults to current time if omitted.
page (required): Page number (0-indexed)
limit (required): Number of results per page
granularity (optional): Aggregation interval in minutes. If omitted, returns raw telemetry data.

Response:

{
    "data": [
        {
            "deviceId": "device-uuid-1",
            "field": "temperature",
            "timestamp": "2025-01-01T00:00:00Z",
            "value": 22.5
        }
    ],
    "total": 150,
    "page": 0,
    "limit": 100
}

Example Request:

curl -X POST https://<function-app-name>.azurewebsites.net/api/telemetries \
  -H "Content-Type: application/json" \
  -H "x-api-key: <api-key>" \
  -d '{
    "deviceIds": ["123e4567-e89b-12d3-a456-426614174000"],
    "keys": ["temperature"],
    "from": "2025-01-01T00:00:00Z",
    "to": "2025-01-01T23:59:59Z",
    "page": 0,
    "limit": 100
  }'

Service Bus Trigger Function: Store Telemetries

The Service Bus trigger function listens to a Service Bus topic subscription and stores incoming telemetry messages to PostgreSQL.

Trigger: Service Bus Topic subscription

Topic: Configured via DATA_DECODED_TOPIC_NAME environment variable

Subscription: Configured via AF_PGSQL_SUBSCRIPTION environment variable

Message Format: The function expects messages in the FormattedMessage format:

{
    "deviceId": "device-uuid",
    "tenantId": "tenant-uuid",
    "body": {
        "temperature": 22.5,
        "humidity": 60
    },
    "deviceSentTimestamp": "2025-01-01T12:00:00Z",
    "platformReceivedTimestamp": "2025-01-01T12:00:01Z"
}

The function automatically:

Flattens nested telemetry values.
Converts values to the appropriate PostgreSQL column type (value_num, value_string, or value_bool).
Inserts telemetry data into the database.

Error Handling: If database insertion fails, the function throws an error, causing the Service Bus message to remain unacknowledged and be retried according to the subscription's retry policy.