Log File Schemas

Introduction to the Log_File_Schemas Table

The log_file_schemas table is designed to store and manage schema information related to log files for different companies. This table plays a crucial role in the system by maintaining structured representations of log file formats, enabling efficient parsing and analysis of log data.

Table Structure

The log_file_schemas table is structured to support versioning and company-specific schema management. Here’s a detailed breakdown of its columns:

Column Name	Data Type	Constraints	Description
id	uuid	primary key (composite)	Unique identifier for the schema
revision	integer	primary key (composite), not null	Version number of the schema
company_id	uuid	foreign key, not null	Reference to the company owning the schema
sample_file_data	jsonb		JSON representation of the schema structure
sample_file_name	text	not null	Name of the sample file used to generate the schema
sample_file_id	uuid		Optional reference to the stored sample file
created_at	timestamptz	not null, default current_timestamp	Timestamp of schema creation
source_dataset_id	uuid	foreign key	Reference to the source dataset used for the schema

Usage and Functionality

The log_file_schemas table is designed to be flexible and support evolving schema definitions. Here are some key points about its usage:

Versioning System: The table implements a revision system using a composite primary key (id, revision). This allows for multiple versions of a schema to be stored and tracked over time.
Company-Specific Schemas: Each schema is associated with a specific company through the company_id foreign key. This ensures that schemas are isolated between different companies using the system.
JSON Schema Storage: The sample_file_data column uses the jsonb data type, allowing for flexible and efficient storage of schema structures in JSON format.
Dataset Linkage: The source_dataset_id column links each schema to its source dataset, enabling traceability and data lineage.
Access Control: Row-level security policies are implemented to restrict access based on user roles and company affiliation:
- Only users with the ADMIN role can insert, update, or delete schemas.
- All authenticated users from the same company can view (select) the schemas.

Notes

The table structure has evolved over time, with columns being renamed and added to better suit the application’s needs.

A unique constraint is enforced through a trigger function to ensure that no two rows can have the same source_dataset_id with different id values:

CREATE OR REPLACE FUNCTION check_unique_source_dataset_id()
RETURNS TRIGGER AS $$
BEGIN
  IF EXISTS (
    SELECT 1 FROM log_file_schemas
    WHERE source_dataset_id = NEW.source_dataset_id
      AND id != NEW.id
  ) THEN
    RAISE EXCEPTION 'Duplicate source_dataset_id found';
  END IF;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER enforce_unique_source_dataset_id
BEFORE INSERT OR UPDATE ON log_file_schemas
FOR EACH ROW EXECUTE FUNCTION check_unique_source_dataset_id();

The table is designed to work closely with the datasets table, allowing for a clear relationship between raw data and its corresponding schema.

By leveraging the log_file_schemas table, the Serial application can efficiently manage and version log file schemas for multiple companies, enabling robust log parsing and analysis capabilities while maintaining data isolation and access control.

Overview

Frontend

Zuplo

Supabase

Log File Schemas

Introduction to the Log_File_Schemas Table

Table Structure

Usage and Functionality

Notes

Overview

Frontend

Zuplo

Supabase

​Introduction to the Log_File_Schemas Table

​Table Structure

​Usage and Functionality

​Notes

Introduction to the Log_File_Schemas Table

Table Structure

Usage and Functionality

Notes