Datasets

Introduction to the datasets Table

The datasets table is designed to store and manage various types of data collected during manufacturing processes. This table plays a crucial role in the system by providing a flexible structure to accommodate different data types, from simple measurements to complex file attachments, while maintaining relationships with processes, companies, and components.

Table Structure

The datasets table is structured to capture a wide range of data attributes and metadata. Here’s a detailed breakdown of its columns:

Column Name	Data Type	Constraints	Description
id	uuid	primary key	Unique identifier for the dataset
name	text	not null	Name of the dataset
process_id	uuid	not null, foreign key	Reference to the associated process
data_type	text (enum)	not null	Type of data (e.g., PARAMETRIC_QUANTITATIVE, IMAGE, FILE)
company_id	uuid	not null, foreign key	Reference to the company owning the dataset
order	integer	nullable	Order of the dataset within its process
created_at	timestamp with time zone	not null, default now()	Timestamp of dataset creation
process_revision	integer	not null	Revision number of the associated process
lsl	double precision	nullable	Lower Specification Limit for quantitative data
usl	double precision	nullable	Upper Specification Limit for quantitative data
unit	text	nullable	Unit of measurement for quantitative data
child_component_id	uuid	nullable, foreign key	Reference to a child component (for LINK type datasets)
expected_value	text	nullable	Expected value for the dataset
is_active	boolean	not null, default true	Indicates if the dataset is currently active
prior_values	text[]	nullable	Array of previous values (for tracking history)
schema_id	uuid	nullable, foreign key	Reference to a log file schema (for structured data parsing)

Usage and Functionality

The datasets table is designed to be versatile and accommodate various data collection needs in manufacturing processes. Here are some key points about its usage:

Multi-type Data Support: The data_type column allows for different types of data to be stored, including quantitative measurements, qualitative assessments, images, files, checkboxes, and pass/fail results. This flexibility enables the system to handle diverse data collection requirements.
Process Integration: Each dataset is associated with a specific process through the process_id and process_revision columns. This allows for version control of processes and ensures that datasets are always linked to the correct process version.
Specification Limits: For quantitative data types, the lsl (Lower Specification Limit) and usl (Upper Specification Limit) columns allow for the definition of acceptable ranges. This is crucial for quality control and automated pass/fail determinations.
Component Linking: The child_component_id column enables datasets to be associated with specific components, which is particularly useful for tracking data across complex assemblies or multi-level bill of materials.
Active Status Tracking: The is_active column allows for soft deletion or archiving of datasets without removing them from the database. This is useful for maintaining historical data while focusing queries on current, active datasets.

Notes

The prior_values array suggests that the system can track historical changes to dataset values, which could be useful for trend analysis or auditing purposes.
The recent addition of the schema_id column indicates an evolution towards more structured data handling, potentially allowing for complex data parsing from log files or other structured data sources.

Example usage in TypeScript:

interface Dataset {
  id: string;
  name: string;
  process_id: string;
  data_type: DataType;
  company_id: string;
  order: number | null;
  created_at: string; // ISO 8601 format
  process_revision: number;
  lsl: number | null;
  usl: number | null;
  unit: string | null;
  child_component_id: string | null;
  expected_value: string | null;
  is_active: boolean;
  prior_values: string[] | null;
  schema_id: string | null;
}

// Example of creating a new dataset
const newDataset: Partial<Dataset> = {
  name: "Component Length",
  process_id: "process-uuid",
  data_type: DataType.ParametricQuantitative,
  company_id: "company-uuid",
  lsl: 10.5,
  usl: 11.5,
  unit: "cm",
};

The order column allows for customizable sequencing of datasets within a process, which can be important for defining the flow of data collection steps.

By leveraging the datasets table, the Serial application can flexibly capture, validate, and analyze a wide range of manufacturing data types while maintaining clear associations with processes, companies, and components. This versatility is crucial for supporting diverse manufacturing scenarios and enabling robust quality control and traceability features.

Overview

Frontend

Zuplo

Supabase

Introduction to the datasets Table

Table Structure

Usage and Functionality

Notes

Overview

Frontend

Zuplo

Supabase

​Introduction to the datasets Table

​Table Structure

​Usage and Functionality

​Notes

Introduction to the datasets Table

Table Structure

Usage and Functionality

Notes