Back to Skills
antigravityDocument Processing

postgresql

Design a PostgreSQL-specific schema. Covers best-practices, data types, indexing, constraints, performance patterns, and advanced features

Documentation

PostgreSQL Table Design

Use this skill when

  • Designing a schema for PostgreSQL
  • Selecting data types and constraints
  • Planning indexes, partitions, or RLS policies
  • Reviewing tables for scale and maintainability

Do not use this skill when

  • You are targeting a non-PostgreSQL database
  • You only need query tuning without schema changes
  • You require a DB-agnostic modeling guide

Instructions

  1. Capture entities, access patterns, and scale targets (rows, QPS, retention).
  2. Choose data types and constraints that enforce invariants.
  3. Add indexes for real query paths and validate with EXPLAIN.
  4. Plan partitioning or RLS where required by scale or access control.
  5. Review migration impact and apply changes safely.

Safety

  • Avoid destructive DDL on production without backups and a rollback plan.
  • Use migrations and staging validation before applying schema changes.

Core Rules

  • Define a PRIMARY KEY for reference tables (users, orders, etc.). Not always needed for time-series/event/log data. When used, prefer BIGINT GENERATED ALWAYS AS IDENTITY; use UUID only when global uniqueness/opacity is needed.
  • Normalize first (to 3NF) to eliminate data redundancy and update anomalies; denormalize only for measured, high-ROI reads where join performance is proven problematic. Premature denormalization creates maintenance burden.
  • Add NOT NULL everywhere it’s semantically required; use DEFAULTs for common values.
  • Create indexes for access paths you actually query: PK/unique (auto), FK columns (manual!), frequent filters/sorts, and join keys.
  • Prefer TIMESTAMPTZ for event time; NUMERIC for money; TEXT for strings; BIGINT for integer values, DOUBLE PRECISION for floats (or NUMERIC for exact decimal arithmetic).

PostgreSQL “Gotchas”

  • Identifiers: unquoted → lowercased. Avoid quoted/mixed-case names. Convention: use snake_case for table/column names.
  • Unique + NULLs: UNIQUE allows multiple NULLs. Use UNIQUE (...) NULLS NOT DISTINCT (PG15+) to restrict to one NULL.
  • FK indexes: PostgreSQL does not auto-index FK columns. Add them.
  • No silent coercions: length/precision overflows error out (no truncation). Example: inserting 999 into NUMERIC(2,0) fails with error, unlike some databases that silently truncate or round.
  • Sequences/identity have gaps (normal; don't "fix"). Rollbacks, crashes, and concurrent transactions create gaps in ID sequences (1, 2, 5, 6...). This is expected behavior—don't try to make IDs consecutive.
  • Heap storage: no clustered PK by default (unlike SQL Server/MySQL InnoDB); CLUSTER is one-off reorganization, not maintained on subsequent inserts. Row order on disk is insertion order unless explicitly clustered.
  • MVCC: updates/deletes leave dead tuples; vacuum handles them—design to avoid hot wide-row churn.

Data Types

  • IDs: BIGINT GENERATED ALWAYS AS IDENTITY preferred (GENERATED BY DEFAULT also fine); UUID when merging/federating/used in a distributed system or for opaque IDs. Generate with uuidv7() (preferred if using PG18+) or gen_random_uuid() (if using an older PG version).
  • Integers: prefer BIGINT unless storage space is critical; INTEGER for smaller ranges; avoid SMALLINT unless constrained.
  • Floats: prefer DOUBLE PRECISION over REAL unless storage space is critical. Use NUMERIC for exact decimal arithmetic.
  • Strings: prefer TEXT; if length limits needed, use CHECK (LENGTH(col) <= n) instead of VARCHAR(n); avoid CHAR(n). Use BYTEA for binary data. Large strings/binary (>2KB default threshold) automatically stored in TOAST with compression. TOAST storage: PLAIN (no TOAST), EXTENDED (compress + out-of-line), EXTERNAL (out-of-line, no compress), MAIN (compress, keep in-line if possible). Default EXTENDED usually optimal. Control with ALTER TABLE tbl ALTER COLUMN col SET STORAGE strategy and ALTER TABLE tbl SET (toast_tuple_target = 4096) for threshold. Case-insensitive: for locale/accent handling use non-deterministic collations; for plain ASCII use expression indexes on LOWER(col) (preferred unless column needs case-insensitive PK/FK/UNIQUE) or CITEXT.
  • Money: NUMERIC(p,s) (never float).
  • Time: TIMESTAMPTZ for timestamps; DATE for date-only; INTERVAL for durations. Avoid TIMESTAMP (without timezone). Use now() for transaction start time, clock_timestamp() for current wall-clock time.
  • Booleans: BOOLEAN with NOT NULL constraint unless tri-state values are required.
  • Enums: CREATE TYPE ... AS ENUM for small, stable sets (e.g. US states, days of week). For business-logic-driven and evolving values (e.g. order statuses) → use TEXT (or INT) + CHECK or lookup table.
  • Arrays: TEXT[], INTEGER[], etc. Use for ordered lists where you query elements. Index with GIN for containment (@>, <@) and overlap (&&) queries. Access: arr[1] (1-ind