Data Modeling Best Practices for Modern Applications
Normalization vs denormalization, entity relationships, schema evolution, and the data modeling decisions that determine your application's future.
Your data model is the most consequential design decision in your application. It determines query performance, data integrity, schema evolution flexibility, and team productivity. A poorly designed data model can't be fixed by better application code or faster hardware — it requires a painful migration. Getting it right upfront saves months of rework later.
Start with the Domain, Not the Database
Model your domain entities first, then translate to database tables. Too many teams start by creating tables and then try to force their business logic to fit. Domain-Driven Design (DDD) techniques — bounded contexts, aggregates, value objects — produce better data models because they reflect how the business actually works.
Normalization: Guidelines, Not Rules
Third Normal Form (3NF) eliminates redundancy and ensures data integrity. But strict normalization creates joins that slow down reads. The pragmatic approach: normalize by default, denormalize intentionally for specific performance requirements. Document every denormalization decision and its trade-offs.
- Normalize reference data (countries, categories, permissions) — it's updated rarely and queried with simple joins
- Normalize transactional data (orders, payments, audit logs) — integrity is critical and you need consistent accounting
- Denormalize read-heavy aggregations (dashboard data, leaderboards, search indexes) — compute once, read many times
- Use materialized views for denormalized read models — keeps the source normalized while providing fast reads
Schema Evolution: Plan for Change
Your data model will change. New features require new tables, new columns, and new relationships. Design for evolution: use nullable columns for optional data (avoid NOT NULL on columns that might not always have values), prefer additive changes (new columns) over destructive changes (column renames), and version your schemas with migration tools.
-- Additive migration: always safe to apply
ALTER TABLE users ADD COLUMN preferences JSONB NOT NULL DEFAULT '{}';
ALTER TABLE users ADD COLUMN timezone TEXT DEFAULT 'UTC';
ALTER TABLE users ADD COLUMN locale TEXT DEFAULT 'en-US';
-- Create index for JSONB queries
CREATE INDEX idx_users_preferences ON users USING GIN (preferences);
-- Backfill existing users with defaults (do this in batches for large tables)
-- UPDATE users SET timezone = 'UTC' WHERE timezone IS NULL;Use JSONB columns for semi-structured data that varies between records (user preferences, feature flags, metadata). It gives you the flexibility of NoSQL within a relational database, with indexing and query support.
Data modeling is where experience pays the highest dividends. Take time upfront to understand the domain, model entities and relationships carefully, plan for schema evolution, and document your decisions. Future you — and future team members — will be grateful.
Sarah Chen
Cloud Infrastructure Architect