Real-Time Analytics Architecture: From Ingestion to Dashboard
Designing analytics systems that process millions of events per second with sub-second query latency.
Real-time analytics transforms decision-making. Instead of waiting for overnight batch reports, stakeholders see live dashboards showing sales, user behavior, system health, and business KPIs — updated every second. Building this requires a fundamentally different architecture than traditional batch analytics.
The Architecture Stack
A real-time analytics system has four layers: event collection (SDK/API → Kafka), stream processing (Kafka Streams / Flink for enrichment, filtering, aggregation), analytical storage (ClickHouse, Apache Druid, or BigQuery), and visualization (Grafana, Metabase, or custom dashboards). Each layer is independently scalable.
Why ClickHouse for Analytics
For OLAP (analytical) queries — aggregations, group-bys, time-series analysis — columnar databases outperform row-based databases by 10-100x. ClickHouse is our default choice: it's open-source, handles petabyte-scale data, and executes most analytical queries in under 100ms. A query that takes 30 seconds in PostgreSQL takes 50ms in ClickHouse.
-- ClickHouse: sub-100ms on billions of rows
SELECT
toStartOfHour(timestamp) AS hour,
page_path,
countDistinct(session_id) AS unique_visitors,
count() AS page_views,
avg(time_on_page) AS avg_time_seconds
FROM events
WHERE
event_type = 'page_view'
AND timestamp >= now() - INTERVAL 24 HOUR
GROUP BY hour, page_path
ORDER BY hour DESC, page_views DESC
LIMIT 100;Event Schema Design
The event schema is the foundation of your analytics system. Design it once, evolve it carefully. We follow the entity-event-property pattern: every event has a type (page_view, purchase, signup), an entity (user_id, session_id), a timestamp, and a set of properties (page_path, product_id, price).
Design your event schema for the queries you'll run, not the events you'll collect. If you need to answer 'what's the conversion rate by traffic source?', make sure traffic source is a property on your conversion events — don't force analysts to join across multiple event types.
Handling Late-Arriving Data
In real-time systems, events don't always arrive in order. Mobile apps send batched events when connectivity is restored, third-party webhooks can be delayed, and distributed systems have clock skew. Your analytics system must handle late-arriving data gracefully — either by using event timestamps (not ingestion timestamps) for aggregation, or by implementing windowed computations that can be updated retroactively.
Real-time analytics is a competitive advantage. Organizations that can see and act on data in real-time make better decisions, respond to incidents faster, and understand their users more deeply. The technology stack is mature, the patterns are well-established, and the ROI is measurable.
David Kim
Embedded Systems Lead