Time-Series Databases: Storing and Querying Data That Changes Over Time
Your monitoring system stores CPU metrics in PostgreSQL. One metric per row: timestamp, host, metric name, value. After six months, the table has 50 billion rows. Simple queries like “show me CPU usage for host-42 over the last hour” take 30 seconds. You have indexes on timestamp and host. It does not matter. The table is too large and the access pattern is wrong for a general-purpose database.
You migrate to InfluxDB. The same query takes 50 milliseconds. The storage footprint drops by 10x due to compression. This is what a time-series database is designed for.
What makes time-series data different
Time-series data has specific characteristics that general-purpose databases handle poorly:
Append-only writes - New data points are always added, never updated. You do not change yesterday’s CPU reading.
Time-ordered access - Queries almost always include a time range. “Show me the last hour” or “show me data from January.”
High write throughput - Monitoring systems write millions of data points per second across thousands of metrics.
Bulk reads - Queries typically read many data points for a time range, not individual rows.
Automatic expiration - Old data is often less valuable. You want to keep 1-minute resolution for the last week, 1-hour resolution for the last year, and nothing older.
Compression - Sequential timestamps and slowly-changing values compress extremely well.
How time-series databases work
Column-oriented storage
Time-series databases store data in columns rather than rows. All timestamps are stored together, all values are stored together. This enables:
- Better compression - Sequential timestamps compress with delta encoding (store differences, not absolute values). Values that change slowly compress with run-length encoding.
- Faster range scans - Reading all values for a time range only reads the value column, not all columns.
Time-based partitioning
Data is automatically partitioned by time. Recent data is in hot partitions (fast storage, in memory). Old data is in cold partitions (slow storage, compressed). Queries for recent data are fast. Queries for old data are slower but still efficient because they only scan the relevant time partition.
Downsampling and retention policies
Time-series databases support automatic downsampling: aggregate 1-minute data into 1-hour data after 7 days, then delete the 1-minute data. This keeps storage manageable while preserving long-term trends.
graph TB subgraph ingestion["Data Ingestion"] S1["Server metrics 1M points/sec"] S2["IoT sensors 500K points/sec"] S3["App events 200K points/sec"] TSDB["Time-Series DB Write-optimized Column storage"] S1 --> TSDB S2 --> TSDB S3 --> TSDB end subgraph storage["Storage Tiers"] HOT["Hot tier Last 7 days Full resolution Fast SSD"] WARM["Warm tier Last 90 days Downsampled 1h Standard SSD"] COLD["Cold tier Last 2 years Downsampled 1d Object storage"] HOT --> WARM --> COLD end TSDB --> HOT style TSDB fill:#EEEDFE,stroke:#534AB7,color:#3C3489 style HOT fill:#E1F5EE,stroke:#0F6E56,color:#085041 style WARM fill:#FAEEDA,stroke:#854F0B,color:#633806 style COLD fill:#F1EFE8,stroke:#888780,color:#444441
The main time-series databases
InfluxDB
The most popular open-source time-series database. Uses a custom storage engine (TSM - Time-Structured Merge Tree). Flux query language for complex time-series operations. Built-in downsampling and retention policies.
Data model: Measurements (like tables), tags (indexed string metadata), fields (numeric values), timestamp.
Best for: Infrastructure monitoring, IoT, application metrics.
Prometheus
Pull-based monitoring system with a built-in time-series database. Scrapes metrics from targets at regular intervals. PromQL for querying. Designed for reliability - each Prometheus instance is independent with no clustering.
Data model: Metrics with labels (key-value pairs). All data is numeric.
Best for: Kubernetes monitoring, microservices metrics, alerting.
Limitation: Not designed for long-term storage. Use Thanos or Cortex to add horizontal scaling and long-term storage.
TimescaleDB
PostgreSQL extension that adds time-series optimizations. Automatic partitioning by time (hypertables). Continuous aggregates for automatic downsampling. Full SQL support.
Best for: Teams that know PostgreSQL and want time-series capabilities without learning a new database. Relational data with time-series components.
ClickHouse
Columnar database optimized for analytical queries. Not exclusively time-series, but excellent for time-series analytics at scale. Used for event analytics, user behavior analysis.
Best for: High-volume event analytics, ad-tech, user behavior analysis.
Apache Druid
Real-time analytics database. Ingests streaming data from Kafka. Sub-second queries on billions of rows. Used for interactive dashboards on large datasets.
Best for: Real-time analytics dashboards, ad-tech, user-facing analytics.
Where it breaks or gets interesting
The cardinality problem
In Prometheus and InfluxDB, high cardinality (many unique label/tag combinations) causes performance problems. Each unique combination of labels creates a separate time series. If you label metrics with user_id, you have one time series per user. With 10 million users, that is 10 million time series. Memory usage explodes.
Rule of thumb: use low-cardinality labels (host, region, service) not high-cardinality labels (user_id, request_id, session_id). For high-cardinality data, use a different storage system (Elasticsearch, ClickHouse).
Out-of-order writes
Time-series databases assume data arrives in roughly time order. If data arrives significantly out of order (late-arriving events from mobile devices, delayed batch imports), it can cause performance problems or data corruption.
Most databases have a configurable “out-of-order tolerance window.” Data arriving within the window is accepted. Data arriving outside the window is rejected or handled specially.
The write amplification of LSM trees
Many time-series databases use LSM (Log-Structured Merge) trees for write performance. Writes go to an in-memory buffer, then are flushed to disk as immutable files. Background compaction merges files. This gives excellent write throughput but causes write amplification (data is written multiple times during compaction).
Querying across multiple metrics
Time-series databases are optimized for querying one metric over time. Joining multiple metrics (correlating CPU usage with request latency) requires either a query language that supports joins (PromQL, Flux) or pulling data into a general-purpose database for analysis.
graph LR subgraph choose["Choosing a Time-Series DB"] INF["InfluxDB General purpose TSDB IoT, infrastructure"] PROM["Prometheus Kubernetes monitoring Pull-based scraping"] TS["TimescaleDB PostgreSQL users Relational + time-series"] CH["ClickHouse Event analytics High cardinality OK"] DR["Druid Real-time dashboards Streaming ingestion"] end style INF fill:#EEEDFE,stroke:#534AB7,color:#3C3489 style PROM fill:#E1F5EE,stroke:#0F6E56,color:#085041 style TS fill:#FAEEDA,stroke:#854F0B,color:#633806 style CH fill:#F1EFE8,stroke:#888780,color:#444441 style DR fill:#EEEDFE,stroke:#534AB7,color:#3C3489
Real-world systems
Netflix - Uses Atlas (their own time-series database) for infrastructure metrics. Handles billions of data points per minute. Designed for high cardinality (per-instance metrics across thousands of servers).
Uber - Uses M3 (open-sourced) for metrics. Built on top of InfluxDB’s storage engine with a distributed query layer.
Cloudflare - Uses ClickHouse for analytics. Processes trillions of DNS queries and HTTP requests. Sub-second queries on petabytes of data.
Datadog - Built their own time-series database optimized for their specific access patterns. Handles trillions of data points per day.
Grafana - Visualization layer that works with multiple time-series backends (Prometheus, InfluxDB, Graphite, Elasticsearch). The de facto standard for time-series dashboards.
How to apply it in practice
When to use a time-series database
Use a time-series database when:
- You are storing metrics, events, or sensor readings
- Your primary query pattern is “show me data for X over time range Y”
- You need automatic downsampling and retention policies
- Write throughput exceeds what a general-purpose database handles efficiently (roughly 100,000+ writes/second)
Use a general-purpose database when:
- You have time-series data but low volume (under 10,000 writes/second)
- You need to join time-series data with relational data frequently
- Your team knows SQL and does not want to learn a new query language
- TimescaleDB (PostgreSQL extension) gives you the best of both worlds
The monitoring stack
A typical production monitoring stack:
- Collection - Prometheus scrapes metrics from services, or services push to InfluxDB/StatsD
- Storage - Prometheus (short-term) + Thanos/Cortex (long-term), or InfluxDB, or TimescaleDB
- Visualization - Grafana dashboards
- Alerting - Prometheus Alertmanager or Grafana Alerting
Data model design
Good time-series data model:
- Use tags/labels for dimensions you filter and group by (host, region, service, environment)
- Use fields/values for the actual measurements (cpu_percent, memory_bytes, request_count)
- Keep tag cardinality low (avoid user_id, request_id as tags)
- Use consistent naming conventions (snake_case, units in the name:
memory_bytesnotmemory)
FAQ
Q: Can I use PostgreSQL for time-series data?
Yes, for moderate volumes. PostgreSQL with proper partitioning (partition by time range) and indexes handles millions of rows efficiently. TimescaleDB extends PostgreSQL with automatic time-based partitioning, continuous aggregates, and compression. For very high write throughput (millions of points per second) or very large datasets (trillions of points), a dedicated time-series database is more efficient.
Q: What is the difference between Prometheus and InfluxDB?
Prometheus is pull-based: it scrapes metrics from targets at regular intervals. InfluxDB is push-based: applications send metrics to InfluxDB. Prometheus is designed for reliability and simplicity - each instance is independent. InfluxDB supports clustering and long-term storage natively. Prometheus is the standard for Kubernetes monitoring. InfluxDB is more general-purpose. Many teams use both: Prometheus for Kubernetes metrics, InfluxDB for application and IoT metrics.
Q: How do you handle time zones in time-series data?
Store all timestamps in UTC. Convert to local time zones only at the display layer. Mixing time zones in storage causes confusion and bugs (especially around daylight saving time transitions). Most time-series databases store timestamps as Unix epoch (seconds or nanoseconds since 1970-01-01 00:00:00 UTC), which is inherently timezone-agnostic.
Interview questions
Q1: You are building a monitoring system for 10,000 servers, each reporting 100 metrics every 10 seconds. What is the write throughput and how do you design the storage?
Strong answer: 10,000 servers x 100 metrics x 6 writes/minute = 6 million writes per minute = 100,000 writes per second. This is beyond what a single PostgreSQL instance handles efficiently. Use a time-series database: InfluxDB or TimescaleDB. Design the data model: measurement name (e.g., system.cpu), tags for host and region (low cardinality), fields for the metric values. Use a retention policy: keep 1-second resolution for 7 days, 1-minute resolution for 90 days, 1-hour resolution for 2 years. Use a write buffer (Telegraf or Prometheus remote write) to batch writes and handle backpressure. For the query layer, use Grafana with pre-built dashboards. For alerting, use Prometheus Alertmanager or Grafana Alerting.
Q2: A developer wants to add user_id as a label to all Prometheus metrics. Why is this a problem?
Strong answer: Prometheus stores one time series per unique combination of metric name and labels. With user_id as a label and 1 million users, every metric becomes 1 million time series. Prometheus keeps all active time series in memory. 1 million time series per metric x 100 metrics = 100 million time series. At roughly 3KB per time series, that is 300GB of RAM just for the label index. Prometheus will run out of memory and crash. This is the high cardinality problem. The fix: do not use high-cardinality labels (user_id, request_id, session_id) in Prometheus. For per-user metrics, use a different system: store user events in ClickHouse or a data warehouse where high cardinality is handled differently (columnar storage, no in-memory label index).
Q3: How would you design a system to store and query IoT sensor data from 1 million devices, each sending a reading every minute?
Strong answer: 1 million devices x 1 reading/minute = 1 million writes per minute = ~17,000 writes per second. Use InfluxDB or TimescaleDB. Data model: measurement sensor_reading, tags for device_id (1 million unique values - this is high cardinality for Prometheus but fine for InfluxDB), device_type, location. Fields for the actual sensor values. For ingestion: use a message queue (Kafka or MQTT broker) to buffer writes and handle device reconnections. Write from the queue to the time-series database in batches. Retention policy: full resolution for 30 days, hourly aggregates for 1 year, daily aggregates forever. For queries: recent data (last 24 hours) is fast. Historical queries use the downsampled data. For anomaly detection: stream data through a processing layer (Flink or Spark Streaming) that computes rolling averages and alerts on deviations.