InfoQ Homepage Articles Time-Series Storage: Design Choices That Shape Cost and Performance

Time-Series Storage: Design Choices That Shape Cost and Performance

May 12, 2026 16 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Listen to this article - 0:00

Key Takeaways

Normalizing series identity into a separate metadata table and referencing it by a compact ID reduces time-series storage by about forty-two percent in our experiment. Instead of repeating dimension strings like device name, region, and location on every row, each row carries only a small integer key and the full dimension strings are stored once per unique series.
High-cardinality fields like request IDs and session tokens should be kept out of series identity. When the number of unique dimension combinations approaches the number of rows, normalization gains collapse and both storage and indexing costs grow linearly.
Storing series dimensions as flexible JSON (e.g., PostgreSQL jsonb) with targeted indexes avoids schema migrations as tags progress, but requires deliberate indexing policy to prevent index sprawl and type drift.
Time partitioning allows O(1) data expiration and partition pruning, but creates a write hotspot on the current window. Adding a second axis (series identity) distributes writes and narrows read scans.
Downsampling from five-second to one-hour resolution reduces row count by 720 times, retaining full resolution only for the window where it matters and serving older queries from pre-aggregated rollups.

Every time-series database makes a set of storage design decisions: how to lay out rows, when to compress, what to partition on. These decisions determine cost and query performance more than the choice of database itself. This article works through those fundamentals from first principles, using widely available tools like PostgreSQL and Apache Parquet to make each trade-off measurable.

What is Time-Series Data?

Time-series data is a sequence of measurements recorded over time. While a standard database record tracks current state (e.g., "Account balance is $50"), time-series data tracks the history of changes (e.g., "Balance at 10:00, 10:01, 10:02...").

Modern systems emit continuous signals of health and activity:

Biometrics (Smartwatches record heart rate every five seconds to track workout intensity).
Logistics (Ride-sharing apps track GPS coordinates to calculate fares and arrival times).
Finance (Market tickers capture thousands of price updates per second).
Infrastructure (Monitoring tools track CPU trends to predict resource exhaustion).

Cheap sensors and cheap storage allow nearly everything to be recorded immediately with full histories, not just snapshots. The core data point follows a repeated shape: timestamp, identifier, and value. This simplicity is deceptive, because at scale, the repetition becomes the problem. The following is an example stream for one series:

Table 1.Examples of dimensions and measurements

Here, a "series" is defined as a unique combination of identifying attributes. For example, {device=thermostat, location=living_room}.

Figure 1. Example of multiple "series" and datapoints

Each datapoint has three parts:

Timestamp – when the measurement happened.
dimensions (or tags) – attributes that identify and group a series.
metrics (or fields) – measured values at that time.

A practical rule is to place stable identifiers in dimensions and to place changing measurements in metrics.

This split matters because queries use them differently: Dimensions are used to filter and group (WHERE location='warehouse_1', GROUP BY device_id). Metrics are used for calculations ( AVG(temperature_c), MAX(humidity_pct)).

Relational Storage: Flat vs. Normalized

To store time-series data in a relational database like PostgreSQL, we can either store all attributes in a single flat table or normalize identifiers into a separate registry.

Option A: Flat (Naive) Schema

A flat schema stores every dimension and metric in each row. This layout is simple to implement and query but forces high redundancy.

CREATE TABLE readings_flat ( 
   ts timestamptz NOT NULL, 
   device_id text NOT NULL, 
   location text NOT NULL,
   region text NOT NULL, 
   metric_name text NOT NULL, 
   value double precision NOT NULL 
); 
CREATE INDEX idx_flat_device_ts ON readings_flat (device_id, ts);

Option B: Normalized Schema

Normalization moves stable identifiers into a series_dim table. Each measurement references a series_id instead of repeating strings.

CREATE TABLE series_dim ( 
   series_id serial PRIMARY KEY, 
   device_id text NOT NULL, 
   location text NOT NULL, 
   region text NOT NULL, 
   UNIQUE (device_id, location, region) 
); 
CREATE TABLE readings_normalized ( 
   series_id integer REFERENCES series_dim(series_id), 
   ts timestamptz NOT NULL, 
   metric_name text NOT NULL, 
   value double precision NOT NULL 
); 
CREATE INDEX idx_norm_series_ts ON readings_normalized (series_id, ts);

Storage Overhead: Experimental Results

We measured the cost of repetition using a PostgreSQL 16 experiment with one thousand series and 2.8M rows. Normalization reduced total storage by approximately forty-two percent (289 MB saved).

Table 2. Storage overhead comparison of flat vs normalized schemas

The Cost Model

The efficiency gap is driven by what the dimension bytes are multiplied by. For flat, TotalBytes ≈ N_rows * (timestamp + metric + dimensions). For normalized, TotalBytes ≈ N_rows * (timestamp + metric + series_id) + N_series * dimensions.

Both schemas have total storage that is O(N_rows). The difference is the per-row cost: In a flat schema, every row carries the full dimension strings. In a normalized schema, every row carries only a small series ID, and the dimension strings are stored once per unique series. Since a series ID is much smaller than the full dimensions, and N_series ≪ N_rows in high-frequency monitoring, the normalized schema stores significantly less data for the same number of rows.

Query Performance (Warm Cache)

Comparing flat and normalized provides some interesting insight. A range read show the same performance for 0.74 ms (flat) vs. 0.74 ms (normalized). For the hourly average, normalized is faster: 215.51 ms (flat) vs. 164.41 ms (normalized). Storing dimensions once and referencing them by ID reduces footprint without adding query overhead.

High Cardinality: When Normalization Breaks Down

Normalization helps when many rows share the same identity. The benefit shrinks when identity fields are high cardinality and almost unique per row.

Case A: Repeatable Dimensions (Normalization Helps)

Table 3. Example of repeated items in dimensions

Here, N_rows = 3 and N_series = 1. A normalized layout stores dimensions once in series_dim and reuses series_id for each point.

Case B: Event-Unique IDs (Normalization Benefit Shrinks)

Table 4. Example of high cardinality dimension

If request_id is part of series identity, N_series approaches N_rows. The deduplication term in normalized storage becomes much less effective. On the practical side, keep stable dimensions in series identity. In addition, keep per-event IDs such as request_id out of series identity unless query patterns require it.

External guidance reflects the same cardinality behavior. AWS CloudWatch defines each unique dimension combination as a separate metric stream. High-cardinality dimensions directly increase metric volume. CloudWatch Logs metric filters go further. They explicitly warn against dimensions like requestID and IPAddress and may disable filters that produce too many unique streams. InfluxDB's cardinality guidance identifies the same pattern: unique IDs, hashes, and random values in tags inflate series count and degrade both write and query performance.

Designing for Schema Evolution

A schema with fixed columns works until tags progress. New tags force table changes, backfills, and index churn. For time-series systems, a practical pattern is to keep a series_dim registry keyed by series_id and store series attributes in a jsonb dimensions field. For example:

CREATE TABLE series_dim ( 
   series_id bigserial PRIMARY KEY, 
   dimensions jsonb NOT NULL 
);

This approach keeps ingestion stable as tags change. The trade-off is that indexing becomes a policy decision, not a one-time DDL task. But there are indexing considerations. A Generalized Inverted Index (GIN) on dimensions is useful for general containment and key-existence filters. InfluxDB, with similar inverted-index ideas appear in InfluxDB v1 metadata indexing and later in a Time Series Index (TSI). Prometheus uses a similar TSDB index pattern that maps label matchers to series IDs before chunk reads (ref. Prometheus concepts, Prometheus storage, and the Prometheus project on GitHub). In addition, B-tree expression indexes are useful for a small set of hot attributes used in frequent queries (for example, dimensions->>'region').

Even with this layout, a few issues recur over time:

Tag growth causes new keys to appear, but only some matter for filtering.
Index sprawl causes indexing of too many JSON attributes, which increases write cost and storage.
Type drift allows the same key to arrive as different types (e.g., "42" vs. 42) unless validated.
Cardinality mistakes allow volatile keys to be accidentally treated as identity and inflate series count.

This design avoids frequent schema migrations and keeps common query paths fast.

Columnar Storage as the Next Lever

After series identity, cardinality, and schema evolution are under control, the next lever is storage layout. Columnar storage keeps each column together. For time-series workloads, this approach can recover part of the storage benefit of normalization even with a flatter logical model, because repeated dimension values compress well and queries scan fewer bytes. Consider the following example Parquet schema:

message readings { 
    required int64 ts_utc_ms (TIMESTAMP(MILLIS,true)); 
    required binary series_id (UTF8);
    required binary metric_name (UTF8); 
    required double value; 
    optional group dimensions (MAP) { 
        repeated group key_value { 
            required binary key (UTF8); 
            optional binary value (UTF8); 
        }
    }
}

Columnar Storage Options and Query Engines

The Parquet schema above defines how data is laid out on disk, but Parquet is a file format, not a database. Choosing columnar storage also means choosing where those files live and what reads them.

Parquet on Object Storage

The simplest deployment is Parquet files in Amazon Simple Storage Service (S3) or S3-compatible stores like Amazon S3 Tables, MinIO, or GCS. This approach decouples storage cost from compute; you pay commodity rates for retention and only pay for compute when you query.

Several engines can query Parquet directly from S3. DuckDB is an embedded analytical database well-suited for single-node analytics and CLI exploration. AWS Athena provides serverless, pay-per-query SQL directly from S3. Trino and Presto are federated SQL engines designed for interactive queries at scale, while Apache Spark is the standard choice for distributed batch processing over large Parquet datasets. ClickHouse can read Parquet via external table functions, and DataFrame libraries like Polars and Pandas (via PyArrow) support Parquet natively for ad hoc analysis in notebooks.

Note that PostgreSQL, which is used throughout this article to illustrate relational trade-offs, cannot natively query Parquet. The typical architecture separates tiers: PostgreSQL handles the hot window (low-latency writes and indexed lookups), while Parquet on S3 handles the cold window (scan-heavy analytical queries).

Apache Iceberg as an Open Table Format

Raw Parquet on S3 works, but at scale you lose schema evolution, Atomicity, Consistency, Isolation, and Durability (ACID) transactions, and automatic partition management. Apache Iceberg adds a metadata layer over Parquet files that solves these problems without replacing Parquet as the physical storage. For time-series, Iceberg is a strong default because it supports schema evolution without rewriting data, hidden partitioning by time, snapshot isolation, and built-in compaction tracking. Its biggest practical advantage is broad engine support. The same Iceberg table can be written by Spark or Flink and queried by Athena, Trino, DuckDB, ClickHouse, Snowflake, BigQuery, Dremio, StarRocks, or Polars without copying or converting data. S3 Tables provides a managed Iceberg-compatible store that handles compaction, snapshot expiration, and catalog integration automatically.

Generating Parquet Data at Scale

For moderate volumes, a periodic export (e.g., a nightly job that queries PostgreSQL and writes Parquet via PyArrow or DuckDB) is sufficient. At higher ingest rates, the standard pattern is streaming directly into Iceberg; Spark Structured Streaming and Apache Flink both have native Iceberg sink support, handling partitioning, schema enforcement, and automatic file sizing.

Parquet's compression advantages assume reasonably sized files. In practice, file sizing is one of the most common operational issues after adopting columnar storage on S3. If a pipeline flushes a new Parquet file every few seconds, the result is thousands of tiny files. Each carries its own footer metadata, and each S3 GET has a per-request cost. A query opening ten thousand one-kilobyte files is far slower and more expensive than one opening ten files that are one megabyte each. A common guideline is 128 MB to 1 GB per file for S3-based analytics, balancing parallelism against metadata overhead and request costs.

Comparison: Normalized Postgres vs. Columnar Parquet

We measured the storage requirements of our 2.8M row dataset using our most efficient normalized PostgreSQL layout against a flat Parquet file.

Table 5. Normalized PostgreSQL vs. columnar Parquet storage, with and without UUID columns

The baseline gap is driven almost entirely by compression. Parquet applies dictionary encoding to every column including timestamps, so repeated dimension strings, metric names, and even timestamps that recur across series collapse to small lookup tables with integer references. In this experiment, one thousand series and 2.8M rows meant heavy repetition in every column, which is why dictionary encoding alone gets the file down to under 1 MB. As timestamp cardinality grows with more series, finer resolution, and longer time ranges, dictionary encoding becomes less effective on that column and the ratio narrows. When a UUID column is added, each value is unique and dictionary encoding can no longer help at all. The file jumps to 100 MB, and the compression ratio drops from approximately 434 times to approximately 3.7 times, which is still a meaningful reduction.

Beyond compression ratios, columnar layout has broader advantages for time-series workloads:

Optimized storage footprint
Because columnar storage groups values for many rows together, the system can independently apply the most efficient compression, such as Delta, RLE, or Dictionary encoding, to each column based on its data type and pattern. This specialized compression collapses sequential timestamps and deduplicates stable dimensions, reducing data volume by orders of magnitude compared to uncompressed row storage.

High-efficiency analytical reads
Queries avoid I/O amplification by scanning only the required column segments. This selective projection allows for rapid aggregation over specific metrics without the cost of reading the entire row.

Cheaper storage
Storing Parquet files in commodity object storage (e.g., S3) allows cost-effective long-term retention, decoupling archival history from the high-performance SSDs required by active databases.

For further reading, see: Column-stores vs. Row-stores: How Different Are They Really? (SIGMOD 2008).

Wide vs. Narrow Schema for Multi-Metric Rows

When multiple metrics are emitted at the same timestamp, row shape changes both storage overhead and query complexity. Narrow (long-form) stores one row per metric:

Table 6. Narrow schema: one row per metric per timestamp, with series identity and timestamp repeated for each metric

Wide stores one row with multiple metric columns:

Table 7.Wide schema: one row per timestamp, with all metrics for that series stored as separate columns

Trade-offs:

Narrow repeats (series_id, ts) for every metric row.
Wide stores (series_id, ts) once for that timestamp.
If a query needs multiple metrics together at the same timestamp, narrow schema usually needs a pivot or self join on (series_id, ts).
A wide schema can return those metrics from a single row read, which keeps query logic simpler.

As metric count per timestamp grows, narrow schemas spend more storage on repeated keys instead of new information. This overhead appears in both data and indexes. Wide schemas avoid most of that repetition, as long as metrics are stable and co-emitted.

Example (narrow self-join):

SELECT t.series_id, t.ts, t.value AS temperature_c, h.value AS humidity_pct 
FROM readings_narrow t JOIN readings_narrow h 
   ON h.series_id = t.series_id 
   AND h.ts = t.ts 
WHERE t.metric_name = 'temperature_c' 
   AND h.metric_name = 'humidity_pct' 
   AND t.series_id = 'A1' 
   AND t.ts >= now() - interval '1 hour';

Wide query:

SELECT series_id, ts, temperature_c, humidity_pct 
FROM readings_wide 
WHERE series_id = 'A1' AND ts >= now() - interval '1 hour';

Practical rules:

Use wide when metric sets are small, stable, and emitted together.
Prefer wide when common queries need multiple metrics at the same timestamp in one result row.
Use narrow when metrics are sparse, dynamic, or sampled independently.
Expect wide schemas to accumulate nullable columns when metric sets improve quickly.

Partitioning: Time First, Then Space

Everything so far has assumed data lives in a single table. As data grows, a monolithic table becomes harder to query efficiently and more expensive to maintain. Partitioning splits the dataset into smaller, independently manageable chunks. On a single database instance, this approach improves query pruning, simplifies retention, and reduces maintenance overhead. The same partition boundaries later become the natural unit for distributing data across nodes if a single instance is no longer enough.

Time Partitioning

Time is the natural partitioning axis. Time-series queries are almost always time-bounded ("last hour," or "last 7 days") and data has a built-in ordering that aligns with how it arrives.

Splitting data into time-ranged chunks gives you three things:

Partition pruning
A query for the last hour only touches the current partition. The planner eliminates non matching partitions using metadata alone (no index scan and no I/O).

Cheap expiration
Dropping an old partition is an O(1) metadata operation. Deleting rows from a monolithic table requires scanning, marking, and vacuuming. On a large table, those operations can take minutes and cause I/O spikes.

Independent maintenance
Each partition can be vacuumed, reindexed, or analyzed independently. The hot (current) partition gets frequent maintenance; cold partitions are left alone. Indexes per partition are also smaller and shallower, so that lookups are fast and memory use is predictable.

In PostgreSQL, this approach is declarative:

CREATE TABLE readings ( 
   series_id integer NOT NULL, 
   ts timestamptz NOT NULL, 
   metric_name text NOT NULL, 
   value double precision NOT NULL 
) PARTITION BY RANGE (ts); 


CREATE TABLE readings_2026_02_01 
   PARTITION OF readings 
   FOR VALUES FROM ('2026-02-01') TO ('2026-02-02'); 
CREATE TABLE readings_2026_02_02 
   PARTITION OF readings 
   FOR VALUES FROM ('2026-02-02') TO ('2026-02-03');

Queries with a WHERE ts BETWEEN ... clause automatically hit only the relevant partitions. Old days can be dropped with DROP TABLE readings_2026_01_15. No row-level cleanup is needed.

Why Time Alone Creates a Hotspot

Time-series writes have a specific property: they cluster around "now". Every active series emits its latest point into the same current time window. If you partition only by time, the current partition absorbs all write traffic while historical partitions sit idle.

For a system ingesting 100K points per second across 10K series, the current daily partition handles the full 100K writes per second. The previous day's partition handles zero. This is a write hotspot by construction. The same problem appears on reads. A query like "give me device-42's data for the last hour" hits the current partition, but that partition contains every series for that hour. Without further organization, the query scans data for all 10K series to find the one it needs.

Two-Dimensional Partitioning: Time and Space

The fix is to partition on two axes: time and series identity. Each resulting partition covers a time range and a subset of series. This approach distributes writes across multiple partitions even within the current time window.

The "space" dimension is typically derived from series identity: a hash of series_id or a natural grouping like region or device type.

Figure 2. Visualization of two-dimensional time series partitioning

Now the 100K writes per second for the current day spread across three partitions instead of one. A query for a specific series touches only the partition that contains its hash bucket, within the relevant time range. For a deeper treatment of how this two-dimensional partitioning works in a managed system at scale, see Werner Vogels' writeup on Amazon Timestream's design.

Downsampling and Retention: Trading Resolution for Cost

Raw time-series data has a short useful life at full resolution. A five-second CPU reading matters during an active incident. A week later, the minute-level trend is enough. A year later, hourly averages are all anyone queries.

Downsampling reduces data resolution over time by pre-computing aggregates at coarser intervals. Retention policies define when to discard data entirely. Together, they control the growth curve of storage cost.

The Resolution Ladder

A typical policy looks like this figure:

Figure 3. The resolution ladder

Each step reduces row count by the ratio of intervals. Going from five-second to one-minute resolution cuts rows by twelves times. Going from one-minute to one-hour cuts another sixty times. A series producing 17,280 raw points per day becomes 1,440 at one minute and twenty-four at one hour, a 720 times reduction.

Cost Challenges of Dashboard Refreshes

With storage and retention under control, the remaining cost driver is read amplification and dashboards are the most common source.
A dashboard query can be cheap in isolation. At scale, refresh traffic multiplies cost quickly. If:

U = concurrent users 
Q = queries per dashboard view 
R = refresh interval (seconds)

Then:

QPS ~= (U * Q) / R

The ingest rate may stay flat, but read cost can still spike because the same time windows and aggregates are recomputed repeatedly.

Practical mitigations to consider:

Cache results with short TTLs.
Serve dashboard queries from pre-aggregated rollups, not raw rows.
Use query-caching features in the dashboard/query layer when available (for example, Grafana with Amazon Timestream)

Conclusion

The storage decisions underneath a time-series database, how you model identity, what you partition on, when you downsample, and how you lay out columns, determine cost and performance more than the choice of database itself. These fundamentals apply whether you're running PostgreSQL, Parquet on S3, or a managed time series service. Get these services right, and you will avoid the class of problems that show up as surprise bills and slow dashboards regardless of which database you pick.

About the Author

Nirmesh Khandelwal

Show moreShow less

InfoQ Software Architects' Newsletter