2018-01-15

An Introduction to Time Series Data

Walking through the basics of time-series data: timestamps, intervals, and the four value types you'll come across.

TimeseriesTypeScriptSkySpark

When I joined CoolPlanet, everything I touched was grounded in time-series data. Sensor readings, equipment states, billing intervals, schedules. It soon became my bread and butter.

The starting point is the structure: every reading has two parts, a timestamp and a value.

Timestamp

A timestamp pins a reading to an absolute moment in time. To do that unambiguously you need date, time, and timezone.

I've worked with customer data from sites all over the world. Timezones have given me plenty of headaches over the years. The reasons are well outside the scope of this article, but Zain Rizvi's falsehoods programmers believe about time zones is a good place to start.

Interval

The duration between consecutive readings is the interval. Same underlying signal, different intervals, different stories.

Interval sampler

Same week of temperature data, sampled at different intervals. Coarser intervals lose the daily swing. Drag across the chart to zoom in.

Points sampled
672
over 7 days at 15 min intervals
Timestamps
2019-01-01 00:007.7 °C
2019-01-01 00:158.1 °C
2019-01-01 00:307.6 °C
2019-01-01 00:457.1 °C
+ 668 more

Value

Value types vary by source. The four you'll meet most often are numeric, boolean, period, and string. Each constrains what folding operations make sense; for the deep dive on folding, see Folding and Interpolation.

Value type explorer

Pick a value type to see what the data looks like and which folding operations make sense.

The most common format. The value is a number, often with an associated unit. Example: room temperature.

timestampRoom Temperature
00:0020.1 °C
00:0520.4 °C
00:1020.6 °C
00:1520.9 °C
00:2021.0 °C
00:2520.7 °C
00:3020.3 °C
00:3519.8 °C
00:4019.5 °C
00:4519.6 °C
00:5019.9 °C
00:5520.2 °C
01:0020.4 °C
Folding

Numeric data folds cleanly. Plenty of operations apply (sum, count, percentiles, standard deviation, and more); a few examples for the data above.

  • Average20.3 °C
  • Minimum19.5 °C
  • Maximum21.0 °C

Why it matters

None of this is rocket science, but the distinctions earn their keep further down the pipeline. Storage layers fold differently depending on the value type. Charts have to render booleans and strings as steps, not lines. Knowing what kind of value you're holding is the first thing you reach for when something doesn't look right.