Picture of the author

Folding and Interpolation: Aligning Time Series Data

Timeseries, Folding, Interpolation

Folding nd interpolation are techniques to align time series data to a given interval.

Folding

There are other terms used for folding such as aggregating or rolling up.

Folding time series data is the process of taking many values within a specific time range and folding them into a single representative value for the entire time range. The fold function determines how a raw history is “folded” when the history is being rolled up.

Consider the following raw history, recorded at 1minute intervals:

timestampMy Point
2019-01-01 00:001 kWh
2019-01-01 00:012 kWh
2019-01-01 00:023 kWh
2019-01-01 00:032 kWh
2019-01-01 00:041 kWh
2019-01-01 00:052 kWh
2019-01-01 00:063 kWh
2019-01-01 00:074 kWh
2019-01-01 00:083 kWh
2019-01-01 00:092 kWh

If we were to roll this history up into 5minute intervals, we would get different results depending on how we fold the raw history. The table below shows the resulting values for fold functions of “sum”, “avg”, “min” and “max” respectively.

timestampSumAvgMinMax
2019-01-01 00:009 kWh1.8 kWh1 kWh3 kWh
2019-01-01 00:0514 kWh2.8 kWh2 kWh4 kWh

The available fold functions depend of the data type.

Interpolation

Interpolating time series data could be considered the inverse of folding time series data. With folding, many values are transformed into a single value. With interpolation, two or more values are transformed into many values.

Interpolation is one method used for filling gaps in data.

Consider the following history, which has data every 5 minutes but nothing in between:

timestampMy Point
2019-01-01 00:005 kWh
2019-01-01 00:01
2019-01-01 00:02
2019-01-01 00:03
2019-01-01 00:04
2019-01-01 00:0510 kWh
2019-01-01 00:06
2019-01-01 00:07
2019-01-01 00:08
2019-01-01 00:09
2019-01-01 00:105 kWh

There are three approaches to interpolating data. The correct approach to use depends entirely on the data in question:

  1. Linear - the data is simply linearly interpolated between known values. Usually applied to sampled data such as temperature trends.

  2. Change of Value - all gaps in the data are filled with the last known value. Usually applied to set point trends or boolean data

  3. Apportion - the size of the gap is determined and all known data is evenly apportioned across the gap. Usually applied to consumption data such as energy usage

The table below demonstrates the three approaches:

timestampOriginalLinearCOVApportion
2019-01-01 00:005551
2019-01-01 00:01651
2019-01-01 00:02751
2019-01-01 00:03851
2019-01-01 00:04951
2019-01-01 00:051010102
2019-01-01 00:069102
2019-01-01 00:078102
2019-01-01 00:087102
2019-01-01 00:096102
2019-01-01 00:105555