Skip to content

Create rolling groups based on a temporal or integer column

Description

If you have a time series <t_0, t_1, …, t_n>, then by default the windows created will be:

  • (t_0 - period, t_0\]
  • (t_1 - period, t_1\]
  • (t_n - period, t_n\]

whereas if you pass a non-default offset, then the windows will be:

  • (t_0 + offset, t_0 + offset + period\]
  • (t_1 + offset, t_1 + offset + period\]
  • (t_n + offset, t_n + offset + period\]

Usage

<Expr>$rolling(index_column, ..., period, offset = NULL, closed = "right")

Arguments

index_column Character. Name of the column used to group based on the time window. Often of type Date/Datetime. This column must be sorted in ascending order. In case of a rolling group by on indices, dtype needs to be one of UInt32, UInt64, Int32, Int64. Note that the first three get cast to Int64, so if performance matters use an Int64 column.
These dots are for future extensions and must be empty.
period Length of the window - must be non-negative.
offset Offset of the window. Default is -period.
closed Define which sides of the range are closed (inclusive). One of the following: “both” (default), “left”, “right”, “none”.

Details

If you want to compute multiple aggregation statistics over the same dynamic window, consider using $rolling() - this method can cache the window size computation.

Value

A polars expression

Examples

library("polars")

dates <- as.POSIXct(
  c(
    "2020-01-01 13:45:48", "2020-01-01 16:42:13", "2020-01-01 16:45:09",
    "2020-01-02 18:12:48", "2020-01-03 19:45:32", "2020-01-08 23:16:43"
  )
)
df <- pl$DataFrame(dt = dates, a = c(3, 7, 5, 9, 2, 1))

df$with_columns(
  sum_a = pl$col("a")$sum()$rolling(index_column = "dt", period = "2d"),
  min_a = pl$col("a")$min()$rolling(index_column = "dt", period = "2d"),
  max_a = pl$col("a")$max()$rolling(index_column = "dt", period = "2d")
)
#> shape: (6, 5)
#> ┌─────────────────────┬─────┬───────┬───────┬───────┐
#> │ dt                  ┆ a   ┆ sum_a ┆ min_a ┆ max_a │
#> │ ---                 ┆ --- ┆ ---   ┆ ---   ┆ ---   │
#> │ datetime[ms]        ┆ f64 ┆ f64   ┆ f64   ┆ f64   │
#> ╞═════════════════════╪═════╪═══════╪═══════╪═══════╡
#> │ 2020-01-01 13:45:48 ┆ 3.0 ┆ 3.0   ┆ 3.0   ┆ 3.0   │
#> │ 2020-01-01 16:42:13 ┆ 7.0 ┆ 10.0  ┆ 3.0   ┆ 7.0   │
#> │ 2020-01-01 16:45:09 ┆ 5.0 ┆ 15.0  ┆ 3.0   ┆ 7.0   │
#> │ 2020-01-02 18:12:48 ┆ 9.0 ┆ 24.0  ┆ 3.0   ┆ 9.0   │
#> │ 2020-01-03 19:45:32 ┆ 2.0 ┆ 11.0  ┆ 2.0   ┆ 9.0   │
#> │ 2020-01-08 23:16:43 ┆ 1.0 ┆ 1.0   ┆ 1.0   ┆ 1.0   │
#> └─────────────────────┴─────┴───────┴───────┴───────┘