Apply a rolling min based on another column
Description
Given a by
column <t_0,
t_1, …, t_n>
, then closed = “right”
(the default)
means the windows will be:
-
(t_0 - window_size, t_0\]
-
(t_1 - window_size, t_1\]
- …
-
(t_n - window_size, t_n\]
Usage
<Expr>$rolling_min_by(
by,
window_size,
...,
min_periods = 1,
closed = c("right", "both", "left", "none")
)
Arguments
by
|
Should be DateTime, Date, UInt64, UInt32, Int64, or Int32 data type
after conversion by as_polars_expr() . Note that the integer
ones require using “i” in window_size . Accepts
expression input. Strings are parsed as column names.
|
window_size
|
The length of the window. Can be a dynamic temporal size indicated by a
timedelta or the following string language:
“3d12h4m25s” \# 3 days, 12 hours, 4
minutes, and 25 seconds
By "calendar day", we mean the corresponding time on the next day (which
may not be 24 hours, due to daylight savings). Similarly for "calendar
week", "calendar month", "calendar quarter", and "calendar year".
|
…
|
These dots are for future extensions and must be empty. |
min_periods
|
The number of values in the window that should be non-null before
computing a result. If NULL (default), it will be set equal
to window_size .
|
closed
|
Define which sides of the interval are closed (inclusive). Default is
“right” .
|
Details
If you want to compute multiple aggregation statistics over the same
dynamic window, consider using $rolling()
- this method can
cache the window size computation.
Value
A polars expression
Examples
library("polars")
df_temporal <- pl$select(
index = 0:24,
date = pl$datetime_range(
as.POSIXct("2001-01-01"),
as.POSIXct("2001-01-02"),
"1h"
)
)
# Compute the rolling min with the temporal windows closed on the right
# (default)
df_temporal$with_columns(
rolling_row_min = pl$col("index")$rolling_min_by(
"date",
window_size = "2h"
)
)
#> shape: (25, 3)
#> ┌───────┬─────────────────────┬─────────────────┐
#> │ index ┆ date ┆ rolling_row_min │
#> │ --- ┆ --- ┆ --- │
#> │ i32 ┆ datetime[ms] ┆ i32 │
#> ╞═══════╪═════════════════════╪═════════════════╡
#> │ 0 ┆ 2001-01-01 00:00:00 ┆ 0 │
#> │ 1 ┆ 2001-01-01 01:00:00 ┆ 0 │
#> │ 2 ┆ 2001-01-01 02:00:00 ┆ 1 │
#> │ 3 ┆ 2001-01-01 03:00:00 ┆ 2 │
#> │ 4 ┆ 2001-01-01 04:00:00 ┆ 3 │
#> │ … ┆ … ┆ … │
#> │ 20 ┆ 2001-01-01 20:00:00 ┆ 19 │
#> │ 21 ┆ 2001-01-01 21:00:00 ┆ 20 │
#> │ 22 ┆ 2001-01-01 22:00:00 ┆ 21 │
#> │ 23 ┆ 2001-01-01 23:00:00 ┆ 22 │
#> │ 24 ┆ 2001-01-02 00:00:00 ┆ 23 │
#> └───────┴─────────────────────┴─────────────────┘
# Compute the rolling min with the closure of windows on both sides
df_temporal$with_columns(
rolling_row_min = pl$col("index")$rolling_min_by(
"date",
window_size = "2h",
closed = "both"
)
)
#> shape: (25, 3)
#> ┌───────┬─────────────────────┬─────────────────┐
#> │ index ┆ date ┆ rolling_row_min │
#> │ --- ┆ --- ┆ --- │
#> │ i32 ┆ datetime[ms] ┆ i32 │
#> ╞═══════╪═════════════════════╪═════════════════╡
#> │ 0 ┆ 2001-01-01 00:00:00 ┆ 0 │
#> │ 1 ┆ 2001-01-01 01:00:00 ┆ 0 │
#> │ 2 ┆ 2001-01-01 02:00:00 ┆ 0 │
#> │ 3 ┆ 2001-01-01 03:00:00 ┆ 1 │
#> │ 4 ┆ 2001-01-01 04:00:00 ┆ 2 │
#> │ … ┆ … ┆ … │
#> │ 20 ┆ 2001-01-01 20:00:00 ┆ 18 │
#> │ 21 ┆ 2001-01-01 21:00:00 ┆ 19 │
#> │ 22 ┆ 2001-01-01 22:00:00 ┆ 20 │
#> │ 23 ┆ 2001-01-01 23:00:00 ┆ 21 │
#> │ 24 ┆ 2001-01-02 00:00:00 ┆ 22 │
#> └───────┴─────────────────────┴─────────────────┘