Get a distinct integer ID for each run of identical values
Description
The ID starts at 0 and increases by one each time the value of the column changes.
Usage
<Expr>$rle_id()
Details
This functionality is especially useful for defining a new group for every time a column’s value changes, rather than for every distinct value of that column.
Value
A polars expression
Examples
library("polars")
df <- pl$DataFrame(
a = c(1, 2, 1, 1, 1),
b = c("x", "x", NA, "y", "y")
)
df$with_columns(
rle_id_a = pl$col("a")$rle_id(),
rle_id_ab = pl$struct("a", "b")$rle_id()
)
#> shape: (5, 4)
#> ┌─────┬──────┬──────────┬───────────┐
#> │ a ┆ b ┆ rle_id_a ┆ rle_id_ab │
#> │ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ str ┆ u32 ┆ u32 │
#> ╞═════╪══════╪══════════╪═══════════╡
#> │ 1.0 ┆ x ┆ 0 ┆ 0 │
#> │ 2.0 ┆ x ┆ 1 ┆ 1 │
#> │ 1.0 ┆ null ┆ 2 ┆ 2 │
#> │ 1.0 ┆ y ┆ 2 ┆ 3 │
#> │ 1.0 ┆ y ┆ 2 ┆ 3 │
#> └─────┴──────┴──────────┴───────────┘