Bin continuous values into discrete categories
Description
Usage
<Expr>$cut(
breaks,
...,
labels = NULL,
left_closed = FALSE,
include_breaks = FALSE
)
Arguments
breaks
|
List of unique cut points. |
…
|
These dots are for future extensions and must be empty. |
labels
|
Names of the categories. The number of labels must be equal to the number of cut points plus one. |
left_closed
|
Set the intervals to be left-closed instead of right-closed. |
include_breaks
|
Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a Categorical to a Struct. |
Value
A polars expression
Examples
library("polars")
# Divide a column into three categories.
df <- pl$DataFrame(foo = -2:2)
df$with_columns(
cut = pl$col("foo")$cut(c(-1, 1), labels = c("a", "b", "c"))
)
#> shape: (5, 2)
#> ┌─────┬─────┐
#> │ foo ┆ cut │
#> │ --- ┆ --- │
#> │ i32 ┆ cat │
#> ╞═════╪═════╡
#> │ -2 ┆ a │
#> │ -1 ┆ a │
#> │ 0 ┆ b │
#> │ 1 ┆ b │
#> │ 2 ┆ c │
#> └─────┴─────┘
# Add both the category and the breakpoint.
df$with_columns(
cut = pl$col("foo")$cut(c(-1, 1), include_breaks = TRUE)
)$unnest("cut")
#> shape: (5, 3)
#> ┌─────┬────────────┬────────────┐
#> │ foo ┆ breakpoint ┆ category │
#> │ --- ┆ --- ┆ --- │
#> │ i32 ┆ f64 ┆ cat │
#> ╞═════╪════════════╪════════════╡
#> │ -2 ┆ -1.0 ┆ (-inf, -1] │
#> │ -1 ┆ -1.0 ┆ (-inf, -1] │
#> │ 0 ┆ 1.0 ┆ (-1, 1] │
#> │ 1 ┆ 1.0 ┆ (-1, 1] │
#> │ 2 ┆ inf ┆ (1, inf] │
#> └─────┴────────────┴────────────┘