Bin continuous values into discrete categories based on their quantiles
Description
Usage
<Expr>$qcut(
quantiles,
...,
labels = NULL,
left_closed = FALSE,
allow_duplicates = FALSE,
include_breaks = FALSE
)
Arguments
quantiles
|
Either a vector of quantile probabilities between 0 and 1 or a positive integer determining the number of bins with uniform probability. |
…
|
These dots are for future extensions and must be empty. |
labels
|
Names of the categories. The number of labels must be equal to the number of categories. |
left_closed
|
Set the intervals to be left-closed instead of right-closed. |
allow_duplicates
|
If TRUE , duplicates in the resulting quantiles are dropped,
rather than raising an error. This can happen even with unique
probabilities, depending on the data.
|
include_breaks
|
Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a Categorical to a Struct. |
Value
A polars expression
Examples
library("polars")
# Divide a column into three categories according to pre-defined quantile
# probabilities.
df <- pl$DataFrame(foo = -2:2)
df$with_columns(
qcut = pl$col("foo")$qcut(c(0.25, 0.75), labels = c("a", "b", "c"))
)
#> shape: (5, 2)
#> ┌─────┬──────┐
#> │ foo ┆ qcut │
#> │ --- ┆ --- │
#> │ i32 ┆ cat │
#> ╞═════╪══════╡
#> │ -2 ┆ a │
#> │ -1 ┆ a │
#> │ 0 ┆ b │
#> │ 1 ┆ b │
#> │ 2 ┆ c │
#> └─────┴──────┘
# Divide a column into two categories using uniform quantile probabilities.
df$with_columns(
qcut = pl$col("foo")$qcut(2, labels = c("low", "high"), left_closed = TRUE)
)
#> shape: (5, 2)
#> ┌─────┬──────┐
#> │ foo ┆ qcut │
#> │ --- ┆ --- │
#> │ i32 ┆ cat │
#> ╞═════╪══════╡
#> │ -2 ┆ low │
#> │ -1 ┆ low │
#> │ 0 ┆ high │
#> │ 1 ┆ high │
#> │ 2 ┆ high │
#> └─────┴──────┘
# Add both the category and the breakpoint.
df$with_columns(
qcut = pl$col("foo")$qcut(c(0.25, 0.75), include_breaks = TRUE)
)$unnest("qcut")
#> shape: (5, 3)
#> ┌─────┬────────────┬────────────┐
#> │ foo ┆ breakpoint ┆ category │
#> │ --- ┆ --- ┆ --- │
#> │ i32 ┆ f64 ┆ cat │
#> ╞═════╪════════════╪════════════╡
#> │ -2 ┆ -1.0 ┆ (-inf, -1] │
#> │ -1 ┆ -1.0 ┆ (-inf, -1] │
#> │ 0 ┆ 1.0 ┆ (-1, 1] │
#> │ 1 ┆ 1.0 ┆ (-1, 1] │
#> │ 2 ┆ inf ┆ (1, inf] │
#> └─────┴────────────┴────────────┘