Skip to content

Select and modify columns of a LazyFrame

Description

Select and perform operations on a subset of columns only. This discards unmentioned columns (like .() in data.table and contrarily to dplyr::mutate()).

One cannot use new variables in subsequent expressions in the same $select() call. For instance, if you create a variable x, you will only be able to use it in another $select() or $with_columns() call.

Usage

<LazyFrame>$select(...)

Arguments

\<dynamic-dots\> Name-value pairs of objects to be converted to polars expressions by the as_polars_expr() function. Characters are parsed as column names, other non-expression inputs are parsed as literals. Each name will be used as the expression name.

Value

A polars LazyFrame

Examples

library("polars")

# Pass the name of a column to select that column.
lf <- pl$LazyFrame(
  foo = 1:3,
  bar = 6:8,
  ham = letters[1:3]
)
lf$select("foo")$collect()
#> shape: (3, 1)
#> ┌─────┐
#> │ foo │
#> │ --- │
#> │ i32 │
#> ╞═════╡
#> │ 1   │
#> │ 2   │
#> │ 3   │
#> └─────┘
# Multiple columns can be selected by passing a list of column names.
lf$select("foo", "bar")$collect()
#> shape: (3, 2)
#> ┌─────┬─────┐
#> │ foo ┆ bar │
#> │ --- ┆ --- │
#> │ i32 ┆ i32 │
#> ╞═════╪═════╡
#> │ 1   ┆ 6   │
#> │ 2   ┆ 7   │
#> │ 3   ┆ 8   │
#> └─────┴─────┘
# Expressions are also accepted.
lf$select(pl$col("foo"), pl$col("bar") + 1)$collect()
#> shape: (3, 2)
#> ┌─────┬─────┐
#> │ foo ┆ bar │
#> │ --- ┆ --- │
#> │ i32 ┆ f64 │
#> ╞═════╪═════╡
#> │ 1   ┆ 7.0 │
#> │ 2   ┆ 8.0 │
#> │ 3   ┆ 9.0 │
#> └─────┴─────┘
# Name expression (used as the column name of the output DataFrame)
lf$select(
  threshold = pl$when(pl$col("foo") > 2)$then(10)$otherwise(0)
)$collect()
#> shape: (3, 1)
#> ┌───────────┐
#> │ threshold │
#> │ ---       │
#> │ f64       │
#> ╞═══════════╡
#> │ 0.0       │
#> │ 0.0       │
#> │ 10.0      │
#> └───────────┘
# Expressions with multiple outputs can be automatically instantiated
# as Structs by setting the `POLARS_AUTO_STRUCTIFY` environment variable.
# (Experimental)
if (requireNamespace("withr", quietly = TRUE)) {
  withr::with_envvar(c(POLARS_AUTO_STRUCTIFY = "1"), {
    lf$select(
      is_odd = ((pl$col(pl$Int32) %% 2) == 1)$name$suffix("_is_odd"),
    )$collect()
  })
}
#> shape: (3, 1)
#> ┌──────────────┐
#> │ is_odd       │
#> │ ---          │
#> │ struct[2]    │
#> ╞══════════════╡
#> │ {true,false} │
#> │ {false,true} │
#> │ {true,false} │
#> └──────────────┘