Return the k
largest rows
Description
Non-null elements are always preferred over null elements, regardless of
the value of reverse
. The output is not guaranteed to be in
any particular order, call sort()
after this function if
you wish the output to be sorted.
Usage
<LazyFrame>$top_k(k, ..., by, reverse = FALSE)
Arguments
k
|
Number of rows to return. |
…
|
These dots are for future extensions and must be empty. |
by
|
Column(s) used to determine the bottom rows. Accepts expression input. Strings are parsed as column names. |
reverse
|
Consider the k smallest elements of the by
column(s) (instead of the k largest). This can be specified
per column by passing a sequence of booleans.
|
Value
A polars LazyFrame
Examples
library("polars")
lf <- pl$LazyFrame(
a = c("a", "b", "a", "b", "b", "c"),
b = c(2, 1, 1, 3, 2, 1)
)
# Get the rows which contain the 4 largest values in column b.
lf$top_k(4, by = "b")$collect()
#> shape: (4, 2)
#> ┌─────┬─────┐
#> │ a ┆ b │
#> │ --- ┆ --- │
#> │ str ┆ f64 │
#> ╞═════╪═════╡
#> │ b ┆ 3.0 │
#> │ a ┆ 2.0 │
#> │ b ┆ 2.0 │
#> │ b ┆ 1.0 │
#> └─────┴─────┘
# Get the rows which contain the 4 largest values when sorting on column a
# and b
lf$top_k(4, by = c("a", "b"))$collect()
#> shape: (4, 2)
#> ┌─────┬─────┐
#> │ a ┆ b │
#> │ --- ┆ --- │
#> │ str ┆ f64 │
#> ╞═════╪═════╡
#> │ c ┆ 1.0 │
#> │ b ┆ 3.0 │
#> │ b ┆ 2.0 │
#> │ b ┆ 1.0 │
#> └─────┴─────┘