Drop duplicate rows
Description
Drop duplicate rows
Usage
<DataFrame>$unique(
subset = NULL,
...,
keep = c("any", "none", "first", "last"),
maintain_order = FALSE
)
Arguments
subset
|
Column name(s) or selector(s), to consider when identifying duplicate
rows. If NULL (default), use all columns.
|
…
|
These dots are for future extensions and must be empty. |
keep
|
Which of the duplicate rows to keep. Must be one of:
|
maintain_order
|
Keep the same order as the original data. This is more expensive to
compute. Setting this to TRUE blocks the possibility to run
on the streaming engine.
|
Value
A polars DataFrame
Examples
library("polars")
df <- pl$DataFrame(
foo = c(1, 2, 3, 1),
bar = c("a", "a", "a", "a"),
ham = c("b", "b", "b", "b"),
)
df$unique(maintain_order = TRUE)
#> shape: (3, 3)
#> ┌─────┬─────┬─────┐
#> │ foo ┆ bar ┆ ham │
#> │ --- ┆ --- ┆ --- │
#> │ f64 ┆ str ┆ str │
#> ╞═════╪═════╪═════╡
#> │ 1.0 ┆ a ┆ b │
#> │ 2.0 ┆ a ┆ b │
#> │ 3.0 ┆ a ┆ b │
#> └─────┴─────┴─────┘
#> shape: (1, 3)
#> ┌─────┬─────┬─────┐
#> │ foo ┆ bar ┆ ham │
#> │ --- ┆ --- ┆ --- │
#> │ f64 ┆ str ┆ str │
#> ╞═════╪═════╪═════╡
#> │ 1.0 ┆ a ┆ b │
#> └─────┴─────┴─────┘
#> shape: (3, 3)
#> ┌─────┬─────┬─────┐
#> │ foo ┆ bar ┆ ham │
#> │ --- ┆ --- ┆ --- │
#> │ f64 ┆ str ┆ str │
#> ╞═════╪═════╪═════╡
#> │ 2.0 ┆ a ┆ b │
#> │ 3.0 ┆ a ┆ b │
#> │ 1.0 ┆ a ┆ b │
#> └─────┴─────┴─────┘