Approximate count of unique values
Description
This is done using the HyperLogLog++ algorithm for cardinality estimation.
Usage
<Expr>$approx_n_unique()
Value
A polars expression
Examples
#> shape: (1, 1)
#> ┌─────┐
#> │ n │
#> │ --- │
#> │ u32 │
#> ╞═════╡
#> │ 2 │
#> └─────┘
df <- pl$DataFrame(n = 0:1000)
df$select(
exact = pl$col("n")$n_unique(),
approx = pl$col("n")$approx_n_unique()
)
#> shape: (1, 2)
#> ┌───────┬────────┐
#> │ exact ┆ approx │
#> │ --- ┆ --- │
#> │ u32 ┆ u32 │
#> ╞═══════╪════════╡
#> │ 1001 ┆ 1005 │
#> └───────┴────────┘