Skip to content

Clone a DataFrame

Description

This is a cheap operation that does not copy data. Assigning does not copy the DataFrame (environment object). This is because environment objects have reference semantics. Calling $clone() creates a new environment, which can be useful when dealing with attributes (see examples).

Usage

<DataFrame>$clone()

Value

A polars DataFrame

Examples

library("polars")

df1 <- as_polars_df(iris)

# Assigning does not copy the DataFrame (environment object), calling
# $clone() creates a new environment.
df2 <- df1
df3 <- df1$clone()
rlang::env_label(df1)
#> [1] "0x56109c214318"
rlang::env_label(df2)
#> [1] "0x56109c214318"
rlang::env_label(df3)
#> [1] "0x56109c6b7170"
# Cloning can be useful to add attributes to data used in a function without
# adding those attributes to the original object.

# Make a function to take a DataFrame, add an attribute, and return a
# DataFrame:
give_attr <- function(data) {
  attr(data, "created_on") <- "2024-01-29"
  data
}
df2 <- give_attr(df1)

# Problem: the original DataFrame also gets the attribute while it shouldn't
attributes(df1)
#> $class
#> [1] "polars_data_frame" "polars_object"    
#> 
#> $created_on
#> [1] "2024-01-29"
# Use $clone() inside the function to avoid that
give_attr <- function(data) {
  data <- data$clone()
  attr(data, "created_on") <- "2024-01-29"
  data
}
df1 <- as_polars_df(iris)
df2 <- give_attr(df1)

# now, the original DataFrame doesn't get this attribute
attributes(df1)
#> $class
#> [1] "polars_data_frame" "polars_object"