Skip to content

Serialize the DataFrame to a binary format

Description

Serialize the DataFrame to a binary format. Currently, this format is uncompressed Arrow IPC stream format, so other Apache Arrow implementations may be able to read it.

Usage

<DataFrame>$serialize()

pl__deserialize_df(data)

Arguments

data A raw vector of serialized DataFrame.

Value

  • \$serialize() returns raw vector of serialized DataFrame.
  • pl$deserialize_df() returns a deserialized DataFrame.

Examples

library("polars")

df <- pl$DataFrame(
  foo = 1:3,
  bar = 6:8,
)$cast(bar = pl$UInt8)

# Serialize the DataFrame to a binary format
serialized <- df$serialize()
serialized
#>   [1] ff ff ff ff 00 01 00 00 04 00 00 00 f2 ff ff ff 14 00 00 00 04 00 01 00 00
#>  [26] 00 0a 00 0b 00 08 00 0a 00 04 00 f2 ff ff ff 4c 00 00 00 10 00 00 00 00 00
#>  [51] 0a 00 0c 00 00 00 04 00 08 00 01 00 00 00 04 00 00 00 f4 ff ff ff 1c 00 00
#>  [76] 00 0c 00 00 00 08 00 0c 00 04 00 08 00 05 00 00 00 5b 30 2c 30 5d 00 00 00
#> [101] 09 00 00 00 5f 50 4c 5f 46 4c 41 47 53 00 00 00 02 00 00 00 48 00 00 00 04
#> [126] 00 00 00 ec ff ff ff 34 00 00 00 20 00 00 00 18 00 00 00 01 02 00 00 10 00
#> [151] 12 00 04 00 10 00 11 00 08 00 00 00 0c 00 00 00 00 00 f6 ff ff ff 08 00 00
#> [176] 00 00 00 06 00 08 00 04 00 03 00 00 00 62 61 72 00 ec ff ff ff 38 00 00 00
#> [201] 20 00 00 00 18 00 00 00 01 02 00 00 10 00 12 00 04 00 10 00 11 00 08 00 00
#> [226] 00 0c 00 00 00 00 00 f4 ff ff ff 20 00 00 00 01 00 00 00 08 00 09 00 04 00
#> [251] 08 00 03 00 00 00 66 6f 6f 00 00 00 00 00 ff ff ff ff b8 00 00 00 04 00 00
#> [276] 00 ec ff ff ff 80 00 00 00 00 00 00 00 14 00 00 00 04 00 03 00 0c 00 13 00
#> [301] 10 00 12 00 0c 00 04 00 e6 ff ff ff 03 00 00 00 00 00 00 00 60 00 00 00 14
#> [326] 00 00 00 00 00 00 00 00 00 0a 00 14 00 04 00 0c 00 10 00 04 00 00 00 00 00
#> [351] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0c 00 00
#> [376] 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00
#> [401] 00 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 03 00 00 00 00
#> [426] 00 00 00 00 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 00 00
#> [451] 00 00 00 00 00 00 01 00 00 00 02 00 00 00 03 00 00 00 00 00 00 00 00 00 00
#> [476] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
#> [501] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 06 07 08 00 00
#> [526] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
#> [551] 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
#> [576] 00 00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00
# The bytes can later be deserialized back to a DataFrame
pl$deserialize_df(serialized)
#> shape: (3, 2)
#> ┌─────┬─────┐
#> │ foo ┆ bar │
#> │ --- ┆ --- │
#> │ i32 ┆ u8  │
#> ╞═════╪═════╡
#> │ 1   ┆ 6   │
#> │ 2   ┆ 7   │
#> │ 3   ┆ 8   │
#> └─────┴─────┘
# Other Apache Arrow implementations may be able to read it.
if (requireNamespace("arrow", quietly = TRUE)) {
  arrow::read_ipc_stream(serialized, as_data_frame = FALSE)
}
#> Table
#> 3 rows x 2 columns
#> $foo <int32>
#> $bar <uint8>
#> 
#> See $metadata for additional Schema metadata