Evaluate the query in streaming mode and write to an Arrow IPC file
Description
This allows streaming results that are larger than RAM to be written to disk.
Usage
<LazyFrame>$sink_ipc(
path,
...,
compression = c("zstd", "lz4", "uncompressed"),
compat_level = c("newest", "oldest"),
maintain_order = TRUE,
type_coercion = TRUE,
`_type_check` = TRUE,
predicate_pushdown = TRUE,
projection_pushdown = TRUE,
simplify_expression = TRUE,
slice_pushdown = TRUE,
collapse_joins = TRUE,
no_optimization = FALSE,
storage_options = NULL,
retries = 2,
sync_on_close = c("none", "data", "all"),
mkdir = FALSE
)
Arguments
path
|
A character. File path to which the file should be written. |
…
|
These dots are for future extensions and must be empty. |
compression
|
Determines the compression algorithm. Must be one of:
|
compat_level
|
Determines the compatibility level when exporting Polars’ internal data
structures. When specifying a new compatibility level, Polars exports
its internal data structures that might not be interpretable by other
Arrow implementations. The level can be specified as the name (e.g.,
“newest” ) or as a scalar integer (Currently, 0
or 1 is supported).
|
maintain_order
|
Maintain the order in which data is processed. Setting this to
FALSE will be slightly faster.
|
type_coercion
|
A logical, indicats type coercion optimization. |
\_type_check
|
For internal use only. |
predicate_pushdown
|
A logical, indicats predicate pushdown optimization. |
projection_pushdown
|
A logical, indicats projection pushdown optimization. |
simplify_expression
|
A logical, indicats simplify expression optimization. |
slice_pushdown
|
A logical, indicats slice pushdown optimization. |
collapse_joins
|
Collapse a join and filters into a faster join. |
no_optimization
|
A logical. If TRUE , turn off (certain) optimizations.
|
storage_options
|
Named vector containing options that indicate how to connect to a cloud
provider. The cloud providers currently supported are AWS, GCP, and
Azure. See supported keys here:
storage_options is not provided, Polars will try to
infer the information from environment variables.
|
retries
|
Number of retries if accessing a cloud instance fails. |
sync_on_close
|
Sync to disk when before closing a file. Must be one of:
|
mkdir
|
Recursively create all the directories in the path. |
Value
Invisibly returns the input LazyFrame
Examples
library("polars")
tmpf <- tempfile(fileext = ".arrow")
as_polars_lf(mtcars)$sink_ipc(tmpf)
pl$scan_ipc(tmpf)$collect()
#> shape: (32, 11)
#> ┌──────┬─────┬───────┬───────┬───┬─────┬─────┬──────┬──────┐
#> │ mpg ┆ cyl ┆ disp ┆ hp ┆ … ┆ vs ┆ am ┆ gear ┆ carb │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ ┆ f64 ┆ f64 ┆ f64 ┆ f64 │
#> ╞══════╪═════╪═══════╪═══════╪═══╪═════╪═════╪══════╪══════╡
#> │ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ … ┆ 0.0 ┆ 1.0 ┆ 4.0 ┆ 4.0 │
#> │ 21.0 ┆ 6.0 ┆ 160.0 ┆ 110.0 ┆ … ┆ 0.0 ┆ 1.0 ┆ 4.0 ┆ 4.0 │
#> │ 22.8 ┆ 4.0 ┆ 108.0 ┆ 93.0 ┆ … ┆ 1.0 ┆ 1.0 ┆ 4.0 ┆ 1.0 │
#> │ 21.4 ┆ 6.0 ┆ 258.0 ┆ 110.0 ┆ … ┆ 1.0 ┆ 0.0 ┆ 3.0 ┆ 1.0 │
#> │ 18.7 ┆ 8.0 ┆ 360.0 ┆ 175.0 ┆ … ┆ 0.0 ┆ 0.0 ┆ 3.0 ┆ 2.0 │
#> │ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … ┆ … │
#> │ 30.4 ┆ 4.0 ┆ 95.1 ┆ 113.0 ┆ … ┆ 1.0 ┆ 1.0 ┆ 5.0 ┆ 2.0 │
#> │ 15.8 ┆ 8.0 ┆ 351.0 ┆ 264.0 ┆ … ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 4.0 │
#> │ 19.7 ┆ 6.0 ┆ 145.0 ┆ 175.0 ┆ … ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 6.0 │
#> │ 15.0 ┆ 8.0 ┆ 301.0 ┆ 335.0 ┆ … ┆ 0.0 ┆ 1.0 ┆ 5.0 ┆ 8.0 │
#> │ 21.4 ┆ 4.0 ┆ 121.0 ┆ 109.0 ┆ … ┆ 1.0 ┆ 1.0 ┆ 4.0 ┆ 2.0 │
#> └──────┴─────┴───────┴───────┴───┴─────┴─────┴──────┴──────┘