Skip to content

Replace all matching regex/literal substrings with a new string value

Description

Replace all matching regex/literal substrings with a new string value

Usage

<Expr>$str$replace_all(pattern, value, ..., literal = FALSE)

Arguments

pattern A character or something can be coerced to a string Expr of a valid regex pattern, compatible with the regex crate.
value A character or an Expr of string that will replace the matched substring.
These dots are for future extensions and must be empty.
literal Logical. If TRUE, treat pattern as a literal string, not as a regular expression.

Details

To modify regular expression behaviour (such as case-sensitivity) with flags, use the inline (?iLmsuxU) syntax. See the regex crate’s section on grouping and flags for additional information about the use of inline expression modifiers.

Value

A polars expression

Capture groups

The dollar sign ($) is a special character related to capture groups. To refer to a literal dollar sign, use $$ instead or set literal to TRUE.

See Also

  • \$str$replace()

Examples

library("polars")

df <- pl$DataFrame(id = 1L:2L, text = c("abcabc", "123a123"))
df$with_columns(pl$col("text")$str$replace_all("a", "-"))
#> shape: (2, 2)
#> ┌─────┬─────────┐
#> │ id  ┆ text    │
#> │ --- ┆ ---     │
#> │ i32 ┆ str     │
#> ╞═════╪═════════╡
#> │ 1   ┆ -bc-bc  │
#> │ 2   ┆ 123-123 │
#> └─────┴─────────┘
# Capture groups are supported.
# Use `${1}` in the value string to refer to the first capture group in the pattern,
# `${2}` to refer to the second capture group, and so on.
# You can also use named capture groups.
df <- pl$DataFrame(word = c("hat", "hut"))
df$with_columns(
  positional = pl$col("word")$str$replace_all("h(.)t", "b${1}d"),
  named = pl$col("word")$str$replace_all("h(?<vowel>.)t", "b${vowel}d")
)
#> shape: (2, 3)
#> ┌──────┬────────────┬───────┐
#> │ word ┆ positional ┆ named │
#> │ ---  ┆ ---        ┆ ---   │
#> │ str  ┆ str        ┆ str   │
#> ╞══════╪════════════╪═══════╡
#> │ hat  ┆ bad        ┆ bad   │
#> │ hut  ┆ bud        ┆ bud   │
#> └──────┴────────────┴───────┘
# Apply case-insensitive string replacement using the `(?i)` flag.
df <- pl$DataFrame(
  city = rep("Philadelphia", 4),
  season = c("Spring", "Summer", "Autumn", "Winter"),
  weather = c("Rainy", "Sunny", "Cloudy", "Snowy")
)
df$with_columns(
  pl$col("weather")$str$replace_all(
    "(?i)foggy|rainy|cloudy|snowy", "Sunny"
  )
)
#> shape: (4, 3)
#> ┌──────────────┬────────┬─────────┐
#> │ city         ┆ season ┆ weather │
#> │ ---          ┆ ---    ┆ ---     │
#> │ str          ┆ str    ┆ str     │
#> ╞══════════════╪════════╪═════════╡
#> │ Philadelphia ┆ Spring ┆ Sunny   │
#> │ Philadelphia ┆ Summer ┆ Sunny   │
#> │ Philadelphia ┆ Autumn ┆ Sunny   │
#> │ Philadelphia ┆ Winter ┆ Sunny   │
#> └──────────────┴────────┴─────────┘