Replace all matching regex/literal substrings with a new string value
Description
Replace all matching regex/literal substrings with a new string value
Usage
<Expr>$str$replace_all(pattern, value, ..., literal = FALSE)
Arguments
pattern
|
A character or something can be coerced to a string Expr of a valid regex pattern, compatible with the regex crate. |
value
|
A character or an Expr of string that will replace the matched substring. |
…
|
These dots are for future extensions and must be empty. |
literal
|
Logical. If TRUE , treat pattern as a literal
string, not as a regular expression.
|
Details
To modify regular expression behaviour (such as case-sensitivity) with
flags, use the inline (?iLmsuxU)
syntax. See the regex
crate’s section on
grouping
and flags for additional information about the use of inline
expression modifiers.
Value
A polars expression
Capture groups
The dollar sign ($
) is a special character related to
capture groups. To refer to a literal dollar sign, use
$$
instead or set
literal
to TRUE
.
See Also
-
\
$str$replace()
Examples
library("polars")
df <- pl$DataFrame(id = 1L:2L, text = c("abcabc", "123a123"))
df$with_columns(pl$col("text")$str$replace_all("a", "-"))
#> shape: (2, 2)
#> ┌─────┬─────────┐
#> │ id ┆ text │
#> │ --- ┆ --- │
#> │ i32 ┆ str │
#> ╞═════╪═════════╡
#> │ 1 ┆ -bc-bc │
#> │ 2 ┆ 123-123 │
#> └─────┴─────────┘
# Capture groups are supported.
# Use `${1}` in the value string to refer to the first capture group in the pattern,
# `${2}` to refer to the second capture group, and so on.
# You can also use named capture groups.
df <- pl$DataFrame(word = c("hat", "hut"))
df$with_columns(
positional = pl$col("word")$str$replace_all("h(.)t", "b${1}d"),
named = pl$col("word")$str$replace_all("h(?<vowel>.)t", "b${vowel}d")
)
#> shape: (2, 3)
#> ┌──────┬────────────┬───────┐
#> │ word ┆ positional ┆ named │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ str ┆ str │
#> ╞══════╪════════════╪═══════╡
#> │ hat ┆ bad ┆ bad │
#> │ hut ┆ bud ┆ bud │
#> └──────┴────────────┴───────┘
# Apply case-insensitive string replacement using the `(?i)` flag.
df <- pl$DataFrame(
city = rep("Philadelphia", 4),
season = c("Spring", "Summer", "Autumn", "Winter"),
weather = c("Rainy", "Sunny", "Cloudy", "Snowy")
)
df$with_columns(
pl$col("weather")$str$replace_all(
"(?i)foggy|rainy|cloudy|snowy", "Sunny"
)
)
#> shape: (4, 3)
#> ┌──────────────┬────────┬─────────┐
#> │ city ┆ season ┆ weather │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ str ┆ str │
#> ╞══════════════╪════════╪═════════╡
#> │ Philadelphia ┆ Spring ┆ Sunny │
#> │ Philadelphia ┆ Summer ┆ Sunny │
#> │ Philadelphia ┆ Autumn ┆ Sunny │
#> │ Philadelphia ┆ Winter ┆ Sunny │
#> └──────────────┴────────┴─────────┘