Skip to content

Replace first matching regex/literal substring with a new string value

Description

Replace first matching regex/literal substring with a new string value

Usage

<Expr>$str$replace(pattern, value, ..., literal = FALSE, n = 1L)

Arguments

pattern A character or something can be coerced to a string Expr of a valid regex pattern, compatible with the regex crate.
value A character or an Expr of string that will replace the matched substring.
These dots are for future extensions and must be empty.
literal Logical. If TRUE, treat pattern as a literal string, not as a regular expression.
n A number of matches to replace. Note that regex replacement with n \> 1 not yet supported, so raise an error if n \> 1 and pattern includes regex pattern and literal = FALSE.

Details

To modify regular expression behaviour (such as case-sensitivity) with flags, use the inline (?iLmsuxU) syntax. See the regex crate’s section on grouping and flags for additional information about the use of inline expression modifiers.

Value

A polars expression

Capture groups

The dollar sign ($) is a special character related to capture groups. To refer to a literal dollar sign, use $$ instead or set literal to TRUE.

See Also

  • \$str$replace_all()

Examples

library("polars")

df <- pl$DataFrame(id = 1L:2L, text = c("123abc", "abc456"))
df$with_columns(pl$col("text")$str$replace(r"(abc\b)", "ABC"))
#> shape: (2, 2)
#> ┌─────┬────────┐
#> │ id  ┆ text   │
#> │ --- ┆ ---    │
#> │ i32 ┆ str    │
#> ╞═════╪════════╡
#> │ 1   ┆ 123ABC │
#> │ 2   ┆ abc456 │
#> └─────┴────────┘
# Capture groups are supported.
# Use `${1}` in the value string to refer to the first capture group in the pattern,
# `${2}` to refer to the second capture group, and so on.
# You can also use named capture groups.
df <- pl$DataFrame(word = c("hat", "hut"))
df$with_columns(
  positional = pl$col("word")$str$replace("h(.)t", "b${1}d"),
  named = pl$col("word")$str$replace("h(?<vowel>.)t", "b${vowel}d")
)
#> shape: (2, 3)
#> ┌──────┬────────────┬───────┐
#> │ word ┆ positional ┆ named │
#> │ ---  ┆ ---        ┆ ---   │
#> │ str  ┆ str        ┆ str   │
#> ╞══════╪════════════╪═══════╡
#> │ hat  ┆ bad        ┆ bad   │
#> │ hut  ┆ bud        ┆ bud   │
#> └──────┴────────────┴───────┘
# Apply case-insensitive string replacement using the `(?i)` flag.
df <- pl$DataFrame(
  city = rep("Philadelphia", 4),
  season = c("Spring", "Summer", "Autumn", "Winter"),
  weather = c("Rainy", "Sunny", "Cloudy", "Snowy")
)
df$with_columns(
  pl$col("weather")$str$replace("(?i)foggy|rainy|cloudy|snowy", "Sunny")
)
#> shape: (4, 3)
#> ┌──────────────┬────────┬─────────┐
#> │ city         ┆ season ┆ weather │
#> │ ---          ┆ ---    ┆ ---     │
#> │ str          ┆ str    ┆ str     │
#> ╞══════════════╪════════╪═════════╡
#> │ Philadelphia ┆ Spring ┆ Sunny   │
#> │ Philadelphia ┆ Summer ┆ Sunny   │
#> │ Philadelphia ┆ Autumn ┆ Sunny   │
#> │ Philadelphia ┆ Winter ┆ Sunny   │
#> └──────────────┴────────┴─────────┘