Borbin the 🐱

Replace substring in substrings with Regular expressions (regex)

21 April, 2019


A regular expression defines a search pattern for strings. Replace is possible for text matches, but repeated substring replace in substrings is difficult. Especially if the text you want to replace is also outside of the initial substrings. Using cascaded match evaluators, this replacements gets very easy. For example to replace quoted text withing square brackets.
Please note the middle text "text" in this example which must not be replaced.

["text1"] "text" ["text2"]
Begin[Pre"1txet"Post]End "text" Begin[Pre"2txet"Post]End

First the text is splitted by the primary substrings defined by square brackets.
Then these substrings are recomposed according to the business logic. In this example the string gets enclosed in a "Pre" / "Post" and then reverted. The primary substrings gets enclosed in "Begin" / "End".

string inputText = "[\"text1\"] "text" [\"text2\"]";

string replacedText = Regex.Replace(
    inputText,
    $"(\\[)(.*?)(\\])",
    arrayMatch =>
    {
        return 
        $"Begin" +
        arrayMatch.Groups[1].Value +
        Regex.Replace(
            arrayMatch.Groups[2].Value,
            $"(\".*?\")+",
            contentMatch =>
            {
                return $"Pre" + new string(contentMatch.Groups[1].Value.Reverse().ToArray()) + "Post";
            }) +

        arrayMatch.Groups[3].Value +
        $"End";
    });


Or modify a text section to add a link from a markdown link to the following picture:

from:
  [Interactive Panorama title](link.htm)
  ![](bild.jpg)

to:
  [Interactive Panorama title](link.htm)
  <a href="link.htm">![](bild.jpg)</a>


Using PowerShell:

    [string]$content = [System.IO.File]::ReadAllText($FileName)

    # using '(?s)', the dot '.' matches the newline and allows for multiline replace 
    [string]$StringsRegex = "(?s)(?<section>\[Interactive Panorama.*?\]\((?<link>.*?)\).*?)(?<bild>\!\[\]\(.*?\))"

    $updatedContent = $content -replace $StringsRegex, {
        $match = $_
        $section = $match.Groups["section"].Value
        $link = $match.Groups["link"].Value
        $bild = $match.Groups["bild"].Value

        # already replaced?
        if(-not $section.Contains("<"))
        {    
            # insert link
            # '$section$bild' -> 'section<a href="$link">$bild</a>'
            "$section<a href=""$link"">$bild</a>
        }
    }

    if($updated)
    {
        [string]$outFile = FileName + ".updated"
        [IO.File]::WriteAllText($outFile, $updatedContent)
    }