Regex 101 Exercise I6 – Remove font directives from HTML


Regex 101 Exercise I6 – Remove font directives from HTML

Remove all <font…> or </font> directives from an HTML string.

Comments (10)

  1. Maurits says:

    I’d run three regexes, myself…

    <font> # get rid of font tags with nothing after the "t" in font

    <font(s|r|n).*?> # get rid of font tags with stuff after the "t" in font

    </font> # get rid of closing font tags

    But I suppose this would do equally well for most data:

    </?font.*?>

  2. kbiel says:

    </?font[^>]*>

    With singleline and ignorecase options set will do the trick.

  3. Maurits says:

    I’d like to propose an exercise…

    Given plaintext containing URLs, make the URLs clickable by adding <a> tags around them.

  4. kbiel says:

    Pattern:

    b(w+://[-w._%d/#]+)b

    Replace:

    <a href="$1">$1</a>

    Kind of loose, but do you really want to check for well-formed URLs with this too?

  5. kbiel says:

    Oops, missed ?, =, and #

    Correct pattern:

    b(w+://[-w._%d/#?=]+)b

  6. Maurits says:

    Semicolons and @ signs, too.

    Watch out for URLs that end in a W-class character… those won’t have a b on the end.

    Bonus points for auto-detecting punctuation on the end and not including it (for example: http://example.com.)

    Note both the period and the ) at the end of that last paragraph are not part of the URL.

  7. Maurits says:

    hey, blogs.msdn.com got it right. I guess that we could just cheat and look up the Community Server regex in the code. 🙂