Regex 101 Exercise I5 – Remove unapproved HTML tags from a string

Regex 101 Exercise I5 – Remove unapproved HTML tags from a string

When accepting HTML input from a user, allow the following tags:



<a href=…>






and remove any others.

Comments (16)

  1. Maurits says:

    This one looks nice and challenging…

  2. Sheva says:

    It seems this can do the trick:



  3. Maurits says:

    But that lets through any tag that contains a, b, i, or u…

    <script> (has i)

    <p onload="…"> (has a)

  4. kbiel says:



    Regex.Replace(InputString, "(</?(?:u|i|b|a\s+href="[^">]*")>)|</?[^>]*>", "$1")

  5. Maurits says:

    kbiel that’s close but it strips </a> tags (as I read it)

  6. Ryan Heath says:

    how about making the href part optional.


  7. Jeno Laszlo says:

    Hi, the correct pattern is:


  8. Jeno Laszlo says:

    Sorry, to keep the <b>, <a>, <i> and <u>, the patter is </?[^abiu/]{1}[^>]?>


  9. Ryan Heath says:


    character negation is less expandable.

    What if you want to expand to tags as <table> <td> <tr> etc etc ?

    I think Kbiel’s expression is the best until now …

  10. Jeno Laszlo says:

    Ok, the regex to keep the <b>, <a>, <i>, <u>, <table> <td> <tr> are the following:


    This will keep all the properties like href, src, with, etc.

  11. Ryan Heath says:

    Close, but tags like <img> or <applet> are not excluded…

  12. Maurits says:

    Eric, is part of the exercise to eliminate "approved" tags with "unapproved" properties?

    For example, should this be stripped?

    <a href="…" onclick="…">…</a>

  13. Bill Brown says:

    Here’s what I made:

    Regex anyTag = new Regex(@"<[/]{0,1}s*(?<tag>w*)s*(?<attr>.*?=[‘""].*?[""’])*?s*[/]{0,1}>");

    Then I use a MatchEvaluator that uses two string[] containing the acceptable tags and attributes.

  14. Jeno Laszlo says:

    Hi, I modified my regex to include <img>, <applet> and other tricky tags too. The


    is working fine for me, but if you are worried about XML data island and other funky stuff I suggest using the


    If anybody has any tips how to make it shorter or knows tricks to fool it, please let me know. I tried creating a named group for the tag list and reuse it but it is not working for me.

  15. Jeno Laszlo says:

    Sorry guys, I found a bug in my code, these are fixed versions:




  16. kbiel says:

    Good catch Maurits. This will do it: