What’s wrong with this code, part 8 – Email Address Validation

It’s time for another “What’s wrong with this code”.

Today’s example is really simple, and hopefully easy.  It’s a snippet of code I picked up from the net that’s intended to validate an email address (useful for helping to avoid SQL injection attacks, for example).


    /// <summary>
    /// Validate an email address provided by the caller.
    /// Taken from http://www.codeproject.com/aspnet/Valid_Email_Addresses.asp
    /// </summary>
    public static bool ValidateEmailAddress(string emailAddress)
        string strRegex = @"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}" +
                          @"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" +
        System.Text.RegularExpressions.Regex re = new System.Text.RegularExpressions.Regex(strRegex);
        if (re.IsMatch(emailAddress))
            return (true);
        return (false);

As always, my next post (Monday) will include the answers and kudos to all those who got it right (and who f

Comments (22)

  1. Denny says:

    fast read: will it work with foo@bar.bin.domain.org

    looks like it only handles

    foo@domain.org format.

    but thats just reading the regex quickly, did I miss it or get it??

  2. Uwe Keim says:

    I’ve read in the O’Reilly regular expression book (the one with the owl) that it is impossible to do e-mail address validation and if you want to do anyway, they printed a full-one-page regular expression.

  3. This routine considers foo@bar.bin.domain.org correct (I just checked with the test program I wrote).

  4. Adam Young says:

    You’re using a Hungarian-style coding standard (strRegex); the if…else is redundant – just return the result of IsMatch; perhaps the regex should be a constant (debatable).

  5. KCS says:

    The first thing I noticed is that the regex doesn’t handle apostrophes (maybe some other characters) in the part that comes before the @ (this is valid per the RFC 2822).

    The following bugs me (I see it often):

    if (re.IsMatch(emailAddress))

    return true;


    return false;

    Why not just:

    return re.IsMatch(emailAddress);

  6. Bingo! Uwe nailed it.

    Adam, the code in question is literally cut&pasted from where I found it. And using hungarian isn’t inherently a programming error, it’s a style thing :)

  7. Tad says:

    It doesn’t check to see if emailAddress is null. I imagine that’d be a problem.

  8. The above example is attempting to do some IP address validation, but failing miserably for things like:


    It would also have problems with its poor assumptions about the TLD length:


  9. philoserf says:

    Sometimes close enough is good enough.

    I can’t believe philoserf the idealist just said that. I must be getting worn down.

  10. Aaron,

    In this case, I think the up-to-3 character part of the regex isn’t an attempt to catch IP addresses, but instead to catch foo@bar.co.uk.

    But you’re right – the .museum and .info domains will have huge issues here.

  11. .info is actually OK with this RE, as is .name. .museum wouldn’t be, however. Also, apparently, noc@to, for example, is a valid address for people at the management of certian cTLDs. Also, + is a valid and fairly common character for the before-the-at-sign portion.

    Oh, and that split regex is amazingly ugly — it’s much better to split on something closer to atom boundries (for some reasonable level of "atom"), even though not all chunks are the same length.

  12. G. Man says:

    Hmmm, I thought this was ‘whats wrong with the *code*’.

    And it is possible to do 100% accurate email address validation, you send an email to the address and require the customer to click a link. Just about all message boards do this nowadays.

  13. Larry,


    The {1,3} portions following the [0-9] is searching for IP address patterns. The {2,4} after the [a-zA-Z] pattern is looking for 2, 3, or 4 character TLDs. There are three copies of this [0-9] junk, and then the trailing pattern allows for the final quad.

    It’s the ([a-zA-Z0-9-]+.)+ pattern that allows for any number (one or more) for domain and subdomain purposes.

    Great resource for testing .NET and client-side RegEx at http://www.regxlib.com. They also list 30 user-submitted variations on email validation patterns.

  14. Tad had my comment. A RegEx is good for part of an e-mail validation, but isn’t sufficient on its own.

  15. It was a pain in the ass to do, but I actually wrote an e-mail address format verifier in JavaScript once. I essentially did a line by line translation of the BNF in RFC822 to JavaScript strings containing equivalent regular expressions. I then matched whatever I wanted to check against this regular expression. What is a pane is that I have to use the backslash character to escape a character in the regular expression and because the regular expression is in a string, I have to escape every backslash. There is definitely a lot more to a correct e-mail address than what that example script uses. You can have extended characters for example.

  16. Adi Oltean says:

    I tried to write my own validator a couple of weeks ago but it was too complicated. I didn’t want to cover the whole RFC 822/2822 standard, just a subset of it that can be easily covered by a regex expression.

    Fortunately I use .NET 2.0 which has a validator built-in. See the System.Net.MailAddress class (http://msdn2.microsoft.com/library/wx7kz7sd.aspx)

  17. Voytek says:


    1. reg exp is interpreted with each call and can be "extracted to static Regex field of class

    2. Aditional options suggested are: RegexOptions.Compiled, RegexOptions.Singleline, RegexOptions.IgnorePatternWhitespace

    3. using IgnorePatternWhitespace option ca be reformated to more readible value

    4. matches "aa@[11.11.11.dd" :-)

    5. argument ‘emailAddress’ is not checked if is null.

  18. Jerry Pisk says:

    Not only it will fail long TLDs but it will also fail a fully specified domain name, with a dot at the end (i.e. user@microsoft.com.).

  19. Paul says:

    Someone’s already mentioned ‘ and +, but the regex also disallows various other valid username characters.

    Not only that, but it’s hell to read…

  20. Marton says:

    Well, first, the regexp has 2 useless groups of parenthesis. Second, the ‘[‘ && ‘]’ pattern isn’t done/checked properly. Heres a fixed version.

    #region static isEmailAddress()

    /// <summary>

    /// Returns true if the email address is valid.

    /// This will not work on every email address, since doing so

    /// is very complicated matters.


    /// If using .NET v2, try ‘System.Net.MailAddress’.

    /// </summary>

    public static bool isEmailAddress(string sEmailAddress){

    if(sEmailAddress==null || sEmailAddress.Length==0)

    return false;

    return System.Text.RegularExpressions.Regex.IsMatch(

    @"^([a-zA-Z0-9_-.]+)@" +

    @"(" +

    @"[?([0-9]{1,3}.){3}" + // oddly [

    @"|" +

    @"([a-zA-Z0-9-]+.)+" +

    @")" +

    @"([a-zA-Z]{2,4}|[0-9]{1,3}]?)" + // oddly: ]




  21. No, you don’t do that.