Regular expressions are different in javascript! And Unintrusive Validation.

Today while trying to implement some new code, and cutting my teeth on javascript validation, I disbelievingly realized that ALL of our client-side validation had stopped working. Like a month ago. And, nobody had noticed or complained!  (As a brief reminder I now work on the NuGet Gallery.)

Perhaps that just goes to show how little real value client-side form validation is providing our nuget.org users, who are a smart bunch. Especially when Server side validation is there (it's always going to be there) and it gets the job done. Anyway, regardless of the true value of client side validation, I'm fully mentally and emotionally committed to doing some client side validation for our new features. So let's move ahead!

The original thing turned out to be a minor and easy thing to debug and fix, and didn't require me to understand the validation framework much at all. The script tag for jquery.validate.unobtrusive had accidentally gone missing when my compatriate made some innocent looking performance enhancements to the site. And without that script tag, you'll never see any validation. So problem solved. The way the tag went missing, honestly, was just one of those accidental things that you would think wouldn't be so easy to do. If you look at our code (yes, it's so frickin' cool that I can link to our code since it's open source)

https://github.com/NuGet/NuGetGallery/blob/master/Website/App_Start/AppActivator.cs

you'll see that we're doing script bundling.

var scriptBundle = new ScriptBundle("~/bundles/js")

                .Include("~/Scripts/jquery-{version}.js")

                .Include("~/Scripts/jquery.validate.js")

                .Include("~/Scripts/jquery.validate.unobtrusive.js");

            BundleTable.Bundles.Add(scriptBundle);

And it turned out that we were bundling a script that didn't exist (we only had jquery.validate.unobtrusive.min. js). Of course we never ever see an error even though we are bundling something that doesn't exist.. . Oh well. If you were thinking of playing with script bundling, you have now been warned. :)

But digressions, digressions. The real point of this post was that even after turning on unobtrusive validation, it became apparent that our email address and username validation on the user registration page were still not working. I could type in 'foo#blarg#blah' as an email address and all was well with the webby world. But why?

Well, let's see. What's supposed to happen is that we set a [RegularExpressionAttribute] on our property, like so:

    [Required]
    [StringLength(64)]
    [RegularExpression(@"(?i)[a-z0-9][a-z0-9_.-]+[a-z0-9]",
       ErrorMessage = "User names must start and end with a letter or number, and may only contain letters, numbers, underscores, periods, and hyphens in between.")]
     [Hint("Choose something unique so others will know which contributions are yours.")]
     public string Username { get; set; } 

And this will cause some funky HTML to get generated, like so:

<inputclass="text-box single-line"data-val="true"data-val-length="The field Username must be a string with a maximum length of 64."data-val-length-max="64"data-val-regex="User names must start and end with a letter or number, and may only contain letters, numbers, underscores, periods, and hyphens in between."data-val-regex-insensitive="True"data-val-regex-pattern="[a-z0-9][a-z0-9_.-]+[a-z0-9]"data-val-required="The Username field is required."id="Username"name="Username"type="text"value=""/>

Which works.
And then there's a little adapter snippet in the unobtrusive validation script file which turns the data-val- attributes into a jquery.validation rule:

    $jQval.addMethod("regex", function (value, element,
params) {
        var match;
        if (this.optional(element)) {
            return true;
        } 

        match = new RegExp(params).exec(value);
        return (match && (match.index === 0) &&
(match[0].length === value.length));
    });

Luckily, the first thing I tried is running this in the JS debugger, and it turns out that 'new RegExp(params)' is what fails. And it fails because (as promised in the title) regular expressions are different in javascript from .Net.

In .Net, the (?i) syntax is legal as part of the regular expression and tells the regular expression parser that you want to build a case-insensitive regular expression.

In Javascript, there are two ways of doing a case insensitive regular expression. One is to use a regular expression literal and suffix it with 'i' like this: /[a-z0-9][a-z0-9_:-]+[a-z0-9]/i

The other is to pass a second parameter "i" to the new RegExp constructor.

This leads to a GRRRRRR moment, where I privately curse the creator of the unobtrusive validation library for failing to consider this issue. And also me spending about an hour trying to understand how to solve the problem myself.

This turns out to be wasted effort, because as is often the case, some bright person has come up with a clever answer on stackoverflow.

https://stackoverflow.com/questions/4218836/regularexpressionattribute-how-to-make-it-not-case-sensitive-for-client-side-v

That said, I did spend this time coming up with another alternative approach, so might as well record it, and explain how it works. This one works not by introducing a new attribute or new annotation, but by overwriting the existing adapters in the unobtrusive validation pipeline.
The logical pipeline is this:

1. Data Model & Attributes (e.g. RegularExpressionAttribute) ->
2. DataAnnotationsModelValidatorProvider & Attribute Adapters (e.g. RegularExpressionAttributeAdapter) ->
3. HTML form + <script tags> (e.g. data-val-regex) ->
4. UnobtrusiveValidation Adapters (e.g. "regex" adapter snippet posted above) ->
5. jquery.validation rules - tada!

We can insert a hacky fix at step 2. in the pipeline:

            DataAnnotationsModelValidatorProvider.RegisterAdapter(
                typeof(RegularExpressionAttribute),
                typeof(FixedRegularExpressionAttributeAdapter));

        }

        class FixedRegularExpressionAttributeAdapter : RegularExpressionAttributeAdapter
        {
            private RegularExpressionAttribute _attribute;

            public FixedRegularExpressionAttributeAdapter(
                ModelMetadata metadata, ControllerContext context, RegularExpressionAttribute attribute)
                : base(metadata, context, attribute)
            {
                _attribute = attribute;
            }

            public override System.Collections.Generic.IEnumerable<ModelClientValidationRule> GetClientValidationRules()
            {
                var rules = base.GetClientValidationRules();
                var rule = rules.First();
                var pattern = (string)rule.ValidationParameters["pattern"];
                if (pattern.StartsWith("(?i)"))
                {
                    pattern = pattern.Substring(4);
                    rule.ValidationParameters["pattern"] = pattern;
                    rule.ValidationParameters["insensitive"] = true;
                }
                return rules;
            }
        }

this will rewrite our HTML to remove the leading (?i) from the data-val-regex-pattern attribute, and create a new attribute data-val-regex-insensitive="True".
And since it's a script file, we could then modify the code of the unobtrusive validation adapter.

Well... it sort of gets the job done... but I don't actually feel like this is a very great solution to apply in practice. For one, pattern.StartsWith("(?i)") is hackiness. But for another, making changes to a script file like unobtrusive validation that we get as a nuget package is a headache because updating the nuget package in the future will lose the changes. Not to mention I'm unclear whether licensing permits editing the script files.

The right thing to do instead seems to be to contribute back to unobtrusive validation, or at least file a bug there, and hope it gets fixed, assuming it's open source.

It appears that it is open source, but it took me an awfully long time to confirm it is in the asp.net codeplex project

https://aspnetwebstack.codeplex.com/SourceControl/changeset/view/ea64fc86b54d#src/System.Web.Mvc/JavaScript/jquery.unobtrusive-ajax.js

where, for now, I've just filed a bug on the issue. As for now it makes more sense to me to workaround it using 'A-Za-z' in the regex than to incorporate a hacky non-future-proof fix, which was an interesting way to learn a bit more about these validation frameworks.