Regular Expressions (RegEx) are a powerful tool for searching for text that matches specific patterns, but it is also a complex tool that requires care and attention to detail. There are many caveats to using RegEx. Ron has recently published a few excellent blog posts on the .NET RegEx engine [part 1 | part 2…
Tag: RegEx
Optimizing Regex Performance, Part 3 [Ron Petrusha]
Regular expressions in the .NET Framework support a number of grouping constructs, which allow a regular expression pattern to be grouped into one or more subexpressions. Grouping constructs are essential for creating backreferences, as well as for defining a subexpression to which a quantifier is applied. The Performance Impact of Capturing Groups The most commonly…
Optimizing Regular Expression Performance, Part II: Taking Charge of Backtracking [Ron Petrusha]
One of the most powerful features of regular expressions in the .NET Framework — and of Nondeterministic Finite Automaton (NFA) regular expression engines generally — is their ability to execute in a non-linear manner. That is, instead of advancing one character at a time, they can return to a previous saved state in order to…
Optimizing Regular Expression Performance, Part I: Working with the Regex Class and Regex Objects [Ron Petrusha]
The .NET Framework’s regular expression implementation is a traditional Nondeterministic Finite Automaton (NFA) engine. Perhaps the most significant feature of NFA engines is that they place the responsibility for crafting efficient, high-performance regular expressions on the developer. (For more information about the .NET Framework’s implementation of an NFA engine, see Details of Regular Expression Behavior…
The RegexOptions.Compiled Flag and Slow Performance on 64-Bit .NET Framework 2.0 [Josh Free]
Developers using System.Text.RegularExpressions.Regex with the RegexOptions.Compiled flag may notice performance degradation in their 2.0 apps when running on 64-Bit .NET Framework 2.0. The performance problem occurs in the Regex(String pattern, RegexOptions options) constructor when instantiating very large, un-optimized regular expressions and while specifying the RegexOptions.Compiled flag: private static Regex nonwords = new Regex(@”\b(” +@”a|aboard|about|above|absent|according\sto|across|after|against|ago|ahead\sof|ain’t|all|along|alongside|” +@”also|although|am|amid|amidst|among|amongst|an|and|anti|anybody|anyone|anything|apart|apart\sfrom|are|”…
Regex Class Caching Changes between .NET Framework 1.1 and .NET Framework 2.0 [Josh Free]
The .NET Framework System.Text.RegularExpressions.Regex class maintains a cache of parsed regular expressions. The cache improves the performance of methods that create regular expressions, as the Regex class is able to avoid the cost of re-parsing and re-compiling existing regular expressions. The cache does not affect the performance of match operations on the same input string,…