Beginner’s tips on VS Find and Replace using Regular Expressions – Tutorial#1


Regular Expressions is a great way to find what you’re looking for.  Unfortunately, it can be also be quite daunting for a beginner who’s unsure where to start.  Hence I started this post for tips and tricks of using Regular Expressions with Find & Replace in VS.


To start off with, we actually use our own regular expressions engine.  If you’ve never touched regular expressions or wild cards, you may wonder what that “right arrow” button next to the Find what: combo box is for.  When the “Use:” option is selected, this button becomes enabled and provides a quick list of the supported syntax commands, depending on whether “Regular Expressions” or “Wildcards” is selected.  At the bottom of that list there is also a link to a help topic with the completely supported.


Let’s start off with something simple.  In this mini-tutorial, I will use the following line on which to perform a simple Find, always from the beginning of the line:


Ros3s ar3 r3d.  Vi0l3ts ar3 blu3.


First, some simple Regular Expression syntax:


. – Matches any character except line break (includes other whitespaces)


* – Matches zero or more occurences of the preceding expression


+ – Matches one or more occurances of the preceding expression


[] – Matches any one of the characters in [].  To specify a range, use –


Now, let’s get started!


A search with “R.+”  searches for a string that begins with R, and is followed by at least one character.  Therefore, the first find will highlight the entire line as the result.  This is because we allowed for at least one occurance of any character (including whitespace) following a R.


We can exclude line breaks by specifing a specific range of characters.  For example,  the search “R[a-z3]+“ states a string that begins with R and is followed by at least one character from the range a-z and 3.  The first match will be “Ros3s“, but the next match will be “r3“ of the second word “ar3“. 


What if we only wanted words that began with R?  We can use the following syntax:


< – Beginning of a word


> – End of a word


So a search for “<R[a-z3]“ searches for a word that begins with R and is followed by at least one character from the range a-z and 3.  It will return matches for “Ros3s“ and “r3d“.  If you just wanted to search for Ros3s, you can narrow down the scope of the search with “<Ro[a-z3]“, which searchs for a word that begins with ro and is followed by at least one character from the range a-z and 3.


Hopefully, this post will help some newbies out there to start experimenting with regular expressions.  I will continue to post other tips later on, and also include some regular expressions that I have found really useful.


-Fiona


Comments (15)

  1. Raj Kaimal says:

    I was working with a word list in VS.NET and could not find a way to search for all words that were atleast 4 characters long: w{4,}

    Is this possible with VS.NET or VS.NET 2005? thanks

  2. Joku says:

    Perhaps the regex funtionality could be replaced to use the one .NET framework has- but adding the support for escapechars and few others to the "replace evaluator". I was a bit disappointed to find that the Replace on the regex namespace only seems to support $0 $1 $2 etc to get the () groups, wouldn’t it be nice to have also escapes like xFF etc – now you are forced to pull up some sscli/reflector and copy paste the escape char code to the replacement evaluating code.

  3. Joku says:

    Or was the name MatchEvaluator.. Whatever 🙂

  4. Jerry Pisk says:

    Why are you putting a comma between a range and the rest of the characters in the [] contruct? That’s not how it’s done, to search a to z and 3 you search for [a-z3], not [a-z,3] (that will search for a to z, 3 and a comma). Unless of course your regular expression syntax is one you came up with instead of using one of the standard (or defacto standard, such as Perl 5).

  5. Fiona says:

    Hey Jerry,

    Woops, had a typo. Thanks!

    -Fiona

  6. Fiona says:

    Hi Joku,

    Thanks for your suggestion. We are looking into that functionality for future versions.

    Regards,

    Fiona

  7. Fiona says:

    Hi Raj,

    We have a syntax ^n to specify math the n occurences of the preceding expression. So, to match any words that are 4 characters, you can use <[a-z]^4.

    Now, if you want to match any word at least 4 characters long, you can use <[a-z]^4[a-z]*

    Hope this helps!

    -Fiona

  8. Hardbreaker says:

    Hi,

    I am trying to find any occurrences of "string" but NOT "include <string>" for example.

    How to achieve this with ONE line of regular expression ?

    Any tips ?

    Thanks

    HB

  9. Fiona says:

    Hi HB,

    That’s a great question, and rather tricky, since you cannot just use our "Match whole word" option since the <> do not count as alphanumeric characters.

    Here are the suggestions I have.

    If you are just interested in looking for each string occurance, and just want to know where they are you can search use "(^string)|([ t]string)" (without quotations). Note that this match will also highlight the space/tab in the beginning of the string.

    If you want to specifically replace only string (with no spaces attached), you can use tagged expressions (more info on tagged expressions in a future tutorial). Assume that you want to take all occurances of string (but not <string>) and replace them with foobar. You can do the following:

    Find what: (^string)|({[ t]}string)

    Replace with: 1foobar

    This makes the potential space/tab in front of string a tagged expression that we will keep in the replace.

    Hope this helps!

    Regards,

    Fiona

  10. I recently wrote a short article as an overview to why regular expressions should be in every developer&#39;s

  11. demac media says:

    BlogEngine.NET Custom Resource Provider