NotNullOrEmpty: How to clean your code with REGEX in Visual studio

REgex in visual studio is really powerful to replace code. It is really same as php or perl regex, but you can do tag expressions as well.

I was copying some fields from edit page to detail page. I did not need all the validation messages,so i need to remove all that. I have to find each message and remove it manually. I had multiple pages to deal with,so I need something better than manual editing. Regex replace came to rescue. Of course, i have to try different expressions to match what i want. I copied here for your reference. Let say you have a code in edit page like this:

   1:      <div class="editor-label">

   2:              @Html.LabelFor(model => model.ClaimTax.ER_SUTA_Amt)

   3:          </div>

   4:          <div class="editor-field">

   5:              @Html.EditorFor(model => model.ClaimTax.ER_SUTA_Amt)

   6:              @Html.ValidationMessageFor(model => model.ClaimTax.ER_SUTA_Amt)

   7:          </div>

We are targeting validation message from start of the word to the end of parenthesis.
Syntax:

   1:   \@Html.ValidationMessageFor\([^)]*\)

\ is regex escape character.
[^)] means all chars except right parenthesis
[] Brackets represent single character from inside whatever written in there. [A-Z] represent single char from A to Z. if you want to have more than one, you add special symbol after that.
It will look like this [A-Z]* to tell chars zero or more from A to Z.

In our example:
\@Html.ValidationMessageFor\([^)]* means get inside of the validation message until right parenthesis. We want to select all the syntax related to validation message.

Our final char right parenthesis is also escaped. With this expression, you can target something like:

   1:  @Html.ValidationMessageFor(model => model.ClaimTax.ER_SUTA_Amt)

and replace with empty string in my case or whatever you want.

Replace can also target selected expressions,so you can append your replacement text to the matched block.

Tag expression support can be usefull if you want to split camelcase field name into words.

if you want change this .NET MVC3 syntax :

   1:   @Html.LabelFor(model => model.LastName)

to this:

   1:    @Html.Label("Last Name")

You can set the target and replace as shown in the figure:

   1:  \@Html.LabelFor\(model \=\> model\.{[A-Z][a-z]+}{[A-Z][a-z]+}

{} means tagged expression. We target those in replace block.
First target expression gets the first word with starting upper case and have one or more lowercase letters. Second expression is similar. Replace tag is:

   1:  @Html.Label("\1 \2"

If field is something like "ClaimClientID", you need to change the target syntax to this:

   1:  \@Html.LabelFor\(model \=\> model\.{[A-Z][a-z]+}{[A-Z][a-z]+[^)]*}

It is still targetting two expressions but it will take the remaining characters even if they are not lower case or numeric.

We don't need right parenthesis at the end. Our target expression is not matching that so it will stay there. You can play with these to create your own powerful regex in visual studio advanced replace menu.

I copied other options from visual studio help menu:

Regular Expressions for Find and Replace

Expression	Syntax	Description	Example
Any character	.	Matches any single character except a line break.	a.o matches "aro" in "around" and "abo" in "about" but not "acro" in "across".
Zero or more	*	Matches zero or more occurrences of the preceding expression, and makes all possible matches.	ab matches "b" in "bat" and "ab" in "about". e.e matches the word "enterprise".
One or more		Matches at least one occurrence of the preceding expression.	ac matches words that contain the letter "a" and at least one instance of "c", such as "race", and "ace". a. s matches the word "access".
Beginning of line	^	Anchors the match string to the beginning of a line.	^car matches the word "car" only when it appears as the first set of characters in a line of the editor.
End of line	$	Anchors the match string to the end of a line.	end$ matches the word "end" only when it appears as the last set of characters possible at the end of a line in the editor.
Beginning of word	<	Matches only when a word starts at this point in the text.	<in matches words such as "inside" and "into" that begin with the letters "in".
End of word	>	Matches only when a word ends at this point in the text.	ss> matches words such as "across" and "loss" that end with the letters "ss".
Line break	\n	Matches an operating system-independent line break. In a Replace expression, inserts a line break.	End\nBegin matches the word "End" and "Begin" only when "End" is the last string in a line and "Begin" is the first string in the next line. In a Replace expression, Begin\nEnd replaces the word "End" with "Begin" on the first line, inserts a line break, and then replaces the word "Begin" with the word "End".
Any one character in the set	[]	Matches any one of the characters in the []. To specify a range of characters, list the starting and ending characters separated by a dash (-), as in [a-z].	be[n-t] matches "bet" in "between", "ben" in "beneath", and "bes" in "beside" but not "bel" in "below".
Any one character not in the set	[^...]	Matches any character that is not in the set of characters that follows the ^.	be[^n-t] matches "bef" in "before", "beh" in "behind", and "bel" in "below", but not "ben" in "beneath".
Or	\|	Matches either the expression before or the one after the OR symbol (\|). Mostly used in a group.	(sponge\|mud) bath matches "sponge bath" and "mud bath."
Escape	\	Matches the character that follows the backslash (\) as a literal. This lets you find the characters that are used in regular expression notation, such as { and ^.	\^ searches for the ^ character.
Tagged expression (or backreference)	{}	Matches text that is tagged with the enclosed expression.	zo{1} matches "zo1" in "Alonzo1 "and "Gonzo1", but not "zo" in "zone".
C/C Identifier	:i	Shorthand for the expression ([a-zA-Z_$][a-zA-Z0-9_$]*).	Matches any possible C/C identifier.
Quoted string	:q	Shorthand for the expression (("[^"]")\|('[^']')), which matches all characters that are enclosed in double or single quotation marks, and also the quotation marks themselves.	:q matches "test quote" and 'test quote' but not the 't of can't.
Space or Tab	:b	Matches either space or tab characters.	Public:bInterface matches the phrase "Public Interface" in text.
Integer	:z	Shorthand for the expression ([0-9] ), which matches any combination of numeric characters.	Matches any integer, such as "1", "234", "56", and so on.

The list of all regular expressions that are valid in Find and
Replace operations is longer than can be displayed in the Expression
Builder. Although the following regular expressions do not appear in the Expression
Builder, you can use them in the Find what or Replace with fields.

Expression	Syntax	Description	Example
Minimal, zero or more	@	Matches zero or more occurrences of the preceding expression, and matches as few characters as possible.	e.@e matches "ente" and "erprise" in "enterprise", but not the full word "enterprise".
Minimal, one or more	#	Matches one or more occurrences of the preceding expression, and matches as few characters as possible.	ac# matches words that contain the letter "a" and at least one instance of "c", such as "ace". a.#s matches "acces" in the word "access".
Repeat n times	^ n	Matches n occurrences of the preceding expression.	[0-9]^4 matches any 4-digit sequence.
Grouping	()	Lets you group a set of expressions together. If you want to search for two different expressions in a single search, you can use the Grouping expression to combine them.	If you want to search for - [a-z][1-3] or - [1-10][a-z], you would combine them: ([a-z][1-3]) \| ([1-10][a-z]).
nth tagged text	\ n	In a Find or Replace expression, indicates the text that is matched by the nth tagged expression, where n is a number from 1 to 9. In a Replace expression, \0 inserts the complete matched text.	If you search for a{[0-9]} and replace with \1, all occurrences of "a" followed by a digit are replaced by the digit it follows. For example, "a1" is replaced by "1" and similarly "a2" is replaced by "2".
Right-justified field	\( w, n)	In a Replace expression, right-justifies the nth tagged expression in a field at least wcharacters wide.	If you search for a{[0-9]} and replace with \(10,1), the occurrences of "a n" are replaced by the integer and right-justified by 10 spaces.
Left-justified field	\(- w, n)	In a Replace expression, left-justifies the nth tagged expression in a field at least wcharacters wide.	If you search for a{[0-9]} and replace with \(-10,1), the occurrences of "a n" are replaced by the integer and left-justified by 10 spaces.
Prevent match	~(X)	Prevents a match when X appears at this point in the expression.	real~(ity) matches the "real" in "realty" and "really," but not the "real" in "reality."
Alphanumeric character	:a	Matches the expression ([a-zA-Z0-9]).	Matches any alphanumeric character, such as "a", "A", "w", "W", "5", and so on.
Alphabetic character	:c	Matches the expression ([a-zA-Z]).	Matches any alphabetical character, such as "a", "A", "w", "W", and so on.
Decimal digit	:d	Matches the expression ([0-9]).	Matches any digit, such as "4" and "6".
Hexadecimal digit	:h	Matches the expression ([0-9a-fA-F] ).	Matches any hexadecimal number, such as "1A", "ef", and "007".
Rational number	:n	Matches the expression (([0-9] .[0-9])\|([0-9].[0-9] )\|([0-9] )).	Matches any rational number, such as "2007", "1.0", and ".9".
Alphabetic string	:w	Matches the expression ([a-zA-Z] ).	Matches any string that contains only alphabetical characters.
Escape	\e	Unicode U 001B.	Matches the "Escape" control character.
Bell	\g	Unicode U 0007.	Matches the "Bell" control character.
Backspace	\h	Unicode U 0008.	Matches the "Backspace" control character.
Tab	\t	Unicode U 0009.	Matches a tab character.
Unicode character	\x#### or \u####	Matches a character given by Unicode value where #### is hexadecimal digits. You can specify a character that is outside the Basic Multilingual Plane (that is, a surrogate) with the ISO 10646 code point or with two Unicode code points that give the values of the surrogate pair.	\u0065 matches the character "e".

NotNullOrEmpty

Wednesday, April 4, 2012

How to clean your code with REGEX in Visual studio

No comments:

Post a Comment