Visual Studio: Find / Replace using regular expression to replace - regex

I'm using the Find / Replace tool of visual studio to find something using regular expressions and make a replace. I have this in the find: Assert.IsTrue\(([^,;]*)\) *; and the replace Assert.IsTrue($1, "$1");, so what this does is looking for every Assert.IsTrue(); whith anything in the parentheses except for commas , and semicolons ;, and then add whatever was on the parentheses inside quotes and after a comma ,. So, if I have Assert.IsTrue(wtv) it will be replaced with Assert.IsTrue(wtv,"wtv").
The problem is when the wtv has quotes or break lines, so if I have
Assert.IsTrue("wtv" == "wtv") it will be replaced to
Assert.IsTrue("wtv" == "wtv", ""wtv" == "wtv"") and
Assert.IsTrue(wtv ||
wtv2)
will be replaced to
Assert.IsTrue(wtv ||
wtv2, "wtv ||
wtv2")
. What I want to do is eliminate in the replacement the new line \r and the quotes, so the results after the replacement are
Assert.IsTrue("wtv" == "wtv", "wtv == wtv") and
Assert.IsTrue(wtv ||
wtv2, "wtv ||wtv2")

First I'll clarify that this doesn't really solve the problem, is just a nasty work around, not a real solution. I post it just in case someone needs a work around as I do (I doubt it but well). Still, as this is not the real answer I'll not mark it as so (unless someone explains me that it's not possible a real answer), and new answers are always welcomed.
What I did was in the part that need regex add several groups that ([^,;"\r\n]*) first look for anything that it's not a comma, semicolon, quote or new-line, then look for (["\r\n]*) ne-line or semicolon, and then repeat this pattern several times.
So, what this will do as it's using * it will look if it happens 0 or more times, and is repeated several times in case that there is more than one comma or more than one new-line (note that if there are none, that's not a problem since I'm using *). And, the replace would look like
Assert.IsTrue($1$2$3..., "$1$3$5...");
where in the first argument I put all the numbers, and in quotes I put only the odd numbers since the even are either non existent or quote / new-line.
I used 31 of these, so if there are more than 15 groups of commas / new-line, it will not be found and replaced
The find
Assert.IsTrue\(([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)(["\r\n]*)([^,;"\r\n]*)\) *;
The replace
Assert.IsTrue($1$2$3$4$5$6$7$8$9$10$11$12$13$14$15$16$17$18$19$20$21$22$23$24$25$26$27$28$29$30$31, "$1$3$5$7$9$11$13$15$17$19$21$23$25$27$29$31");
This works for the examples I provided and for any example with less than 15 groups of commas / new-lines, if I can come up with something better (since this a really crappy solution), I'll add it here.

Related

Notepad++ masschange using regular expressions

I have issues to perform a mass change in a huge logfile.
Except the filesize which is causing issues to Notepad++ I have a problem to use more than 10 parameters for replacement, up to 9 its working fine.
I need to change numerical values in a file where these values are located within quotation marks and with leading and ending comma: ."123,456,789,012.999",
I used this exp to find and replace the format to:
,123456789012.999, (so that there are no quotation marks and no comma within the num.value)
The exp used to find is:
([,])(["])([0-9]+)([,])([0-9]+)([,])([0-9]+)([,])([0-9]+)([\.])([0-9]+)(["])([,])
and the exp to replace is:
\1\3\5\7\9\10\11\13
The problem is parameters \11 \13 are not working (the chars eg .999 as in the example will not appear in the changed values).
So now the question is - is there any limit for parameters?
It seems for me as its not working above 10. For shorter num.values where I need to use only up to 9 parameters the string for serach and replacement works fine, for the example above the search works but not the replacement, the end of the changed value gets corrupted.
Also, it came to my mind that instead of using Notepad++ I could maybe change the logfile on the unix server directly, howerver I had issues to build the correct perl syntax. Anyone who could help with that maybe?
After having a little play myself, it looks like back-references \11-\99 are invalid in notepad++ (which is not that surprising, since this is commonly omitted from regex languages.) However, there are several things you can do to improve that regular expression, in order to make this work.
Firstly, you should consider using less groups, or alternatively non-capture groups. Did you really need to store 13 variables in that regex, in order to do the replacement? Clearly not, since you're not even using half of them!
To put it simply, you could just remove some brackets from the regex:
[,]["]([0-9]+)[,]([0-9]+)[,]([0-9]+)[,]([0-9]+)[.]([0-9]+)["][,]
And replace with:
,\1\2\3\4.\5,
...But that's not all! Why are you using square brackets to say "match anything inside", if there's only one thing inside?? We can get rid of these, too:
,"([0-9]+),([0-9]+),([0-9]+),([0-9]+)\.([0-9]+)",
(Note I added a "\" before the ".", so that it matches a literal "." rather than "anything".)
Also, although this isn't a big deal, you can use "\d" instead of "[0-9]".
This makes your final, optimised regex:
,"(\d+),(\d+),(\d+),(\d+)\.(\d+)",
And replace with:
,\1\2\3\4.\5,
Not sure if the regex groups has limitations, but you could use lookarounds to save 2 groups, you could also merge some groups in your example. But first, let's get ride of some useless character classes
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+)(\.)([0-9]+)(")(,)
We could merge those groups:
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+)(\.)([0-9]+)(")(,)
^^^^^^^^^^^^^^^^^^^^
We get:
(\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+\.[0-9]+)(")(,)
Let's add lookarounds:
(?<=\.)(")([0-9]+)(,)([0-9]+)(,)([0-9]+)(,)([0-9]+\.[0-9]+)(")(?=,)
The replacement would be \2\4\6\8.
If you have a fixed length of digits at all times, its fairly simple to do what you have done. Even though your expression is poorly written, it does the job. If this is the case, look at Tom Lords answer.
I played around with it a little bit myself, and I would probably use two expressions - makes it much easier. If you have to do it in one, this would work, but be pretty unsafe:
(?:"|(\d+),)|(\.\d+)"(?=,) replace by \1\2
Live demo: http://regex101.com/r/zL3fY5

Regex with comma separated numbers in Apex

As a preface, I realize there are other topics on regular expressions with comma separated numbers, but when I tried to use those solutions, they didn't work.
Basically, I am trying to create a regular expression to recognize comma separated numbers (in this case without spaces). Before trying to convert this into actual regex syntax, I realize that it should probably work something like this, where 'd' is a number and ',' is a comma, and '+' is a kleene plus:
((d+),)*(d+)
or
(d+)(,(d+))*
Here's the code I am using in an Apex validation to make sure that a certain field is a list of numbers separated by commas without spaces (note: I have tried several variations of this to no avail, but will only post one):
(\d+,)*(\d+)
For some reason this isn't working, but it seems to be the correct syntax of any digit 1 or more times followed by a single comma, and that entire expression can be repeated 0 or more times, and that entire repeated expression should always be followed by at least 1 digit.
This expression in practice does recognize all the accepted forms (ex: 100 or 100,200 etc.), but for some reason it also accepts answers like
'100,200,'
or
'100,200,,'
or
'100,,200'
I'm pretty stumped as to why this won't work as well as the previously given solutions which seem to do the same thing mine do. Thanks for any help in advance!
That's it:
^(\d+,)*\d+$
The anchors ^$ will make the difference because they will force the whole string (not just a part) to match the pattern
You should try pattern like this:
^(?:(\d+),)+(\d)+$

How to read this command to remove all blanks at the end of a line

I happened across this page full of super useful and rather cryptic vim tips at http://rayninfo.co.uk/vimtips.html. I've tried a few of these and I understand what is happening enough to be able to parse it correctly in my head so that I can possibly recreate it later. One I'm having a hard time getting my head wrapped around though are the following two commands to remove all spaces from the end of every line
:%s= *$== : delete end of line blanks
:%s= \+$== : Same thing
I'm interpreting %s as string replacement on every line in the file, but after that I am getting lost in what looks like some gnarly variation of :s and regex. I'm used to seeing and using :s/regex/replacement. But the above is super confusing.
What do those above commands mean in english, step by step?
The regex delimiters don't have to be slashes, they can be other characters as well. This is handy if your search or replacement strings contain slashes. In this case I don't know why they use equal signs instead of slashes, but you can pretend that the equals are slashes:
:%s/ *$//
:%s/ \+$//
Does that make sense? The first one searches for a space followed by zero or more spaces, and the second one searches for one or more spaces. Each one is anchored at the end of the line with $. And then the replacement string is empty, so the spaces are deleted.
I understand your confusion, actually. If you look at :help :s you have to scroll down a few pages before you find this note:
*E146*
Instead of the '/' which surrounds the pattern and replacement string, you
can use any other character, but not an alphanumeric character, '\', '"' or
'|'. This is useful if you want to include a '/' in the search pattern or
replacement string. Example:
:s+/+//+
I do not know vim syntax, but it looks to me like these are sed-style substitution operators. In sed, the / (in s/REGEX/REPLACEMENT/) can be uniformly replaced with any other single character. Here it appears to be =. So if you mentally replace = with /, you'll get
:%s/ *$//
:%s/ \+$//
which should make more sense to you.

Visual Studio Find and Replace Regular Expressions help

I'd like to replace some assignment statements like:
int someNum = txtSomeNum.Text;
int anotherNum = txtAnotherNum.Text;
with
int someNum = Int32.Parse(txtSomeNum.Text);
int anotherNum = Int32.Parse(txtAnotherNum.Text);
Is there a good way to do this with Visual Studio's Find and Replace, using Regular Expressions? I'm not sure what the Regular expression would be.
I think in Visual Studio, you can mark expressions with curly braces {txtSomeNum.Text}. Then in the replacement, you can refer to it with \1. So the replacement line would be something like Int32.Parse(\1).
Update: via #Timothy003
VS 11 does away with the {} \1 syntax and uses () $1
Comprehensive guide
http://blog.goyello.com/2009/08/22/do-it-like-a-pro-%E2%80%93-visual-studio-find-and-replace/
This is what I was looking for:
Find: = {.*\.Text}
Replace: = Int32.Parse(\1)
Better regex for the original problem would be
find expr.: {:i\.Text}
replace expr.: Int32.Parse(\1)
Check out:
http://msdn.microsoft.com/en-us/library/2k3te2cs%28v=vs.100%29.aspx
for the definitive guide to regex in VS.
I recently completed reformatting another programmer's C++ project from hell. He had completely and arbitrarily entered, or left out at random, spaces and tabs, indentation (or not), and an insane level of parentheses nesting, such that none of us used to coding standards of any type could even begin to read the code before I started. Used regex extensively to find and correct abnormal constructs. In a couple of hours, I was able to correct major problems in approximately 125,000 lines of code without actually looking at most of them. In one particular single find/replace I changed more than 22,000 lines of code in 125 files, total time under 10 seconds.
Particularly useful constructs in the regex:
:b+ == one or more blanks and/or tabs.
:i == matches a C-style variable name or keyword (i.e. while, if,
pick3, bNotImportant)
:Wh == a whitespace char.; not just blank or tab
:Sm == any of the arithmetic symbols (+, -, >, =, etc.)
:Pu == any punctuation mark
\n == line break (useful for finding where he had inserted 8 or 10 blank lines)
^ == matches start of line ($ to match end)
While it would have been nice to match some other regex standard (duh), I did find a number of the MS extensions extremely useful for searching a code base, such as not having to define 'identifier' hundreds of times as "[A-Za-z0-9]+", instead just using ":i".

how to eliminate dots from filenames, except for the file extension

I have a bunch of files that look like this:
A.File.With.Dots.Instead.Of.Spaces.Extension
Which I want to transform via a regex into:
A File With Dots Instead Of Spaces.Extension
It has to be in one regex (because I want to use it with Total Commander's batch rename tool).
Help me, regex gurus, you're my only hope.
Edit
Several people suggested two-step solutions. Two steps really make this problem trivial, and I was really hoping to find a one-step solution that would work in TC. I did, BTW, manage to find a one-step solution that works as long as there's an even number of dots in the file name. So I'm still hoping for a silver bullet expression (or a proof/explanation of why one is strictly impossible).
It appears Total Commander's regex library does not support lookaround expressions, so you're probably going to have to replace a number of dots at a time, until there are no dots left. Replace:
([^.]*)\.([^.]*)\.([^.]*)\.([^.]*)$
with
$1 $2 $3.$4
(Repeat the sequence and the number of backreferences for more efficiency. You can go up to $9, which may or may not be enough.)
It doesn't appear there is any way to do it with a single, definitive expression in Total Commander, sorry.
Basically:
/\.(?=.*?\.)//
will do it in pure regex terms. This means, replace any period that is followed by a string of characters (non-greedy) and then a period with nothing. This is a positive lookahead.
In PHP this is done as:
$output = preg_replace('/\.(?=.*?\.)/', '', $input);
Other languages vary but the principle is the same.
Here's one based on your almost-solution:
/\.([^.]*(\.[^.]+$)?)/\1/
This is, roughly, "any dot stuff, minus the dot, and maybe plus another dot stuff at the end of the line." I couldn't quite tell if you wanted the dots removed or turned to spaces - if the latter, change the substitution to " \1" (minus the quotes, of course).
[Edited to change the + to a *, as Helen's below.]
Or substitute all dots with space, then substitute [space][Extension] with .[Extension]
A.File.With.Dots.Instead.Of.Spaces.Extension
to
A File With Dots Instead Of Spaces Extension
to
A File With Dots Instead Of Spaces.Extension
Another pattern to find all dots but the last in a (windows) filename that I've found works for me in Mass File Renamer is:
(?!\.\w*$)\.
I don't know how useful that is to other users, but this page was an early search result and if that had been on here it would have saved me some time.
It excludes the result if it's followed by an uninterrupted sequence of alphanumeric characters leading to the end of the input (filename) but otherwise finds all instances of the dot character.
You can do that with Lookahead. However I don't know which kind of regex support you have.
/\.(?=.*\.)//
Which roughly translates to Any dot /\./ that has something and a dot afterwards. Obviously the last dot is the only one not complying. I leave out the "optionality" of something between dots, because the data looks like something will always be in between and the "optionality" has a performance cost.
Check:
http://www.regular-expressions.info/lookaround.html