I have a large C++ code base that I'm doing some refactoring on where a number of functions have become redundant, and hence should be removed. So I would like to replace
MyFunc(Param)
with
Param
where Param could be a literal value, variable, function call etc... From the online help I gathered that the search parameters should be
MyFunc/({+}/(
and the replace parameters simply
/1
But this gives me a syntax error in my pattern. I'm new to search and replace with regex under visual studio. Can the above be easily achieved? I've had a look at similar questions on this site, which suggest I'm roughly on the right track, but seem to be missing something.
Edit: If you can answer the above, how about if it is part of a class deference, e.g.
MyClass.MyFunc(Param)
or
MyClass->MyFunc(Param)
(FWIW, I also picked up a copy of VisualAssist in the hope it could do this but it doesn't appear to be able to handle this situation).
Second edit: Thanks to Joe for the correct response, but for anyone else using this approach, beware of some pitfalls,
MyFunc(MyArray[MyOtherFunc(x)])
ends up as
MyArray[MyOtherFunc(x])
and
MyFunc((SomeType)x)
ends up as
(SomeTypex)
Once you do a search to check what you get prior to doing a search and replace, make sure you keep modified files open in case you need to undo, and backup your source files before starting, this works well enough. Even with the pitfalls listed, still a huge time saver.
Try this instead:
Find = MyFunc\({[^\)]*}\)
Replace = \1
Your slashes are the wrong way around and the expression in the parenthesis ({+}) is invalid.
This won't work for parameters that contain function calls or other uses of parentheses - the balanced bracket matching problem isn't solveable using regular expressions.
Related
First of all, I use C# 4.0 to parse the code of a VB6 application.
I have some old VB6 code and about 500+ copies of it. And I use a regular expression to grab all kinds of global variables from the code. The code is described as "Yuck" and some poor victim still has to support this. So I'm hoping to help this poor sucker a bit by generating overviews of specific constants. (And yes, it should be rewritten but it ain't broke, so...)
This is a sample of a code line I need to match, in this case all boolean constants:
Public Const gDemo = False 'Is this a demo version
And this is the regular expression I use at this moment:
Public\s+Const\s+g(?'Name'[a-zA-Z][a-zA-Z0-9]*)\s+=\s+(?'Value'[0-9]*)
And I think it too is yuckie, since the * at the end of the boolean group. But if I don't use it, it will only return 'T' or 'F'. I want the whole word.
Is this the proper RegEx to use as solution or is there an even nicer-looking option?
FYI, I use similar regexs to find all string constants and all numeric constants. Those work just fine. And basically the same .BAS file is used for all 50 copies but with different values for all these variables. By parsing all files, we have a good overview of how every version is configured.
And again, yes, we need to rebuild the whole project from scratch since it becomes harder to maintain these days. But it works and we need the manpower for other tasks. It just needs the occasional tweaks...
You can use: Public\s+Const\s+g(?<Name>[a-zA-Z][a-zA-Z0-9]*)\s+=\s+(?<Value>False|True)
demo
I searched a lot, but never found an answer to my question, and I'm desperate.
I would like to get all dots ( '.' ) between parenthesis wherever they are, and with and undefined number of parenthesis. The problem is that I can just get the first dot, but I don't know how to get all in the same group.
I tried this : \((?:[^\.]*)([\.])(?:[^\.]*)*\)
But it just works if there's just one dot..
Any idea please ?
Try this:
(\(|(\.)|\))
example: http://regex101.com/r/jV5yI0
Part of the reason you're having problems finding an answer to this is that there really isn't one. In order to match an arbitrary number of parenethesis, or any other construct that, that requires a context free grammar. A regex simply isn't powerful enough
That being said there are some Regex engines out there that do support this type of matching. The support tends to be very Engine specific though (for example .Net does it with balancing groups). If you can tell us what engine you are using we may be able to provide an exact answer here
Suppose I want find all function calls in my listing (a vb.net listing), and I have the function name.
first I thought I could do a regular expression such as:
myfunc\( .* \)
That should work even if the function spans multiple lines, assuming that the dot is interpreted as including newlines (there is an option to do this in dot-net)
but then I realized that some of my arguments themselves could be function calls.
in other words:
myfunc( a,b,c,d(),e ),
which means that the parentheses don't match up.
so I thought that since the main function call usually is the first item on a line, I could do this:
^myfunc( .* \) $
The idea is that the function is the first item on a line (^) and the last paren is the last item on a line ($). but that doesn't work either.
What am I doing wrong?
You can't. By design, regular expressions cannot deal with recursion which is needed here.
For more information, you might want to read the first answer here: Can regular expressions be used to match nested patterns?
And yes, I know that some special "regular expressions" do allow for recursion. However, in most cases, this means that you are doing something horrible. It is much better to use something that can actually understand the syntax of your language.
This is not a direct answer to your question, but if you want to find all uses of your function you can use Visual Studio. Just right click on the function, and then select Find All References:
Visual Studio will show you the results. You can then double click on each line and Visual Studio will take you there.
I’m using a commercial application that has an option to use RegEx to validate field formatting. Normally this works quite well. However, today I’m faced with validating the following strings: quoted alphanumeric codes with simple arithmetic operators (+-/*). Apparently the issue is sometimes users add additional spaces (e.g. “ FLR01” instead of “FLR01”) or have other typos such as mismatched parenthesis that cause issues with downstream processing.
The first examples all had 5 codes being added:
"FLR01"+"FLR02"+"FLR03"+"FMD01"+"FMR05"
So I started going down the road of matching 5 alphanumeric characters quoted by strings:
"[0-9a-zA-Z]{5}"[+-*/]
However, the formulas quickly got harder and I don’t know how to get around the following complications:
I need to test for one of the four simple math operators (+-*/) between each code, but not after the last one.
There can be any number of codes being added together, not just five as in the example above.
Enclosed parenthesis are okay (“X”+”Y”)/”2”
Mismatched parenthesis are not okay.
No formula (e.g. a blank) is okay.
Valid:
"FLR01"+"FLR02"+"FLR03"+"FMD01"+"FMR05"
"0XT"+"1SEAL"+"1XT"+"23LSL"+"23NBL"
("LS400"+"LT400")*"LC430"/("EL414"+"EL414R"+"LC407"+"LC407R"+"LC410"+"LC410R"+"LC420"+"LC420R")
Invalid:
" FLR01" +"FLR02"
"FLR01"J"FLR02"
("FLR01"+"FLR02"
Is this not something you can easily do with RegExp? Based on Jeff’s answer to 230517, I suspect I’m failing at least the ‘matched pairing’ issue. Even a partial solution to the problem (e.g. flagging extra spaces, invalid operators) would likely be better than nothing, even if I can't solve the parenthesis issue. Suggestions welcomed!
Thanks,
Stephen
As you are aware you can't check for matching parentheses with regular expressions. You need something more powerful since regexes have no way of remembering state and counting the nested parentheses.
This is a simple enough syntax that you could hand code a simple parser which counts the parentheses, incrementing and decrementing a counter as it goes. You'd simply have to make sure the counter never goes negative.
As for the rest, how about this?
("[0-9a-zA-Z]+"([+\-*/]"[0-9a-zA-Z]+")*)?
You could also use this regular expression to check the parentheses. It wouldn't verify that they're nested properly but it would verify that the open and close parentheses show up in the right places. Add in the counter described above and you'd have a proper validator.
(\(*"[0-9a-zA-Z]+"\)*([+\-*/]\(*"[0-9a-zA-Z]+"\)*)*)?
You can easily use regex's to match your tokens (numbers, operators, etc), but you cannot match balanced parenthesis. This isn't too big of a problem though, as you just need to create a state machine that operates on the tokens you match. If you're not familiar with these, think of it as a flow chart within your program where you keep track of where you are, and where you can go. You can also have a look at the Wikipedia page.
We have a configuration file that lists a series of regular expressions used to exclude files for a tool we are building (it scans .class files). The developer has appended all of the individual regular expressions into a single one using the OR "|" operator like this:
rx1|rx2|rx3|rx4
My gut reaction is that there will be an expression that will screw this up and give us the wrong answer. He claims no; they are ORed together. I cannot come up with case to break this but still fee uneasy about the implementation.
Is this safe to do?
Not only is it safe, it's likely to yield better performance than separate regex matching.
Take the individual regex patterns and test them. If they work as expected then OR them together and each one will still get matched. Thus, you've increased the coverage using one regex rather than multiple regex patterns that have to be matched individually.
As long as they are valid regexes, it should be safe. Unclosed parentheses, brackets, braces, etc would be a problem. You could try to parse each piece before adding it to the main regex to verify they are complete.
Also, some engines have escapes that can toggle regex flags within the expression (like case sensitivity). I don't have enough experience to say if this carries over into the second part of the OR or not. Being a state machine, I'd think it wouldn't.
It's as safe as anything else in regular expressions!
As far as regexes go , Google code search provides regexes for searches so ... it's possible to have safe regexes
I don't see any possible problem too.
I guess by saying 'Safe' you mean that it will match as you needed (because I've never heard of RegEx security hole). Safe or not, we can't tell from this. You need to give us more detail like what the full regex is. Do you wrap it with group and allow multiple? Do you wrap it with start and end anchor?
If you want to match a few class file name make sure you use start and end anchor to be sure the matching is done from start til end. Like this "^(file1|file2)\.class$". Without start and end anchor, you may end up matching 'my_file1.class too'
The answer is that yes this is safe, and the reason why this is safe is that the '|' has the lowest precedence in regular expressions.
That is:
regexpa|regexpb|regexpc
is equivalent to
(regexpa)|(regexpb)|(regexpc)
with the obvious exception that the second would end up with positional matches whereas the first would not, however the two would match exactly the same input. Or to put it another way, using the Java parlance:
String.matches("regexpa|regexpb|regexpc");
is equivalent to
String.matches("regexpa") | String.matches("regexpb") | String.matches("regexpc");