This question already has answers here:
Can regular expressions be used to match nested patterns? [duplicate]
(11 answers)
Closed 6 years ago.
There are multiple C++ files. I need to extract the body of for-loop from these files.
Is there an easy way to do this maybe using grep. Consider there are no nested for loops.
Without parsing the entire file, the answer is no.
for-loops are comprised of a context-free grammar and, as such, cannot be matched by a regular expression.
A more involved approach is to use grep to search for the beginning of a for-loop (for follow by optional whitespace followed by a lpar) then manually find the closing curly.
Unfortunately parsing C++ is Turing Complete, so unless there's some cute flag to pass to your compiler, you're hosed.
Related
This question already has answers here:
How to find overlapping matches with a regexp?
(4 answers)
Closed 3 years ago.
I'm trying to extract the repeated pattern from a string.
For example with something like "112112112112" I would want to end up with "112".
I've been having problems where I either end up with "1" or "112112".
The patterns can be of any size.
Here's an example of the kind of expressions I've been playing around with.
^(.+)(?=\1)
There are repeated patterns with different sizes, if 3 would be desired, for instance, we'd use a quantifier for that, such as:
(.{3})(?=\1)
Demo 1
or
(.{3,5})(?=\1)
Demo 2
This question already has answers here:
How do I match any character across multiple lines in a regular expression?
(26 answers)
Regex search with pattern containing (?:.|\s)*? takes increasingly long time
(1 answer)
Closed 3 years ago.
A few times I saw regex experts say that using (.|\n)*? is a really, really bad idea.
Well, I do understand that it's better to replace it with the .* and use the /s flag. But sometimes the flags are not available, for example, when using regex within a text editor or other software with limited regex functionality. Thus, using something like (.|\n)*? might be the only option for multi-line matching.
So, what are the reasons to always avoid (.|\n)*??
This question already has answers here:
Regular expression for math operations with parentheses
(4 answers)
Closed 2 years ago.
I need help to build a regular expression that accepts the basic arithmetic operations algorithm, but also allows meter operations on any number of parentheses
so far I have this expression:
^([(]*(-)?\d+(\.\d+)?[)]?)([(]?[-+/*%^]?\d+(\.\d+)?[)]*)+
It happens that the above expression accepts me without closing parenthesis or unopened (parentheses must go in pairs).
I show the evidence that I have made, which is in the red box should not accept
http://regexr.com/38r4u
And I hope you can help me,
Thanks.
You cannot parse a recursive structure using a regex. Use a parser instead.
This question already has answers here:
Using Regex to generate Strings rather than match them
(12 answers)
Reverse regular expression, create string from regex
(1 answer)
Closed 9 years ago.
Or "How can I RegEx in reverse?"
specifically I want to take a regex such as wks[0-9][0-9][0-9]
and create a list such as wks001,wks002,wks003, etc
I know the easiest way in this example would be to simply increment the number through addition, but say I want it to be even more sophisticated later on such as [0-9abc] I'd like to use a more sophisticated tool.
preferable would be some windows capable scripting tech, such as vbscript/powershell but I'm open to other alternatives. I guess I kind of thought this might be something that is done all the time by random number generators and such and would be a programming staple, but I lack the understanding to phrase it correctly I think.
This question already has answers here:
Does an algorithm exist which can determine whether one regular language matches any input another regular language matches?
(4 answers)
Closed 9 years ago.
I'd like to take a user-input regular expression and determine whether or not it will match any string, i.e. would it "reduce" to .+ or .*?
I suspect that since this exists, that my question will reduce to the halting problem, but I'd really like to be wrong about that.
I don't think what you want is similar to the Halting problem since the grammar of regular expression. Considering the alphabet and the language recognized by your automaton is finite, you can still use a dummy algorithm that would try every world of your language and test if the regular expression is able to recognize it or not.
In practice, this method has an awful complexity but you don't have any "undefined" state you would have in Halting problem since the number of inputs is enumerable.
I actually don't know if a better version of this dummy algorithm exists, but i hope i answered about your question on similarity to the Halting problem.