Regular Exp. in Eclipse, Find and Replace: Match everything between curly braces - regex

I would like to match everything between and including curly braces in eclipse find and replace (it can be assumed that there are no inner curly braces but any other character including all types of whitespace.
int SomeMethodName() {
// TODO Auto-generated method stub
return asdfasdf.rearoiula12123893;
}
Right now I am trying this and it only matches curly braces with nothing in them \{[.\s]*\}

The . inside a character class means a . literal, not a wildcard. You need something more like:
\{.*?\}
Depending on how Eclipse treats new line characters you might need to change it to:
\{(.|\r\n?|\n)*?\}

This should work. Tested using Regex Powertoy here.
\{[\s\W\w]*\}
EDIT:
\{[\s\w\. /=(":);]*\} should stop at the nearest closing brace. The piece after the space has all the miscellaneous non-word characters, so you might have to add to that depending on the nature of what you're parsing (e.x. a weird String).

Related

regex: find a line somewhere after another line

I need a regular expression to find a specific line in a file that occurs somewhere after another line. for example, I may want to find the string "friend", but only when it occurs on a line after a line containing the string "hello". so for example:
hello there
how are you
my friend
should pass, but
how are you
my friend
hello
or
hello friend
how are you
should not pass.
The only thing I've thought of is something like hello[.\s]*\n[.\s]*friend, which does not work.
EDIT: I'm using a customized program that has a lot of limitations. I don't have access to switches or custom modes. I need a single regular expression that works for the standard python regex mode.
hello[.\s]*\n[.\s]*friend
First note that a dot inside a character class matches for a literal dot, not as a "match all" character, so you really want alternation, not character class for this. But also not that a "match all" dot will also match spaces, so you don't even need alternation.
So overall, you really just need this:
hello.*?friend
Now comes the problem with matching across new-line chars. By default the "match all" dot does not match new-line chars. You can flag/modifier it to match it, but how you do that depends on what language you are using. In php or perl, you can use the s modifier, e.g.
php:
preg_match('~hello.*?friend~s',$content);
edit:
If you are trying to use regex in something like an editor (or otherwise can't add flags/modifiers), most editors have an option to flag it as such. If not, you can try alternation with newline chars like so:
hello(.|\r?\n)*friend
You need to include two newline characters.
hello(?:.*\n)+.*friend
This expects atleast one newline character present inbetween.
I'm by no means a regex expert (particularly not in Python), but my RegexBuddy app thinks this will work:
(?s)hello.*\n+.*friend
The (?s) is apparently an inline way of specifying the "Dot matches newline" option, which seems to be necessary for the \n to work.

Vim regular expression to remove block of code for all the lines

In my code I want to remove a block of code that starts with a bracket and ends with a bracket. For example if I have this line
ENDPROGRAM { fprintf(sdfsdfsdfsd....) }
and after running the regex i want it to just end up with
ENDPROGRAM
I want to only delete code inside the bracket and the brackets themselves. I tried this command
:%s/\{[a-zA-Z0-0]*\}//g
but it says that pattern not found. Any suggestion?
ENDPROGRAM is just an example, I have like DIV MULT etc etc
Since you're using Vim, an alternative is to record a keyboard macro for this into a register, say register z.
Start recording with qz.
Search forward for ENDPROGRAM: /ENDPROGRAM[enter]
Scan forward for opening brace: f{
Delete to matching brace: d%
Finish recording q.
Now run the macro with #z, and then repeat with ##. Hold down your # key to repeat rapidly.
For one-off jobs not involving tens of thousands of changes in numerous files, this kind of interactive approach works well. You visually confirm that the right thing is done in every place. The thing is that even if you fully automate it with regexes, you will still have to look at every change to confirm that the right thing was done before committing the code.
The first mistake in your regex is that the material between braces must only be letters and digits. (I'm assuming the 0-0 is a typo for 0-9). Note that you have other things between the braces such as spaces and parentheses. You want {.*}: an open brace, followed by zero or more characters, followed by a closing brace. If it so happens that you have variants, like ENDPROGRAM { abc } { def }, this regex will eat them too. The regex matches from the first open brace to the last closing one. Note also that the regex {[^}]*} will not work if the block contains nested interior braces; it stops at the first closing brace, not the last one, and so ENDPROGRAM { { x } } will turn to ENDPROGRAM }.
The second mistake is that you are running this on all lines using the % address. You only want to run this on lines that contain ENDPROGRAM, in other words:
:g/ENDPROGRAM/s/ {.*}//
"For all lines that contain a match for ENDPROGRAM, find a space followed by some bracketed text, and replace it with nothing." Or else:
:%s/ENDPROGRAM {.*}/ENDPROGRAM/
THIS looks like a job for: (dum da dum duuuuum!)
TEXT OBJECTS!
Place the cursor anywhere within the braces. Type daB.
WOOOOOOOAAAH what just happened?!
aB is something called a "text object" in Vim. You could also have typed da{ or da} in this situation. A text object is a thing that Vim's operators can act on. d is one such operator. I'm sure you know others: c, y, etc.
Visual mode also works on text objects. Instead of daB, try vaB. This will select all the text in the braces, plus the braces themselves. Most text objects also have an "inner" variant, for example ciB would delete everything inside the braces, and enter insert mode, leaving the braces intact.
There are text objects to work with HTML/XML tags, objects for working with quoted strings, objects for sentences and paragraphs, objects for words and WORDS, and more. See the full list at :help text-objects.
When something is broken, start simple and work up to what you need. Do not worry about the :s command at first; instead, focus on getting the pattern (or regular expression) right.
Search for \{ and you will get an error message. Oops, that should be just {.
Add the character class: {[a-zA-Z0-0]*. Darn, that is not right, because you left out the space.
Next try: {[a-zA-Z0-0 ]*. Now we are getting somewhere, but we also want to match the parentheses and the dots: {[a-zA-Z0-0 ().]*.
Add the closing brace, realize that you really meant 0-9 instead of 0-0, and you are done: {[a-zA-Z0-9 ().]*}.
At this point, you can take advantage of the fact that :s uses the current search pattern by default, so all you need is :%s///.

Regular expression for everything between two characters

I'm trying to build a regular expression which allows me to remove tags in a string. These tags always look like this: {...}.
I've tried \{.*\} so far but unfortunately this won't work if these tags occur two times. For Example: {123} Hello {asdas}. The entire line would be deleted since it starts with a { and end with a }. So, how can I avoid this behaviour?
Thanks in advance.
\{[^}]*\}
would probably do it.
An opening bracket, followed by any number of anything besides a closing bracket, followed by a closing bracket.

how to delete lines containing a string AND also NOT containing some other string?

I need some regular expression help.
Using the Firefox firebug extension, "css-usage", I was able to export a new css file where the utility prepended "UNUSED" to every class that was not referenced in the page.
I would like to now remove every style that contains the UNUSED styles, however there are some complexities with that. Namely, some tags are comma separated with other tags/selectors which may still be used so I don't want to delete any lines that have a comma in it. And secondly some styles are specified in a long multi-line block in curly braces, so I don't want to delete any lines that do not have a closing curly brace '}'.
I'm using a mac so any solution with SED or AWk or vi is acceptable. I would like to delete all lines in a css file that starts with "UNUSED" and contains no commas and must have a closing '}' curly brace.
To match UNUSED without commas and ensure that it ends with }:
UNUSED [^,]*}
To test if a string contains a character without capturing that character or moving the pointer forward, use a lookahead.
(?=.*[,])
Or, to make sure the string does not contain that character, use a negative lookahead.
(?!.*[,])
If you want to make sure that a string does not contain a character but does contain other characters, you can combine negative and positive lookaheads.
(?!.*[,])(?=.*[}])(?=.*UNUSED)
Finally, to actually select the entire string where this match occurs, use .* after the lookaheads.
^(?!.*[,])(?=.*[}])(?=.*UNUSED).*$
I am mostly familiar with .NET, but I believe many regex engines have options that allow you to specify whether ^ and $ will match the beginning/ending of the entire input string or the beginning/ending of a line. You'd want the second option.
Finally, you could use a regex replace to replace lines that match the given expression with an empty string (thus "deleting" those lines).
Use regex pattern
^UNUSED[^,]*}[^,]*$
with multi-line modifier.

Empty out all methods in a file in vim, regex?

I am using the vim editor and I wanted to empty out all methods in the class, for instance
class A {
public:
int getNumber() const {
return number;
}
void setNumber(int num) {
number = num;
}
};
I wanted to have a regex where I could substitute so that the result would be this
class A {
public:
int getNumber() const;
void setNumber(int num};
};
I know I can use %s for global subtitution but getting everything under the method body is what I am looking for.
Since braces may of course nest within your functions, it's unfortunately impossible to create a regular expression which could match the corresponding ending brace.
Fortunately, VIM does have the move-to-corresponding (which will move forward to the next brace/bracket/etc. and jump to the corresponding opening/closing one), using the %-key (shift-5 on most keyboard layouts I believe?), so basically, standing on the space before the opening brace, c%;<ESC> would replace the entire body with a semi-colon.
Finding the opening brace of a function is an entirely different matter, and I'll see if I can come up with something for you, but to get you going, execute it once and then just keep hitting . (period) to execute it again for every function.
I don't know that regular expressions can match over multiple lines. Peter may be right to recommend a macro. Something like this on the first method may do the trick:
[ESC]qa$d%i;0[ESC]q
Then on the first line of the other methods, you can do this:
[ESC]#a
Here is how it works:
[ESC] - the escape key puts you in
the correct mode
qa - save the next commands as a
macro named "a"
$ - jump to the end of the line
d% - delete to the matching
bracket
i; - insert a semicolon
[ESC] - the escape key puts you in
the correct mode
q - stop recording the macro
#a - calls the macro again (on the
line for the next method)
I hope this helps.
I think this is something that is best done using recorded macros and not regexes. Sorry I don't have time to create one for you case. To get started see:
:help q
Doing this with a regex for all possible inputs is impossible, as roe said. You would need a full parser for your language. But the regex below solves a subset of the problem that may be enough, as long as your text is VERY regular and adheres to some constraints:
Opening brace is the last character on the same line as the method header
Closing brace is indented the same amount as the opening line, is on its own line, and is the last character on the line
Line begins with int or void (you can expand this list to include all relevant types), ignoring leading whitespace
So this works on your sample input:
:%s/\v((^\s+)(int|void).{-})\s+\{$\_.{-}^\2\}$/\1;/
\v: "very magic" mode (avoids needing to backslash-escape everything)
((^\s+): (begin capture of \1); capture the level of indent for this line
(int|void): first word on the line (add to this list as needed)
.{-}): non-greedy match on this line (end capture of \1)
\s+{$: as many spaces as possible, then a {, then end of line
\_.{-}: non-greedy match everything, including newlines
^\2\}$: from the start of a line, match the number of spaces we captured above, then an ending brace at the end of the line
If you know how many spaces your method header lines are indented, you can plug this into the regex in place of (^\s+) to make it more fool-proof.
I guarantee you can easily think of possible inputs to make this regex fail.
I would play with ctags to build the list of functions declared and defined in your header file. The list would contain the line of the inline definition, and not the regex to find the line.
Then in reverse order go to the line of definition of a function, search for its opening bracket, and apply a d%. It should work fine as long as there isn't any unbalanced brackets in the comments of the function.
BTW, in order to not loose the definition, I a have a :MOVEIMPL command that moves an inline function definition to the related .cpp.