I am using the vim editor and I wanted to empty out all methods in the class, for instance
class A {
public:
int getNumber() const {
return number;
}
void setNumber(int num) {
number = num;
}
};
I wanted to have a regex where I could substitute so that the result would be this
class A {
public:
int getNumber() const;
void setNumber(int num};
};
I know I can use %s for global subtitution but getting everything under the method body is what I am looking for.
Since braces may of course nest within your functions, it's unfortunately impossible to create a regular expression which could match the corresponding ending brace.
Fortunately, VIM does have the move-to-corresponding (which will move forward to the next brace/bracket/etc. and jump to the corresponding opening/closing one), using the %-key (shift-5 on most keyboard layouts I believe?), so basically, standing on the space before the opening brace, c%;<ESC> would replace the entire body with a semi-colon.
Finding the opening brace of a function is an entirely different matter, and I'll see if I can come up with something for you, but to get you going, execute it once and then just keep hitting . (period) to execute it again for every function.
I don't know that regular expressions can match over multiple lines. Peter may be right to recommend a macro. Something like this on the first method may do the trick:
[ESC]qa$d%i;0[ESC]q
Then on the first line of the other methods, you can do this:
[ESC]#a
Here is how it works:
[ESC] - the escape key puts you in
the correct mode
qa - save the next commands as a
macro named "a"
$ - jump to the end of the line
d% - delete to the matching
bracket
i; - insert a semicolon
[ESC] - the escape key puts you in
the correct mode
q - stop recording the macro
#a - calls the macro again (on the
line for the next method)
I hope this helps.
I think this is something that is best done using recorded macros and not regexes. Sorry I don't have time to create one for you case. To get started see:
:help q
Doing this with a regex for all possible inputs is impossible, as roe said. You would need a full parser for your language. But the regex below solves a subset of the problem that may be enough, as long as your text is VERY regular and adheres to some constraints:
Opening brace is the last character on the same line as the method header
Closing brace is indented the same amount as the opening line, is on its own line, and is the last character on the line
Line begins with int or void (you can expand this list to include all relevant types), ignoring leading whitespace
So this works on your sample input:
:%s/\v((^\s+)(int|void).{-})\s+\{$\_.{-}^\2\}$/\1;/
\v: "very magic" mode (avoids needing to backslash-escape everything)
((^\s+): (begin capture of \1); capture the level of indent for this line
(int|void): first word on the line (add to this list as needed)
.{-}): non-greedy match on this line (end capture of \1)
\s+{$: as many spaces as possible, then a {, then end of line
\_.{-}: non-greedy match everything, including newlines
^\2\}$: from the start of a line, match the number of spaces we captured above, then an ending brace at the end of the line
If you know how many spaces your method header lines are indented, you can plug this into the regex in place of (^\s+) to make it more fool-proof.
I guarantee you can easily think of possible inputs to make this regex fail.
I would play with ctags to build the list of functions declared and defined in your header file. The list would contain the line of the inline definition, and not the regex to find the line.
Then in reverse order go to the line of definition of a function, search for its opening bracket, and apply a d%. It should work fine as long as there isn't any unbalanced brackets in the comments of the function.
BTW, in order to not loose the definition, I a have a :MOVEIMPL command that moves an inline function definition to the related .cpp.
Related
Assume I have an occurence
GET.CUSTOMER:
and an occurence
GET.ACCOUNT:
How should the Regex expression be formulated if I want the above occurence to be matched only and only if there is no occurence of word
RETURN
Between
GET.CUSTOMER:
BLOCK OF CODE
and
GET.ACCOUNT:
ANOTHER BLOCK OF CODE
For this to be generic, assume that an anchored colon is only allowed in function name, so there can be no colons "stuck" to a word other than the function's name. I.e
RANDOM.FUNCTION:
Is allowed, but
RANDOM.LINE.OF.CODE : MORE.CODE
Is not allowed, except in a string within quotes and apostrophes.
This matching will be used in a Vim syntax file, and not in actual code.
#EDIT
The question: Is the above even possible? Which regex expressions should I look into that might help me solve this?
The following will match GET.CUSTOMER:, if it is followed by GET.ACCOUNT:, but there's no RETURN in between the two. You might need to tweak this a bit; I've left our keyword boundary assertions and other fluff here. Also, as this is a multi-line match, it might be slow or break down if there are too many lines in between.
syntax match getCustomerBlockWithoutReturn
\ "\%#=1\%(GET\.CUSTOMER:\_.\{-}\%(RETURN\|GET\.ACCOUNT:\)\)\#>\%(GET\.ACCOUNT:\)\#<="
\ contains=getCustomer
syntax match getCustomer "GET\.CUSTOMER:" contained
hi link getCustomer Statement
The first getCustomerBlockWithoutReturn matches the whole block. getCustomer is contained in the former (the contained prevents matching outside of it) and performs the highlighting via the :highlight group. This is because you only want to highlight the word that starts the block, not the whole block itself.
The main challenge with this regular expression is that usually, backtracking tries really hard to find a match, and it would skip over GET.ACCOUNT: ... GET.CUSTOMER: parts just to find a RETURN and make the match, even if that spans multiple actual blocks.
By using the (obscure) whole pattern multi (:help /\#>), we're preventing backtracking and match a minimal (via \{-}) area (including newlines by using \_. instead of .) from GET.CUSTOMER: to either RETURN or GET.ACCOUNT:. A positive lookbehind (via /\#<=) then asserts that this end actually is GET.ACCOUNT:, i.e. that we have a block without a RETURN in it. (Note: At least in my Vim version 8.1.536, I had to force use of the older regexp engine via \%#=1; I have reported that bug to the Vim developers.)
I have a latex file in which I want to get rid of the last \\ before a \end{quoting}.
The section of the file I'm working on looks similar to this:
\myverse{some text \\
some more text \\}%
%
\myverse{again some text \\
this is my last line \\}%
\footnote{possibly some footnotes here}%
%
\end{quoting}
over several hundred lines, covering maybe 50 quoting environments.
I tried with :%s/\\\\}%\(\_.\{-}\)\\end{quoting}/}%\1\\end{quoting}/gc but unfortunately the non-greedy quantifier \{-} is still too greedy.
It catches starting from the second line of my example until the end of the quoting environment, I guess the greedy quantifier would catch up to the last \end{quoting} in the file. Is there any possibility of doing this with search and replace, or should I write a macro for this?
EDIT: my expected output would look something like this:
this is my last line }%
\footnote{possibly some footnotes here}%
%
\end{quoting}
(I should add that I've by now solved the task by writing a small macro, still I'm curious if it could also be done by search and replace.)
I think you're trying to match from the last occurrence of \\}% prior to end{quoting}, up to the end{quoting}, in which case you don't really want any character (\_.), you want "any character that isn't \\}%" (yes I know that's not a single character, but that's basically it).
So, simply (ha!) change your pattern to use \%(\%(\\\\}%\)\#!\_.\)\{-} instead of \_.\{-}; this means that the pattern cannot contain multiple \\}% sequences, thus achieving your aims (as far as I can determine them).
This uses a negative zero-width look-ahead pattern \#! to ensure that the next match for any character, is limited to not match the specific text we want to avoid (but other than that, anything else still matches). See :help /zero-width for more of these.
I.e. your final command would be:
:%s/\\\\}%\(\%(\%(\\\\}%\)\#!\_.\)\{-}\)\\end{quoting}/}%\1\\end{quoting}/g
(I note your "expected" output does not contain the first few lines for some reason, were they just omitted or was the command supposed to remove them?)
You’re on the right track using the non-greedy multi. The Vim help files
state that,
"{-}" is the same as "*" but uses the shortest match first algorithm.
However, the very next line warns of the issue that you have encountered.
BUT: A match that starts earlier is preferred over a shorter match: "a{-}b" matches "aaab" in "xaaab".
To the best of my knowledge, your best solution would be to use the macro.
In my code I want to remove a block of code that starts with a bracket and ends with a bracket. For example if I have this line
ENDPROGRAM { fprintf(sdfsdfsdfsd....) }
and after running the regex i want it to just end up with
ENDPROGRAM
I want to only delete code inside the bracket and the brackets themselves. I tried this command
:%s/\{[a-zA-Z0-0]*\}//g
but it says that pattern not found. Any suggestion?
ENDPROGRAM is just an example, I have like DIV MULT etc etc
Since you're using Vim, an alternative is to record a keyboard macro for this into a register, say register z.
Start recording with qz.
Search forward for ENDPROGRAM: /ENDPROGRAM[enter]
Scan forward for opening brace: f{
Delete to matching brace: d%
Finish recording q.
Now run the macro with #z, and then repeat with ##. Hold down your # key to repeat rapidly.
For one-off jobs not involving tens of thousands of changes in numerous files, this kind of interactive approach works well. You visually confirm that the right thing is done in every place. The thing is that even if you fully automate it with regexes, you will still have to look at every change to confirm that the right thing was done before committing the code.
The first mistake in your regex is that the material between braces must only be letters and digits. (I'm assuming the 0-0 is a typo for 0-9). Note that you have other things between the braces such as spaces and parentheses. You want {.*}: an open brace, followed by zero or more characters, followed by a closing brace. If it so happens that you have variants, like ENDPROGRAM { abc } { def }, this regex will eat them too. The regex matches from the first open brace to the last closing one. Note also that the regex {[^}]*} will not work if the block contains nested interior braces; it stops at the first closing brace, not the last one, and so ENDPROGRAM { { x } } will turn to ENDPROGRAM }.
The second mistake is that you are running this on all lines using the % address. You only want to run this on lines that contain ENDPROGRAM, in other words:
:g/ENDPROGRAM/s/ {.*}//
"For all lines that contain a match for ENDPROGRAM, find a space followed by some bracketed text, and replace it with nothing." Or else:
:%s/ENDPROGRAM {.*}/ENDPROGRAM/
THIS looks like a job for: (dum da dum duuuuum!)
TEXT OBJECTS!
Place the cursor anywhere within the braces. Type daB.
WOOOOOOOAAAH what just happened?!
aB is something called a "text object" in Vim. You could also have typed da{ or da} in this situation. A text object is a thing that Vim's operators can act on. d is one such operator. I'm sure you know others: c, y, etc.
Visual mode also works on text objects. Instead of daB, try vaB. This will select all the text in the braces, plus the braces themselves. Most text objects also have an "inner" variant, for example ciB would delete everything inside the braces, and enter insert mode, leaving the braces intact.
There are text objects to work with HTML/XML tags, objects for working with quoted strings, objects for sentences and paragraphs, objects for words and WORDS, and more. See the full list at :help text-objects.
When something is broken, start simple and work up to what you need. Do not worry about the :s command at first; instead, focus on getting the pattern (or regular expression) right.
Search for \{ and you will get an error message. Oops, that should be just {.
Add the character class: {[a-zA-Z0-0]*. Darn, that is not right, because you left out the space.
Next try: {[a-zA-Z0-0 ]*. Now we are getting somewhere, but we also want to match the parentheses and the dots: {[a-zA-Z0-0 ().]*.
Add the closing brace, realize that you really meant 0-9 instead of 0-0, and you are done: {[a-zA-Z0-9 ().]*}.
At this point, you can take advantage of the fact that :s uses the current search pattern by default, so all you need is :%s///.
I need some regular expression help.
Using the Firefox firebug extension, "css-usage", I was able to export a new css file where the utility prepended "UNUSED" to every class that was not referenced in the page.
I would like to now remove every style that contains the UNUSED styles, however there are some complexities with that. Namely, some tags are comma separated with other tags/selectors which may still be used so I don't want to delete any lines that have a comma in it. And secondly some styles are specified in a long multi-line block in curly braces, so I don't want to delete any lines that do not have a closing curly brace '}'.
I'm using a mac so any solution with SED or AWk or vi is acceptable. I would like to delete all lines in a css file that starts with "UNUSED" and contains no commas and must have a closing '}' curly brace.
To match UNUSED without commas and ensure that it ends with }:
UNUSED [^,]*}
To test if a string contains a character without capturing that character or moving the pointer forward, use a lookahead.
(?=.*[,])
Or, to make sure the string does not contain that character, use a negative lookahead.
(?!.*[,])
If you want to make sure that a string does not contain a character but does contain other characters, you can combine negative and positive lookaheads.
(?!.*[,])(?=.*[}])(?=.*UNUSED)
Finally, to actually select the entire string where this match occurs, use .* after the lookaheads.
^(?!.*[,])(?=.*[}])(?=.*UNUSED).*$
I am mostly familiar with .NET, but I believe many regex engines have options that allow you to specify whether ^ and $ will match the beginning/ending of the entire input string or the beginning/ending of a line. You'd want the second option.
Finally, you could use a regex replace to replace lines that match the given expression with an empty string (thus "deleting" those lines).
Use regex pattern
^UNUSED[^,]*}[^,]*$
with multi-line modifier.
I would like to match everything between and including curly braces in eclipse find and replace (it can be assumed that there are no inner curly braces but any other character including all types of whitespace.
int SomeMethodName() {
// TODO Auto-generated method stub
return asdfasdf.rearoiula12123893;
}
Right now I am trying this and it only matches curly braces with nothing in them \{[.\s]*\}
The . inside a character class means a . literal, not a wildcard. You need something more like:
\{.*?\}
Depending on how Eclipse treats new line characters you might need to change it to:
\{(.|\r\n?|\n)*?\}
This should work. Tested using Regex Powertoy here.
\{[\s\W\w]*\}
EDIT:
\{[\s\w\. /=(":);]*\} should stop at the nearest closing brace. The piece after the space has all the miscellaneous non-word characters, so you might have to add to that depending on the nature of what you're parsing (e.x. a weird String).