notepad++ regular expression remove all text between curly brackets - regex

function get_last_word($sentance){
$wordArr = explode(' ', $sentance);
$last_word = trim($wordArr[count($wordArr) - 1]);
runDebug( __FILE__, __FUNCTION__, __LINE__, "Sentance: $sentance. Last word:$last_word",4);
return $last_word;
}
i want to remove all text between {}
result should be:
function get_last_word($sentance){}
i have tried
{+.*}
and its working only when curly brackets are on same line

Newer version of Notepad++ supports multi-line matching (I am now using 6.1.3)
In the Find/Replace dialog, next to the "Regular Expression" radio button, there is a checkbox called ". matches newline" which means multi-line matching.
Then, use \{.*?\} (which is a ungreedy match) to achieve what you want.
Beware that it does not match braces for you. For example
foo {
bar {
blabalbla
}
xxx {
yyy
}
}
will give you
foo {}
xxx {
yyy
}
}
(I believe there are other questions in SO about brace matching in regex, you may have a look, though I wonder if they will work in notepad++)

You should be fine when you just replace \{[^{}]+\} with {}, repeatedly...

Try
(?<=\{)[^}]+(?=\})
this will match anything that falls between { and }

Related

What is a regex to change indentation of any lines that come between two unique strings, to match the indentation of the first and drop the second?

How can I write a regex to find xBxCxDExDExDE...xF with any number of xDE's up to a terminal character F, and replace it with xBxExExExE... where:
all occurrences of D get removed
E is .*\n
all E's are actually different strings that should still be there after the replace
x is .* (whatever the first line starts with before B)
all x's are actually the same exact string
Starting string:
some text; // context
some more text; // context
strongify(self);
if (self) {
// Handle error
if (error) {
AlertWithError(error);
} else {
if (uploadingImage) {
[self logImageUploadWithInfo:uploadInfo startTime:startTime];
}
// Call optional protocol method on completion
[(id<Thing>)controller doSomething];
}
// Signal
[self done];
}
even more text; // context
yadayadayada; // context
... where x is eight spaces {8}, B is strongify(self);\n, C is if (self) {\n, D is four spaces {4}, F is }, and E is everything after xD on each of the lines after C. Lines that should not be "found" by the regex pattern are marked with // context.
The regex should let me do a search and replace where the substitution is like $1$2$3, such that the result would be:
some text; // context
some more text; // context
strongify(self);
// Handle error
if (error) {
AlertWithError(error);
} else {
if (uploadingImage) {
[self logImageUploadWithInfo:uploadInfo startTime:startTime];
}
// Call optional protocol method on completion
[(id<Thing>)controller doSomething];
}
// Signal
[self done];
even more text; // context
yadayadayada; // context
As long as it does what I need, I will accept any answer even if it uses a totally different substitution string or a different (but 100% equivalent) conceptualization of the problem than xABxCxDExDExDE...xF -> xAxExExExE...
What I tried so far:
( *)(strongify\(self\);\n)(?: .*if ?\(self\) ?\{\n)((?: )(\1.*?\n)*)(?:\1\})
... but the problem with this is that (?: ) gets captured even though it's a non-capturing group, so $1$2$3 results in the lines after the first captured line all being indented one level too far.
This seems like a bug in regex. Why is it capturing a non-capturing group? Bonus points if you can explain why that makes sense or why it's good for enclosing capture groups to convert non-capturing groups into capturing groups.
I also tried not wrapping (?: )(\1.*?\n)* in a capture group but then we run into the problem that $3 now only gives [self done]; because capture groups aren't cumulative.
So it seems I'm looking for some kind of way to have a cumulative capture group.
Conceptually I would like to solve this by spawning a child regex that gets fed just the lines that need to have their indentation reduced, since in that case it becomes a simple multi-line regex.

Regex, ignore matches that might occur inside a string

Say for example I have the test string:
this is text "This is a quote { containing } some characters" blah blah { inside }
I would like to match every pair of curly brackets and the text in between using the expression
\{[^{]*?\}
but ignore any matches that might occur inside of a string, namely the { containing } portion of the string, or even be able to match only { text } of the following test string
more text "text text { { { } " { text } words
Well this works:
{[^}]*}(?=(?:[^"]*"[^"]*"[^"]*)*$)
But I'm not sure that it's bullet proof. You can view it online:
{[^}]*} get the curly content
(?=(?:[^"]*"[^"]*"[^"]*)*$) ensure that it's followed by an even number of ".
Note: This regex doesn't take account of escaped double quotes.

How to QRegExp "[propertyID="anything"] "?

I am parsing a file which contains following packets:
[propertyID="123000"] {
fillColor : #f3f1ed;
minSize : 5;
lineWidth : 3;
}
To scan just this [propertyID="123000"] fragment I havre this QRegExp
QRegExp("^\b\[propertyID=\"c+\"\]\b");
but that does not work? Here I have example code to parse that file above:
QRegExp propertyIDExp= QRegExp("\\[propertyID=\".*\"]");
propertyIDExp.setMinimal(true);
QFile inputFile(fileName);
if (inputFile.open(QIODevice::ReadOnly))
{
QTextStream in(&inputFile);
while (!in.atEnd())
{
QString line = in.readLine();
// if does not catch if line is for instance
// [propertyID="123000"] {
if( line.contains(propertyIDExp) )
{
//.. further processing
}
}
inputFile.close();
}
QRegExp("\\[propertyID=\".+?\"\\]")
You can use ..It will match any character except newline.Also use +? to make it non greedy or it will stop at the last instance of " in the same line
Use the following expression:
QRegExp("\\[propertyID=\"\\d+\"]");
See regex demo
In Qt regex, you need to escape regex special characters with double backslashes, and to match digits, you can use the shorthand class \d. Also, \b word boundary prevented your regex from matching since it cannot match between the string start and [ and between ] and a space (or use \B instead).
To match anything in between quotes, use a negated character class:
QRegExp("\\[propertyID=\"[^\"]*\"]");
See another demo
As an alternative, you can use lazy dot matching with the help of .* and QRegExp::setMinimal():
QRegExp rx("\\[propertyID=\".*\"]");
rx.setMinimal(true);
In Qt, . matches any character including a newline, so please be careful with this option.

Regular expression for highlighting words in quotes int qt5

I use QHighlighter class, and used regExp to highlight words in quotes:
void Highlighter::highlightBlock(const QString &text)
{
QRegExp expr("\"(.*?)\"");
int index = expr.indexIn(text);
while(index >=0)
{
int length = expr.matchedLength();
setFormat(index, length, Qt::red);
index = expr.indexIn(text, index+length);
}
}
It doesn't work. Work this:
"\".*\""
But it highlights unnecessary. What regular expression is correct?
Just higlight everything between quotes
QRegExp("\"([^\"]*)\"");
highlight single words (run in loop with offset to match words)
QRegExp("\"(\\w)*\"");
How to match words in quotes:
('|")[^\1]*?\1
Example:
http://regex101.com/r/iF5aA1

Why my perl script isn't finding bad indetation from my regex match

My work's coding standard uses this bracket indentation:
some declaration
{
stuff = other stuff;
};
control structure, function, etc()
{
more stuff;
for(some amount of time)
{
do something;
}
more and more stuff;
}
I'm writing a perl script to detect incorrect indentation. Here's what I have in the body of a while(<some-file-handle>):
# $prev holds the previous line in the file
# $current holds the current in the file
if($prev =~ /^(\t*)[^;]+$/ and $current =~ /^(?<=!$1\t)[\{\}].+$/) {
print "$file # line ${.}: Bracket indentation incorrect\n";
}
Here, I'm trying to match:
$prev: A line not ended with a semi-colon, followed by...
$current: A line not having the number of leading tabs+1 of the previous line.
This doesn't seem to match anything, at the moment.
the $prev variable needs some modification.
it should be something like \t* then .+ then not ending in semicolon
also, the $current should be like:
anything ending in ; or { or } not having the number of leading tabs+1 of the previous line.
EDIT
the perl code to try the $prev
#!/usr/bin/perl -l
open(FP,"example.cpp");
while(<FP>)
{
if($_ =~ /^(\t*)[^;]+$/) {
print "got the line: $_";
}
}
close(FP);
//example.cpp
for(int i = 0;i<10;i++)
{
//not this;
//but this
}
//output
got the line: {
got the line: //but this
got the line: }
it did not detect the line with the for loop ...
am i missing something...
i see a couple of problems...
your prev regex matches all lines which do not have a ; anywhere. which will break on lines like (for int x = 1; x < 10; x++)
if the indent of the opening { is incorrect, you will not detect that.
try this instead, it only cares if you have a ;{ (followed by any whitespace) at the end.
/^(\s*).*[^{;]\s*$/
now you should change your strategy so that if you see a line which does not end in { or ; you increment the indent counter.
if you see a line which ends in }; or } decrement your indent counter.
compare all lines against this
/^\t{$counter}[^\s]/
so...
$counter = 0;
if (!($curr =~ /^\t{$counter}[^\s]/)) {
# error detected
}
if ($curr =~ /[};]+/) {
$counter--;
} else if ($curr =~ /^(\s*).*[^{;]\s*$/) }
$counter++;
}
sorry for not styling my code according to your standards... :)
And you intend to only count tabs (not spaces) for indentation?
Writing this kind of checker is complicated. Just think about all the possible constructs that uses braces that should not change indentation:
s{some}{thing}g
qw{ a b c }
grep { defined } #a
print "This is just a { provided to confuse";
print <<END;
This {
$is = not $code
}
END
But anyway, if the issues above aren't important to you, consider whether the semi colon is important at all in your regex. After all, writing
while($ok)
{
sort { some_op($_) }
grep { check($_} }
my_func(
map { $_->[0] } #list
);
}
Should be possible.
Have you considered looking at Perltidy?
Perltidy is a Perl script that reformats Perl code into set standards. Granted, what you have isn't part of the Perl standard, but you can probably tweak the curly braces via the configuration file Perltidy uses. If all else fails, you can hack through the code. After all, Perltidy is just a Perl script.
I haven't really used it, but it might be worth looking into. Your problem is trying to locate all the various edge cases, and making sure you're handling them correctly. You can parse 100 programs to find that the 101st reveal problems in your formatter. Perltidy has been used by thousands of people on millions of lines of code. If there is an issue, it probably already has been found.