Regex for matching every line enclosed in curly brackets - regex

I'm trying to match every single line within curly brackets, and I'm struggling to capture what I want. To give an example, if I have this text:
{
this is a line,
this = another line,
this is the third line!
this is, indeed, another line
},
round two: {
we're now on the second pair of brackets,
and this is the final line.
}
Then I want to match and capture a total of six lines:
this is a line,
this = another line,
this is the third line!
this is, indeed, another line
we're now on the second pair of brackets,
and this is the final line.
So far my current idea is trying to match "curly bracket" -> "anything" -> "line" -> "anything" -> "curly bracket", i.e. something like this:
{(?s)[^}]*(^([^}^\n]+)$)(?s)[^}]*}
But that only matches one line per pair of curly brackets, rather than every line.
How would I go about doing this? Thanks.
EDIT: Updated the example to include preceding text before one of the opening curly braces and varying whitespace.

Just match lines that don't contain a brace:
^[^{}\r\n]+$
The multiline flag is to be set (/m). Alternatively, insert (?m) at the beginning of the regex.
Demo
The regex reads, "match the beginning of the line followed by one or more characters other than {, }, \r and \n, followed by the end of the line".
To exclude leading spaces in each matched line you can modify the regex slightly:
^\s*\K[^{}\r\n]+$
Demo
\K resets the starting point of the match, excluding any previously-consumed characters. \K is not available with all regex engines.

Assuming input is well formed:
([^{\n](?=[^{]+}))+
See live demo

Related

Regex: Replace all occurrences of an attribute from an object/dictionary/json?

I have an input attribute key value and I want to remove all its occurences from a json/dictionary/object. Here's an example:
{
"$type":"NewRunner.SingleValueExpression",
"name":"ABC",
"age":23
"nestedJSON": {
"$type":"NewRunner.SingleValueExpression003",
"field3":"edvrvbte"
}
}
I want to remove "$type" attribute from everywhere in the given string and the output should be:
{
"name":"ABC",
"age":23
"nestedJSON": {
"field3":"edvrvbte"
}
}
How can I write a regex for the same? Can someone help me?
Ideally it would be like: string.replace("regexValue",replacement)
I am looking for writing the regex value.
I tried this:
\"\$type\":\".+?(?=abc)\",
and this as well:
\"\$type\":\"(?<=\[)(.*?)(?=\])\",
But confused what should I write in center \".+?(?=abc)\" to match anything in value
Try this:
"\$type":[^,{}]*,[\r\n]*|,\s*"\$type":[^{},\r\n]*
"\$type":[^,{}]*,[\r\n]*
"\$type": match the string "$type":.
[^,{}]* match zero or more character except , this is important and it means that every character will be matched except , because we don't want to cross the comma. The same thing with the curly braces {} we don't want to cross the curly braces as well.
,[\r\n]* match a literal , and zero or more newline.
|,\s*"\$type":[^{},\r\n]*
| this is the alternation operator it is like Boolean OR.
,\s* to match a comma followed by zero or more whitespace character.
"\$type": this part is the same as the previous part.
[^{},\r\n]* this part also the same as the previous part but here we added \r and \n and there is no comma , this is because if the value "$type":"NewRunner.SingleValueExpression" happens to be the last value in the object there will be no comma after it, but the problem here is that after the last value in the object there will be an optional new line or a closing curly brace } so we don't want to cross the closing curly brace as well. here we added \r and \n because if the value is the last value we don't want to remove the new line after it, this is not an important thing but to make the code looks good and the closing curly brace will be at that new line.
See regex demo.
You might use
\s*"\$type":[^{},\r\n]*,?
Explanation
\s* Match optional whitespace chars
"\$type": Match "$type":
[^{},\r\n]* Optionally match any char except { } , or a newline
,? Match an optional comma
See a regex demo.

Regex to change all past a certain pattern to Uppercase

I have an xml file that has a value like
JOBNAME="JBDSR14353_Some_other_Descriptor"
I am looking for an expression that will go through the file and change all of the characters in the quotes to Uppercase letters. Is there a Regex expression that will search for JOBNAME="Anything within the quotes" and change them to uppercase? Or a command that will find JOBNAME= and change all on that line to uppercase letters? I know that can just do a search for JOBNAME= and then use a VU command in vim to throw the line to uppercase store that to a macro and run that, but I was wondering if there was a way to get this done with a regex??
Here's an alternative with :substitute, as you had originally intended. This works better than #Zach's solution with gU_ when there's other text in the line:
:%s/JOBNAME="[^"]\+"/\U&/g
"[^"]\+" matches the quoted text (non-greedily by matching only non-quotes inside, to handle multiple quotes in a line)
\U turns the remainder of the replacement uppercase
for simplicity, the entire match (&) is uppercased here, but one could have also used capture groups (\(...\)), or match limiting with \zs
You can use the :g command which executes a command on lines that match a pattern:
:g/JOBNAME/norm! gU_
This will execute the gU_, which capitalizes all letters on a line, on all the lines that match JOBNAME
If there are other things on the same line that you don't want to capitalize, here is a solution for only the words in quotes:
:g/JOBNAME/norm! f"gU;
f" goes to the next quote. gU capitalizes with a motion. The motion used is ; which searches for the next " (repeats the last f command).
To do this with substitution you can use the \U atom which makes everything after it uppercase.
:%s/JOBNAME="\zs.*\ze"/\U&
\zs and \ze mark the start and end of the match and & is the whole match. This means that only the part between quotes is replaced.

Regular Expression to search and replace in notepad ++

If i have a line of text that i want to remove from a text file in notepad and it is always formatted like this
[text]:
except that the words in the text area change. what is a regular expression i could create to remove the whole section with the search and replace function in notepad?
To delete the entire line starting with [any text]: you can use: ^[\t ]*\[.*?\]:.*?\r\n
Explanation:
^ ... start search at beginning of a line (in this case).
[\t ]* ... find 0 or more tabs or spaces.
\[ ... find the opening square bracket as literal character.
.*? ... find 0 or more characters except the new line characters carriage return and line-feed non greedy which means as less characters as possible to get a positive match, i.e. stop matching on first occurrence of following ] in the search expression.
\]: ... find the closing square bracket as literal character and a colon.
.*?\r\n ... find 0 or more characters except the new line characters and finally also the carriage return and line-feed terminating the line.
The search string ^[\t ]*\[.*?\]:.*?$ would find also the complete line, but without matching also the line termination.
The replace string is for both search strings an empty string.
If by removing the entire section, you mean remove the [text]: up to the next [otherText]:, you can try this:
\[text\]:((?!\[[^\]]*\]:).)*
Remember to set the flag for ". matches newline".
This regex basically first matches your section title. Then, it would start matching right after this title and for each character, it uses a negative lookahead to check if the string following this character looks like a section title. If it does the matching is terminated.
Note: Remember that this regex would replace all occurrences of the matched pattern. In other words, if you have more than one of that section, they are both replaced.

Regex for this dashed pattern

Would anyone have a suggestion for a regex that manipulates line that ends in:
,04-721-0G-00033-AU
and transform that string into:
,04,721,0G,00033,AU
(replaces all dashes after last comma in a string into commas)
Keep in mind that there could be preceding parts of the string that have dashes and commas, so what I know for sure is that the part of the line I want manipulated is a string that starts with a last comma in the line, ends with EOL and has this structure of ,XX-XXX-XX-XXXXX-XX
Any suggestions?
Thanks.
Match: ,(?=[^,]*$)(\w{2})-(\w{3})-(\w{2})-(\w{5})-(\w{2})$
Replace by: ,$1,$2,$3,$4,$5
How it works:
,(?=[^,]*$) selects the last , of the line (literally: the , that is only followed by anything but an other , until the end of the line).
after that, we try to match your XX-XXX-XX-XXXXX-XX with
(\w{2})-(\w{3})-(\w{2})-(\w{5})-(\w{2})
make sure that the end of the line has been reached by matching $
Then you just rewrite:
the ,
each XX group separated by a -.
Would this pattern (test replace) do what you like?
-(?=[^,]{1,15}$)
Replace with ,
Checks at hyphen, if there are 1-15 charcters left to end that are no commas using a look ahead, if so replaces with comma.
As no language is specified, for a multiline replace, you might want to add the m-modifier for multiline, for JS additional the g-modifier for global (test with modifiers).

How to check if a line is blank using regex

I am trying to make simple regex that will check if a line is blank or not.
Case;
" some" // not blank
" " //blank
"" // blank
The pattern you want is something like this in multiline mode:
^\s*$
Explanation:
^ is the beginning of string anchor.
$ is the end of string anchor.
\s is the whitespace character class.
* is zero-or-more repetition of.
In multiline mode, ^ and $ also match the beginning and end of the line.
References:
regular-expressions.info/Anchors, Character Classes, and Repetition.
A non-regex alternative:
You can also check if a given string line is "blank" (i.e. containing only whitespaces) by trim()-ing it, then checking if the resulting string isEmpty().
In Java, this would be something like this:
if (line.trim().isEmpty()) {
// line is "blank"
}
The regex solution can also be simplified without anchors (because of how matches is defined in Java) as follows:
if (line.matches("\\s*")) {
// line is "blank"
}
API references
String String.trim()
Returns a copy of the string, with leading and trailing whitespace omitted.
boolean String.isEmpty()
Returns true if, and only if, length() is 0.
boolean String.matches(String regex)
Tells whether or not this (entire) string matches the given regular expression.
Actually in multiline mode a more correct answer is this:
/((\r\n|\n|\r)$)|(^(\r\n|\n|\r))|^\s*$/gm
The accepted answer: ^\s*$ does not match a scenario when the last line is blank (in multiline mode).
Try this:
^\s*$
Full credit to bchr02 for this answer. However, I had to modify it a bit to catch the scenario for lines that have */ (end of comment) followed by an empty line. The regex was matching the non empty line with */.
New: (^(\r\n|\n|\r)$)|(^(\r\n|\n|\r))|^\s*$/gm
All I did is add ^ as second character to signify the start of line.
The most portable regex would be ^[ \t\n]*$ to match an empty string (note that you would need to replace \t and \n with tab and newline accordingly) and [^ \n\t] to match a non-whitespace string.
Here Blank mean what you are meaning.
A line contains full of whitespaces or a line contains nothing.
If you want to match a line which contains nothing then use '/^$/'.
Somehow none of the answers from here worked for me when I had strings which were filled just with spaces and occasionally strings having no content (just the line terminator), so I used this instead:
if (str.trim().isEmpty()) {
doSomethingWhenWhiteSpace();
}
Well...I tinkered around (using notepadd++) and this is the solution I found
\n\s
\n for end of line (where you start matching) -- the caret would not be of help in my case as the beginning of the row is a string
\s takes any space till the next string
hope it helps
This regex will delete all empty spaces (blank) and empty lines and empty tabs from file
\n\s*