REGEX NOTEPAD ++ - regex

I have a list in this format
FIRSTTEXT:SECONDTEXT:RANDOMTEXT::::::::RANDOMNUMBERS:NUMBER:
but all the text is not in this format. i want to save only FIRSTTEXT:SECONDTEXT,
firsttext and secondtext are in the same position on all document !
I have tried this one:
Find what: (.+):(.+)
Replace with: \1:\2
However, it doesn't work.

You may use
Find What: ^(?:([^:\s]+:[^:\s]+).*|.*\R*)
Replace With: $1
Details
^ - start of a line
(?: - start of a non-capturing group:
([^:\s]+:[^:\s]+) - Group 1 ($1 refers to this value):
[^:\s]+ - 1+ chars other than whitespace and :
: - a colon
[^:\s]+ - 1+ chars other than whitespace and :
.* - 0+ chars other than any line break char, as many as possible
| - or
.* - 0+ chars other than any line break char, as many as possible
\R* - 0+ line break sequences
) - end of the non-capturing group.
Demo and settings:

Related

Regex for not only spaces

I am looking for regex that not allowing only spaces (but more than one). One allows (blank space).
I got something like this .*\S.*' or this .*[^ ].* but i want to allow only one space but not more than one only spaces.
You can use
pattern="\S*(?:\s\S*)?"
The pattern will get parsed as a ^(?:\S*(?:\s\S*)?)$ pattern and will match
^ - start of string
(?: - start of a non-capturing group:
\S* - zero or more chars other than whitespace
(?:\s\S*)? - an optional sequence of a whitespace and zero or more non-whitespace chars
) - end of a non-capturing group
$ - end of string.

RegEx multiple lines

I have this text:
text1 without brackets
text2 (with brackets)
and I need two groups in every line:
group#1: text1 without brackets
group#2:
group#1: text2
group#2: with brackets
Here is a link for this example: regexr.com
Thanks for help!
You may use
^(.*?)(?:\s*\(([^()]*)\))?$
See the regex demo and the regex graph:
Details
^ - start of string
(.*?) - Group 1: any 0+ chars as ew as possible
(?:\s*\(([^()]*)\))? - an optional sequence of patterns that is tried at least once:
\s* - 0+ whitespaces
\( - a ( char
([^()]*) - Group 2: 0+ chars other than ( and )
\) - a ) char
$ - end of the string.
Try pattern: ([^(\n]+)(?:\n|\(([^)]+))
Explanation:
([^(\n]+) - first capturing group: match one or more characters other than ( or \n so it will match everything until opening bracket or newline character
(?:...) - used in order to make use of alternation and not create second capturing group
\n|\(([^)]+) - match newline or bracker ( and one or more characters other than closing bracket ) storing it into second capturing group.
Demo

Right regexp for detect changes in mysql config

I need to catch all redefined variables in my.cnf
In my case, they looks like
#basedir = /usr/local/mysql
basedir = /usr
So I need to extract all redefined parameters.
Search criteria that parameter was redefined: file has both strings which starts from #param and param.
Please advice me correct regexp.
You may use
^\h*#\K([_$a-zA-Z0-9]+)(?=\s+=\s.+\R\h*\1\s)
See the regex demo
For the regex to work, use the m multiline modifier and read the file into memory as a single string (you can do it with -0777 options).
Pattern details
^ - start of a line
\h* - 0+ horizontal whitespaces
# - a # char
\K - match reset operator
([_$a-zA-Z0-9]+) - Group 1: any 1 or more ASCII letters, digits, _ and $
(?=\s+=\s.+\R^\h*\1\s) - that is immediately followed with:
\s+ - 1+ whitespaces
= - a = char
\s - whitespace
.+ - 1+ chars other than line break chars
\R - a line break sequence
\h* - 0+ horizontal whitespaces
\1 - same value as in Group 1
\s - whitespace.

Regex : Everything in group except white space

I have this regex right here :
^(#include rem\(\s*(.*)),\s*(.*)\)
That matches this string :
#include rem( padding-top, $alert-padding );
I want to be able that the group with $alert-padding ignores the white space at the end. I tried doing :
^(#include rem\(\s*(.*)),\s*(/S)\)
replace the .* by /S but it doesn't match.
You can play around with the regex here :
https://regex101.com/r/9rouVU/1/
You may use \S+ to match 1 or more non-whitespace characters:
^(#include rem\(\s*(\S+))\s*,\s*(\S+)\s*\)
See the regex dem0
Details:
^ - start of string
(#include rem\(\s*(\S+)) - Group 1 capturing:
#include rem\( - a literal substring #include rem(
\s* - 0+ whitespaces
(\S+) - Group 2 capturing 1+ non-whitespace symbols
\s*,\s* - 0+ whitespaces, , and again 0+ whitespaces
(\S+) - 1+ non-whitespace symbols
\s* - 0+ whitespaces
\) - a literal ).
You can make the match in the second group lazy and then match for further optional whitespace:
^(#include rem\(\s*(.*)),\s*(.*?)\s*\)

Find specific segments using regex

I've got a string which i want split up in specific segments but i cant match the correct segment of the string because of two occurences of the same pattern.
My string:
#if(text.text isempty){<customer_comment>#cc{txt_without_comments}cc#</customer_comment>}else{#if(text.answer=='no'){<customer_comment>#{text.text}</customer_comment>}else{<answer>#{text.text}</answer>}endif#}endif#
I need to match: #if(text.text isempty){#cc{txt_without_comments}cc#}else{....}endif#
and not the nested dots in the else-block.
Here is my incomplete regex:
(?<match>(?<open>#if\((?<statement>[^)]*)\)\s*{)(?<ifblock>(.+?)(?:}else{)(?<elseblock>.*))(?<-open>)}endif#)
This regex is too greedy in the ifblock group it supposed to stop at the first }else{ pattern.
Edit:
This is the exact result i want to produce:
match: #if(text.text isempty){<customer_comment>#cc{txt_without_comments}cc#</customer_comment>}else{#if(text.answer=='no'){<customer_comment>#{text.text}</customer_comment>}else{<answer>#{text.text}</answer>}endif#}endif#
statement: text.text isempty
ifblock: <customer_comment>#cc{txt_without_comments}cc#</customer_comment>
elseblock: #if(text.answer=='no'){<customer_comment>#{text.text}</customer_comment>}else{<answer>#{text.text}</answer>}endif#
You are not using balancing groups correctly. Balancing groups must be used to push some values into the stack using a capture and removed from the stack with other captures, and then a conditional construct is necessary to check if the group stack is empty, and if it is not, fail the match to enforce backtracking.
So, if the regex is the only way for you to match these strings, use the following:
(?s)(?<match>#if\((?<statement>[^)]*)\)\s*{\s*(?<ifblock>.*?)\s*}\s*else\s*{\s*(?<elseblock>#if\s*\((?:(?!#if\s*\(|\}\s*endif#).|(?<a>)#if\s*\(|(?<-a>)\}\s*endif#)*(?(a)(?!)))\}\s*endif#)
See the regex demo. However, writing a custom parser might turn out a better approach here.
Pattern details:
(?s) - single line mode on (. matches newline)
(?<match> - start of the outer group "match"
#if\( - a literal char sequence #if(
(?<statement>[^)]*) - Group "statement" capturing 0+ chars other than )
\)\s*{\s* - ), 0+ whitespaces, {, 0+ whitespaces
(?<ifblock>.*?) - Group "ifblock" that captures any 0+ chars, as few as possible up to the first...
\s*}\s*else\s*{\s* - 0+ whitespaces, }, 0+ whitespaces, else, 0+ whitespaces, {, 0+ whitespaces
(?<elseblock>#if\s*\((?:(?!#if\s*\(|\}\s*endif#).|(?<a>)#if\s*\(|(?<-a>)\}\s*endif#)*(?(a)(?!))) - Group "elseblock" capturing:
#if\s*\( - #if, 0+ whitespaces, (
(?: - start of the alternation group, that is repeated 0+ times
(?!#if\s*\(|\}\s*endif#).| - any char not starting the #if, 0+ whitespaces, ( sequence and not starting the }, 0+ whitespaces, endif# sequence or...
(?<a>)#if\s*\(| - Group "a" pushing the #if, 0+ whitespaces and ( into stack
(?<-a>)\}\s*endif# - }, 0+ whitespaces, endif# removed from "a" group stack
)* - end of the alternation group
(?(a)(?!)) - conditional checking if the balanced amount of if and endif are matched
\}\s*endif# - }, 0+ whitespaces, endif#
) - end of the outer "match" group.