Regex handling "|" in Text - regex

I got the following text:
Code = ABCD123 | Points = 30
Code = ABCD333 | Points = 44
At the end, I want to removing anything except the Code, output:
I actually tried it with
Code = | P.+
But I don't know how to get "|" removed. Currently, I have just ÀBCD333 | left as an example.
I'm struggling there.

Assuming the code only consists of word characters, you may use the following:
^Code = (\w+).+$
..and replace with:
If the code can be anything, you may use something like this instead:
^Code = (.+?)[ ]\|.+$

Find what: ^Code = (\w+).+
Replace with: $1
CHECK Wrap around
CHECK Regular expression
UNCHECK . matches newline
Replace all
^ # beginning of line
Code = # literally
(\w+) # group 1, 1 or more word character
.+ # 1 or more any character but newline
$1 # content of group 1
Screenshot (before):
Screenshot (after):


How can I delete the rest of the line after the second pipe character "|" for every line with python?

I am using notepad++ and I want to get rid of everything after one second (including the second pipe character) for every line in my txt file.
Basically, the txt file has the following format:
3.1_1.wav|I like apples.|I like apples|I like bananas
3.1_2.wav|Isn't today a lovely day?|Right now it is 1 in the afternoon.|....
The result should be:
3.1_1.wav|I like apples.
3.1_2.wav|Isn't today a lovely day?
I have tried using \|.* but then everything after the first pipe character is matched.
In Notepad++ do this:
Find what: ^([^\|]*\|[^\|]*).*
Replace with: $1
check "Regular expression", and "Replace All"
^ - anchor at start of line
( - start group, can be referenced as $1
[^\|]* - scan over any character other than |
\| - scan over |
[^\|]* - scan over any character other than |
) - end group
.* - scan over everything until end of line
in replace reference the captured group with $1
I'm not sure if this is the best way to do it, but try this:

Pyspark - Regex - Extract value from last brackets

I created the following regular expression with the idea of extracting the last element in brackets. See that if I only have one parenthesis it works fine, but if I have 2 parenthesis it extracts the first one (which is a mistake) or extract with the brackets .
Do you know how to solve it?
tmp= spark.createDataFrame(
(1, 'foo (123) oiashdj (hi)'),
(2, 'bar oiashdj (hi)'),
['id', 'txt']
tmp = tmp.withColumn("old", regexp_extract(col("txt"), "(?<=\().+?(?=\))", 0));
tmp = tmp.withColumn("new", regexp_extract(col("txt"), "\(([^)]+)\)?$", 0));
| id| txt|old| new| needed
| 1|foo (123) oiashdj...|123|(hi)| hi
| 2| bar oiashdj (hi)| hi|(hi)| hi
To extract the substring between parentheses with no other parentheses inside at the end of the string you may use
tmp = tmp.withColumn("new", regexp_extract(col("txt"), r"\(([^()]+)\)$", 1));
\( - matches (
([^()]+) - captures into Group 1 any 1+ chars other than ( and )
\) - a ) char
$ - at the end of the string.
The 1 argument tells the regexp_extract to extract Group 1 value.
See the regex demo online.
NOTE: To allow trailing whitespace, add \s* right before $: r"\(([^()]+)\)\s*$"
NOTE2: To match the last occurrence of such a substring in a longer string, with exactly the same code as above, use
The .* will grab all the text up to the end, and then backtracking will do the job.
This should work. Use it with the single line flag.

Looking for single occurrence between '{' and ':' in a large text

I'm new to the Regex world, so please be kind on the tantrums :-)
I would like to print only the first occurrence of a string between { and :.
Example in the following string:
({TRIGGER.VALUE}=0 and {Zabbix windows:zabbix[process,discoverer,avg,busy].avg(10m)}>75)
({TRIGGER.VALUE}=1 and {Zabbix windows:zabbix[process,discoverer,avg,busy].avg(10m)}>65)
I want it to output only Zabbix windows
how is that possible?
I tried {([a-zA-Z0-9 ]*): it is printing : and doing it twice.
Thanks for reading!
You may use a PCRE regex with -o option (extracting the matches rather than returning the whole lines) to grab the text you need and use head -1 to only have the first match:
s='({TRIGGER.VALUE}=0 and {Zabbix windows:zabbix[process,discoverer,avg,busy].avg(10m)}>75) or ({TRIGGER.VALUE}=1 and {Zabbix windows:zabbix[process,discoverer,avg,busy].avg(10m)}>65)'
echo $s | grep -oP '(?<={)[\w\s]+(?=:)' | head -1
See an online demo
Pattern details:
(?<={) - there must be a { immediately to the left of the current location
[\w\s]+ - 1+ word and/or whitespace chars
(?=:) - there must be a : immediately to the right of the current location.

Finding single escaped characters

I would like to replace some escaping character in a given text. Here what I've tried.
_RE_SPECIAL_CHARS = re.compile(r"(?:[^#\\]|\\.)+#")
text = r"ok#\"
search =, text)
if search:
print(_RE_SPECIAL_CHARS.sub("<star>", text))
print('<< NOTHING FOUND ! >>')
This prints :
What I need to have instead is ok<star>\
You can use lookbehind and just match the special character:
Or you can capture the part before # in group 1 and replace with \1<star>
print(_RE_SPECIAL_CHARS.sub("\1<star>", text))

how to match each line wrapped by start/end tag?

I want to convent my blog from markdown to html. And, I used [crayon lang="cpp"]...[/crayon] to paste code. I wanted to get each line that wrapped by [crayon][/crayon], and then add 4 spaces at the beginning of each line. For example:
Some text
[crayon lang="bash"]
other text
[crayon lang="cpp"]
int main()
I want it to be:
Some text
other text
int main()
I don't know how to do it by regex. Could anyone help me?
Here is what I've tried:
\[crayon.*?\]([\d\D]*?)\[\/crayon\] \1 matches all lines wrapped by the [crayon][/crayon], but I can't add spaces.
(?'st'\[crayon.*?\])^.*$(?'-st'\[/crayon\]) doesn't match
A (relatively) easy way would be to do it in two steps:
Insert 4 spaces at the start of each line, but only lines after '[crayon lang="..."]' and before '[/crayon]'
pattern : (?ms)^(?=(?:(?!\[crayon\b).)*\[/crayon])
replacement : ' ' (4 spaces)
Remove all '[crayon lang="..."]' and '[/crayon]'
pattern : \[/?crayon.*?][ \t]*(\r?\n|$)
replacement : '' (empty string)
A PHP demo:
$text = 'Some text
[crayon lang="bash"]
other text
[crayon lang="cpp"]
int main()
$text = preg_replace('#^(?=(?:(?!\[crayon\b).)*\[/crayon])#ms', ' ', $text);
$text = preg_replace('#\[/?crayon.*?][ \t]*(\r?\n|$)#', '', $text);
echo "$text\n";
which would print:
Some text
other text
int main()
A quick explanation of the, perhaps terse regex ^(?=(?:(?!\[crayon\b).)*\[/crayon]):
^ # match the start of a line
(?= # start positive look ahead
(?: # start group
(?!\[crayon\b). # match any char as long as it doesn't have `[crayon` in front of it
)* # end group and repeatr it zero or more times
\[/crayon] # match '[/crayon]'
) # end positive look ahead
In plain English that would read:
match any start of a line, only if there's a [/crayon] ahead of this line-start, and in between this line-start and [/crayon] there cannot be a [crayon.
I have an idea. You can use it, if you think its ok.
1. Scan line by line:
a. Look for \[crayon.+\] this pattern
b. if you don't find this pattern then write the line as it present
c. if you find this pattern then don't write anything and look for \[\/crayon\] this pattern
d. until you find this pattern write every line by adding 4 spaces beginning of it.
e. when you find (c) specified pattern then don't write anything and again start from (a)
How about \[crayon.*?\]\n(.*\n)*?\[\/crayon\]\n. This way \1 can capture each individual line.