Correction in regular expression - regex

I have a string that contains a combination of words along with \r\n at few places, and \n at some places.
This is a sample:
\r\nThis is an\nexample\nand I need\na\nsolution\r\nr\nOK\r\n
Now I need to match only This is an\nexample\nand I need\na\nsolution along with \n in it
This is the expression I tried not working though
\r\n([\s\w]+)\r\n
This reads the complete string. Correction please

You don't want there to be \rs within the sentence you match? You could use a negative match:
\r\n([^\r]+)\r\n
I believe using \s will match both \r and \n as spacing characters

Try using the non-greedy quantifier ?:
\r\n([\s\w]+?)\r\n

A multiline lazy match should work:
/\r\n(.+?)\r\n/m

Related

How to exclude character that has preceding character different than specified in regular expression?

With regular expression I would like to get all characters between round brackets, but \( and \) characters should be also included in the result.
Examples:
input: fo(ob)a)r
output: ob
input: foo(bar\(qwerty\))baz
output: bar\(qwerty\)
This is what I used for finding text between brackets:
(?<=\()([^\s\(\)]+)(?=\)), but I can't make exceptions for brackets preceded by \.
You could do something like this :
.*(?<!\\)\((.*?)(?<!\\)\)
Basically, it matches as many characters as possible until it sees an open parenthesis without a backslash (using a negative lookbehind), then groups the next matching characters until a closing parenthesis (still without a backslash).
Note that this regex may not work properly if you escape the backslashes.
Example : https://regex101.com/r/BqVKZp/1
This regex works for both your examples, without any lookaheads and lookbehinds:
\((.+[^\\])\)
A U flag is needed.

Notepad++ regular expressions and replace

I have a couple of sentences that need processing using regular expressions. They're in a text file and I'm opening it in notepad++.
<tag>There are two tags here</tag>
<tag>How am i supposed to
feel when this is happening?</tag>
<tag>I'm not sure.
But oh well<tag>
Is it possible to use notepad++'s regular expressions and replace functionality to produce an output like so:
<tag>There are two tags here</tag>
<tag>How am i supposed to feel when this is happening?</tag>
<tag>I'm not sure. But oh well<tag>
So that sentences that span over two or more lines are joined based on the fact that there is a > at the end of the sentence. Thanks.
Replace this:
[\r\n]+(?!<)
with a space
Click for Demo
Explanation:
[\r\n]+ - matches 1+ occurrences of a \r or \n
(?!<) - negative lookahead to validate that the above match is not followed by an opening tag <
Before Replacement with space:
After replacing the matches with space:

Regex replace one value between comma separated values

I'm having a bunch of comma separated CSV files.
I would like to replace exact one value which is between the third and fourth comma. I would love to do this with Notepad++ 'Find in Files' and Replace functionality which could use RegEx.
Each line in the files look like this:
03/11/2016,07:44:09,327575757,1,5434543,...
The value I would like to replace in each line is always the number 1 to another one.
It can't be a simple regex for e.g. ,1, as this could be somewhere else in the line, so it must be the one after the third and before the fourth comma...
Could anyone help me with the RegEx?
Thanks in advance!
Two more rows as example:
01/25/2016,15:22:55,276575950,1,103116561,10.111.0.111,ngd.itemversions,0.401,0.058,W10,0.052,143783065,,...
01/25/2016,15:23:07,276581704,1,126731239,10.111.0.111,ll.browse,7.133,1.589,W272,3.191,113273232,,...
You can use
^(?:[^,\n]*,){2}[^,\n]*\K,1,
Replace with any value you need.
The pattern explanation:
^ - start of a line
(?:[^,\n]*,){2} - 2 sequences of
[^,\n]* - zero or more characters other than , and \n (matched with the negated character class [^,\n]) followed with
, - a literal comma
[^,\n]* - zero or more characters other than , and \n
\K - an operator that forces the regex engine to discard the whole text matched so far with the regex pattern
,1, - what we get in the match.
Note that \n inside the negated character classes will prevent overflowing to the next lines in the document.
You can replace value between third and fourth comma using following regex.
Regex: ([^,]+,[^,]+,[^,]+),([^,]+)
Replacement to do: Replace with \1,value. I used XX for demo.
Regex101 Demo
Notepad++ Demo

NOTEPAD++ REGEX - I can't get what's in between two strings, I don't get it

I'm so close to understanding regex. I'm a bit stumped, I thought i understood lazy and greedy.
Here is my current regex: <g_n><!\[CDATA\[([^]]+)(?=]]><\/g_n>)
My current regex makes:
<g_n><![CDATA[xxxxxxxxxx]]></g_n>
match to:
<g_n><![CDATA[xxxxxxxxxx
But I want to make it match like this:
xxxxxxxxxx
You want
<g_n><!\[CDATA\[(.*?)]]></g_n>
then if you want to replace it use
\1
in the replacement box
Your matching the whole string, the brackets around the .*? match all of that and put it in the \1 variable
So the match will be all of the string with \1 referring to what you want
To change the xxxxx
Regex :
(<g_n><![CDATA[)(?:.*?)(]]></g_n>)
Replacement
\1WHAT YOU WANT TO CHANGE TO\2
It looks like you need to add escape slashes to the two closing square brackets, as they are literals from the string you're parsing.
<g_n><!\[CDATA\[.*+?\]\]><\/g_n>
^ ^
Any square brackets not being escaped by backslashes will be treated as regex operational brackets, which in this case won't catch the input string.
EDIT, I think the +? is redundant.
\[.*\]\]> ...
should suffice, since .* means any character, any amount of times.
Tested with notepad++ 6.3.2:
find: (<g_n><!\[CDATA\[)([^]]+)(?=]]></g_n>)
replace: $1WhatYouWant
You can replace + by * in the pattern to match void CDATA:
<g_n><![CDATA[]]></g_n>

How to find occurences of same subsequent characters in a string with a regular expression?

How can I find occurences of same subsequent characters in a string with a regular expression or function?
Example:
I am l​ee​t and I have a thr​ee​ pi​zz​as. That n​oo​b right there has only one pi​zz​a. P​oo​r boy.
You can use a backreference:
/(.)\1/
Change \1 to \1+ if you want to find sequences of length two or more.
Note that the syntax can vary depending on the regular expression engine you are using.
Not sure which version of regex you're working with, but for egrep, this works:
egrep '(.)\1' < file
That will show all lines that have two of some character in a row. If you want just letters:
egrep `([A-Za-z])\1' < file
would work.
Like this in a perl flavour. \w matches a word character, and \2 matches second parentheses.
m/((\w)\2+)/g
Google it:'double characters regex'
Here's a re-fiddle I made with your regex: http://refiddle.com/2fa
This should work ................ (.)\1+