Regular Expressions: Remove x-leading spaces from lines [duplicate] - regex

This question already has answers here:
Notepad++ Regex - Issue with ^ anchor and repeating patterns
(2 answers)
Closed 5 years ago.
To remove, e.g. (exactly) 2 leading spaces from each line, I've tried to replace
"^ "
with
""
I tried that with our own text editor and with Notepad++. Both behave the same and start the search at the same position where the last found/replace happend, so it will actually remove 2n spaces from each line (n >= 0). Is this the expected behavior? Is my used regular expression wrong for that task or do our own text editor and Notepad++ behave incorrectly?

The issue here is that Notepad++ will keep replacing a pattern so long as it keeps finding matches. This means that replacing ^ will keep stripping whitespace from the start of the string, so long as there are two or more leading spaces available.
Try this as a workaround:
Find:
^ (.*)$
Replace:
$1

Related

How to exclude a substring in a regular expression? [duplicate]

This question already has answers here:
What is the difference between .*? and .* regular expressions?
(3 answers)
What do 'lazy' and 'greedy' mean in the context of regular expressions?
(13 answers)
Closed 5 months ago.
There is a line of text:
Lorem ~Ipsum~ is simply ~dummy~ text ~of~ the printing...
To find all the words enclosed in ~~ I use
re.search(r'~([^~]*)~', text)
Let's say it became necessary to use ~~ instead of ~
([^\~]+) indicates to exclude the ~ character from the text within those characters
How do I make a regular expression to exclude a string of characters instead of just one?
That is, ~~Lor~em~~ should return Lor~em
The symbol of the new string must not be excluded and the length of the found string cannot be 0
Use a non-greedy quantifier instead of a negated character set.
re.search(r'~~(.*?)~~', text, flags=re.DOTALL)
re.DOTALL makes . match newline characters.

Removing escaped unicode sequence in a text file [duplicate]

This question already has an answer here:
Regex for matching Unicode pattern
(1 answer)
Closed 2 years ago.
I have a text file with lots of unicode escaped sequence (of emojis by the way), for instance
blablabla \uD83D\uDC4D\uD83C blablabla \uDFFC\uD83D\uDC4F\uD83C\uDFFD
I'd like to remove it all, and get
blablabla blablabla
Is there Any regex expression which would clean these considering that i use Notepad++?
Thanks.
I would suggest: \\u[0-9A-F]{4}\s?.
\\u escapes the slash and matches it and the u literal. [0-9A-F]{4} matches exactly 4 of these characters. Perhaps you should update it to also match length 2 characters depending on the actual text: \\u([0-9A-F]{4}|[0-9A-F]{2})\s?
The \s? matches zero or more whitespace so you don't end up with multiple consecutive whitespace characters.

Sed syntax to extract all of text BEFORE last delimiter? [duplicate]

This question already has answers here:
Regular Expression, remove everything after last forward slash
(5 answers)
Closed 3 years ago.
I am trying to get the syntax right so that I can make scanning-client-container-0.2.tar look like scanning-client-container
I am using the delimiter " - " like so:
sed -e 's/-[^*]*$//'
with the result scanning, which is cut off too early
You can use a negated character class in your regex:
sed 's/-[^-]*$//' <<< 'scanning-client-container-0.2.tar'
scanning-client-container
RegEx Details:
-: Match a -
[^-]*: Match 0 or more characters that are not -
$: Match end

Use RegEx to find and transform characters to capital case [duplicate]

This question already has answers here:
Notepad++ and regex: how to UPPERCASE specific part of a string / find / replace
(2 answers)
Closed 4 years ago.
In notepad++ I need to use RegEx transform all
phone1_id, phone2_id, phone3_id
in
PHONE1_ID, PHONE2_ID, PHONE3_ID
This RegEx helps me find all those strings: phone\d+_id
but how can I transform them to capital case?
Ctrl+H
Find what: phone\d+_id
Replace with: \U$0
check Wrap around
check Regular expression
Replace all
Replacement:
\U : Change to uppercase
$0 : contains the whole match
Result for given example:
PHONE1_ID, PHONE2_ID, PHONE3_ID

Regex divide string by commas ignoring function syntax [duplicate]

This question already has answers here:
Split string delimited by comma without respect to commas in brackets
(3 answers)
Closed 4 years ago.
I need a regex that substitutes a string by looking at their commas.
For example the string:
str1 = "a,b,12,func(a,b),8,bob,func(1,2))"
should be transformed as following:
str1_transformed = "a;b;12;func(a,b);8;bob;func(1,2))"
I cannot substitute every "," with a ";" because it will look like:
str1_wrong = "a;b;12;func(a;b);8;bob;func(1;2))"
How can I deal with it?
I looked at the following threads without success:
How can I Split(',') a string while ignore commas in between quotes?
Regular Expression for Comma Based Splitting Ignoring Commas inside Quotes
If you know that you won't have unbalanced or escaped brackets below regex works well:
,(?![^()]*\))
Breakdown:
, Match a comma
(?! Start of negative lookahead
[^()]*\) That means, recent matched comma shouldn't follow a closing bracket without matching opening bracket
) End of lookahead
C# code:
Regex regex = new Regex(#",(?![^()]*\))");
string result = regex.Replace(#"a,b,12,func(a,b),8,bob,func(1,2))", #";");