Regex in "Find and Replace"; how to match \n (newline character)? - regex

I'm not sure whether I couldn't find the correct way or this is a bug.
I wanted to check some reference manual but there doesn't seem to be one.
In Jupyter's Find and Replace screen, there's an icon .* to check when I want to use regex.
Mostly it works fine, but if I try to match a line break (\n), it does not match it unless it is that very character. For example, I want to match every line that doesn't end with , and join that line to the next one. I'd match [^,]\n and replace with ,, which would remove the line break. I could try [^,]$, but replacing this wouldn't remove the line break.
How do I do this?

There are a lot of variants of the new-line character.
E.g.: \r or \n
Regex Pattern
Anyway, here is the pattern using lookahead to check if there is a comma before, and the variants of new-line character.
(?<!,)(\r?(\n|\r))
Regex Demo

Related

Remove trailing whitespace at the end of aspx file

I am trying to remove trailing whitespace including \r and \n at the end of aspx files by using Find and Replace using the pattern
\s+(?!.)
trying to replace whitespace followed by nothing with nothing.
The result is that everything will come on the same line.
Why?
I also tried \s+$ with the same result.
You may add a negative lookahead to the end of your current pattern:
(\s+\r?\n)+$(?!.)
This will ensure that only final lines with whitespace only are matched. See the demo here.

Replace Certain Line Breaks with Equivalent of Pressing delete key on Keyboard NotePad++ Regex

Im using Notepad++ Find and replace and I have regex that looks for [^|]\r which will find the end of the line that starts with 8778.
8778|44523|0||TENNESSEE|ADMINISTRATION||ROLL 169 BATCH 8|1947-09-22|0|OnBase
See Also 15990TT|
I want to basically merge that line with the one below it, so it becomes this:
8778|44523|0||TENNESSEE|ADMINISTRATION||ROLL 169 BATCH 8|1947-09-22|0|OnBase See Also 15990TT|
Ive tried the replace being a blank space, but its grabbing the last character on that line (an e in this case) and replacing that with a space, so its making it
8778|44523|0||TENNESSEE|ADMINISTRATION||ROLL 169 BATCH 8|1947-09-22|0|OnBas
See Also 15990TT|
Is there any way to make it essentially merge the two lines?
\r only matches a carriage return symbol, to match a line break, you need \R that matches any line break sequence.
To keep a part of a pattern after replacement, capture that part with parentheses, and then use a backreference to that group.
So you may use
([^|\r])\R
Replace with $1. Or with $1 if you need to append a space.
Details
([^|\r]) - Capturing group 1 ($1 is the backreference that refers to the group value from the replacement pattern): any char other than | and CR
\R - any line break char sequence, LF, CR or CRLF.
See the regex demo and the Notepad++ demo with settings:
The issue is you're using [^|] to match anything that's not a pipe character before the carriage return, which, on replacement, will remove that character (hence why you're losing an e).
If it's imperative that you match only carriage returns that follow non-pipe characters, capture the preceding character ([^|])\r$ and then put it back in the replacement using $1.
You're also missing a \n in your regex, which is why the replacement isn't concatenating the two lines. So your search should be ([^|])\r\n$ and your replace should be $1.
Find
(\r\n)+
For "Replace" - don't put anything in (not even a space)

Delphi regular expression, ignoring LineFeed in matched text

I want to search for the following regular expression
^[ ]*,$
in the following text :
,[LF]
,[LF]
My problem is that Delphi finds the expression, but the matched text doesn't include the LF.
Effectively I want to removes the lines from my source code.
I'am using TPerlRegEx with delphiXe8
In the example [LF] is the linefeed ($0D $0A)
I Tested several flags combinaisons in TPerlRegExOptions
This works perfectly in SublimeText 3
What am I missing ?
If you are using a PCRE regex, you can match zero or more spaces at the stsart of a line followed with a comma followed with a newline sequence with
(?m)^[ ]*,\R
See the regex demo. Note that (?m) is a multiline modifier making ^ match a location at the beginning of a line (after \n). \R matches any newline sequence.
Add a ? after \R to also match the last line in the text that has no newline sequence at the end.

Notepad++ delete lines ending with specific character

I want to delete all lines ending with |
I tried
.*[|;]
but it's not the end
Use the following regex:
.*\|$
This says "any character any number of times (.*), followed by a pipe (\| - you have to escape it), and then the end of a line ($)".
If you want to find lines ending with either ; or |, use:
.*[\|;]$
You don't have to escape the pipe in this case, but I prefer to do so anyway.
In either case, make sure you're in "Regular expression" search mode with ". matches newline" unchecked.

Removing repeated characters, including spaces, in one line

I currently have a string, say $line='55.25040882, 3,,,,,,', that I want to remove all whitespace and repeated commas and periods from. Currently, I have:
$line =~ s/[.,]{2,}//;
$line =~ s/\s{1,}//;
Which works, as I get '55.25040882,3', but when I try
$line =~ s/[.,\s]{2,}//;
It pulls out the ", " and leaves the ",,,,,,". I want to retain the first comma and just get rid of the whitespace.
Is there a way to elegantly do this with one line of regex? Please let me know if I need to provide additional information.
EDIT: Since there were so many solutions, I decided to update my question with the answer below:
$line =~ s/([.,])\1{1,}| |\t//g;
This removes all repeated periods and commas, removes all spaces and tabs, while retaining the \r and \n characters. There are so many ways to do this, but this is the one I settled for. Thanks so much!
This is mostly a critique of Rohit's answer, which seems to contain several misconceptions about character class syntax, especially the negation operator (^). Specifically:
[(^\n^\r)\s] matches ( or ^ or ) or any whitespace character, including linefeed (\n) and carriage return (\r). In fact, they're each specified twice (since \s matches them too), though the class still only consumes one character at a time.
^[\n\r]|\s matches a linefeed or carriage return at the beginning of the string, or any whitespace character anywhere (which makes the first part redundant, since any whitespace character includes linefeed and carriage return, and anywhere includes the beginning of the string).
Inside a character class, the caret (^) negates the meaning of everything that follows iff it appears immediately after the opening [; anywhere else, it's just a caret. All other metacharacters except \ lose their special meanings entirely inside character classes. (But the normally non-special characters, - and ], become special.)
Outside a character class, ^ is an anchor.
Here's how I would write the regex:
$line =~ s/([.,])\1+|\h+//g;
Explanation:
Since you finally went with ([.,])\1{1,}, I assume you want to match repeated periods or repeated commas, not things like ., or ,.. Success with regexes means learning to look at text the way the regex engine does, and it's not intuitive. You'll help yourself a lot if you try to describe each problem the way the regex engine would, if it could speak.
{1,} is not incorrect, but why add all that clutter to your regex when + does the same thing?
\h matches horizontal whitespace, which includes spaces and tabs, but not linefeeds or carriage returns. (That only works in Perl, AFAIK. In Ruby/Oniguruma, \h matches a hex digit; in every other flavor I know of, it's a syntax error.)
You can try using: -
my $line='55.25040...882, 3,,,,,,';
$line =~ s/[^\S\n\r]|[.,]{2,}//g; # Negates non-whitespace char, \n and \r
print $line
OUTPUT: -
55.25040882,3
[^\S\n\r]|[.,]{2,} -> This means either [^\S\n\r] or [.,]{2,}
[.,]{2,} -> This means replace , or . if there is more than 2 in the same
line.
[^\S\n\r] -> Means negate all whitespace character, linefeed, and newline.