I want to search for the following regular expression
^[ ]*,$
in the following text :
,[LF]
,[LF]
My problem is that Delphi finds the expression, but the matched text doesn't include the LF.
Effectively I want to removes the lines from my source code.
I'am using TPerlRegEx with delphiXe8
In the example [LF] is the linefeed ($0D $0A)
I Tested several flags combinaisons in TPerlRegExOptions
This works perfectly in SublimeText 3
What am I missing ?
If you are using a PCRE regex, you can match zero or more spaces at the stsart of a line followed with a comma followed with a newline sequence with
(?m)^[ ]*,\R
See the regex demo. Note that (?m) is a multiline modifier making ^ match a location at the beginning of a line (after \n). \R matches any newline sequence.
Add a ? after \R to also match the last line in the text that has no newline sequence at the end.
Related
Im using Notepad++ Find and replace and I have regex that looks for [^|]\r which will find the end of the line that starts with 8778.
8778|44523|0||TENNESSEE|ADMINISTRATION||ROLL 169 BATCH 8|1947-09-22|0|OnBase
See Also 15990TT|
I want to basically merge that line with the one below it, so it becomes this:
8778|44523|0||TENNESSEE|ADMINISTRATION||ROLL 169 BATCH 8|1947-09-22|0|OnBase See Also 15990TT|
Ive tried the replace being a blank space, but its grabbing the last character on that line (an e in this case) and replacing that with a space, so its making it
8778|44523|0||TENNESSEE|ADMINISTRATION||ROLL 169 BATCH 8|1947-09-22|0|OnBas
See Also 15990TT|
Is there any way to make it essentially merge the two lines?
\r only matches a carriage return symbol, to match a line break, you need \R that matches any line break sequence.
To keep a part of a pattern after replacement, capture that part with parentheses, and then use a backreference to that group.
So you may use
([^|\r])\R
Replace with $1. Or with $1 if you need to append a space.
Details
([^|\r]) - Capturing group 1 ($1 is the backreference that refers to the group value from the replacement pattern): any char other than | and CR
\R - any line break char sequence, LF, CR or CRLF.
See the regex demo and the Notepad++ demo with settings:
The issue is you're using [^|] to match anything that's not a pipe character before the carriage return, which, on replacement, will remove that character (hence why you're losing an e).
If it's imperative that you match only carriage returns that follow non-pipe characters, capture the preceding character ([^|])\r$ and then put it back in the replacement using $1.
You're also missing a \n in your regex, which is why the replacement isn't concatenating the two lines. So your search should be ([^|])\r\n$ and your replace should be $1.
Find
(\r\n)+
For "Replace" - don't put anything in (not even a space)
regex difference between vscode and visual studio
starting with
line1
line2
find: ^(.+)$
replace: "$1",
In vscode it works as expected, resulting in
"line1",
"line2",
In studio, doesn't seem to work, resulting in
"line1
",
"line2
",
Which one is correct? I assume vscode.
TL;DR: Use ^(.*[^\r\n]) to match a whole line without EOL characters.
According to the Docs:
Purpose
Expression
Example
Match any single character (except a line break)
.
a.o matches "aro" in "around" and "abo" in "about" but not "acro" in "across"
Anchor the match string to the end of a line
\r?$
car\r?$ matches "car" only when it appears at the end of a line
Anchor the match string to the end of the file
$
car$ matches "car" only when it appears at the end of the file
However, some of that doesn't seem to hold true for some reason (i.e., . does match a line break and .$ does match the end of any line). All of the following patterns will match from the beginning to the end of the line including EOL characters: ^.+, ^.+$, ^.+\r?$.
I have noticed this behavior in VS2017 before and I'm not sure why it happens but I was able to get around it using something like the following:
^(.*[^\r\n])
Note: You can also get rid of the capturing group and replace with "$0",.
In VSCode regex patterns, a dot . matches any char but any line break chars.
In .NET regex used in Visual Studio, a dot matches any char but a newline, LF, char.
This difference explains the results you get and you can't call them right or wrong, these are just regex engine differences.
Note you would not have noticed any difference between the two engines if you had used LF-only line endings, but Visual Studio in Windows uses CRLF endings by default.
In order to wrap a whole line with double quotes using .NET regex, just exclude both LF and CR (carriage return) symbols from matching by replacing the dot with a [^\r\n] negated character class:
^[^\r\n]+
And replace with "$&", pattern where $& refers to the whole match.
You may get rid of the capturing group in the VSCode regex and use the same replacement pattern as in .NET, too.
I need to cut lines that have 6 or more characters, hyphen, then other characters or symbols. Hyphen and rest of line should be removed. Source text:
0402CS-2
0402CS-3
0402
7812-C
0603CS-1
0603CS-2
0603CS-3
As a result, I need this:
0402CS
0402CS
0402
7812-C
0603CS
0603CS
0603CS
To do that, I use Notepad++ regexp replace feature. Find pattern: ^([^\-]{6,})\-.+$ Replace pattern: \1
But there is no option "multiline", so, symbols "^" and "$" doesn't match ONLY beginning and end of the line and actually I have result:
0402CS
0402CS
0402
7812 <-- that's wrong!
0603CS
0603CS
0603CS
Please advice me how to fix find pattern? Or, maybe there is other handful and powerful free text editor that can do that?
^([^\n\-]{6,})\-.+$
^^
Just use \n as due to [^-] the regex can traverse to line below as use that line to make a match.
See demo.
https://regex101.com/r/BHO93c/1
for the input
0402
7812-C the regex matches both lines as 1 line and makes a match.
See demo if 0402 is not there.
https://regex101.com/r/BHO93c/2
That happens because the [^-] character class also matches a newline.
Add \n to it:
^([^\n-]{6,})-.+$
See the regex online demo (note the m multiline modifier (making ^ match the start of the line, and $ - the end of the line) and g modifier (enabling search for multiple occurrences) that is ON by default in Notepad++).
Note that escaping the hyphen is not necessary inside a character class when it is at the start/end of the class, and you never need to escape the hyphen outside the character class.
If i have a line of text that i want to remove from a text file in notepad and it is always formatted like this
[text]:
except that the words in the text area change. what is a regular expression i could create to remove the whole section with the search and replace function in notepad?
To delete the entire line starting with [any text]: you can use: ^[\t ]*\[.*?\]:.*?\r\n
Explanation:
^ ... start search at beginning of a line (in this case).
[\t ]* ... find 0 or more tabs or spaces.
\[ ... find the opening square bracket as literal character.
.*? ... find 0 or more characters except the new line characters carriage return and line-feed non greedy which means as less characters as possible to get a positive match, i.e. stop matching on first occurrence of following ] in the search expression.
\]: ... find the closing square bracket as literal character and a colon.
.*?\r\n ... find 0 or more characters except the new line characters and finally also the carriage return and line-feed terminating the line.
The search string ^[\t ]*\[.*?\]:.*?$ would find also the complete line, but without matching also the line termination.
The replace string is for both search strings an empty string.
If by removing the entire section, you mean remove the [text]: up to the next [otherText]:, you can try this:
\[text\]:((?!\[[^\]]*\]:).)*
Remember to set the flag for ". matches newline".
This regex basically first matches your section title. Then, it would start matching right after this title and for each character, it uses a negative lookahead to check if the string following this character looks like a section title. If it does the matching is terminated.
Note: Remember that this regex would replace all occurrences of the matched pattern. In other words, if you have more than one of that section, they are both replaced.
I want to remove trailing white spaces and tabs from my code without
removing empty lines.
I tried:
\s+$
and:
([^\n]*)\s+\r\n
But they all removed empty lines too. I guess \s matches end-of-line characters too.
UPDATE (2016):
Nowadays I automate such code cleaning by using Sublime's TrailingSpaces package, with custom/user setting:
"trailing_spaces_trim_on_save": true
It highlights trailing white spaces and automatically trims them on save.
Try just removing trailing spaces and tabs:
[ \t]+$
To remove trailing whitespace while also preserving whitespace-only lines, you want the regex to only remove trailing whitespace after non-whitespace characters. So you need to first check for a non-whitespace character. This means that the non-whitespace character will be included in the match, so you need to include it in the replacement.
Regex: ([^ \t\r\n])[ \t]+$
Replacement: \1 or $1, depending on the IDE
The platform is not specified, but in C# (.NET) it would be:
Regular expression (presumes the multiline option - the example below uses it):
[ \t]+(\r?$)
Replacement:
$1
For an explanation of "\r?$", see Regular Expression Options, Multiline Mode (MSDN).
Code example
This will remove all trailing spaces and all trailing TABs in all lines:
string inputText = " Hello, World! \r\n" +
" Some other line\r\n" +
" The last line ";
string cleanedUpText = Regex.Replace(inputText,
#"[ \t]+(\r?$)", #"$1",
RegexOptions.Multiline);
Regex to find trailing and leading whitespaces:
^[ \t]+|[ \t]+$
If using Visual Studio 2012 and later (which uses .NET regular expressions), you can remove trailing whitespace without removing blank lines by using the following regex
Replace (?([^\r\n])\s)+(\r?\n)
With $1
Some explanation
The reason you need the rather complicated expression is that the character class \s matches spaces, tabs and newline characters, so \s+ will match a group of lines containing only whitespace. It doesn't help adding a $ termination to this regex, because this will still match a group of lines containing only whitespace and newline characters.
You may also want to know (as I did) exactly what the (?([^\r\n])\s) expression means. This is an Alternation Construct, which effectively means match to the whitespace character class if it is not a carriage return or linefeed.
Alternation constructs normally have a true and false part,
(?( expression ) yes | no )
but in this case the false part is not specified.
[ |\t]+$ with an empty replace works.
\s+($) with a $1 replace also works, at least in Visual Studio Code...
To remove trailing white space while ignoring empty lines I use positive look-behind:
(?<=\S)\s+$
The look-behind is the way go to exclude the non-whitespace (\S) from the match.
To remove any blank trailing spaces use this:
\n|^\s+\n
I tested in the Atom and Xcode editors.
In Java:
String str = " hello world ";
// prints "hello world"
System.out.println(str.replaceAll("^(\\s+)|(\\s+)$", ""));
You can simply use it like this:
var regex = /( )/g;
Sample: click here