How to check if a line is blank using regex - regex

I am trying to make simple regex that will check if a line is blank or not.
Case;
" some" // not blank
" " //blank
"" // blank

The pattern you want is something like this in multiline mode:
^\s*$
Explanation:
^ is the beginning of string anchor.
$ is the end of string anchor.
\s is the whitespace character class.
* is zero-or-more repetition of.
In multiline mode, ^ and $ also match the beginning and end of the line.
References:
regular-expressions.info/Anchors, Character Classes, and Repetition.
A non-regex alternative:
You can also check if a given string line is "blank" (i.e. containing only whitespaces) by trim()-ing it, then checking if the resulting string isEmpty().
In Java, this would be something like this:
if (line.trim().isEmpty()) {
// line is "blank"
}
The regex solution can also be simplified without anchors (because of how matches is defined in Java) as follows:
if (line.matches("\\s*")) {
// line is "blank"
}
API references
String String.trim()
Returns a copy of the string, with leading and trailing whitespace omitted.
boolean String.isEmpty()
Returns true if, and only if, length() is 0.
boolean String.matches(String regex)
Tells whether or not this (entire) string matches the given regular expression.

Actually in multiline mode a more correct answer is this:
/((\r\n|\n|\r)$)|(^(\r\n|\n|\r))|^\s*$/gm
The accepted answer: ^\s*$ does not match a scenario when the last line is blank (in multiline mode).

Try this:
^\s*$

Full credit to bchr02 for this answer. However, I had to modify it a bit to catch the scenario for lines that have */ (end of comment) followed by an empty line. The regex was matching the non empty line with */.
New: (^(\r\n|\n|\r)$)|(^(\r\n|\n|\r))|^\s*$/gm
All I did is add ^ as second character to signify the start of line.

The most portable regex would be ^[ \t\n]*$ to match an empty string (note that you would need to replace \t and \n with tab and newline accordingly) and [^ \n\t] to match a non-whitespace string.

Here Blank mean what you are meaning.
A line contains full of whitespaces or a line contains nothing.
If you want to match a line which contains nothing then use '/^$/'.

Somehow none of the answers from here worked for me when I had strings which were filled just with spaces and occasionally strings having no content (just the line terminator), so I used this instead:
if (str.trim().isEmpty()) {
doSomethingWhenWhiteSpace();
}

Well...I tinkered around (using notepadd++) and this is the solution I found
\n\s
\n for end of line (where you start matching) -- the caret would not be of help in my case as the beginning of the row is a string
\s takes any space till the next string
hope it helps

This regex will delete all empty spaces (blank) and empty lines and empty tabs from file
\n\s*

Related

Regex for matching every line enclosed in curly brackets

I'm trying to match every single line within curly brackets, and I'm struggling to capture what I want. To give an example, if I have this text:
{
this is a line,
this = another line,
this is the third line!
this is, indeed, another line
},
round two: {
we're now on the second pair of brackets,
and this is the final line.
}
Then I want to match and capture a total of six lines:
this is a line,
this = another line,
this is the third line!
this is, indeed, another line
we're now on the second pair of brackets,
and this is the final line.
So far my current idea is trying to match "curly bracket" -> "anything" -> "line" -> "anything" -> "curly bracket", i.e. something like this:
{(?s)[^}]*(^([^}^\n]+)$)(?s)[^}]*}
But that only matches one line per pair of curly brackets, rather than every line.
How would I go about doing this? Thanks.
EDIT: Updated the example to include preceding text before one of the opening curly braces and varying whitespace.
Just match lines that don't contain a brace:
^[^{}\r\n]+$
The multiline flag is to be set (/m). Alternatively, insert (?m) at the beginning of the regex.
Demo
The regex reads, "match the beginning of the line followed by one or more characters other than {, }, \r and \n, followed by the end of the line".
To exclude leading spaces in each matched line you can modify the regex slightly:
^\s*\K[^{}\r\n]+$
Demo
\K resets the starting point of the match, excluding any previously-consumed characters. \K is not available with all regex engines.
Assuming input is well formed:
([^{\n](?=[^{]+}))+
See live demo

Cut lines using Notepad++ Regexp replace

I need to cut lines that have 6 or more characters, hyphen, then other characters or symbols. Hyphen and rest of line should be removed. Source text:
0402CS-2
0402CS-3
0402
7812-C
0603CS-1
0603CS-2
0603CS-3
As a result, I need this:
0402CS
0402CS
0402
7812-C
0603CS
0603CS
0603CS
To do that, I use Notepad++ regexp replace feature. Find pattern: ^([^\-]{6,})\-.+$ Replace pattern: \1
But there is no option "multiline", so, symbols "^" and "$" doesn't match ONLY beginning and end of the line and actually I have result:
0402CS
0402CS
0402
7812 <-- that's wrong!
0603CS
0603CS
0603CS
Please advice me how to fix find pattern? Or, maybe there is other handful and powerful free text editor that can do that?
^([^\n\-]{6,})\-.+$
^^
Just use \n as due to [^-] the regex can traverse to line below as use that line to make a match.
See demo.
https://regex101.com/r/BHO93c/1
for the input
0402
7812-C the regex matches both lines as 1 line and makes a match.
See demo if 0402 is not there.
https://regex101.com/r/BHO93c/2
That happens because the [^-] character class also matches a newline.
Add \n to it:
^([^\n-]{6,})-.+$
See the regex online demo (note the m multiline modifier (making ^ match the start of the line, and $ - the end of the line) and g modifier (enabling search for multiple occurrences) that is ON by default in Notepad++).
Note that escaping the hyphen is not necessary inside a character class when it is at the start/end of the class, and you never need to escape the hyphen outside the character class.

Add to end of line that contains a specific word and starts with x

I would like to add some custom text to the end of all lines in my document opened in Notepad++ that start with 10 and contain a specific word (for example "frog").
So far, I managed to solve the first part.
Search: ^(10)$
Replace: \1;Batteries (to add ;Batteries to the end of the line)
What I need now is to edit this regex pattern to recognize only those lines that also contain a specific word.
For example:
Before: 1050;There is this frog in the lake
After: 1050;There is this frog in the lake;Batteries
You can use the regex to match your wanted lines:
(^(10).*?(frog).*)
the .*? is a lazy quantifier to get the minimum until frog
and replace by :
$1;Battery
Hope it helps,
You should allow any characters between the number and the end of line:
^10.*frog.*
And replacement will be $0;Batteries. You do not even need a $ anchor as .* matches till the end of a line since . matches any character but a line break char.
NOTE: There is no need to wrap the whole pattern with capturing parentheses, the $0 placeholder refers to the whole match value.
More details:
^ - start of a line
10 - a literal 10 text
.* - zero or more chars other than line break chars as many as possible
frog - a literal string
.* - zero or more chars other than line break chars as many as possible
try this
find with: (^(10).*(frog).*)
replace with: $1;Battery
Use ^(10.*frog.*)$ as regex. Replace it with something like $1;Batteries

Regular Expression to search and replace in notepad ++

If i have a line of text that i want to remove from a text file in notepad and it is always formatted like this
[text]:
except that the words in the text area change. what is a regular expression i could create to remove the whole section with the search and replace function in notepad?
To delete the entire line starting with [any text]: you can use: ^[\t ]*\[.*?\]:.*?\r\n
Explanation:
^ ... start search at beginning of a line (in this case).
[\t ]* ... find 0 or more tabs or spaces.
\[ ... find the opening square bracket as literal character.
.*? ... find 0 or more characters except the new line characters carriage return and line-feed non greedy which means as less characters as possible to get a positive match, i.e. stop matching on first occurrence of following ] in the search expression.
\]: ... find the closing square bracket as literal character and a colon.
.*?\r\n ... find 0 or more characters except the new line characters and finally also the carriage return and line-feed terminating the line.
The search string ^[\t ]*\[.*?\]:.*?$ would find also the complete line, but without matching also the line termination.
The replace string is for both search strings an empty string.
If by removing the entire section, you mean remove the [text]: up to the next [otherText]:, you can try this:
\[text\]:((?!\[[^\]]*\]:).)*
Remember to set the flag for ". matches newline".
This regex basically first matches your section title. Then, it would start matching right after this title and for each character, it uses a negative lookahead to check if the string following this character looks like a section title. If it does the matching is terminated.
Note: Remember that this regex would replace all occurrences of the matched pattern. In other words, if you have more than one of that section, they are both replaced.

Regular expression to match characters at beginning of line only

I am trying to work on regular expressions. I have a mainframe file which has several fields. I have a flat file parser which distinguishes several types of records based on the first three letters of every line. How do I write a regular expression where the first three letters are 'CTR'.
Beginning of line or beginning of string?
Start and end of string
/^CTR.*$/
/ = delimiter
^ = start of string
CTR = literal CTR
$ = end of string
.* = zero or more of any character except newline
Start and end of line
/^CTR.*$/m
/ = delimiter
^ = start of line
CTR = literal CTR
$ = end of line
.* = zero or more of any character except newline
m = enables multi-line mode, this sets regex to treat every line as a string, so ^ and $ will match start and end of line
While in multi-line mode you can still match the start and end of the string with \A\Z permanent anchors
/\ACTR.*\Z/m
\A = means start of string
CTR = literal CTR
.* = zero or more of any character except newline
\Z = end of string
m = enables multi-line mode
As such, another way to match the start of the line would be like this:
/(\A|\r|\n|\r\n)CTR.*/
or
/(^|\r|\n|\r\n)CTR.*/
\r = carriage return / old Mac OS newline
\n = line-feed / Unix/Mac OS X newline
\r\n = windows newline
Note, if you are going to use the backslash \ in some program string that supports escaping, like the php double quotation marks "" then you need to escape them first
so to run \r\nCTR.* you would use it as "\\r\\nCTR.*"
^CTR
or
^CTR.*
edit:
To be more clear: ^CTR will match start of line and those chars. If all you want to do is match for a line itself (and already have the line to use), then that is all you really need. But if this is the case, you may be better off using a prefab substr() type function. I don't know, what language are you are using. But if you are trying to match and grab the line, you will need something like .* or .*$ or whatever, depending on what language/regex function you are using.
Regex symbol to match at beginning of a line:
^
Add the string you're searching for (CTR) to the regex like this:
^CTR
Example: regex
That should be enough!
However, if you need to get the text from the whole line in your language of choice, add a "match anything" pattern .*:
^CTR.*
Example: more regex
If you want to get crazy, use the end of line matcher
$
Add that to the growing regex pattern:
^CTR.*$
Example: lets get crazy
Note: Depending on how and where you're using regex, you might have to use a multi-line modifier to get it to match multiple lines. There could be a whole discussion on the best strategy for picking lines out of a file to process them, and some of the strategies would require this:
Multi-line flag m (this is specified in various ways in various languages/contexts)
/^CTR.*/gm
Example: we had to use m on regex101
Try ^CTR.\*, which literally means start of line, CTR, anything.
This will be case-sensitive, and setting non-case-sensitivity will depend on your programming language, or use ^[Cc][Tt][Rr].\* if cross-environment case-insensitivity matters.
^CTR.*$
matches a line starting with CTR.
Not sure how to apply that to your file on your server, but typically, the regex to match the beginning of a string would be :
^CTR
The ^ means beginning of string / line
There's are ambiguities in the question.
What is your input string? Is it the entire file? Or is it 1 line at a time? Some of the answers are assuming the latter. I want to answer the former.
What would you like to return from your regular expression? The fact that you want a true / false on whether a match was made? Or do you want to extract the entire line whose start begins with CTR? I'll answer you only want a true / false match.
To do this, we just need to determine if the CTR occurs at either the start of a file, or immediately following a new line.
/(?:^|\n)CTR/
(?i)^[ \r\n]*CTR
(?i) -- case insensitive -- Remove if case sensitive.
[ \r\n] -- ignore space and new lines
* -- 0 or more times the same
CTR - your starts with string.