regex match only specific file lines - regex

I have a file containing lines like:
13
13-55
some text 11
I want to create a regex to match only first to type of lines, but not the last one.
Reges created by me is [0-9\-]+

You have to specify that you are testing from the beggining to the end of the string:
Try with following regex:
^[0-9-]+$

Try using anchors (^ and $ to denote beginning and end of string respectively) and use the multiline option (this one depends on the language/engine/environment of the regex).
^[0-9-]+$
Note, you can drop the backslash for the - if it's at the beginning or end of a character class.

If you want to match lines which start with number.
^[0-9-]+$

Related

Is there a Regex code to combine the codes I have to lookabove and look between a line?

Lets say this is my teststring:
XXX 3.14 QQQ
XXX 3.14 QQQ
YYY
I would like to have the second 3.14 as the exact match. I think this would require me to combine the next two code lines:
((?<=XXX ).*(?= QQQ)) which selects both 3.14.
.*\n(?=((.*\n){1})(YYY)) which selects the full second line.
However, when I use ((?<=XXX ).*(?= QQQ)).*\n(?=((.*\n){1})(YYY)) the exact match is the second "3.14" and "QQQ".
Any help would with finding out how to match the second 3.14 with the use of these formulas would very much be appreciated.
Thank you
If this is really your string and not just a part of it, you could make use of \A, the very beginning of a string:
^(?<!\A)XXX(.+?)QQQ$
See a demo on regex101.com.
It might be easier to specify your programming language and to use the corresponding array location in the found matches (e.g. results[1]).
If you want to match the second number, you could also match the exact content instead of using lookarounds and use a single capturing group for the second value.
^XXX \d+\.\d+ QQQ\r?\nXXX (\d+\.\d+) QQQ\r?\nYYY\b
^ Start of string
XXX \d+\.\d+ QQQ match the first line
\r?\n Match a newline
XXX (\d+\.\d+) QQQ Match the second line and capture the number in a capture group
\r?\nYYY\b Match a newline, YYY and word boundary
Regex demo
You where on the correct path, the following should work:
(?<=^XXX )[^\n]*(?= QQQ$)(?!.*^XXX [^\n]* QQQ$)
Which says match everything (unless a newline) between <start of line>XXX and QQQ<end of line>, if not followed by another sequence of ^XXX [^\n]* QQQ$.
For the above regex to function appropriately you'll need to set the multiline flag (m) to let ^ and $ match the beginning and ending of a line, rather than the string. You also need to set the single line/dot all flag (s).
If you don't care about the XXX being adjacent to the start of the line and QQQ being adjacent to the end of the line you can leave out the ^ and $ anchors and don't have to set the multiline flag. This version would look like:
(?<=XXX )[^\n]*?(?= QQQ)(?!.*XXX [^\n]* QQQ)

Regex match line containing string

I'm trying to create a regex that will select an entire line where it contains a matching string.
I can't seem to get it to work. Here is the expression:
^.*?(\bEventname 2\b).*$
You can see the test case and what I've tried here:
https://www.regex101.com/r/mT5rZ3/1
here's what I use and it works perfectly for me
^.*substring.*$
This answer solves the question with 463 steps instead of 952 steps. Just ensure a new line at the end of the file.
.*Eventname 2.*\n
https://www.regex101.com/r/mT5rZ3/5
EDIT 4-9-2022
With .*Eventname 2.*\n? it also solves with 463 steps, but there is no need to ensure a new line at the end of the file.
If you are using the PHP regex . don't match newlines. So
.*(\bEventname 2\b).*
would be enough. If . matches newline you would need *? to make the dots non-greedy (so it just matches one line, instead of everything). You also need to be in multi-line mode to use ^ and $, but that shouldn't be necessary (since you only want to match one line anyway).
Try this:
(.*(?:Eventname 2).*)
explaination:
( ... ) : groups and captures the line
(?:...) : groups without capturing the string that the line needs to contain
.* : any characters
You are using a string containing several lines. By default, the ^ and $ operators will match the beginning and end of the whole string. The m modifier will cause them to match the beginning and end of a line.

Add to end of line that contains a specific word and starts with x

I would like to add some custom text to the end of all lines in my document opened in Notepad++ that start with 10 and contain a specific word (for example "frog").
So far, I managed to solve the first part.
Search: ^(10)$
Replace: \1;Batteries (to add ;Batteries to the end of the line)
What I need now is to edit this regex pattern to recognize only those lines that also contain a specific word.
For example:
Before: 1050;There is this frog in the lake
After: 1050;There is this frog in the lake;Batteries
You can use the regex to match your wanted lines:
(^(10).*?(frog).*)
the .*? is a lazy quantifier to get the minimum until frog
and replace by :
$1;Battery
Hope it helps,
You should allow any characters between the number and the end of line:
^10.*frog.*
And replacement will be $0;Batteries. You do not even need a $ anchor as .* matches till the end of a line since . matches any character but a line break char.
NOTE: There is no need to wrap the whole pattern with capturing parentheses, the $0 placeholder refers to the whole match value.
More details:
^ - start of a line
10 - a literal 10 text
.* - zero or more chars other than line break chars as many as possible
frog - a literal string
.* - zero or more chars other than line break chars as many as possible
try this
find with: (^(10).*(frog).*)
replace with: $1;Battery
Use ^(10.*frog.*)$ as regex. Replace it with something like $1;Batteries

Regular expression to match last line break in file

In my quest to learn flex I'm having a scanner echo input adding line numbers.
After every line I display a counter and increment it.
Trouble is there is always a lone line number at the end of the display.
I need a regex that will ignore all line breaks except for the last one.
I tried [\n/<<EOF>>] to no avail.
Any thoughts?
I don't know what regex engine uses Flex but you can use this regex:
\z
Working demo
\z assert position at the very end of the string.
Matches the end of a string only. Unlike $, this is not affected by
multiline mode, and, in contrast to \Z, will not match before a
trailing newline at the end of a string.
If above regex doesn't work then you can use this one:
(?<=[\S\s])$
Working demo
Edit: since flex seems to work slightly different than other regex engines you could use this regex:
[\s\S]$
To get the latest character of each line. Then you can iterated over all lines until get the last one. Here you have an online flex regex engine tool:
http://ryanswanson.com/regexp/#start
Try below regex, It will search for a new line character at the end of the line.
\n$
Have you tried simply doing:
\n$
Debuggex Demo
The \n matches the newline, the $ matches end of string.

what can be the regex for the following string

I am doing this in groovy.
Input:
hip_abc_batch hip_ndnh_4_abc_copy_from_stgig abc_copy_from_stgig
hiv_daiv_batch hip_a_de_copy_from_staging abc_a_de_copy_from_staging
I want to get the last column. basically anything that starts with abc_.
I tried the following regex (works for second line but not second.
\abc_.*\
but that gives me everything after abc_batch
I am looking for a regex that will fetch me anything that starts with abc_
but I can not use \^abc_.*\ since the whole string does not start with abc_
It sounds like you're looking for "words" (i.e., sequences that don't include spaces) that begin with abc_. You might try:
/\babc_.*\b/
The \b means (in some regular expression flavors) "word boundary."
Try this:
/\s(abc_.*)$/m
Here is a commented version so you can understand how it works:
\s # match one whitepace character
(abc_.*) # capture a string that starts with "abc_" and is followed
# by any character zero or more times
$ # match the end of the string
Since the regular expression has the "m" switch it will be a multi-line expression. This allows the $ to match the end of each line rather than the end of the entire string itself.
You don't need to trim the whitespace as the second capture group contains just the text. After a cursory scan of this tutorial I believe this is the way to grab the value of a capture group using Groovy:
matcher = (yourString =~ /\s(abc_.*)$/m)
// this is how you would extract the value from
// the matcher object
matcher[0][1]
I think you are looking for this: \s(abc_[a-zA-Z_]*)$
If you are using perl and you read all lines into one string, don't forget to set the the m option on your regex (that stands for "Treat string as multiple lines").
Oh, and Regex Coach is your free friend.