I have a multiline text field and need to test if each line matches a pattern.
The field might look like this:
1xABCD
9xDEFGHIJK
7xAJDKSLD
2xA
The pattern is this: \dx\w.*
The number of lines is from 1 to X.
I was trying ^\d+x\w.*${1,} or \d+x\w.*\r\n{1,}
Thank you
You may use
^\d+x\w+(?:\r?\n\d+x\w+)*$
Details
^ - start of string
\d+x\w+ - 1+ digits, x and then 1+ word chars (letters, digits or _)
(?:\r?\n\d+x\w+)* - a non-capturing group ((?:...)) that matches 0 or more (*) occurrences of:
\r?\n - an optional CR and an LF symbol
\d+x\w+ - 1+ digits, x and then 1+ word chars (letters, digits or _)
$ - end of string.
See the regex demo (note the text pasted in the regex101.com has LF only line endings).
Related
Is it possible to match only the letter from the following string?
RO41 RNCB 0089 0957 6044 0001 FPS21098343
What I want: FPS
What I'm trying LINK : [0-9]{4}\s*\S+\s+(\S+)
What I get: FPS21098343
Any help is much appreciated! Thanks.
You can try with this:
var String = "0258 6044 0001 FPS21098343";
var Reg = /^(?:\d{4} )+ *([a-zA-Z]+)(?:\d+)$/;
var Match = Reg.exec(String);
console.log(Match);
console.log(Match[1]);
You can match up to the first one or more letters in the following way:
^[^a-zA-Z]*([A-Za-z]+)
^.*?([A-Za-z]+)
^[\w\W]*?([A-Za-z]+)
(?s)^.*?([A-Za-z]+)
If the tool treats ^ as the start of a line, replace it with \A that always matches the start of string.
The point is to match
^ / \A - start of string
[^a-zA-Z]* - zero or more chars other than letters
([A-Za-z]+) - capture one or more letters into Group 1.
The .*? part matches any text (as short as possible) before the subsequent pattern(s). (?s) makes . match line break chars.
Replace A-Za-z in all the patterns with \p{L} to match any Unicode letters. Also, note that [^\p{L}] = \P{L}.
To grep all the groups of letters that go in a row in any place in the string you can simply use:
([a-zA-Z]+)
You could use a capture group to get FPS:
\b[0-9]{4}\s+\S+\s+([A-Z]+)
The pattern matches:
\b[0-9]{4} A wordboundary to prevent a partial match, and match 4 digits
\s+\S+\s+ Match 1+ non whitespace chars between whitespace chars
([A-Z]+) Capture group 1, match 1+ chars A-Z
Regex demo
If the chars have to be followed by digits till the end of the string, you can add \d+$ to the pattern:
\b[0-9]{4}\s+\S+\s+([A-Z]+)\d+$
Regex demo
I need to take only a number (a float number) from a text, but I can't remove the whitespaces...
** Update
I have a problem with this method, I only need to consider numbers and ',' between '- EUR' and 'Fee' as rule.
You can use
- EUR\W*(.*?)\W*Fee
See the regex demo.
Variations of the regex that might work in different regex engines:
- EUR\W*\K.*?(?=\W*Fee)
(?<=- EUR\W*).*?(?=\W*Fee)
Details:
- EUR - literal text
\W* - zero or more non-word chars
(.*?) - Group 1: any zero or more chars other than line break chars as few as possible
\W*- zero or more non-word chars
Fee - a string.
You could also match the number format in capture group 1
- EUR\b\D*(\d+(?:,\d+)?)\s+Fee\b
- EUR\b Match - EUR and a word boundary
\D* Match 0+ times any char except a digit
( Capture group 1
\d+(?:,\d+)? Match 1+ digits with an optional decimal part
) Close group 1
\s+Fee\b Match 1+ whitespace chars, Fee and a word boundary
Regex demo
this is working i removed the , from (.) in test string.
Regex example - working
goal
I want to retrieve only this string "14" from this message with a logstash Grok
3/03/0 EE 14 GFR 20 AAA XXXXX 50 3365.00
this is my grok code
grok{
match => {
field1 => [
"(?<number_extract>\d{0}\s\d{1,3}\s{1})"
]
}
}
I would like to match just the first match "14" but my Grok filter returns all matches:
14 20 50
If you need to find the first occurrence of a number that consists of 1, 2 or 3 digits only, you may use
^(?:.*?\s)?(?<number_extract>\d{1,3})(?!\S)
Details
^ - start of string
(?:.*?\s)? - an optional substring of any 0+ chars other than line break chars as few as possible, and then a whitespace (this enables a match at the start of the string if it is there)
(?<number_extract>\d{1,3}) - 1 to 3 digits
(?!\S) - a negative lookahead that makes sure there is a whitespace or end of string immediately to the right (enables a match at the end of the string).
Alternative solution
If you know that the number you are looking for is after a date-like field and another field, and you want to force this pre-validation, you may use
^\d+/\d+/\d+\s+\S+\s+(?<number_extract>\d+)
See the regex demo
If you do not have to check if the first field is date-like, you may simply use
^\S+\s+\S+\s+(?<number_extract>\d+)
^(?:\S+\s+){2}(?<number_extract>\d+) // Equivalent
See the regex demo here.
Details
^ - start of string
\d+/\d+/\d+ - 1+ digits, /, 1+ digits, /, 1+ digits
\s+ - 1+ whitespaces
\S+ - 1+ chars other than whitespace
\s+ - 1+ whitespaces
(?<number_extract>\d+) - Capturing group "number_extract": 1+ digits.
Grok demo:
i am trying to create a regex which should be able to accept the following strings
proj_asdasd_000.gz.xml
proj_asdasd.gz.xml
basically 2nd underscore is optional and if any value follows it, it should only be integer.
Following is my Regex that i am trying.
^proj([a-zA-z0-9]?)+_[a-zA-z]+(_[0-9]?)+\.[a-z]+.[a-z]
Any suggestion to make it accept the above mentioned strings?
You may use
^proj[a-zA-Z0-9]*_[a-zA-Z]+(?:_[0-9]+)?\.[a-z]+\.[a-z]+$
^proj[a-zA-Z0-9]*_[a-zA-Z]+(?:_[0-9]+)?(?:\.[a-z]+){2}$
See the regex demo
Details
^ - start of string
proj - a literal substring
[a-zA-Z0-9]* - 0 or more alphanumeric chars
_ - a _ char
[a-zA-Z]+ - 1+ ASCII letters
(?:_[0-9]+)? - an optional sequence of an underscore followed with 1+ digits
\.[a-z]+\.[a-z]+ = (?:\.[a-z]+){2} - two occurrences of . and 1+ lowercase ASCII letters
$ - end of string.
Notes:
[A-z] matches more than just ASCII letters
([a-zA-z0-9]?)+ matches an optional character 1 or more times, which makes little sense. Either match a char 1 or more times with + or 0 or more times with *, no need of parentheses
(_[0-9]?)+ matches 1 or more sequences of _ followed by a single optional digit (so, it matches _9___1_, for example). The quantifiers must be swapped to match an optional sequence of _ and 1+ digits.
I have a regular expression that is allowing a string to be standalone, separated by hyphen and underscore.
I need help so the string only takes hyphen or underscore, but not both.
This is what I have so far.
^([a-z][a-z0-9]*)([-_]{1}[a-z0-9]+)*$
foo = passed
foo-bar = passed
foo_bar = passed
foo-bar-baz = passed
foo_bar_baz = passed
foo-bar_baz_qux = passed # but I don't want it to
foo_bar-baz-quz = passed # but I don't want it to
You may expand the pattern a bit and use a backreference to only match the same delimiter:
^[a-z][a-z0-9]*(?:([-_])[a-z0-9]+(?:\1[a-z0-9]+)*)?$
See the regex demo
Details:
^ - start of string
[a-z][a-z0-9]* - a letter followed with 0+ lowercase letters or digits
(?:([-_])[a-z0-9]+(?:\1[a-z0-9]+)*)? - an optional sequence of:
([-_]) - Capture group 1 matching either - or _
[a-z0-9]+ - 1+ lowercase letters or digits
(?:\1[a-z0-9]+)* - 0+ sequences of:
\1 - the same value as in Group 1
[a-z0-9]+ - 1 or more lowercase letters or digits
$ - end of string.
Here's a nice clean solution:
^([a-zA-Z-]+|[a-zA-Z_]+)$
Break it down!
^ start at the beginning of the text
[a-zA-Z-]+ match anything a-z or A-Z or -
| OR operator
[a-zA-Z_]+ match anything a-z or A-Z or _
$ end at the end of the text
Here's an example on regexr!