I'm trying to match all the non whitespace characters after a string in Regex. In this example, I want to match "b" without the whitespaces and the slashes around it:
a: /b/
I tried using (?<=a:)([^\s\/]+) but it doesn't work.
You still need to account for / before b, not just for whitespace.
You may use a \K based regex (if your regex flavor is PCRE/Onigmo/Boost):
a:\s*\/\K[^\s\/]+
See the regex demo.
Also, if you are using a regex engine that supports unknown width lookbehind patterns, you may use
(?<=a:\s*\/)[^\s\/]+
See this regex demo.
Else, you need to capture your substring with parentheses:
a:\s*\/([^\s\/]+)
See this regex demo.
Details
a: - a a: string
\s* - 0+ whitespaces
\/ - a / char
\K - a match reset operator
[^\s\/]+ - 1+ chars other than whitespace and /.
Related
I'm trying to find a solution to a regex that can match anything after a string or nothing, but if there's something it can't be a dot .
is it possible to do without negative lookahead?
here's an example regex:
.*\.(cpl)[^.].*
now the string:
C:\Windows\SysWOW64\control.exe mlcfg32.cpl sounds
this one is matched, but if there's only:
C:\Windows\SysWOW64\control.exe mlcfg32.cpl
it's not matched because due to the dot blacklist it's searching for any character after cpl,if i use ? after the [^.] however it won't blacklist the . in case there's something else after, so it will capture this even if it shouldn't:
C:\Windows\SysWOW64\control.exe mlcfg32.cpl. sounds
can it be done without using negative lookaheads? - ?!
You may use this regex:
.*\.cpl(?:[^.].*|$)
RegEx Demo
RegEx Breakdown:
.*: Match 0 or more of any character
\.cpl: Match .cpl
(?:[^.].*|$): Match end of string or a non-dot followed by any text
You can use
.*\.(cpl)(?:[^.].*)?$
See the regex demo. Details:
.* - zero or more chars other than line break chars as many as possible
\. - a dot
(cpl) - Group 1: cpl
(?:[^.].*)? - an optional non-capturing group that matches a char other than . char and then zero or more chars other than line break chars as many as possible
$ - end of string.
I am trying to match a string the 2nd word after "Vores ref.:" using positive lookbehind. It works in online testers like https://regexr.com/, but my tool Alteryx dont allow quantifiers like + in a lookbehind.
"ABC This is an example Vores ref.: 23244-2234 LW782837673 Test 2324324"
(?<=Vores\sref.:\s\d+-\d+\s+)\w+ is correctly matching the LW78283767, on regexr.com but not in Alteryx.
How can I rewrite the lookahead expression by using quantifiers but still get what I want?
You can use a replacement approach here using
.*?\bVores\s+ref\.:\s+\d+-\d+\s+(\w+).*
Replace with $1.
See the regex demo.
Details:
.*? - any 0+ chars other than line break chars, as few as possible
\bVores - whole word Vores
\s+ - one or more whitespaces
ref\.: - ref.: substring
\s+ - one or more whitespaces
\d+-\d+ - one or more digits, - and one or more digits
\s+ - one or more whitespaces
(\w+) - Capturing group 1: one or more word chars.
.* - any 0+ chars other than line break chars, as many as possible.
You can use a capture group instead.
Note to escape the dot \. to match it literally.
\bVores\sref\.:\s\d+-\d+\s+(\w+)
The pattern matches:
\bVores\sref\.:\s\d+-\d+\s+ Your pattern turned into a match
(\w+) Capture group 1, match 1+ word characters
Regex demo
We have a regular expression containing (?!\s) but the regex engine used does not allow to use lookahead assertions. Complete regex is
^(?!\s)(.*)(\S)$
Can anyone please suggest any alternative ways of achieving same functionality with out using a lookahead.
You may use
^\S(.*\S)?$
It will match
^ - start of string
\S - a non-whitespace char
(.*\S)? - 1 or 0 occurrences of
.* - any 0+ chars other than line break chars, as many as possible
\S - a non-whitespace char
$ - end of string.
See a regex demo using the Go regex engine (RE2) that does not allow lookaheads.
I cannot figure out how to add two regex together, I have these requirements:
Letters and space ^[\p{L} ]+$
Cannot be whitespace ^[^\s]+$
I cannot figure out how to write one regex that will combine both? There is perhaps some other solution?
You may use
^(?! +$)[\p{L} ]+$
^(?!\s+$)[\p{L}\s]+$
^\s*\p{L}[\p{L}\s]*$
Details
^ - start of string
(?!\s+$) - no 1 or more whitespaces are allowed till the end of the string
[\p{L}\s]+ - 1+ letters or whitespaces
$ - end of string.
See the regex demo.
The ^\s*\p{L}[\p{L}\s]*$ is a regex that matches any 0+ whitespaces at the start of the string, then requires a letter that it consumes, and then any 0+ letters/whitespaces may follow.
See the regex demo.
Hello i want to match with regex this word
(Parc Installé)
from this text:
31/1/2017 17:19:23,4245986,ct0001#Intotel.int,Parc Installé,100.100.30.100
I did this regex ',[A-Za-zA-zÀ-ú+ \/\w+0-9._%+-]+,'
But the result is : 4245986 ans Parc Installé.
How can i match only Parc Installé
You may try a regex based on a lookahead that will require a comma and digits/commas after it up to the end of string:
[^,]+(?=\s*,[\d.]+$)
See this regex demo
Details:
[^,]+ - 1 or more chars other than ,
(?=\s*,[\d.]+$) - a lookahead requiring
\s* - zero or more whitespaces
, - a comma
[\d.]+ - 1+ digits or dots up to...
$ - ... the end of string
To make it a bit more restrictive, you may replace the lookahead with (?=\s*,\d+(?:\.\d+){3}$) to require 4 sequences of dot-separated 1+ digits. See this regex demo.
If a lookahead is not supported (case with a RE2 engine), you might want to use a capturing group based solution:
([^,]+)\s*,[\d.]+$
Here, the part within (...) will be captured into Group 1 and will be accessible via a backreference or a function like =REGEXEXTRACT in Google Spreasheets that only retrieves the contents of a capturing group if the latter is present in the pattern.