A regex for letters and space that cannot be a whitespace

A regex for letters and space that cannot be a whitespace - regex

I cannot figure out how to add two regex together, I have these requirements:
Letters and space ^[\p{L} ]+$
Cannot be whitespace ^[^\s]+$
I cannot figure out how to write one regex that will combine both? There is perhaps some other solution?

You may use
^(?! +$)[\p{L} ]+$
^(?!\s+$)[\p{L}\s]+$
^\s*\p{L}[\p{L}\s]*$
Details
^ - start of string
(?!\s+$) - no 1 or more whitespaces are allowed till the end of the string
[\p{L}\s]+ - 1+ letters or whitespaces
$ - end of string.
See the regex demo.
The ^\s*\p{L}[\p{L}\s]*$ is a regex that matches any 0+ whitespaces at the start of the string, then requires a letter that it consumes, and then any 0+ letters/whitespaces may follow.
See the regex demo.

Related

How to blacklist specific character, but also allow any other character or no character, without using negative lookahead

I'm trying to find a solution to a regex that can match anything after a string or nothing, but if there's something it can't be a dot .
is it possible to do without negative lookahead?
here's an example regex:
.*\.(cpl)[^.].*
now the string:
C:\Windows\SysWOW64\control.exe mlcfg32.cpl sounds
this one is matched, but if there's only:
C:\Windows\SysWOW64\control.exe mlcfg32.cpl
it's not matched because due to the dot blacklist it's searching for any character after cpl,if i use ? after the [^.] however it won't blacklist the . in case there's something else after, so it will capture this even if it shouldn't:
C:\Windows\SysWOW64\control.exe mlcfg32.cpl. sounds
can it be done without using negative lookaheads? - ?!

You may use this regex:
.*\.cpl(?:[^.].*|$)
RegEx Demo
RegEx Breakdown:
.*: Match 0 or more of any character
\.cpl: Match .cpl
(?:[^.].*|$): Match end of string or a non-dot followed by any text

You can use
.*\.(cpl)(?:[^.].*)?$
See the regex demo. Details:
.* - zero or more chars other than line break chars as many as possible
\. - a dot
(cpl) - Group 1: cpl
(?:[^.].*)? - an optional non-capturing group that matches a char other than . char and then zero or more chars other than line break chars as many as possible
$ - end of string.

Regular expression using positive lookbehind not working in Alteryx

I am trying to match a string the 2nd word after "Vores ref.:" using positive lookbehind. It works in online testers like https://regexr.com/, but my tool Alteryx dont allow quantifiers like + in a lookbehind.
"ABC This is an example Vores ref.: 23244-2234 LW782837673 Test 2324324"
(?<=Vores\sref.:\s\d+-\d+\s+)\w+ is correctly matching the LW78283767, on regexr.com but not in Alteryx.
How can I rewrite the lookahead expression by using quantifiers but still get what I want?

You can use a replacement approach here using
.*?\bVores\s+ref\.:\s+\d+-\d+\s+(\w+).*
Replace with $1.
See the regex demo.
Details:
.*? - any 0+ chars other than line break chars, as few as possible
\bVores - whole word Vores
\s+ - one or more whitespaces
ref\.: - ref.: substring
\s+ - one or more whitespaces
\d+-\d+ - one or more digits, - and one or more digits
\s+ - one or more whitespaces
(\w+) - Capturing group 1: one or more word chars.
.* - any 0+ chars other than line break chars, as many as possible.

You can use a capture group instead.
Note to escape the dot \. to match it literally.
\bVores\sref\.:\s\d+-\d+\s+(\w+)
The pattern matches:
\bVores\sref\.:\s\d+-\d+\s+ Your pattern turned into a match
(\w+) Capture group 1, match 1+ word characters
Regex demo

Regex to capture everything after optional token

I have fields which contain data in the following possible formats (each line is a different possibility):
AAA - Something Here
AAA - Something Here - D
Something Here
Note that the first group of letters (AAA) can be of varying lengths.
What I am trying to capture is the "Something Here" or "Something Here - D" (if it exists) using PCRE, but I can't get the Regex to work properly for all three cases. I have tried:
- (.*) which works fine for cases 1 and 2 but obviously not 3;
(?<= - )(.*) which also works fine for cases 1 and 2;
(?! - )(.+)| - (.+) works for cases 2 and 3 but not 1.
I feel like I'm on the verge of it but I can't seem to crack it.
Thanks in advance for your help.
Edit: I realized that I was unclear in my requirements. If there is a trailing " - D" (the letter in the data is arbitrary but should only be a single character), that needs to be captured as well.

About the patterns that you tried:
- (.*)This pattern will match the first occurrence of - followed by matching the rest of the line. It will match too much for the second example as the .* will also match the second occurrence of -
(?<= - )(.*)This pattern will match the same as the first example without the - as it asserts that is should occur directly to the left
(?! - )(.+)| - (.+) This pattern uses a negative lookahead which asserts what is directly to the right is not (?! - ). As none of the example start with - , the whole line will be matched directly after the negative lookahead due to .+ and the second part after the alternation | will not be evaluated
If the first group of letters can be of varying length, you could make the match either specific matching 1 or more uppercase characters [A-Z]+ or 1+ word characters \w+.
To get a more broad match, you could match 1 or more non whitespace characters using \S+
^(?:\S+\h-\h)?\K\S+(?:\h(?!-\h)\S+)*
Explanation
^ Start of string
(?:\S+\h-\h)? Optionally match the first group of non whitespace chars followed by - between horizontal whitespace chars
\K Clear the match buffer (Forget what is currently matched)
\S+ Match 1+ non whitespace characters
(?: Non capture group
\h(?!-\h) Match a horizontal whitespace char and assert what is directly to the right is not - followed by another horizontal whitespace char
\S+ Match 1+ non whitespace chars
)* Close non capture group and repeat 1+ times to match more "words" separated by spaces
Regex demo
Edit
To match an optional hyphen and trailing single character, you could add an optional non capturing group (?:-\h\S\h*)?$ and assert the end of the string if the pattern should match the whole string:
^(?:\S+\h-\h)?\K\S+(?:\h(?!-\h)\S+)*\h*(?:-\h\S\h*)?$
Regex demo

You may use
^(?:.*? - )?\K.*?(?= - | *$)
^(?:.*?\h-\h)?\K.*?(?=\h-\h|\h*$)
See the regex demo
Details
^ - start of string
-(?:.*? - )? - an optional non-capturing group matching any 0+ chars other than line break chars as few as possible up to the first space-space
\K - match reset operator
.*? - any 0+ chars other than line break chars as few as possible
(?= - | *$) - space-space or 0+ spaces till the end of string should follow immediately on the right.
Note that \h matches any horizontal whitespace chars.

^(?:[A-Z]+ - \K)?.*\S
demo
Since "Something Here" can be anything, there's no reason to specially describe the eventual last letter in the pattern. You don't need something more complicated.
With this pattern I assume that you are not interested by the trailing spaces, that's why I ended it with \S. If you want to keep them, remove the \S and change the previous quantifier to +.

How to grab the rest of the string too in the regex matches?

I have the following string:
this is a test string user:testuser,anotheruser hashtag:peach,phone,milk site:youtube.com,twitter.com flair:😂bobby😂
Currently the regex ([^:\s]+):([^:\s]+) matches all the filters with colon in between (user, hashtag, site, flair). How can I also grab the remaining "this is a test string" part as another match?
Demo:
https://regex101.com/r/L0T2GJ/11

You may add an alternative to match any 0+ chars as few as possible from the start of the string till the first key followed with a colon:
^.*?(?=\s+[^:\s]+:)|([^:\s]+):([^:\s]+)
^^^^^^^^^^^^^^^^^^^
See the regex demo
Details
^ - start of the string
-.*? - any 0+ chars other than line break chars, as few as possible
(?=\s+[^:\s]+:) - the positive lookahead makes sure that, immediately to the right of the current position, there is
\s+ - 1+ whitespaces
[^:\s]+ - 1+ chars other than : and whitespace
: - a colon

Regex : Match everything after first dash

I have a string which contains the rego number of the car like
1FX9JE - 2012 Audi A3 Ambition Sportback MY12 Stronic
I would like to match everything except the rego number, so anything after the dash.
The regex I came up with is (php)
\s.[^-]*$
My initial regex which i came up can match anything after the dash only if the string contains only 1 dash. For example https://regex101.com/r/Jao8W0/1
However, if the string has more than 1 dash. The regex is not usable.
For example : https://regex101.com/r/Jao8W0/2
Is there anyway for me to match anything after the first dash even though the string contains additional dash after the first dash.
Thank you

Try this Regex:
^[^-\r\n]+-\s*\K.*$
Click for Demo
Explanation:
^ - asserts the start of the string
[^-\r\n]+ - matches 1+ occurrences of any character that is neither a - or nor a newline
-\s* - matches the first - in the string followed by 0+ whitespaces
\K - forgets everything matched so far
.* - matches 0+ occurrences of any character
$ - asserts the end of the string

if only has one space, you can use this pattern:
(?<=\-\s)(.*)
else if there may have more than one space, get the group(1) from match
(?<=\-)\s*(.*)
(?<=...) Ensures that the given pattern will match, ending at the
current position in the expression. The pattern must have a fixed
width. Does not consume any characters.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

A regex for letters and space that cannot be a whitespace - regex

I cannot figure out how to add two regex together, I have these requirements: Letters and space ^[\p{L} ]+$ Cannot be whitespace ^[^\s]+$ I cannot figure out how to write one regex that will combine both? There is perhaps some other solution?

Related

How to blacklist specific character, but also allow any other character or no character, without using negative lookahead

Regular expression using positive lookbehind not working in Alteryx

Regex to capture everything after optional token

How to grab the rest of the string too in the regex matches?

Regex : Match everything after first dash

Categories

Resources