Regex exclude character from the group - regex

I am trying to write this regex to match dots with a few rules
(\.+ *|([a-zA-ZÀ-ž]\.\d))(?=[^\d{1}(\.\d{1})])(?=[^.,])
But my regex is matching few characters before and after the dot as well
For example:
č.1 > match č.1 (incorrect, match should be only .)
St.M > match . (correct)
2.0 > no match (correct)
Do you have any idea, how to "exclude" these other characters from the result and match only the dot?
Thanks for your help

You could shorten the pattern using a positive lookbehind (?<=) asserting the character class with the specific ranges to the left.
(?<=[a-zA-ZÀ-ž])\.
Regex demo
As per the comments, the pattern with the positive lookahead
(?<=[a-zA-ZÀ-ž])\.+ *(?=[^.,])
Regex demo

Related

Regex positive lookahead multiple occurrence

I have below sample string
abc,com;def,med;ghi,com;jkl,med
I have to grep the string which is coming before keyword ",com" (all occurrences)
Final result which is I am looking for is something like -
abc,ghi
I have tried below positive lookahead regex -
[\s\S]*?(?=com)
But this is only fetching abc, not the ghi.
What modification do I need to make in above regex?
Using a character class [\s\S] can match any character and will also match the , and ;
What you can do is match non whitespace characters except for , and ; using a negated character class and that way you don't have to make it non greedy as well.
Then assert the ,com to the right (followed by a word boundary to prevent a partial word match)
Instead of using a lookahead, you might also use a capture group:
([^\s,;]+),com\b
See a regex demo with the capture group values.

How to use a negative lookahead to prevent my regular expression from matching?

I'm using this regular expression: ^(\d+(?:\.\d+)?) that will match any decimal or integer numeric value that is followed by any character and will capture only the numeric part of it. For example this regex will match the following values and capture the numeric part of them:
10.5
10.5 Inches
10 Inches
However, it seems like my regex will also match the following value: 6" + 1.5". I want to update my regex so that it doesn't match for these type of values. So it shouldn't match if there are multiple numeric values.
I tried doing a negative lookahead like this ^(\d+(?:\.\d+)?)(?!\d), but it doesn't seem to be working.
Converting my comment to answer so that solution is easy to find for future visitors.
You may use this regex:
^(\d+(?:\.\d+)?)\b(?!.*\d)
RegEx Demo
RegEx Breakdown:
^: Line start
(: Start a capture group
\d+: Match 1+ digits
(?:\.\d+)?: Optionally match dot and 1+ digits
): End capture group
\b: Word boundary
(?!.*\d): Negative lookahead to assert that there is no digit ahead after this match

Find match within a first match

I have the following string
abc123+InterestingValue+def456
I want to get the InterestingValue only, I am using this regex
\+.*\+
but the output it still includes the + characters
Is there a way to search for a string between the + characters, then search again for anything that is not a + character?
Use lookarounds.
(?<=\+)[^+]*(?=\+)
DEMO
You can use a positive lookahead and a positive lookbehind (more info about these here). Basically, a positive lookbehind tells the engine "this match has to come before the next match", and a positive lookahead tells the engine "this has to come after the previous match". Neither of them actually match the pattern they're looking for though.
A positive lookbehind is a group beginning with ?<= and a positive lookahead is a group beginning with ?=. Adding these to your existing expression would look like this:
(?<=\+).*(?=\+)
regex101
If it should be the first match, you can use a capture group with an anchor:
^[^+]*\+([^+]+)\+
^ Start of string
[^+]* Optionally match any char except + using a negated character class
\+ Match literally
([^+]+) Capture group 1, match 1+ chars other than +
\+ Match literally
Regex demo

Extracting days from a string Regex

I am trying to extract the days using regex groups in C# from the following string,
"RRULE:FREQ=MONTHLY;UNTIL=20211126T143000Z;INTERVAL=1;BYDAY=MO,TU,WE,TH,FR;BYSETPOS=-1"
I am new to regular expressions and looked at various websites to try write an expression the expression i have got so far is the following
(?:BYDAY=)([A-Z,]*);
Which matches
MO,TU,WE,TH,FR;
as a whole, which i can then use ',' in a split to achieve what I want, I wanted to know if there is a way of doing this purely in Regex.
If a quantifier in the lookbehind is supported, you might use:
(?<=BYDAY=[A-Z,]*)[A-Z]+
Explanation
(?<= Positive lookbehind, assert what is on the left is
BYDAY=[A-Z,]* match BYDAY= followed by 0 or more times A-Z or ,
) Close lookbehind
[A-Z]+ Match 1+ chars A-Z
.Net regex demo | C# demo by WiktorStribiżew
Alternatively you can make use of the \G anchor to get iterative matches and capture the value in group 1
(?:\G(?!^)|BYDAY=)([A-Z]+),?
Regex demo

Regex negative lookahead not working as expected

I have the following regex:
[a-zA-Z0-9. ]*(?!cs)
and the string
Hotfix H5.12.1.00.cs02_ADV_LCR
I want to match only untill
Hotfix H5.12.1.00
But the regex matches untill "cs02"
Shouldn't the negative lookahead have done the job?
You may consider using a tempered greedy token:
(?:(?!\.cs)[a-zA-Z0-9. ])*
See the regex demo.
This will work regardless of whether .cs is present in the string or not because the tempered greedy token matches any 0+ characters from the [a-zA-Z0-9. ] character class that is not .cs.
You need to use positive lookahead instead of negative lookahead.
[a-zA-Z0-9. ]*(?=\.cs)
or
[a-zA-Z0-9. ]+(?=\.cs)
Note that your regex [a-zA-Z0-9. ]*(?!cs) is greedy and matches all the characters until it reaches a boundary which isn't followed by cs. See here.
At first pattern [a-zA-Z0-9. ]+ matches Hotfix H5.12.1.00.cs02 greedily because this pattern greedily matches alphabets , dots and spaces. Once it see the underscore char, it stops matching where the two conditions is satisfied,
_ won't get matched by [a-zA-Z0-9. ]+
_ is not cs
It works same for the further two matches also.