Using multiple conditions in regex - regex

I am attempting to create a regex that matches when two conditions are met:
URL snippet is present
After the snippet the number "1" must also be present (1 does not have to be
immediately after snippet)
Both conditions must be met for the regex to be true.
This is the regex that I have so far:
^https?:\/\/www\.website\.co\.uk\/brand\/
This matches the URL snippet. But I want the regex to include the second condition.
Therefore, if the second condition was included
This would match: http://www.website.co.uk/brand/AD/**1**/A_d.html
But this would not: http://www.website.co.uk/brand/
Any help on this would be great.

^https?:\/\/www\.website\.co\.uk\/brand\/.*1.*
If you have a .* in regex, you match all characters asides from newline. By padding your 1 with .* you match all regexes that have
http://www.website.co.uk/brand/
and are followed by characters with at least one 1.

Related

My regex appears to be complete, yet it's missing matches

I'm currently working on a piece of regex that mostly works, however there's a few matches that aren't capturing, despite working when they're the only match. I'm hoping someone can point out what is clearly an obvious error, but one that I'm missing.
Specifically, the string kMad matches to [Kk\D]+ by itself, but not when it's part of the bigger string.
For reference:
Full Regex showing missing matches
Specific Regex showing matches
Expected matches by line
Non-matching occurrence of kMad10:31-18:5771 does not include 4 digits at the end since two digits after the colon is already captured by (\d{2}:\d{2}\-\d{2}:\d{2}). You can define a range for the digit occurrences for the regex section after it like \d{2,4} instead of \d{4}
The new regex will be:
(?:\d{4}\-\d{2}\-\d{2}+\.)?(?:Line\d{1,3})?(OFF|ADO|([Kk\D]+)?(\d{2}:\d{2}\-\d{2}:\d{2})(\d{2,4}))
Regex101 Demo

regex positive lookahead with if/else condition

I am trying to write an regular expression that would check if a pattern exists and, if it does, matches everything following it, and if (and only if) it does not, matches everything after another pattern.
example lines:
http://example.com/contact
www.example.com/contact
http://www.example.com/contact
expected output in all 3 cases: example
Here is the regular expression I expected would do the job:
(?(?<=www\.).+|(?<=http:\/\/).+)(?=\.com)
which I assumed would:
check if "www." is to be found
if yes, would match everything following it
if not, match everything following "http://"
restrict match to everything before the occurrence of ".com "
For the first two lines, the expression worked well, but in the third line www.example is matched instead of just example. Does this mean that for some reason the else command is executed although the if condition is met?
How can I change the above expression so that it only does the http// lookahead if the www. part was not found?
Converting my comment to answer.
You may use this regex:
^(?:https?://(?:www\.)?|www\.)\K\S+?(?=\.com(?:/|$))
RegEx Demo
RegEx Description:
^: Start
(?:https?://(?:www\.)?|www\.): Match http://www. or http:// or (https)
\K: Reset matched information
\S+?: Match 1+ non-space characters (lazy)
(?=\.com(?:/|$)): Using lookahead assert that we have .com or end of line ahead

How can Iselect lines containing "Post" in WinAutomation

When I use the following expression in notepad++ it does successfully selects line that contains the word "post" in it, but when I use it in winautomation it doesn't work :(
^.*post.*$
Can someone please tell me what would be some alternative regex expression by which I can select lines that contains the word "post"
Also I am not sure if this would help or not but here is a sample expression that works in winautomation, I use it to parse urls off XML files, is this a different regex format than the one above?
(?<=<loc>).*?(?=</loc>)
What you are trying to do with ^.*post.*$ is match any character any number of times followed by post word followed by any character any number of times.
A logical way would be to lookahead if post is present and select whole line. This is done by following regex.
Regex: (?=.*post)^.*$
Explanation:
(?=.*post) would lookahead for any number of characters any number of time followed by word post.
^.*$ if previous assertion is true then regex will match whole line.
Regex101 Demo

Regex to check URL's

I need to test for URLs with the below patterns:
https://cloudhusethelp.zendesk.com
https://cloudhusethelp.zendesk.com/
https://cloudhusethelp.zendesk.com/en-us
https://cloudhusethelp.zendesk.com/da
https://cloudhusethelp.zendesk.com/fr
https://cloudhusethelp.zendesk.com/aa
The regex used is https\:\/\/cloudhusethelp\.zendesk\.com\/[A-z][A-z]
So this compares the URL with 2 alphabets at the end. The URL can end with any language or no language.
Should I write multiple regular expression to find the match for above condition or one condition can do it.
Any help is appreciated.
You can definitely do it with a single expression:
https\:\/\/cloudhusethelp\.zendesk\.com(\/[A-Za-z]{2}(-[A-Za-z]{2})?)?
The part that differs from your expression is at the end:
([A-Za-z]{2}(-[A-Za-z]{2})?)?
It is a nested optional expression that matches nothing, a pair of letters, or a pair of letters followed by dash and another pair of letters.
Demo.
the slash at the end is also optional, as the first example you provided dont have it.
https\:\/\/cloudhusethelp\.zendesk\.com(\/[A-z\-]{2}(\-[A-z\-]{2})?)?
demo

Regular Expression for email to check repetitive characters

I'm validating email address with regular expression. I would like to test for a following conditions:
minimum of 3 characters in name, symbol #, minimum 3 characters in first part of domain, a dot,no more than 3 repetitive characters
I tried this regular expression and it's working fine for all cases except last one.
/^[A-Za-z0-9._%+-]{3,}\#[A-Za-z0-9.-]{3,}\.[A-Za-z]{2,4}$/
It's not checking for repetitive character(any character) after dot(.)
Not Ok: test#test.ccccom, test#test.coooom
Ok : test#test.com
Don't know what is wrong with last portion of my RE.
Any input will be appreciated.
You can use the following regex:
^(?!.*([A-Za-z0-9])\1{3})[A-Za-z0-9._%+-]{3,}\#[A-Za-z0-9-]{3,}\.[A-Za-z]{2,4}$
Changes made:
(?!.*([A-Za-z0-9])\1{3}) - This is a negative lookahead that makes sure that none of the characters repeat more than thrice in a row.
The rest of the regex is same as it is, except for the removal of the . from the second character class.
RegEx Demo
If you want to disallow repeated characters after the last ., then you could use the following instead:
^[A-Za-z0-9._%+-]{3,}\#[A-Za-z0-9-]{3,}\.(?!([A-Za-z0-9])\1{3})[A-Za-z]{2,4}$
RegEx Demo
This won't allow more than three repeated characters after the last dot,
^[A-Za-z0-9._%+-]{3,}\#[A-Za-z0-9.-]{3,}\.(?:(?!(.)\1{3})[a-zA-Z]){2,4}$
DEMO