Regex: match all but two dots - regex

I'm trying to validate a system path. This path is valid if it begins with some dirs and don't contain two dots one another.
#valid path
/home/user/somedir/somedir_2
#evil path
/home/user/../somedir_2/
I know how to check for two dots in a path:
\.\. or [\.]{2}
But I really want to do something like that:
/home/user/<match everything but two dots>/somedir_2/
so that "match everything but two dots" can be everything but two dots.
I have tried:
/home/user/[^\.{2}]*/somedir_2/
but without any success.

The specification isn't clear, but you can use negative lookahead to do something like this:
^(?!.*thatPattern)thisPattern$
The above would match strings that matches thisPattern, but not if it contains a match of thatPattern.
Here's an example (as seen on rubular.com):
^(?!.*aa)[a-z]*$
This would match ^[a-z]*$, but not if it contains aa anywhere.
References
regular-expressions.info/Lookarounds

^/home/user/(?!.*\.\.).*$
will match your good pattern and reject your evil one.

Related

Regex to match path containing one of two strings

RegEx to match one of two strings in the third segment, ie in pseudo code:
/content/au/(boomer or millenial)/...
Example matches
/content/au/boomer
/content/au/boomer/male/31
/content/au/millenial/female/29/M
/content/au/millenial/male/18/UM
Example non-matches
/content/au
/content/nz/millenial/male/18/UM
/content/au/genz/male
I've tried this, but to no avail:
^/content/au/(?![^/]*/(?:millenial|boomer))([^/]*)
Don't use a look ahead; just use the plain alternation millenial|boomer then a word-boundary:
^/content/au/(?:millenial|boomer)\b(?:/.*)?
See live demo.
You should probably spell millennial correctly too (two "n"s, not one).
What's with the negative lookahead? This is a simple, if not trivial, positive match.
^/content/au/(?:millenial|boomer)(?:/|$)
The final group says the match needs to be followed by a slash or nothing, so as to exclude paths which begin with one of the alternatives, but contain additional text.
You can use the following regex DEMO
content/au/(?:boomer|millenial)

Regex expression to exclude both prefix and suffix

I'm trying to build an expression which will match all text EXCLUDING text with prefix 'abc' AND suffix 'def' (text which only has the prefix OR the suffix is ok).
I've tried the following:
^(?!([a][b][c]])).*(?!([d][e][f])$), but it doesn't match text which only has one of the criterias (i.e. abc.xxx fails, as well as xxx.pdf, though they should pass)
I understand the answer is related to 'look behind' but i'm still not quite sure how to achieve this behavior
I've also tried the following:
^(?<!([a][b][c])).*(?!([d][e][f])$), but again, with no luck
^((abc.*\.(?!def))|((?!abc).*\.def))$
I think there can be a simpler solution, but this one will work as you wanted it.
[a][b][c] can be simplified to abc, the same goes for def.
The first part of the pattern matches abc.*\. without def at the end.
The second part matches .*\.def without the prefix abc.
Here is a visual representation of the pattern:
Debuggex Demo
Keep it simple and combine it into a single lookahead to check both conditions:
^(?!abc.*def$).*

Regex to match all urls, excluding .css, .js recources

I'm looking for a regular expression to exclude the URLs from an extension I don't like.
For example resources ending with: .css, .js, .font, .png, .jpg etc. should be excluded.
However, I can put all resources to the same folder and try to exclude URLs to this folder, like:
.*\/(?!content\/media)\/.*
But that doesn't work! How can I improve this regex to match my criteria?
e.g.
Match:
http://www.myapp.com/xyzOranotherContextRoot/rest/user/get/123?some=par#/other
No match:
http://www.myapp.com/xyzOranotherContextRoot/content/media/css/main.css?7892843
The correct solution is:
^((?!\/content\/media\/).)*$
see: https://regex101.com/r/bD0iD9/4
Inspirit by Regular expression to match a line that doesn't contain a word?
Two things:
First, the ?! negative lookahead doesn't remove any characters from the input. Add [^\/]+ before the trailing slash. Right now it is trying to match two consecutive slashes. For example:
.*\/(?!content\/media)[^\/]+\/.*
(edit) Second, the .*s at the beginning and end match too much. Try tightening those up, or adding more detail to content\/media. As it stands, content/media can be swallowed by one of the .*s and never be checked against the lookahead.
Suggestions:
Use your original idea - test against the extensions: ^.*\.(?!css|js|font|png|jpeg)[a-z0-9]+$ (with case insensitive).
Instead of using the regular expression to do this, use a regex that will pull any URL (e.g., https?:\/\/\S\+, perhaps?) and then test each one you find with String.indexOf: if(candidateURL.indexOf('content/media')==-1) { /*do something with the OK URL */ }

Regex to include one thing but exclude another

I've been having a lot of trouble finding how to write a regex to include certain URLs starting with a specified phrase while excluding another.
We want to include pages that start with:
/womens
/mens
/kids-clothing/boys
/kids-clothing/girls
/homeware
But we want to exclude anything that has /sXXXXXXX in the URL - where the X's are numbers.
I've written this so far to match the below URLs but it's behaving very oddly. Should I be using lookarounds or something?
\/(womens|mens|kids\-clothing\/boys|kids\-clothing\/boys|homeware).*[^s[0-9]+].*
/homeware/bathroom/s2522424/4-tier-pastel-pop-drawers-approx-91cm-x25cm-x-28cm
/homeware/bathroom/towels-and-bathmats
/homeware/bathroom/towels-and-bathmats/s2506420/boutique-luxury-towels
/homeware/bathroom/towels-and-bathmats?page=3&size=36&cols=4&sort=&id=/homeware/bathroom/towels-and-bathmats&priceRange[min]=1&priceRange[max]=14
/homeware/bathroom?page=3&size=36&cols=4&sort=&id=/homeware/bathroom&priceRange[min]=1&priceRange[max]=35
/homeware/bedroom
/homeware/bedroom/bedding-sets
/homeware/bedroom/bedding-sets/s2471012/striped-reversible-printed-duvet-set
/homeware/bedroom/bedding-sets/s2472706/check-printed-reversible-duvet-set
/homeware/bedroom/bedding-sets/s2475332/union-jack-duvet-set
/kids-clothing/boys/shop-by-age/toddler-3mnths-5yrs/s2520246/boys-lollipop-slogan-t-shirt
/kids-clothing/boys/shop-by-age/toddler-3mnths-5yrs/s2520253/boys-2-pack-dinosaur-t-shirts
/kids-clothing/girls/great-value/sale?page=1&size=36&cols=4&sort=price.asc&id=/kids-clothing/girls/great-value/sale&priceRange[min]=0.5&priceRange[max]=7
/kids-clothing/girls/mini-shops/ballet-outfits
/kids-clothing/girls/shop-by-age/baby--newborn-0-18mths
/kids-clothing/girls/shop-by-age/baby--newborn-0-18mths/s2484120/3-pack-frill-pants-pinks
/kids-clothing/girls/shop-by-age/baby--newborn-0-18mths/s2504431/3-pack-l-s-bodysuit
/mens/categories/tops?page=5&size=36&cols=4&sort=&id=/mens/categories/tops&priceRange[min]=2&priceRange[max]=22.5
/mens/categories/trousers-and-chinos
/mens/categories/trousers-and-chinos/s2438566/easy-essential-cuffed-jogging-bottoms
/mens/categories/trousers-and-chinos/s2438574/easy-essential-cuffed-jogging-bottoms
/mens/categories/trousers-and-chinos/s2458939/regatta-zip-off-lightweight-outdoor-trousers
You are on the right track. A negative lookahead will do it:
"^(?!.*\/s\d+)\/(womens|mens|kids\-clothing\/boys|kids\-clothing\/girls|homeware)\/.*"
The ^ anchors to the start of the string. The (?!.*\/s\d+) means that "/sXXXXXXX" can't appear anywhere in the string, and the rest of it matches your required starting tokens.
The reason [^s[0-9]+] didn't work is that [^xyz] matches only one single character. What you're effectively saying there is that you're looking for any character that isn't any combination of "s", "[" and "0-9", followed by "]". e.g. "s[234[s]".
The reason you need to put your negative lookahead at the start of the string is so nothing is matched at all. If you put it after the \/(womens|mens|kids\-clothing\/boys|kids\-clothing\/girls|homeware)\/.*, you would still successfully match everything before the "/sXXXXXXX". i.e. for line 1 of your data, you would match "/homeware/bathroom/".
Yes, you need a negative lookaround:
/^\/(womens|mens|kids\-clothing\/boys|kids\-clothing\/boys|homeware)(?:\/(?:(?!s\d+).)*)+$/gm
If you're comparing one line at a time you don't need the multiline (m) flag. It's probably behaving strangely because you had a character class (denoted by square brakcets) nested inside more square brackets, which doesn't work; you can't nest character classes. This was tested and works on refiddle.

How to match a string that does not end in a certain substring?

how can I write regular expression that dose not contain some string at the end.
in my project,all classes that their names dont end with some string such as "controller" and "map" should inherit from a base class. how can I do this using regular expression ?
but using both
public*.class[a-zA-Z]*(?<!controller|map)$
public*.class*.(?<!controller)$
there isnt any match case!!!
Do a search for all filenames matching this:
(?<!controller|map|anythingelse)$
(Remove the |anythingelse if no other keywords, or append other keywords similarly.)
If you can't use negative lookbehinds (the (?<!..) bit), do a search for filenames that do not match this:
(?:controller|map)$
And if that still doesn't work (might not in some IDEs), remove the ?: part and it probably will - that just makes it a non-capturing group, but the difference here is fairly insignificant.
If you're using something where the full string must match, then you can just prefix either of the above with ^.* to do that.
Update:
In response to this:
but using both
public*.class[a-zA-Z]*(?<!controller|map)$
public*.class*.(?<!controller)$
there isnt any match case!!!
Not quite sure what you're attempting with the public/class stuff there, so try this:
public.*class.*(?<!controller|map)$`
The . is a regex char that means "anything except newline", and the * means zero or more times.
If this isn't what you're after, edit the question with more details.
Depending on your regex implementation, you might be able to use a lookbehind for this task. This would look like
(?<!SomeText)$
This matches any lines NOT having "SomeText" at their end. If you cannot use that, the expression
^(?!.*SomeText$).*$
matches any non-empty lines not ending with "SomeText" as well.
You could write a regex that contains two groups, one consists of one or more characters before controller or map, the other contains controller or map and is optional.
^(.+)(controller|map)?$
With that you may match your string and if there is a group() method in the regex API you use, if group(2) is empty, the string does not contain controller or map.
Check if the name does not match [a-zA-Z]*controller or [a-zA-Z]*map.
finally I did it in this way
public.*class.*[^(controller|map|spec)]$
it worked