Match string does not contain substring with regex - regex

Ok, I know that it is a question often asked, but I did not manage to get what I wanted.
I am looking for a regular expression in order to find a pattern that does not contain a particular substring.
I want to find an url that does not contains the b parameter.
http://www.website.com/a=789&c=146 > MATCH
http://www.website.com/a=789&b=412&c=146 > NOT MATCH
Currently, I have the following Regex:
\bhttp:\/\/www\.website\.com\/((?!b=[0-9]+).)*\b
But I am wrong with the \b, the regex match the beginning of th string and stop when it find b=, instead of not matching.
See: http://regex101.com/r/fN3zU5/3
Can someone help me please?

Just use a lookahead to check anything following the URL must be a space or line end.
\bhttp:\/\/www\.website\.com\/(?:(?!b=[0-9]+).)*?\b(?= |$)
DEMO

use this:
^http:\/\/www\.website\.com\/((?!b=[0-9]+)).*$
\b only matches word endings.
^ matches start and end of string
and you dont even need to do it that complicated, If you dont want the url with the b parameter use this:
^http:\/\/www\.website\.com\/(?!b).*$
demo here : http://regex101.com/r/fN3zU5/5

import re
pattern=re.compile(r"(?!.*?b=.*).*")
print pattern.match(x)
This will look ahead if there is a "b=" present.A negative lookahead means it will not match that string.

You had a look at this possibility:
http://regex101.com/r/fN3zU5/6
^http:\/\/www\.website\.com\/[ac\=\d&]*$
only allow &,=,a,c and digits
complete url in group and there should not be a "b=" parameter
if you have more options and you dont want to list them all:
you dont allow a 'b' to be part of your parameters
^http:\/\/www\.website\.com\/[^b]*$
http://regex101.com/r/fN3zU5/7
^http:\/\/www\.website\.com\/(?!.*?b=.*?).*$ works too here "b=" is permitted at any position of the parameter string so you could even have the "b" string as a value of a parameter.
See
http://regex101.com/r/fN3zU5/8

This is what you want. ^http:\/\/www\.website\.com\/(([^b]=[0-9]+).)*$

Its a simple pattern not flexible but it works :
http:\/\/www\.website\.com\/+a=+\w+&+c=+\w+

Related

Case analysis with REGEX

I have some data like
small_animal/Mouse
BigAnimal:Elephant
Not an animal.
What I want to get is:
Mouse
Elephant
Not an animal.
Thus, I need a regular expression that searches for / or : as follows: If one of these is found, take the text behind that character. If neither / nor : exists, take the whole string.
I tried a lot. For example this will work for mouse and elephant, but not for the third line:
(?<=:)[^:]*|(?<=/)[^/]*
And this will always give the full string ...
(?<=:)[^:]*|(?<=/)[^/]*|^.*$
My head is burning^^ Maybe, somebody can help? :) Thanks a lot!
EDIT:
#The fourth bird offered a nice solution for single characters. But what if I want to search for strings like
animal::Dog
Another123Cat
Not an animal.
How can I split on :: or 123?
You might use
^(?:[^:/]*[:/])?\K.+
^ Start of string
(?:[^:/]*[:/])? Optionally match any char except : or / till matching either : or /
\K Forget what is matched so far
.+ Match 1+ times any char
regex demo
If you don't want to cross a newline, you can extend the character class with [^:/\r\n]*
Another option could be using an alternation
^[^:/]*[:/]\K.+|.+
Regex demo
Or perhaps making use of a SKIP FAIL approach by matching what you want to omit
^[^:/]*[:/](*SKIP)(*F)|.+
Regex demo
If you want to use multiple characters, you might also use
^(?:(?:(?!123|::|[:/]).)*+(?:123|::|[:/]))?\K.+
Regex demo

Regex - Matching a part of a URL

I'm trying to use regular expression to match a part of the following url:
http://www.example.com/store/store.html?ptype=lst&id=370&3434323&root=nav_3&dir=desc&order=popularity
I want the Regex to find:
&3434323
Basically, it's meant to search any part of the argument that doesn't follow the variable=value formula. So basically I need it to search sections of the URL that don't have an equal sign it, but match just that part.
I tried using:
&\w*+[^=_-]
But it returns: &3434323&. I need it to not return the next ampersand.
And it must be done in regex. Thanks in advance!
You can use this regex:
[?&][^=]+(&|$)
It looks for any string that doesn't contain the equal sing [^=]+ and starts with the question mark or the ampersand [?&] and ends with ampersand or the end of the URL (&|$).
Please note that this will return &3434323&, so you'll have to strip the ampersands on both sides in your code. I assume that you're fine with that. If you really don't want the second ampersand, you can use a lookahead:
[?&][^=]+(?=&|$)
If you don't want even the first ampersand, you can use this regex, but not all compilers support it:
(?<=\?|&)[^=]+(?=&|$)
Parsing query parameters can be tricky, but this may do the job:
((?:[?&])[^=&]+)(?=&|$)
It will not catch the ampersand at the end of the parameter, but it will include either the question mark or the ampersand at the beginning. It will match any parameter not in the form of a key-value pair.
Demo here.

How to get the queryparam vid from the url using regex

Help me with the regex, I am trying to get the vid value from the following url.
I tried with like the following but I am not sure with that:
[\&]{1}vid[\=][\d]*
Is that correct?
Use vid=(\d+) for numbers of IDs see regex
Try Your Regex on this place...
https://regex101.com/r/dX3hD4/1
The trick here is to match between two patterns of interest -
"vid="
"&"
Anything you capture between that is what you're after.
Hence use this:
"http://gorid.com/api.jsp?acs=123&vid=432&skey=asdasd-asdas-adsasd".match("vid=([^;]*)&")[1]
We're accessing the 2nd element of the match object because that contains the value.
In a JS/PHP type environment, you can match on something like this, where you just find anything alphanumeric is between vid= and the following &:
vv = str.match(/vid=(.+?)&/)[1];
HERE
If the value is always numeric, replace (.+?) with (\d+?)
The regex you wrote will not work because you are including the characters &vid= in the return value. To make sure the regex engine checks for the string &vid= but does not include it in the result you will need to use a lookbehind:
(?<=&vid=)([^&\r\n]+)
We use a positive lookbehind to find &vid= and then grab everything from that point until the next & sign or the end of the line.
For your second request, if you wish to verify that the content of vid is a valid number you need to specify that all the characters following &vid= should be digits and also include a positive lookahead that makes sure the next character after the digits is a & sign. The corresponding regular expression then becomes:
(?<=&vid=)([^\D]+)(?=&)

Get all matches for a certain pattern using RegEx

I am not really a RegEx expert and hence asking a simple question.
I have a few parameters that I need to use which are in a particular pattern
For example
$$DATA_START_TIME
$$DATA_END_TIME
$$MIN_POID_ID_DLAY
$$MAX_POID_ID_DLAY
$$MIN_POID_ID_RELTM
$$MAX_POID_ID_RELTM
And these will be replaced at runtime in a string with their values (a SQL statement).
For example I have a simple query
select * from asdf where asdf.starttime = $$DATA_START_TIME and asdf.endtime = $$DATA_END_TIME
Now when I try to use the RegEx pattern
\$\$[^\W+]\w+$
I do not get all the matches(I get only a the last match).
I am trying to test my usage here https://regex101.com/r/xR9dG0/2
If someone could correct my mistake, I would really appreciate it.
Thanks!
This will do the job:
\$\$\w+/g
See Demo
Just Some clarifications why your regex is doing what is doing:
\$\$[^\W+]\w+$
Unescaped $ char means end of string, so, your pattern is matching something that must be on the end of the string, that's why its getting only the last match.
This group [^\W+] doesn't really makes sense, groups starting with [^..] means negate the chars inside here, and \W is the negation of words, and + inside the group means literally the char +, so you are saying match everything that is Not a Not word and that is not a + sign, i guess that was not what you wanted.
To match the next word just \w+ will do it. And the global modifier /g ensures that you will not stop on the first match.
This should work - Based on what you said you wanted to match this should work . Also it won't match $$lower_case_strings if that's what you wanted. If not, add the "i" flag also.
\${2}[A-Z_]+/g

Find first point with regex

I want a regex which return me only characters before first point.
Ex :
T420_02.DOMAIN.LOCAL
I want only T420_02
Please help me.
You can use the following regex: ^(.*?)(?=\.)
The captured group contains what you need (T420_02 in your example).
This simple expression should do what you need, assuming you want to match it at the beginning of the string:
^(.+?)\.
The capture group contains the string before (but not including) the ..
Here's a fiddle: http://www.rexfiddle.net/s8l0bn3
Use regex pattern ^[^.]+(?=[.])