regex exclusion not working (need "in between but not including") - regex

Need regex to pull out the access token from the below:
{
"access_token": "APWsWZi4CfK1cejU2Fn8u2xFtFKS_sDD3XlD6AKoydYTelIIadE5rarE6V2M_LVBD3ak_1WvaL0mlKYyCrSqubsbZCSidCLHB9kepR2ffw-O0Z8aMug4e7AYQ_gs_eWSygnFjbbOvCROp6mzvaBXsTEjn1J9Rtvt5yUzP1XKcHp4dQnO04MlwryZGO0Fuov4sMWpeml-8vB7o7H4hkQnSbR1yLuG_I6mmetKZqBMKibP_C3PndvnaFJzAVODDe3bGiubKELOu6jcSEOIxZKO38F_jXSDsrwIVbyrwYriD1menbh6hN7oFWdQzYc0U-5fxnAlfPm1yHTboAPxDqgIHKVOw4Wq-Ns7zAl9ZB16omRDP0yxNIG0hSQ7mT8xnf8tpsB7v3KdiHgDVbEe7P0mwKwpkQHUGp8-0B7P7iCaXWQmylLPh43yr68",
"token_type": "Bearer",
"expires_in": 300
}
Using ([A-Z]|-)\w+ pulls out my string but also Bearer. I tried ([A-Z]|-)\w+(?!Bearer) and it made no difference. Any other suggestions?
To be perfectly clear: the "access_token": part can't be included. ONLY the token itself.

You may use this:
"access_token": "(.*)"
Live Demo

If you need a regex that only matches the token and not anything else, the only regex I can think of that will work for you is
\w{36,}
where 36 (just an example) is the minimum number of characters in the access token. But this is very hacky. It is at least possible, if not likely, that another piece of information could get added that would also match that regex.
The reason this works is because in the JSON provided, the access token is by far the longest string of related characters. \w is a shortcut for [a-zA-Z0-9_] and \w{36,} will match a string of at least 36 such characters. Because " is not included, the string terminates at that point. In other words, as long as your number is larger than the longest word outside of the token, it will only pick up the token.
The real solution is to go with
"access_token": "(\w+)"
and reference the first capture group, if you have the ability to specify that.

Related

How to modify single key value pair with Charles proxy rewrite

I would like to modify the value of a single key in a larger json response body using Charles proxy re-write.
As an example, I want to change age from 20 to 30 (but can be any initial value):
{
"userId": "some_value_i_dont_want_to_touch",
"username": "Charlie",
"age": "20"
}
I do not wish to replace the entire json body as that is not practical for my situation; I just want to modify the value of a single key.
The regex "age":[\s\S]"(.*)" or "age":[\s\S]"[^"]+" highlights the text I want on sites like https://regex101.com/
I've tried a few variants of this as well, but none of them seem to actually change the value.
Example screenshot
If I just enter the Match value "20" and the Replace value as "30" it works which tells me I have the url and path correct, but I'd like to avoid accidentally replacing the wrong thing.
Using the current full value without regex doesn't work either, so trying to do "age": "20" -> "age": "30" doesn't work, but maybe I need to escape something, though this would also limit me to cases where I know the value beforehand which is not ideal.
The error logs also indicate the request was re-written, but nothing changes
Ex: Running: Body: "age":[\s\S]"(.*)" -> "age": "30"
I am not clear on how the groups work (with the $) but maybe this has something to do with it.
I am probably just missing something obvious or going about this wrong.
Any insight would be appreciated.
For this topic, perhaps there is no whitespace needed because a proxy response is a text string with no spaces, however when we view it in JSON Text it helps us to read(thus having white spaces)
For the rest like me who is new that may help!
I just learnt that for [ needs to be escaped for regex to work, etc:
Match:
"xxxx":\[] #tick regex
or you may,
^"xxxx.{0,}"
xxxx refering to anything that starts with (^)
" at the end marks the end of the string or else it'll match everything else
Replace:
"xxxx":\["new"]
Hope this helps!

Django Url pattern regex for tokens

I need to pass tokens like b'//x0eaa#abc.com//x00//xf0//x7f//xff//xff//xfd//x00' in my Django Url pattern. I am not able to find matching regex for that resulting Page not found error.
My url will be like /api/users/0/"b'//x0eaa#abc.com//x00//xf0//x7f//xff//xff//xfd//x00'"/
I have tried with following regex
url(r'^api/users/(?P<username>[\w\-]+)/(?P<paging_state>[\w.%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4})/$', views.getUserPagination),
Please pass the token in request header or body and then use accordingly in your view.
Considering there are some static predictable elements in your url like -
api/users/
/" before b
"/ at the end after '
So I can see the url in either of the 2 ways below. Regex's mentioned accordingly:
api/users/(set of words, digits or hyphens)/"(any character except newline)"/
REGEX: ^api\/users\/([\w\d\-]+)\/"(.*)"\/$
URL: url(r'^api\/users\/([\w\d\-]+)\/"(.*)"\/$', views.getUserPagination),
api/users/(set of words, digits or hyphens)/"(one character-b)'//(any no. of words or digits)#(any no. of words or digits).(any no. of words or digits) (any no. of words, digits, front slashes)'"/
REGEX: ^api\/users\/([\w\d\-]+)\/"([a-g]'\/\/[\w\d]*#[\w\d]*.[\w\d]*[\/\w\d]*')"\/$
URL: url(r'^api\/users\/([\w\d\-]+)\/"([a-g]'\/\/[\w\d]*#[\w\d]*.[\w\d]*[\/\w\d]*')"\/$', views.getUserPagination),
You should be able to use either of the above two. There can be multiple ways to match the token part in your url. So unless it is a big security concern, you can do with the simplest approach as mentioned in point 1.

Regex HTTP Response Body Message

I use a jmeter for REST testing.
I have made a HTTP Request, and this is the response data:
{"id":11,"name":"value","password":null,"status":"ACTIVE","lastIp":"0.0.0.0","lastLogin":null,"addedDate":1429090984000}
I need just the ID (which is 11) in
{"id":11,....
I use the REGEX below :
([0-9].+?)
It works perfectly but it will be a problem if my ID more than 2 digits. I need to change the REGEX to :
([0-9][0-9].+?)
Is there any dynamic REGEX for my problem. Thank you for your attention.
Regards,
Stefio
If you want any integer between {"id": and , use the following Regular Expression:
{"id":(\d+),
However the smarter way of dealing with JSON data could be JSON Path Extractor (available via JMeter Plugins), going forward this option can be much easier to use against complex JSON.
See Using the XPath Extractor in JMeter guide (scroll down to "Parsing JSON") to learn more on syntax and use cases.
I suggest using the following regular expression:
"id":([^,]*),
This will first find "id": and then look for anything that is not a comma until it finds a comma. Note the character grouping is only around the value of the ID.
This will work for ANY length ID.
Edit:
The same concept works for almost any JSON data, for example where the value is quoted:
"key":"([^"]*)"
That regular expression will extract the value from given key, as long as value is quoted and does not contain quotes. It first finds "key": and then matches anything that is not a quote until the next quote.
You can use the quantifier like this:
([0-9]{2,}.+?)
It will catch 2 or more digits, and then any symbol, 1 or more times. If you want to allow no other characters after the digits, use * instead of +:
([0-9]{2,}.*?)
Regex demo

RegEx to cut out URL

I try to get an URL from a String of the following format:
RANDOMRUBBISHhttps://www.my-url.com/randomfirstname_randomlastnameRANDOMRUBBISH
I already tried some things, especially the the look before/after, which I used before successfully on another url format (starts https... ends .html, this was working).
But seems I'm too stupid to figure out the regex for the kind of string mentioned above. I just want the URL part from https.... to the end of the random last name. Is this even possible?
Any Ideas?
If you can guarantee that randomfirstname_randomlastname is all lowercase and RANDOMRUBBISH is all uppercase, you can use character classes [a-z] and [A-Z]. The language the regex is for will determine how to use these.
This is example works in javascript:
var str = "RANDOMRUBBISHhttps://www.my-url.com/randomfirstname_randomlastnameRANDOMRUBBISH";
var match = /https:\/\/www\.my-url\.com\/[a-z]*/.exec(str);

MFC: How do I construct a good regular expression that validates URLs?

Here's the regular expression I use, and I parse it using CAtlRegExp of MFC :
(((h|H?)(t|T?)(t|T?)(p|P?)(s|S?))://)?([a-zA-Z0-9]+[\.]+[a-zA-Z0-9]+[\.]+[a-zA-Z0-9])
It works fine except with one flaw. When URL is preceded by characters, it still accepts it as a URL.
ex inputs:
this is a link www.google.com (where I can just tokenize the spaces and validate each word)
is...www.google.com (this string still matches the RegEx above :( )
Please help...
Thanks...
Use the IgnoreCase flag instead of catering for each case.
Stick a ^ at the beginning if you want the start of the string to be the start of the URL
You're missing a lot of characters from possible, valid URLs.
You need to tell the regex to only match at the start and end of the string. I'm not sure how you do that in VC++ - in most regexs you enclose the pattern with ^ and $. The ^ says "the start of the string" and the $ says "the end of the string."
^(((h|H?)(t|T?)(t|T?)(p|P?)(s|S?))\://)?([a-zA-Z0-9]+[\\.]+[a-zA-Z0-9]+[\\.]+[a-zA-Z0-9])$
The second is matching because the string still contains a valid URL.
How about using CUrl (that is, 'C-Url', in ATL, not curl as in libcurl) which can 'parse' urls with CUrl::CrackUrl . If that function returns FALSE you assume it's not a valid URL.
That said, decomposing URL is sufficiently complex to warrant a proper parser, not a regex based decomposition. Cfr. rfc 2396 etc. for an overview on the complexities.
Start the regex with ^ to and end it with $ to have the regex match only if the entire sting matches (if that's what you want):
^(((h|H?)(t|T?)(t|T?)(p|P?)(s|S?))\://)?([a-zA-Z0-9]+[\.]+[a-zA-Z0-9]+[\.]+[a-zA-Z0-9])$
What about this one: (((f|ht)tp://)[-a-zA-Z0-9#:%_\+.~#?&//=]+) ?
This Regular Expression has been tested to work for the following
http|https://host[:port]/[?][parameter=value]*
public static final String URL_PATTERN = "(https?|ftp)://(www\\.)?(((([a-zA-Z0-9.-]+\\.){1,}[a-zA-Z]{2,4}|localhost))|((\\d{1,3}\\.){3}(\\d{1,3})))(:(\\d+))?(/([a-zA-Z0-9-._~!$&'()*+,;=:#/]|%[0-9A-F]{2})*)?(\\?([a-zA-Z0-9-._~!$&'()*+,;=:/?#]|%[0-9A-F]{2})*)?(#([a-zA-Z0-9._-]|%[0-9A-F]{2})*)?";
PS. It also validates on localhost link.
(Thoroughly written by me :-))