How to modify single key value pair with Charles proxy rewrite - regex

I would like to modify the value of a single key in a larger json response body using Charles proxy re-write.
As an example, I want to change age from 20 to 30 (but can be any initial value):
{
"userId": "some_value_i_dont_want_to_touch",
"username": "Charlie",
"age": "20"
}
I do not wish to replace the entire json body as that is not practical for my situation; I just want to modify the value of a single key.
The regex "age":[\s\S]"(.*)" or "age":[\s\S]"[^"]+" highlights the text I want on sites like https://regex101.com/
I've tried a few variants of this as well, but none of them seem to actually change the value.
Example screenshot
If I just enter the Match value "20" and the Replace value as "30" it works which tells me I have the url and path correct, but I'd like to avoid accidentally replacing the wrong thing.
Using the current full value without regex doesn't work either, so trying to do "age": "20" -> "age": "30" doesn't work, but maybe I need to escape something, though this would also limit me to cases where I know the value beforehand which is not ideal.
The error logs also indicate the request was re-written, but nothing changes
Ex: Running: Body: "age":[\s\S]"(.*)" -> "age": "30"
I am not clear on how the groups work (with the $) but maybe this has something to do with it.
I am probably just missing something obvious or going about this wrong.
Any insight would be appreciated.

For this topic, perhaps there is no whitespace needed because a proxy response is a text string with no spaces, however when we view it in JSON Text it helps us to read(thus having white spaces)
For the rest like me who is new that may help!
I just learnt that for [ needs to be escaped for regex to work, etc:
Match:
"xxxx":\[] #tick regex
or you may,
^"xxxx.{0,}"
xxxx refering to anything that starts with (^)
" at the end marks the end of the string or else it'll match everything else
Replace:
"xxxx":\["new"]
Hope this helps!

Related

Regex for capturing *only* the value for a specific key out of a JSON string

I have a JSON String, like "{"Key1": "Value1", "Key2": "Value2"}"
And I'm trying to specifically get Value1 (I don't want the quotes)
Basically I don't want to decode the entire JSON string, I just want to extract a single value as a string
The best I can do right now is get "Key1": "Value1"
There are a bunch of questions that are similar, and here are answers that are close:
https://stackoverflow.com/a/14350155/2415178
https://stackoverflow.com/a/33783638/2415178
However, each of these answers always include the key name as well
I've been fiddling with regexr (here's my test: https://regexr.com/57fln), but I've been at it for like a half hour now and I don't really want to keep trying as I every time I learn regex I end up forgetting it because of how infrequently I use it. I'm assuming it should be pretty simple, but my solutions that are close look really ugly
The RegEx that you link to,
/"Key1": "(.+?)"/i
is actually perfect - it's just that you need to get the output of the capture group, rather than of the whole RegEx, and the tool that you're using (RegExr) doesn't show this.
If you ask for just that group, then you'll have what you wanted, e.g. in JavaScript:
window.onload = ()=>{
document.write('{"Key1": "Value1", "Key2": "Value2"}'.match(/"Key1": "(.+?)"/i)[1]);
}
Note: the [1] is doing the magic here - if you did [0] you would get the whole capture string - which is what RegExr shows. [1] shows the 1st capture group (if one exists), [2] would show the second, etc.
However, I would recommend that you use a JSON parser instead, as there are a lot of things that this could break on - e.g. if the value has escaped quotes like "Key1": "Value\"1", etc.

What is the regular expression to extract word which is not preceded by any characters

For example,
ID is content ID,
need to know the regex to extract the first ID, I tried using [/b]ID but is not working
You're not giving us a lot to go on, but here is an extremely general example:
(ID.+)
which matches ID and everything after, be it ID:, ID "123", or ID: "123". Note that if you have characters after ID, it will capture that too. Update your question and I will update my answer to accomodate it accordingly.
Here is a live example: http://rubular.com/r/enuYP1kze4.

regex exclusion not working (need "in between but not including")

Need regex to pull out the access token from the below:
{
"access_token": "APWsWZi4CfK1cejU2Fn8u2xFtFKS_sDD3XlD6AKoydYTelIIadE5rarE6V2M_LVBD3ak_1WvaL0mlKYyCrSqubsbZCSidCLHB9kepR2ffw-O0Z8aMug4e7AYQ_gs_eWSygnFjbbOvCROp6mzvaBXsTEjn1J9Rtvt5yUzP1XKcHp4dQnO04MlwryZGO0Fuov4sMWpeml-8vB7o7H4hkQnSbR1yLuG_I6mmetKZqBMKibP_C3PndvnaFJzAVODDe3bGiubKELOu6jcSEOIxZKO38F_jXSDsrwIVbyrwYriD1menbh6hN7oFWdQzYc0U-5fxnAlfPm1yHTboAPxDqgIHKVOw4Wq-Ns7zAl9ZB16omRDP0yxNIG0hSQ7mT8xnf8tpsB7v3KdiHgDVbEe7P0mwKwpkQHUGp8-0B7P7iCaXWQmylLPh43yr68",
"token_type": "Bearer",
"expires_in": 300
}
Using ([A-Z]|-)\w+ pulls out my string but also Bearer. I tried ([A-Z]|-)\w+(?!Bearer) and it made no difference. Any other suggestions?
To be perfectly clear: the "access_token": part can't be included. ONLY the token itself.
You may use this:
"access_token": "(.*)"
Live Demo
If you need a regex that only matches the token and not anything else, the only regex I can think of that will work for you is
\w{36,}
where 36 (just an example) is the minimum number of characters in the access token. But this is very hacky. It is at least possible, if not likely, that another piece of information could get added that would also match that regex.
The reason this works is because in the JSON provided, the access token is by far the longest string of related characters. \w is a shortcut for [a-zA-Z0-9_] and \w{36,} will match a string of at least 36 such characters. Because " is not included, the string terminates at that point. In other words, as long as your number is larger than the longest word outside of the token, it will only pick up the token.
The real solution is to go with
"access_token": "(\w+)"
and reference the first capture group, if you have the ability to specify that.

Regex HTTP Response Body Message

I use a jmeter for REST testing.
I have made a HTTP Request, and this is the response data:
{"id":11,"name":"value","password":null,"status":"ACTIVE","lastIp":"0.0.0.0","lastLogin":null,"addedDate":1429090984000}
I need just the ID (which is 11) in
{"id":11,....
I use the REGEX below :
([0-9].+?)
It works perfectly but it will be a problem if my ID more than 2 digits. I need to change the REGEX to :
([0-9][0-9].+?)
Is there any dynamic REGEX for my problem. Thank you for your attention.
Regards,
Stefio
If you want any integer between {"id": and , use the following Regular Expression:
{"id":(\d+),
However the smarter way of dealing with JSON data could be JSON Path Extractor (available via JMeter Plugins), going forward this option can be much easier to use against complex JSON.
See Using the XPath Extractor in JMeter guide (scroll down to "Parsing JSON") to learn more on syntax and use cases.
I suggest using the following regular expression:
"id":([^,]*),
This will first find "id": and then look for anything that is not a comma until it finds a comma. Note the character grouping is only around the value of the ID.
This will work for ANY length ID.
Edit:
The same concept works for almost any JSON data, for example where the value is quoted:
"key":"([^"]*)"
That regular expression will extract the value from given key, as long as value is quoted and does not contain quotes. It first finds "key": and then matches anything that is not a quote until the next quote.
You can use the quantifier like this:
([0-9]{2,}.+?)
It will catch 2 or more digits, and then any symbol, 1 or more times. If you want to allow no other characters after the digits, use * instead of +:
([0-9]{2,}.*?)
Regex demo

ElasticSearch Regexp Filter

I'm having problems correctly expressing a regexp for the ElasticSearch Regexp Filter. I'm trying to match on anything in "info-for/media" in the url field e.g. http://mydomain.co.uk/info-for/media/press-release-1. To try and get the regex right I'm using match_all for now, but this will eventually be match_phrase with the user's query string.
POST to localhost:9200/_search
{
"query" : {
"match_all" : { },
"filtered" : {
"filter" : {
"regexp": {
"url":".*info-for/media.*"
}
}
}
},
}
This returns 0 hits, but does parse correctly. .*info.* does get results containing the url, but unfortunately is too broad, e.g. matching any urls containing "information". As soon as I add the hyphen in "info-for" back in, I get 0 results again. No matter what combination of escape characters I try, I either get a parse exception, or no matches. Can anybody help explain what I'm doing wrong?
First, to the extent possible, try to never use regular expressions or wildcards that don't have a prefix. The way a search for .*foo.* is done, is that every single term in the index's dictionary is matched against the pattern, which in turn is constructed into an OR-query of the matching terms. This is O(n) in the number of unique terms in your corpus, with a subsequent search that is quite expensive as well.
This article has some more details about that: https://www.found.no/foundation/elasticsearch-from-the-bottom-up/
Secondly, your url is probably tokenized in a way that makes "info-for" and "media" separate terms in your index. Thus, there is no info-for/media-term in the dictionary for the regexp to match.
What you probably want to do is to index the path and the domain separately, with a path_hierarchy-tokenizer to generate the terms.
Here is an example that demonstrates how the tokens are generated: https://www.found.no/play/gist/ecf511d4102a806f350b#analysis
I.e. /foo/bar/baz generates the tokens /foo/bar/baz, /foo/bar, /foo and the domain foo.example.com is tokenized to foo.example.com, example.com, com
A search for anything in below /foo/bar could then be a simple term filter matching path:/foo/bar. That's a massively more performant filter, which can also be cached.