Regex to get value from JSON field - regex

Considering the json below I need to get the value text from grand_total. The most close regular expression that I achieved was:
"grand_total":(.*?)\}
{
"data": [{
"grand_total": {
"digital": "4:41",
"hours": 4,
"minutes": 41,
"text": "4 hrs 41 mins",
"total_seconds": 16880.662732
}
}],
"end": "2019-09-04T02:59:59Z",
"start": "2019-09-03T03:00:00Z"
}

On s mode, an expression similar to the following might extract the desired value,
"grand_total":\s*{.*?"text"\s*:\s*"([^"]*)"
and you can likely call that using $1.
If you wish to explore/simplify/modify the expression, it's been
explained on the top right panel of
regex101.com. If you'd like, you
can also watch in this
link, how it would match
against some sample inputs.

Related

Regex: How to select float value in JSON string

How to select float number in JSON below.
I wish to select only value of cummulativeQuoteQty, so I wish to select 2221.98 value
{"orderId": 431708286, "clientOrderId": "eYKLCLsatTklXnphSd6ffF", "transactTime": 1676486706186, "price": "0", "origQty": "22.3", "executedQty": "22.3", "cummulativeQuoteQty": "2221.98", "status": "FILLED", "timeInForce": "GTC", "type": "MARKET", "side": "SELL", "fills": [{"price": "1.389", "qty": "22.3", "commission": "0.00007653", "commissionAsset": "BNB"}], "isIsolated": false}
If you must use regex instead of a json parser:
(?<="cummulativeQuoteQty": ")[^"]+
See live demo.
Note that this approach is brittle to formatting, and it less readable than parsing json, which most software supports, and getting the field value by name.

How to match a string exactly OR exact substring from beginning using Regular Expression

I'm trying to build a regex query for a database and it's got me stumped. If I have a string with a varying number of elements that has an ordered structure how can I find if it matches another string exactly OR some exact sub string when read from the left?
For example I have these strings
Canada.Ontario.Toronto.Downtown
Canada.Ontario
Canada.EasternCanada.Ontario.Toronto.Downtown
England.London
France.SouthFrance.Nice
They are structured by most general location to specific, left to right. However, the number of elements varies with some specifying a country.region.state and so on, and some just country.town. I need to match not only the words but the order.
So if I want to match "Canada.Ontario.Toronto.Downtown" I would want to both get #1 and #2 and nothing else. How would I do that? Basically running through the string and as soon as a different character comes up it's not a match but still allow a sub string that ends "early" to match like #2.
I've tried making groups and using "?" like (canada)?.?(Ontario)?.? etc but it doesn't seem to work in all situations since it can match nothing as well.
Edit as requested:
Mongodb Database Collection:
[
{
"_id": "doc1",
"context": "Canada.Ontario.Toronto.Downtown",
"useful_data": "Some Data"
},
{
"_id": "doc2",
"context": "Canada.Ontario",
"useful_data": "Some Data"
},
{
"_id": "doc3",
"context": "Canada.EasternCanada.Ontario.Toronto.Downtown",
"useful_data": "Some Data"
},
{
"_id": "doc4",
"context": "England.London",
"useful_data": "Some Data"
},
{
"_id": "doc5",
"context": "France.SouthFrance.Nice",
"useful_data": "Some Data"
},
{
"_id": "doc6",
"context": "",
"useful_data": "Some Data"
}
]
User provides "Canada", "Ontario", "Toronto", and "Downtown" values in that order and I need to use that to query doc1 and doc2 and no others. So I need a regex pattern to put in here: collection.find({"context": {$regex: <pattern here>}) If it's not possible I'll just have to restructure the data and use different methods of finding those docs.
At each dot, start an nested optional group for the next term, and add start and end anchors:
^Canada(\.Ontario(\.Toronto(\.Downtown)?)?)?$
See live demo.

How to extract value from dynamic generation

I am trying to use value from dynamic generation.
My payload looks like:
{
"payload": [
{
"questionDefinitionId": "jRs6zAh3GGt3G8tL9SzUrS8SiXyg6EirSElv3VRpX_Q=",
"questionText": "What was your childhood nickname?",
"languageCode": "en",
"questionNumber": 1,
"disabled": false
},
{
"questionDefinitionId": "pmyZ4excucJBuFvSPCr6yIvO74vZS8DUNPx0GYVR57E=",
"questionText": "What is your favorite team?",
"languageCode": "en",
"questionNumber": 2,
"disabled": false
},
{
"questionDefinitionId": "awE_x8cXHcc0uhJ7lgtjzX1NtgA0IQBBWu7iDbVqW-k=",
"questionText": "What is the name of your favorite childhood friend?",
"languageCode": "en",
"questionNumber": 1,
"disabled": false
},
{
This generation is different every time when is executed.
I need to get: jRs6zAh3GGt3G8tL9SzUrS8SiXyg6EirSElv3VRpX_Q=, which is questionDefinitionId value for the questionNumber: 1, but is always generated in the different location in the Json file
but every time their order is in different place in the long list.
Your payload appears a JSON object to me, in this case it makes much more sense to go for JSON Extractor, this will be way easier to implement/read/support/etc.
For example you can get questionDefinitionId attribute value where questionText is What was your childhood nickname? using == Filter Operator like:
$.payload[?(#.questionText == 'What was your childhood nickname?')].questionDefinitionId
Demo:
If you want the questionDefinitionId where questionNumber is 1 amend the JSON Path Expression to look like:
$.payload[?(#.questionNumber == '1')].questionDefinitionId
However in your example there are 2 questions with number 1
See API Testing With JMeter and the JSON Extractor for more information on the concept.
Try this regex:
(?<="questionDefinitionId": ")(.+?)(?=")

Regex to match text between two delimeters?

Heres an example of the things I need to match on a request that I have stored as a text:
[{"id":"896","name":"TinyAuras","author_id":"654","author":"Kurisu</span></strong></span></a>","githubFolder":"https://github.com/xKurisu/TinyAuras/blob/master/TinyAuras.csproj","count":9,"countByChampion":{"":9,"total":9},"description":"(Beta) Aura/Buff/Debuff Tracker","udate":"1451971516","createdDays":375,"image":"https://cdn.joduska.me/forum/uploads/assemblydb/image-default.jpg","strudate":"2016-07-22 19:40","champions":null,"forum_link":"165574","assembly_compiles":true,"voted":false,"voted_champions":[]},
I want to select that link up to the stop here (basically the github folder, not the actual csproj).
I have a file full of thousands of those and I'm trying to extract all of those links and put them in a text file.
Here is what I have so far for perl regex:
(?<=githubFolder":").*(?=\/.+\.csproj") but that ends up selecting more than I need after the first match. Any suggestions?
The issue is, I want everything right before this.csproj.
So in my example I want to extract:
https://github.com/xKurisu/TinyAuras/blob/master/
This regex:
"githubFolder":"([^"]*/)[^"/]*"
selects:
https://github.com/xKurisu/TinyAuras/blob/master/
in your example.
However, it would likely be better to use an actual json parser as Jim D.'s answer suggests so you won't have to worry about spacing and special characters.
While the accepted answer will likely get the job done here, I just want to point out that the old school linux tools are not easy to use to get 100% accurate results working with JSON, and for that reason, it would be best practice to use an actual JSON parser to extract your content.
One simple reason is that strings are JSON encoded so you will need to somehow decode them to insure you get the correct result. Another is that JSON is not a regular language, it is context free. You will need something more powerful than regular expressions in general.
One I am familiar with is jq, and the array of JSON objects can be parsed as the OP desires like this:
$ jq -r ' .[] | .githubFolder ' foo
https://github.com/xKurisu/TinyAuras/blob/master/TinyAuras.csproj
https://github.com/xKurisu/"GiantAuras"/blob/master/GiantAuras.csproj
$
where file foo is
[
{
"id": "896",
"name": "TinyAuras",
"author_id": "654",
"author": "Kurisu</span></strong></span></a>",
"githubFolder": "https://github.com/xKurisu/TinyAuras/blob/master/TinyAuras.csproj",
"count": 9,
"countByChampion": {
"": 9,
"total": 9
},
"description": "(Beta) Aura/Buff/Debuff Tracker",
"udate": "1451971516",
"createdDays": 375,
"image": "https://cdn.joduska.me/forum/uploads/assemblydb/image-default.jpg",
"strudate": "2016-07-22 19:40",
"champions": null,
"forum_link": "165574",
"assembly_compiles": true,
"voted": false,
"voted_champions": []
},
{
"id": "888",
"name": "\"GiantAuras\"",
"author_id": "666",
"author": "Astaire</span></strong></span></a>",
"githubFolder": "https://github.com/xKurisu/\"GiantAuras\"/blob/master/GiantAuras.csproj",
"count": 90,
"countByChampion": {
"": 777,
"total": 42
},
"description": "(Stable) Aura/Buff/Debuff Tracker",
"udate": "1451971517",
"createdDays": 399,
"image": "https://cdn.joduska.me/forum/uploads/assemblydb/image-default.jpg",
"strudate": "2016-07-22 19:40",
"champions": null,
"forum_link": "165574",
"assembly_compiles": true,
"voted": false,
"voted_champions": []
}
]
Here is the regexp:
("githubFolder":".*)\/(.*\.csproj)
1. "githubFolder":"https://github.com/removed/removed/blob/master/stophere/this.csproj
1.1. Group: "githubFolder":"https://github.com/removed/removed/blob/master/stophere
1.2. Group: this.csproj
you can test it here: http://www.regexe.com
this pattern : (http|https):\/\/github\.com\/[\w\/]+\/ selects all directories which starts with github.com on your example.
Try this RegEx:
githubFolder":"([a-zA-Z:\/.]+\/)
It will Group the link upto last slash.

Replace specific part of a string

I'm currently encountering a pickle in modifications of a document. Lets say for example, I have this chunk of text:
"id": "EFM",
"type": "Casual",
"hasBeenAssigned": false,
"hasRandomAssigned": false
},
I currently have roughly 73 - 80 occourances of:
"id" : "somethingdifferent",
Using a regular expression in notepad++, How can I select the entire string:
"id" : "",
but only change the contents between the second set of quotes?
Edit
An oversight made me leave this information out:
"equipedOutfit": {
"id": "MkIV",
"type": "Outfit",
"hasBeenAssigned": false,
"hasRandonAssigned": false
},
"equipedWeapon": {
"id": "EFM",
"type": "Casual",
"hasBeenAssigned": false,
"hasRandonAssigned": false
},
The selected text, looking for is:
"id" : "EFM",
You can use a regex like this:
("id": ").*?"
With a replacement string:
$1whatever"
^^^^^^^^--- replace 'whatever' with whatever you want
Working demo
Update: as you updated your question, I'm updating the answer. If you want only to replace "id": "EFM" then you have just to look for that text only and put the replacement string you want.
"id":\s*"\K[^"]*
You can use \K here and replace by whatever you want.See demo.
https://regex101.com/r/sS2dM8/29
EDIT:
If you want only EFM then use
"id"\s*:\s*"\KEFM(?=")
Find what: ("id"\s?:\s?").*(")
Replace with: \1somethingdifferent\2
Options:
Regular expression, Wrap around