Regular Expression If 2nd parameter is Enrollment - regex

I have below response
{
"id": "3452",
"enrollable_id": "3452",
"enrollable_type": "Enrollment"
}
{
"id": "3453",
"enrollable_id": "3453",
"enrollable_type": "Task"
}
{
"id": "3454",
"enrollable_id": "3454",
"enrollable_type": "Enrollment"
}
{
"id": "3455",
"enrollable_id": "3455",
"enrollable_type": "Task"
}
I would like to get id [3452 and 3454] only if enrollable_type= Enrollment. This is for jmeter regex extractor so it would be great if I can just use one liner regex to fetch 3452 and 3454.

The RegEx you are looking for is:
_id":\s*"([^"]+(?=[^\0}]+_type":\s*"E))
Try it online!
Explanation
_id":\s*" Finds the place where the enrollment_id is
[^"]+(?= Matches the ID if:
[^\0}]+_type":\s* Finds the place where enrollable_type is
"E Checks if the enrollable type begins with an uppercase E
) End if
( ) Captures the ID
It's important to note that this RegEx will match on valid people and capture the valid ID. This means you will need to get each match's capture rather than just getting each match.
Disclaimer
The above RegEx contains backslashes, which you will need to escape if using the RegEx as a string literal.
This is the RegEx with all necessary-to-escape characters escaped:
_id":\\s*"([^"]+(?=[^\\0}]+_type":\\s*"E))

It's usually a bad idea to parse structured data with just a regex, but if you're intent on going this route then here you go:
"(\d+)"\s*,\s*(?="enrollable_type":\s*"Enrollment")
This assumes that entrollable_type always follows enrollable_id and that everything is quoted consistently with a little allowance for variance in white space. You should be able to handle a little more variance if necessary, such as if you're unsure if can depend on keys or data being quoted (["']?). However, if you can depend on the order of the properties (such as if they type comes before id) then you should abandon using a regex.
Here's a sample working in JavaScript
const text = `{ "id": "3452", "enrollable_id": "3452", "enrollable_type": "Enrollment" } { "id": "3453", "enrollable_id": "3453", "enrollable_type": "Task" } { "id": "3454", "enrollable_id": "3454", "enrollable_type": "Enrollment" } { "id": "3455", "enrollable_id": "3455", "enrollable_type": "Task" }`;
const re = /"(\d+)"\s*,\s*(?="enrollable_type":\s*"Enrollment")/g;
var match;
while(match = re.exec(text)) {
console.log(match[1]);
}

Your response seems to be a JSON one (however it's malformed). If this is the case and it's really JSON - I would recommend going for JSON Extractor instead as regular expressions are fragile, sensitive to markup change, new lines, order of elements, etc. while JSON Extractor looks only into the content.
The relevant JSON Path query would be something like:
$..[?(#.enrollable_type == 'Enrollment')].enrollable_id
Demo:
More information: JMeter's JSON Path Extractor Plugin - Advanced Usage Scenarios

You can extract the data in 2 ways
Using Json Extractor.
To extract data using json extractor response data should follow json syntax rules,
To extract data use the following JSON path in json extractor
$..[?(#.enrollable_type=="Enrollment")].id
and use match no -1 as shown below
To extract data using regular expression extractor use the following regex
id": "(.+?)",\s*(.+?)\s*"enrollable_type": "Enrollment
template : $1$2$3$4$
Match no -1
as shown below
you can see the variables stored using debug sampler
More information
extract variables

Related

JMeter json path extractor and Regular expression combination

I want to extract sys_id for the employee_number does not starting with "C"
{
"items": [{
"sys_updated_on": "2021-01-15 15:04:04",
"sys_id": "60eaa1dc47870d9132f624846d434a",
"employee_number": "C89"
}, {
"sys_updated_on": "2017-12-08 09:26:49",
"sys_id": "c57058e8db8689ca52c4be13961974",
"employee_number": "983"
}, {
"sys_updated_on": "2016-04-08 13:25:00",
"sys_id": "fd413e848716119096ca2d0ebb358e",
"employee_number": "565"
}]
}
I tried multiple JSON Extractor expressions but no luck
$.[?(#.employee_number=~'\d+')].sys_id
$.[?(#.employee_number=~'[0-9]')].sys_id
Need both Xpath and Regular expression combination as the JSON contains many other fields and provided JSON is a small part of that.
I also want to know how to combine the regular expression and JSON
path
If you want to match only numbers in the employee_number - try out the following:
$.items[?(#.employee_number =~ /\d+/)].sys_id
More information:
JsonPath Operators
How to Use the JSON Extractor For Testing
with regards to what you "also want to know" - one post - one question, however I'll give you some hint:

Find string in between in kibana elastic search with regex like in splunk

In splunk, we can filter out dynamic string in between two strings.
Say for example,
<TextileType>Shirt</TextileType>
<TextileType>Trousers</TextileType>
<TextileType>Shirt</TextileType>
<TextileType>Trousers</TextileType>
<TextileType>Shirt</TextileType>
The output I am expecting:
Shirt - 3
Trousers - 2
I am able to do this in splunk, easily.
Picture copied from Google (not exact one)
How can I achieve this in Kibana ?
Tried many ways, but not able to do any regex as per my need.
Note: Here's the example json query, in which I need to add regex. In this example, I am just trying to search for "Shirt" manually, which I am expecting to get dynamically.
{
"query": {
"match": {
"text": {
"query": "Shirt",
"type": "phrase"
}
}
}
}
Considering data is in the sample index, you can use a wildcard search:
GET /sample/_search
{
"query": {
"wildcard":{
"column2":"*Shirt*"
}
}
}
Notice how it only returns results containing keyword Shirt
If you are looking to clean the data, you'd need to run it through a logstash pipeline to strip the XML tags and leave you with the text.

Nifi - Extracting Key Value pairs into new fields

With Nifi I am trying to use the ReplaceText processor to extract key value pairs.
The relevant part of the JSON file is the 'RuleName':
"winlog": {
"channel": "Microsoft-Windows-Sysmon/Operational",
"event_id": 3,
"api": "wineventlog",
"process": {
"pid": 1640,
"thread": {
"id": 4452
}
},
"version": 5,
"record_id": 521564887,
"computer_name": "SERVER001",
"event_data": {
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
},
"provider_guid": "{5790385F-C22A-43E0-BF4C-06F5698FFBD9}",
"opcode": "Info",
"provider_name": "Microsoft-Windows-Sysmon",
"task": "Network connection detected (rule: NetworkConnect)",
"user": {
"identifier": "S-1-5-18",
"name": "SYSTEM",
"domain": "NT AUTHORITY",
"type": "Well Known Group"
}
},
Within the ReplaceText processor I have this configuration
ReplaceText
"winlog.event_data.RuleName":"MitreRef=(.*),Technique=(.*),Tactic=(.*),Alert=(.*)"
"MitreRef":"$1","Technique":"$2","Tactic":"$3","Alert":"$4"
The first problem is that the new fields MitreRef etc. are not created.
The second thing is that the fields may appear in any order in the original JSON, e.g.
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
or,
MitreRef=1043,Tactic=Command and Control,Technique=Commonly Used Port
Any ideas on how to proceed?
Welcome to StackOverflow!
As your question is quite ambiqious I'll try to guess what you aimed for.
Replacing string value of "RuleName" with JSON representation
I assume that you want to replace the entry
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
with something along the lines of
"RuleName": {
"Technique": "Commonly Used Port",
"Tactic": "Command and Control",
"MitreRef": "1043"
}
In this case you can grab basically the whole line and assume you have three groups of characters, each consisting of
A number of characters that are not the equals sign: ([^=]+)
The equals sign =
A number of characters that are not the comma sign: ([^,]+)
These groups in turn are separated by a comma: ,
Based on these assumptions you can write the following RegEx inside the Search Value property of the ReplaceText processor:
"RuleName"\s*:\s*"([^=]+)=([^,]+),([^=]+)=([^,]+),([^=]+)=([^,]+)"
With this, you grab the whole line and build a group for every important data point.
Based on the groups you may set the Replacement Value to:
"RuleName": {
"${'$1'}": "${'$2'}",
"${'$3'}": "${'$4'}",
"${'$5'}": "${'$6'}"
}
Resulting in the above mentioned JSON object.
Some remarks
The RegEx assumes that the entry is on a single line and does NOT work when it is splitted onto multiple lines, e.g.
"RuleName":
"Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
The RegEx assumes the are exactly three "items" inside the value of RuleName and does NOT work with different number of "items".
In case your JSON file can grow larger you may try to avoid using the Entire text evaluation mode, as this loads the content into a buffer and routes the FlowFile to the failure output in case the file is to large. In that case I recommend you to use the Line-by-Line mode as seen in the attached image.
Allowing a fourth additional value
In case there might be a fourth additional value, you may adjust the RegEx in the Search Value property.
You can add (,([^=]+)=([^,]+))? to the previous expression, which roughly translated to:
( )? - match what is in the bracket zero or one times
, - match the character comma
([^=]+)=([^,]+) - followed by the group of characters as explaind above
The whole RegEx will look like this:
"RuleName"\s*:\s*"([^=]+)=([^,]+),([^=]+)=([^,]+),([^=]+)=([^,]+)(,([^=]+)=([^,]+))?"
To allow the new value to be used you have to adjust the replacement value as well.
You can use the Expression Language available in most NiFi processor properties to decide whether to add another item to the JSON object or not.
${'$7':isEmpty():ifElse(
'',
${literal(', "'):append(${'$8'}):append('": '):append('"'):append(${'$9'}):append('"')}
)}
This expression will look if the seventh RegEx group exists or not and either append an empty string or the found values.
With this modification included the whole replacement value will look like the following:
"RuleName": {
"${'$1'}": "${'$2'}",
"${'$3'}": "${'$4'}",
"${'$5'}": "${'$6'}"
${'$7':isEmpty():ifElse(
'',
${literal(', "'):append(${'$8'}):append('": '):append('"'):append(${'$9'}):append('"')}
)}
}
regarding multiple occurrences
The ReplaceText processor replaces all occurrences it finds where the RegEx matches. Using the settings provided in the last paragraph given the following example input
{
"event_data": {
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043,Foo=Bar"
},
"RuleName": "Technique=Commonly Used Port,Tactic=Command and Control,MitreRef=1043"
}
will result in the following:
{
"event_data": {
"RuleName": {
"Technique": "Commonly Used Port",
"Tactic": "Command and Control",
"MitreRef": "1043",
"Foo": "Bar"
}
},
"RuleName": {
"Technique": "Commonly Used Port",
"Tactic": "Command and Control",
"MitreRef": "1043"
}
}
example template
You may download a template I created that includes the above processor from gist.

How to extract everything between 2 characters from JSON response?

I'm using the regex in Jmeter 2.8 to extract some values from JSON responses.
The response is like that:
{
"key": "prod",
"id": "p2301d",
"objects": [{
"id": "102955",
"key": "member",
...
}],
"features":"product_features"
}
I'm trying to get everything except the text between [{....}] with one regex.
I've tried this one "key":([^\[\{.*\}\],].+?) but I'm always getting the other values between [{...}] (in this example: member)
Do you have any clue?
Thanks.
Suppose you can try to use custom JSON utils for jmeter (JSON Path Assertion, JSON Path Extractor, JSON Formatter) - JSON Path Extractor in this case.
Add ATLANTBH jmeter-components to jmeter: https://github.com/ATLANTBH/jmeter-components#installation-instructions.
Add JSON Path Extractor (from Post Processors components list) as child to the sampler which returns json response you want to process:
(I've used Dummy Sampler to emulate your response, you will have your original sampler)
Add as many extractors as values your want to extract (3 in this case: "key", "id", "features").
Configure each extractor: define variable name to store extracted value and JSONPath query to extract corresponding value:
for "key": $.key
for "id": $.id
for "features": $.features
Further in script your can refer extracted values using jmeter variables (variable name pointed in JSON Path Extractor settings in "Name" field): e.g. ${jsonKey}, ${jsonID}, ${$.features}.
Perhaps it may be not the most optimal way but it works.
My solution for my problem was to turn the JSON into an object so that i can extract just the value that i want, and not the values in the {...}.
Here you can see my code:
var JSON={"itemType":"prod","id":"p2301d","version":"10","tags":[{"itemType":"member","id":"p2301e"},{"itemType":"other","id":"prod10450"}],"multiPrice":null,"prices":null};
//Transformation into an object:
obj = eval(JSON );
//write in the Jmeter variable "itemtype", the content of obj.itemType:prod
vars.put("itemtype", obj.itemType);
For more information: http://www.havecomputerwillcode.com/blog/?p=500.
A general solution: DEMO
Regex: (\[{\n\s*(?:\s*"\w+"\s*:\s*[^,]+,)+\n\s*}\])
Explanation, you don't consume the spaces that you must correctly, before each line there are spaces and you must consume them before matching, that's why isn't your regex really working. You don't need to scape the { char.

regular expressions: in a JSON array, get the id of the object that has status waiting

I have a JSON response that I want to parse with regular expressions that contains an array of objects like
...
{
"Id":"01",
"Subject":"Sub",
....
"Status":"Completed"
...
},
{
"Id":"02",
"Subject":"Sub",
....
"Status":"Waiting"
...
}
and I want to get the id of the object that has status waiting.
When I parse with "Id": "(.+?)",[\s\S]+?"Subject": "Sub",[\s\S]+?"Status": "Waiting"; it matches from "Waiting" to the first "Id" (backwards); certainly I want the Id of the object that is waiting.
Try this:
{\s*"Id":"(\d+)"[^}]+"Status":"Waiting"\s*}
Try this one:
(?s)"Id":\s*"([^"]+)[^}]*?"Status":\s*"Waiting"
It will work if there is no nested { } between properties Id and Status.
If you can use a Json Parser, please use that.
This will work as long as there are no nested brackets.
{[^{}]*Id":"(\d+)[^{}]*\s"Status":"Waiting"
See it here on Regexr
Your expression
"Id": "(.+?)",[\s\S]+?"Subject": "Sub",[\s\S]+?"Status": "Waiting"
^^^^^^^^
fails here
That part matches anything from the first "Sub", till it finds the first "Status": "Waiting"