Regular Expressions - Greater than / less than conditionals allowed? - regex

I have a page that returns a list of buyables (i.e. number available) and their associated product ID:
{"buyable":1,"prodId":444123,"
I want to retrieve the product ID from the page with a buyable number greater than 1, is it possible to do this within a regular expression?
EDIT: I have the following regular expression to grab the appropriate groups, but am not finding a good way to setup a conditional statement within the Regular Expression to filter out the non-"1" buyable items.
.*\"buyable\":([0-9]+),.*\"prodId\":([0-9]+),
or am I drastically overthinking this and I just need to use the below instead?
.*\"buyable\":([1-9]+),.*\"prodId\":([0-9]+),

I think you should be using json path to tackle this problem.
Anyway, if you want to use a regex, then you could use:
.*\"buyable\":(?:[02-9]|[0-9]{2,}),.*\"prodId\":([0-9]+),
Working demo

Related

repeated, arbitrary capture groups

Given a string, eg.:
static_string.name__john.id__6.foo__bar.final_string
but with an arbitrary number of label__value. components, how can I repeat the capture groups, split them into label & value, and also capture the terminating final_string ?
For the above I'd want [name, john, id, 6, foo, bar, final_string]
Is something like this possible when I don't know the number of label__value. components in advance?
This is for golang / RE2 if that matters.
Update: I don't have the luxury of doing this in a few lines of code, and would need to do this in a single regex. The regex is defined in a config file to an application I don't control, so a code based loop with conditionals etc is unfortunately not possible.
This totally depends on what the thing you are putting this into expects.
This is answer focused on getting you the capture groups in a basic way attempting to avoid any issues with the "thing" you are putting the regex into and RE2.
Note: You might find that the final_string doesn't get the capture group index you expect with this method, but again depends on what you are putting the regex into.
A regular expression that would match "one" and "no" key/value pairs the following is:
^[^.]+(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+))$
static_string.final_string
static_string.name__john.final_string
To support one more key/value pair we repeat part of the regular expression:
Part repeated:
(?:\.([^.]+?)__([^.]+))?
So to support 2 key value pairs the regular expression is:
^[^.]+(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+))$
This now supports the following additional example:
static_string.name__john.foo__bar.final_string
So if I expand that out to support 12 key value pairs the regular expression is:
^[^.]+(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+?)__([^.]+))?(?:\.([^.]+))$
This supports the following additional examples:
static_string.name__john.id__6.foo__bar.final_string
static_string.name2_1b__john.id__6.foo__bar.final_string
static_string.name__john.id__6.foo__bar.name__john.id__6.foo__bar.name__john.id__6.foo__bar.name__john.id__6.foo__bar.final_string

APIGEE - Regular Expression Not Working in Condition

I am trying to use a condition to catch the case in which the query string of a request contains two or more parameters from a specific list. In such a case I wish to raise an error.
Of course, I can use many "and" and "or" clauses, but that will get very messy very quickly as the size of the list of parameters increases. So instead, I opted to use a regex to test for this.
As an example, if the list of parameters is [Bird,Dog,Horse], then any request who has two or more of these parameters in its query string should be matched.
The regular expression I am using is:
/(.(Bird|Dog|Horse).){2}
I tested in various regex testers and it works.
However, when I put the condition:
request.querystring Matches "/(.(Bird|Dog|Horse).){2}"
I never get a match.
Am I missing some specific APIGEE regex rules? Maybe the "{2}" is not supported in APIGEE? Thank you very much!!
Adam
The problem was I used "Matches" instead of "JavaRegex".
I tried "JavaRegex" before, but it also didn't work - the second problem was that I have the "/" at the beginning, which is not needed if you use "JavaRegex".
https://community.apigee.com/questions/65080/regular-expression-not-working-in-condition.html?childToView=65113#answer-65113

Custom sort with regular expressions?

I have two columns, first with a list of names and second with their rating. I want to use custom sort, for which I use the following (found it here)
:
=sort(A33:B50;match(B33:B50;{"Great";"Good";"OK";"Bad");true)
Which works, but my ratings are actually:
Great+
Great
Great-
Good+
Good
Good-
OK+
...
Is there any way where I can combine the formula above with regular expressions? Something along the lines of this:
=sort(A33:B50;match(B33:B50;{"Great*";"Good*";"OK*";"Bad*");true)
Which doesn't really do anything. Checked out the regex formulas of Google sheets, but couldn't find any that would do the trick in this situation.
Cheers!
PS: A workaround would be
=sort(A33:B50;match(B33:B50;{"Great+";"Great";"Great-";"Good+";"Good";"Good-";"OK+";"OK";"OK-";"Bad+";"Bad";"Bad-");true)
but I'm curious if there's a less tedious way of doing this
=sort(A1:B7;match(regexextract(B1:B7;"Great|Good|OK|Bad");{"Great";"Good";"OK";"Bad"};0);true)
Pipeline | is for OR login in Regex.
Change A1:B7 and B1:B7 to your ranges.
Edit
for sorting Good+ Good Good- change regex to "Great|Good\+|Good\-|Good|OK|Bad", change the array to {"Great";"Good+";"Good";"Good-";"OK";"Bad"}
counter-intuitive: the order in the regextract is Good+|Good-|Good
and in the array {"Great";"Good+";"Good";"Good-";"OK";"Bomb"} (Good in the regex was already capturing Good- instances)

How to extract multiple values with a regular expression in Jmeter

I am running tests with jmeter and I need to extract with a Regular Expression:
insertar?sIws2kyXGJJA_01==
insertar?sIws2kyXGJJA_02==
in the following String:
[\"EMBPAGE1_00010001\",\"**insertar?sIws2kyXGJJA_01==**\",1,100,\"%\",300,\"px\",0,\"center\",\"\",\"[\"EMBPAGE1_00010002\",\"**insertar?sIws2kyXGJJA_02==**\",1,100,\"%\",300,\"px\",0,\"center\",\"\",\"
Use super secret operator (Negative match N)
UPD: G2 - is in my example, as I extract two groups from each encounter.
each encounter is "uuid" in g1 and g2 is second part I need second part here.
that's why $2$ template and g2. If your encounters in one group you ll most likely use $1$ template that will place all encounters into g1.
If you have one match group you don't actually need _gN ending at all.
To understand more the variables after group extraction add a "Debug PostProcessor" and inspect output in TreeView.
It nice two know that control elements like "For each" understand groups and can work with prefix like regexUUID_ and walk through. In most cases it's next you do after extraction.
UPD2. primitive version of regexp in question (insertar\?sIws2kyXGJJA_\d*)==([^[]*)
with template $1$$2$
you ll have the first parts in g1 group and the second parts in g2
In answer given by DMC, you need to add regular expression extractor TWICE to match/retrieve both the values with different Match No. (1, 2). Though it is also correct, suggesting better approach to achieve the same.
Another Approach:
1. Capture Both Values:
You can use Template to capture both the values at the same time, and later, refer it using indexing.
Please check the following screen shot:
Here, we captured both the values using two groups into two different templates, as $1$ and $2$ respectively. Here, templates store the data in the order of the groups specified in regular expression by default. (FYI, you can change the order also by tweaking the order of templates like $2$ and then $1$.)
Now, as in the diagram, we are capturing two values and storing them using templates: $1$ (refers to first group match) and $2$ (refers to second group match)
2. Retrieve Values:
Now, refer these values in your script by using the following syntax:
${insert_values_gn} (n refers to match no.)
eg:
${insert_values_g1} - refers to the first match
${insert_values_g2} - refers to the second match
To make it simple, You can think "insert_values" as list of strings captured using multiple groups and use 'n' (1,2,3 etc) as the index to retrieve the values.
Note: using templates, you can have any number of values can be retrieved using multiple groups and refer to them by indexing, using a single regular expression extractor.
I'm sure there is a more efficient way but this worked:
\*\*(.*?)\*\*.*\"\*\*(.*?)\*\*
You can also use only \*\*(.*?)\*\*
It will match both of them anyway, so make sure you set the right 'Matching No.' in Jmeter if you pass one of the values:
The Matching No should be 1 for the first, and 2 for the second match i believe.

Find the simplest regex query to match a set of examples

The online service Kimono provides a GUI for a user to select
page elements and then uses the selected elements to create a regex which will match those selections. This regex can then be used to extract information from the same page at different points in time. The service is useful because you dont have to generate the regex query yourself and instead provide a set of example query matches which are then compiled into a query regex expression. The company was acquired and so the service is no longer available.
However the problem seems like an interesting problem and so my question is this: what algorithm is capable of turning a number of examples (both positive and negative are needed) in a large document into a regex which when applied will then match those examples?
Regular expressions are typically implemented with NFAs and DFAs.
https://en.wikipedia.org/wiki/Nondeterministic_finite_automaton
https://en.wikipedia.org/wiki/Deterministic_finite_automaton
The process of finding the smallest DFA to represent a particular DFA is known as minimization.
https://en.wikipedia.org/wiki/DFA_minimization
This needs to be converted back into a regular expression.
https://cs.stackexchange.com/questions/2016/how-to-convert-finite-automata-to-regular-expressions