How to exclude the last character from the below reg expression response - regex

I am having the below text response I created a regular expression but I want to exclude the last character from the response.
INPUT:
href=\"admin_task_detail?rid=d652a3dd-fff0-4174-a065-5158298493af&tid=4947358c-9f5a-4174-8699-7fa12f9ac3c8&v=5ecb5743a92e7\" >T4947-C3C8<\/a>","#data-order":"T4947-C3C8"},
Reg Exp which I tried
tid=(.+?)\"
Results:
Match[1][0]=tid=4947358c-9f5a-4174-8699-7fa12f9ac3c8&v=5ecb5743a92e7\"
Match[1][1]=4947358c-9f5a-4174-8699-7fa12f9ac3c8&v=5ecb5743a92e7\
The slash symbol need to get excluded from the response.

Just escape the backslash in the regex:
tid=(.+?)\\"
Demo & explanation

Are you absolutely sure that you need this:
tid=4947358c-9f5a-4174-8699-7fa12f9ac3c8&v=5ecb5743a92e7
and not this:
tid=4947358c-9f5a-4174-8699-7fa12f9ac3c8
?
Because these tid and v are different parameters of the query string and most probably you need only tid, not v.
In first case the relevant regular expression would be tid=(.+?)\\
In the second case you will need tid=(.+?)&
The tricky point is that backslash is a meta character so it needs to be escaped by the extra backslash.
In the majority of cases (like in your one) it's easier and better to use Boundary Extractor where you don't need to deal with these regular expressions, it's sufficient to provide left and right boundaries in the plain text. Check out The Boundary Extractor vs. the Regular Expression Extractor in JMeter article for more details.

Related

How to write a regular expression to extract data from response in JMeter

I need to extract an ID from the response of an API call in JMeter. The response looks something like this:
"sessionId":"Edhgjhsbdjkkmsdsd-dlkmsdl.Mkhsdiufhskjndjsbsbd.iusdhfjsnkdnsd"
I only need to extract the last string between . and " i.e. iusdhfjsnkdnsd.
I've tried writing this regular expression:
"sessionId":"(.+?)\.(.+?)\.(.+?)"
I've set the template value to $3$ so only the third value is picked but it doesn't work. Nothing is captured using this regex.
You're almost there, all you need to do is to escape dots with backslashes like:
"sessionId":"(.+?)\.(.+?)\.(.+?)"
because dot is a meta-character which means "any character" and if you want JMeter's regex engine to treat is as a dot - you need to escape it properly.
Also you can consider using Boundary Extractor, all you need to do is to set "left" and "right" boundaries and JMeter will capture everything in-between

Jmeter correlation for values with no left or right boundries

I wanna correlate a alphanumeric 81fe8bfe87576c3ecb22426f8e57847382917acf value returned from a POST API request as Response which consists of no left or right boundaries, I am using ^[a-zA-Z0-9]+$ as regex expression which is a correct regex expression with Jmeter RegExp Tester, but unable to extract the alphanumeric value from the response and store in a variable as determined by the logs using Regular Expression Extractor.
But, Values returned by the logs shows unable to extract alphanumeric value using Regular Expression Extractor.
Here is my Regular Expression Extractor to extract the alphanumeric value
I already have tried out all the Fields to check options available, nothing works. I am not sure , exactly why is it not working as the regex expression ^[a-zA-Z0-9]+$ is correct, maybe it's related to empty or no left and right boundaries.
Would really appreciate any resolution provided.
Your ^[a-zA-Z0-9]+$ regex contains no capturing groups, but your template, $1$, retrieves Group 1 value from the match. Since the match has no Group 1, the value is not found.
There are two solutions:
Replace your ^[a-zA-Z0-9]+$ with ^([a-zA-Z0-9]+)$ and keep on using $1$ template.
Replace $1$ with $0$ so as to access the whole match value, Group 0, rather than Group 1 (that is missing in the original regex).
You need to surround your regular expression with parentheses in order to have a capture group, see Meta Characters chapter of JMeter User Manual for more information
Given you need to extract only alphanumeric characters you can simplify your regular expression to just (\w+)
Given you need to get the full response you can just use Boundary Extractor and leave both boundaries blank - JMeter will store the whole response into a JMeter Variable (it will work for JMeter 5.2 or higher, see JMeter Bug 63775 for details
If you need to store the whole response into a JMeter Variable and want to use Regular Expression Extractor for this the relevant regular expression would be (?s)(^.*)

Regex extract Json attributes name

I am looking to extract all the attributes name of a Json string. I came out with an expression but it doesn't work for some specific scenario,
The expression I build is the following
"([a-zA-Z0-9-]*)"(?::\s(?:"[a-zA-Z0-9-\s:]*")|(?:\s^null$)|(?:\s[0-9]+,))
And it works fine for attribute like these :
{"dataAreaId": "cel", "CustomerAccount": "C101112", "AddressBrazilianCNPJOrCPF": "", "PartyType": "Organization"}
But it doest retrieve/match the attribute for these :
{ "DeliveryAddressLongitude": 0,"AddressTimeZone": null,"FullPrimaryAddress": "7800 Avenue Aurtweuil Suite 28841\nBrossard QC J2Z 3P1\nCanada"}
I will really appreciate having any guideline about it as I am struggling.
Cheers
VIncent
With generated json you'd only have to match the word preceding a colon, right, while accounting for quotes? For example:
/("?)(\b\w+\b)\1:/gm
Edit:
/.../gm: g and m are flags that modify the behaviour of the expression, where g (global) means try to find all matches in the string and m (multiline) means make every line in the string anchorable by ^ and $; you actually don't need the m flag here, that was an oversight on my part.
Depending on regex flavour you'll use flags as seen above - after the second expression delimiter, as parameters for match functions or as in-expression modifiers like (?g). I just find /.../flags a good shortcut to show an expression with flags.
\b is a word boundary that anchors a sequence of word characters by making sure there can't be a word character on both sides of it; if there is, the expression won't match. In this expression I just use it to make the engine fail bad strings a little bit quicker while accounting for the optional ". They aren't strictly needed for this expression when you use it only on well-formed JSON.

Regex to match text from multiple links

How to extract links which contain a certain word?
For e.g.:
https://www.test.com/text/1###https://www.test.com/text/word/2###https://www.test.com/text/text/word/3###https://www.test.com/3/text###https://www.test.com/word/3/text/text
How to search "word" from below regex?
((https:).*?(###))
The result should be like this
https://www.test.com/text/word/2
https://www.test.com/text/text/word/3
https://www.test.com/word/3/text/text
Let's try to build such regex. First we need to find the beginning of url:
/(https?:\/\//
We add ? after https for http urls.
Then we need to find any text except ###, so we need to add:
(?:(?!###).)*
which means - any amount of characters not starting a ### sequence.
Also we need to add word itself and previous sub-expression again, since word can be surrounded by any text:
word(?:(?!###).)*
But the thing is that last sub-expression will skip last character before ###, so we need to add one more thing to handle it:
.(?=###|$)
which means - any character followed by ### or end of string. The final expression will look like:
/(https:\/\/(?:(?!###).)*word(?:(?!###).)*.(?=###|$))/g
But i believe, it's better to just split text by ### and then check for needed word by String.prototype.includes.
If the word has to be a part of the pathname, you might use filter in combination with URL and check if the parts of the pathname contain word.
let str = 'https://www.test.com/text/1###https://www.test.com/text/word/2###https://www.test.com/text/text/word/3###https://www.test.com/3/text###https://www.test.com/word/3/text/text';
let filteredUrls = str.split("###")
.filter(s =>
new URL(s).pathname
.split('/')
.includes('word')
);
console.log(filteredUrls);
If you want to use regex only and possessive quantifiers are supported (The javascript tag has been removed) you might use:
https?://[^#w]*(?:#(?!##)|w(?!ord)|[^#w]*)++word.*?(?=###|$)
Regex demo
Previous answer
You for sure looking for this regular expression:
https://www.test.com/(text/)*word/\d+(/text)*
Here is how you can use it in JavaScript context (very slash / is escaped by backslash \/):
var str = 'https://www.test.com/text/1###https://www.test.com/text/word/2###https://www.test.com/text/text/word/3###https://www.test.com/3/text###https://www.test.com/word/3/text/text';
var urls = str.match(/https:\/\/www.test.com\/(text\/)*word\/\d+(\/text)*/g);
console.log(urls);
In the array you get exactly the elements you wanted.
Update the answer after update question and adding comment by the author
If you need take the words from your example string, then you have to use a little more complex regular exception:
var str = 'https://www.test.com/text/1###https://www.test.com/text/word/2###https://www.test.com/text/text/word/3###https://www.test.com/3/text###https://www.test.com/word/3/text/text';
var urls = str.match(/(?<=\/)\w+(?=\/\d+\/\w)|(?<=(\w\/\w+\/))\w+(?=\/\d)/g);
console.log(urls);
Explanation
Here is regular expression /(?<=(\w\/\w+\/))\w+(?=\/\d)|(?<=\/)\w+(?=\/\d+\/\w)/g, limited by /.../ and with the g flag forcing pattern searches for occurrence.
The regular expression has two alternatives ...|...
The first one (?<=\/)\w+(?=\/\d+\/\w) captures cases when the searched word is directly behind the slash (?<=\/) and before more words behind the number (?=\/\d+\/\w).
https://www.test.com/word/3/text/text
The second alternative (?<=(\w\/\w+\/))\w+(?=\/\d) captures cases where the word is preceded by other words following the domain (?<=(\w\/\w+\/)) (in fact two slashes separated by alphanumeric characters) and the searched word is immediately before the slash followed by the number (?=\/\d).
https://www.test.com/text/word/2
https://www.test.com/text/text/word/3
All slashes must be escaped: \/.
The construction (?<=...) means lookbehind in regular expressions and (?=...) means lookahead in regular expressions.
Note 1. The above example currently only works well in a Chrome browser, as that:
(...) now lookbehind is part of the ECMAScript 2018 specification. As of this writing (late 2018), Google's Chrome browser is the only popular JavaScript implementation that supports lookbehind. So if cross-browser compatibility matters, you can't use lookbehind in JavaScript.
Note 2. Lookbehnd, even if it is interpreted correctly, in most regular expression engines must contain a fixed length regular expression, which I do not keep in the example above, because this one is still valid and works for regular expression engines used in Google Chrome's JavaScript engine, JGsoft engine and .NET framework RegEx classes.
Note 3. The lookbehind syntax or its poorer \K replacement are widely supported by many regular expression engines used in a large group of programming languages.
More explanation about regular expressions which I used you can find for example here.
You may first split by ### then check whether /word/ exists in each element:
var s = 'https://www.test.com/text/1###https://www.test.com/text/word/2###https://www.test.com/text/text/word/3###https://www.test.com/3/text###https://www.test.com/word/3/text/text';
var result = [];
s.split(/###/).forEach(function(el) {
if (el.includes('/word/'))
result.push(el);
})
// or else by using filter
// result = s.split(/###/).filter(el => el.includes('/word/'))
console.log(result);

Regular Expression - Starting and ending with, and contains specific string in the middle

I would like to generate a regex with the following condition:
The string "EVENT" is contained within a xml tag called "SHEM-HAKOVETZ".
For example, the following string should be a match:
<SHEM-HAKOVETZ>104000514813450EVENTS0001dfd0.DAT</SHEM-HAKOVETZ>
I think you want something like this ^<SHEM-HAKOVETZ>.*EVENT.*<\/SHEM-HAKOVETZ>$
Regular expression
^<SHEM-HAKOVETZ>.*EVENTS.*<\/SHEM-HAKOVETZ>$
Parts of the regular expression
^ From the beginning of the line
<SHEM-HAKOVETZ> Starting tag
.* Any character - zero or more
EVENT Middle part
<\/SHEM-HAKOVETZ>$ Ending part of the match
Here is the working regex.
If you want to match this line, you could use this regex:
<SHEM-HAKOVETZ>*EVENTS.*(?=<\/SHEM-HAKOVETZ>)
However, I would not recommend using regex XML-based data, because there may be problems with whitespace handling in XML (see this article for more information). I would suggest using an actual XML parser (and then applying the reg to be sure about your results.
Here is a solution to only match the "value" part ignoring the XML tags:
(?<=<SHEM-HAKOVETZ>)(?:.*EVENTS.*)(?=<\/SHEM-HAKOVETZ>)
You can check it out in action at: https://regex101.com/r/4XiRch/1
It works with Lookbehind and Lookahead to make sure it will only match if the tags are correct, but for further coding will only match the content.