how to write regex to get the first level group - regex

I have a string like this
env1,env2[data1,data2],env3
I want a regex to get the first groups
env1
env2[data1,data2]
env3
then I can write another regex(pcre) to parse data1,data2.
But I don't know how to parse the first level.
#Simon
Yes.
I want grap the env2[data1,data2] firstly.

If you need to capture the env + number + [everything that is not ]...], then you may use this regex:
(env\d+\[[^\]]+\])
https://regex101.com/r/fE9nF8/1

Related

Extract id from URL using regex including underscores and alfanumeric characters [duplicate]

I am using a data analysis package that exposes a Regex function for string parsing. I am trying to parse a response from a website that is in the format...
key1=val1&key2=val2&key3=val3 ...
[There is the possibility that the keys and values may be percent encoded, but the current return values are not, the current return values are tokens and other info that are alphanumeric].
I understand this data to be www-form-urlencoded, or alternatively it might be known as query string format.
The object is to extract the value for a given key, if the order of the keys cannot be relied upon. For example, I might know that one of the keys I should receive is "token", so what regex pattern can I use to extract the value for the key "token"? I have searched for this but cannot find anything that does what I need, but if there is a duplicate question, apologies in advance.
In Alteryx, you may use Tokenize with a regex containing a capturing group around the part you need to extract:
The Tokenize Method allows you to specify a regular expression to match on and that part of the string is parsed into separate columns (or rows). When using the Tokenize method, you want to match to the whole token, and if you have a marked group, only that part is returned.
I bolded the part of the method description that proves that if there is a capturing group, only this part will be returned rather than the whole match.
Thus, you may use
(?:^|[?&])token=([^&]*)
where instead of token you may use any of the keys the value for which you want to extract.
See the regex demo.
Details
(?:^|[?&]) - the start of a string, ? or & (if the string is just a plain key-value pair string, you may omit ? and use (?:^|&) or (?<![^&]))
token - the key
= - an equal sign
([^&]*) - Group 1 (this will get extracted): 0 or more chars other than & (if you do not want to extract empty values, replace * with + quantifier).

Multiple replace regex in one Apache-NiFi statement

I have a csv in following format.
id,mobile
1,02146477474
2,08585377474
3,07646474637
4,02158789566
5,04578599525
I want to add a new column and add just leading 3 numbers to that column (for specific cases and all the others NOT_VALID string). So result should be:
id,number,provider
1,02146477474,021
2,08585377474,085
3,07646474637,NOT_VALID
4,02158789566,021
5,04578599525,NOT_VALID
I can use following regex for replacing that. But I would like to use all possible conversations in one step. Using UpdateRecord processor.
${field.value:replaceFirst('085[0-9]+','085')}
When I use something like this:
${field.value:replaceFirst('085[0-9]+','085'):or(${field.value:replaceFirst('086[0-9]+','086')}`)}
This replaces all with false.
Nifi uses Java regex
As soon, as you are using record processing, this should work for you:
${field.value:replaceFirst('^(021|085)?.*','$1')}
The group () optionally ? catches 021 or 085 at the beginning of string ^
The replacement - $1 - is the first group
PS: The sites like https://regex101.com/ helps to understand regex

Simple regex to replace first part of URL

Given
http://localhost:3000/something
http://www.domainname.com/something
https://domainname.com/something
How do I select whatever is before the /something and replace it with staticpages?
The input URL is the result of a request.referer, but since you can't render request.referer (and I don't want a redirect_to), I'm trying to manually construct the appropriate template using controller/action where action is always the route, and I just need to replace the domain with the controller staticpages.
You could use a regex like this:
(https?://)(.*?)(/.*)
Working demo
As you can see in the Substitution section, you can use capturing group and concatenates the strings you want to generate the needed urls.
The idea of the regex is to capture the string before and after the domain and use \1 + staticpages + \3.
If you want to change the protocol to ftp, you could play with capturing group index and use this replacement string:
ftp://\2\3
So, you would have:
ftp://localhost:3000/something
ftp://www.domainname.com/something
ftp://domainname.com/something

How to make a regex to replace the value of a key in a json file

I want to make a regex so I can do a "Search/Replace"
over a json file with many object.
Every object has a key named "resource"
containing a URL.
Take a look at these examples:
"resource":"http://www.img/qwer/123/image.jpg"
"resource":"io.nl.info/221/elephant.gif"
"resource":"simgur.com/icon.png"
I want to make a regex to replace the whole url with
a string like this: img/filename.format.
This way, the result would be:
"resource":"img/image.jpg"
"resource":"img/elephant.gif"
"resource":"img/icon.png"
I'm just starting with regular expressions and I'm
completely lost. I was thinking that one valid idea would
be to write something starting with this pattern "resource":"
and ending with the last five characters. But I don't even know how to try
that.
How could I write the regular expression?
Thanks in advance!
Try this:
Find: "resource":\s*"[^"]+?([^\/"]+)"
Replace: "resource":"img/\1
Using [^"]+? ensures the match won't roll off the end of the current entry and gobble up too much input, and it's reluctant (with the added ?) so it gets the whole image file name (instead ofwhat the last character).
Edit:
I added optional whitespace after the key, which your pastebin has.
See a live demo of this regex with your pastebin.
Regex
.*\/
Debuggex Demo
This will find the text you want to replace. Replace it with img/ if you want to find the whole text you'll need to look for the following Regex:
("resource":").*\/
Debuggex Demo
Then replace with $1img/ this should give you group 1 and the img part.
Let me know if there are any questions
Note: I personally would just use objects since you have the JSON and parse it to a object then iterate over the objects and change each resource on each object independently rather than looking for a magic bullet
If your JSON is an array of objects containing resource field I would do it in 3 steps: convert to object, find resources and replace them, convert back to string (optional)
var tmp = JSON.parse('<your json>');
for (i = 0; i < tmp.length; ++i) {
for (e in tmp[i])
if (e == 'resource')
tmp[i][e] = tmp[i][e].replace(/.*(?=img\/.*\..*)/,'')
}
tmp = JSON.stringify(tmp);

Regex Assistance for a url filepath

Can someone assist in creating a Regex for the following situation:
I have about 2000 records for which I need to do a search/repleace where I need to make a replacement for a known item in each record that looks like this:
<li>View Product Information</li>
The FILEPATH and FILE are variable, but the surrounding HTML is always the same. Can someone assist with what kind of Regex I would substitute for the "FILEPATH/FILE" part of the search?
you may match the constant part and use grouping to put it back
(<li>View Product Information</li>)
then you should replace the string with $1your_replacement$2, where $1 is the first matching group and $2 the second (if using python for instance you should call Match.group(1) and Match.group(2))
You would have to escape \ chars if you're using Java instead.