Regex string with optional suffix in Reqtify - regex

I try to filter the following lines in Reqtify:
Li.success tc_BT_Cancel_From_Pause_State_a4_2016-01-14_16h40m16s.log
Li.success tc_BT_Cancel_From_Pause_State_a5_2016-01-14_16h40m23s.log
Li.success tc_BT_Cancel_Init_BtControlStop_2016-01-14_16h40m23s.log
With a first regex ^Li.\w+\stc_(\w+)_20 I achieve to extract
BT_Cancel_From_Pause_State_a4
BT_Cancel_From_Pause_State_a5
BT_Cancel_Init_BtControlStop
But my goal is to strip off the _a* suffix.
I already tried in an additional expression (.+)(_a\d)? but the result is unchanged.
Same for (.+)(_a\d|) .
Does anyone have an idea how to strip this optional part off?
The final list should be:
BT_Cancel_From_Pause_State
BT_Cancel_From_Pause_State
BT_Cancel_Init_BtControlStop
Thanks,
Chris

This should do the trick :
^Li.\w+\stc_(\w+?)(_a\d)?_20
Group 1 remains unchanged, group 2 will be the optional _aX.
If you don't want to group the second part, you can change it this way :
^Li.\w+\stc_(\w+?)(?:_a\d)?_20

I propose this:
^Li.\w+\stc_(\w+?)(?:_a\d)?_20
With Live Demo

Related

Regex to remove everything after -i- (with -i-)

I was trying to find solution for my problem.
Input: prd-abcd-efgh-i-0dflnk55f5d45df
Output: prd-abcd-efgh
Tried Splunk Query : index=aws-* (host=prd-abcd-efgh*) | rex field=host "^(?<host>[^.]+)"| dedup host | stats count by host,methodPath
I want to remove everything comes after "-i-" using simple regex.I tried with regex "^(?[^.]+)" listed here
https://answers.splunk.com/answers/77101/extracting-selected-hosts-with-regex-regex-hosts-with-exceptions.html
Please help me to solve it.
replace(host, "(?<=-i-).*", "")
Example here: https://regex101.com/r/blcCcQ/2
This (?<=-i-) is a lookbehind
I have no knowledge of Splunk. but the normal way to do that would be to match the part you don't want and replace it with an empty string.
The regex for doing that could be:
-i-.*
Then replace the match with an empty string.
Something simple like this should work:
([a-z-]+)-i-.+
The first capture group will return only the part preceding -i-.

URL matching regex

I need some help with URL matching regex. I read the regex syntax documentation but it's so complex.
I'm trying to create a URL list for a checkout funnel, how would I set up regex for the following?
https://shop.mysite.ca/[unique ID]/checkouts/[unique ID 2]
OR
https://shop.mysite.ca/[unique ID]/checkouts/[unique ID
2]?step=contact_information
What I have so far, though not sure how to put the optional parameter "step=contact_information")
/^(https:\/\/shop.mysite.ca\/)([\da-z]+)(\/checkouts\/)([\da-z]+)$/
You can use a "?" to make a group either appear 0 or 1 times, making it optional.
/^(https:\/\/shop.mysite.ca\/)([\da-z]+)(\/checkouts\/)([\da-z]+)(\?step=contact_information)?$/
this should work
^(https:\/\/shop.mysite.ca\/)([\da-z]+)(\/checkouts\/)([\da-z]+)((\?step=contact_information)*)$
edit: forgot the ? and used * instead. The other solution by #thomas is a bit better I think
Try it:
^(https:\/\/shop.mysite.ca\/)(\d+.)(\/checkouts)(\/)(\d+.)($|\?\w.+$)
If unique ID is composed with non numbers characteres:
^(https:\/\/shop.mysite.ca\/)([\da-zA-Z]+.)(\/checkouts)(\/)([\da-zA-Z]+.)($|\?\w.+$)

Optional question mark - regexp

I have problem with creating correct regular expression.
Here is what I have so far:
https://regex101.com/r/d0epRo/2
I need to add to this links one more parameter and I have to determinate wheather there is question mark or not. Therefore ? should be optional but I can't get it to work.
Those not working (\?|) (\?)? (\??).
Those should be marked http://www.polskieszlaki.pl and http://www.polskieszlaki.pl/wawel.htm but aren't
I have no forther ideas. Help please.
I think what you want is this regex:
a[\s]+href="[^mailto][\S]+polskieszlaki\.pl(?:(.*))?(?:(\?)(.*))?\"
This (?: ... ) means "do not capture"
If you are just trying to retrieve the query parameters try:
a[\s]+href="[^mailto][\S]+polskieszlaki\.pl(.*)(?:\?(?<param>.*))\"
You can then extract the param group
Or in a more simpler form without the named + ignored capture groups:
a[\s]+href="[^mailto][\S]+polskieszlaki\.pl(.*)(\?(.*))\"

Need regex to strip away remaing part of a path

I am trying to write a regex which will strip away the rest of the path after a particular folder name.
If Input is:
/Repository/Framework/PITA/branches/ChangePack-6a7B6/core/src/Pita.x86.Interfaces/IDemoReader.cs
Output should be:
/Repository/Framework/PITA/branches/ChangePack-6a7B6
Some constrains:
ChangePack- will be followed change pack id which is a mix of numbers or alphabets a-z or A-Z only in any order. And there is no limit on length of change pack id.
ChangePack- is a constant. It will always be there.
And the text before the ChangePack can also change. Like it can also be:
/Repository/Demo1/Demo2/4.3//PITA/branches/ChangePack-6a7B6/core/src/Pita.x86.Interfaces
My regex-fu is bad. What I have come up with till now is:
^(.*?)\-6a7B6
I need to make this generic.
Any help will be much appreciated.
Below regex can do the trick.
^(.*?ChangePack-[\w]+)
Input:
/Repository/Framework/PITA/branches/ChangePack-6a7B6/core/src/Pita.x86.Interfaces/IDemoReader.cs
/Repository/Demo1/Demo2/4.3//PITA/branches/ChangePack-6a7B6/core/src/Pita.x86.Interfaces
Output:
/Repository/Framework/PITA/branches/ChangePack-6a7B6
/Repository/Demo1/Demo2/4.3//PITA/branches/ChangePack-6a7B6
Check out the live regex demo here.
^(.*?ChangePack-[a-zA-Z0-9]+)
Try this.Instead of replace grab the match $1 or \1.See demo.
https://regex101.com/r/iY3eK8/17
Will you always have '/Repository/Framework/PITA/branches/' at the beginning? If so, this will do the trick:
/Repository/Framework/PITA/branches/\w+-\w*
Instead of regex you could can use split and join functions. Example python:
path = "/a/b/c/d/e"
folders = path.split("/")
newpath = "/".join(folders[:3]) #trims off everything from the third folder over
print(newpath) #prints "/a/b"
If you really want regex, try something like ^.*\/folder\/ where folder is the name of the directory you want to match.

Remove entire querystring using regex

I have the below regex, but how can I remove the querystring entirely if it is present:
^~/(.*)/restaurant/(.*)
eg. the url
/seattle/restaurant/sushi?page=2
or
/seattle/restaurant/sushi?somethingelse=something
or
/seatthe/restaurant/sushi
should just return seattle and restaurant and sushi and remove any querystring if it is present.
(sorry for reposting a similar question, but I couldn't get the answer to work in my previous question).
thanks
Thomas
This regex:
(/[^?]+).*
Should match the initial section of your URL and put it in a group.
So it will match /seattle/restaurant/sushi and put the value in a group.
You can use something like this: (/.*?/restaurant[^?]+).* if you want to handle just URLs with the word restaurant as the second word between the slashes.
Edit: Something like so should yield 3 groups: /(.*?)/(restaurant)/([^?]+).*. Group 1 being seatthe, group 2 being restaurant and group 3 being sushi. If after the last / there is a ?, the regex discards the ? and everything which follows.
You should change your final /./ to match "anything but a question mark" like this
^~/(.*)/restaurant/([^?]*)