Regex to remove everything after -i- (with -i-) - regex

I was trying to find solution for my problem.
Input: prd-abcd-efgh-i-0dflnk55f5d45df
Output: prd-abcd-efgh
Tried Splunk Query : index=aws-* (host=prd-abcd-efgh*) | rex field=host "^(?<host>[^.]+)"| dedup host | stats count by host,methodPath
I want to remove everything comes after "-i-" using simple regex.I tried with regex "^(?[^.]+)" listed here
https://answers.splunk.com/answers/77101/extracting-selected-hosts-with-regex-regex-hosts-with-exceptions.html
Please help me to solve it.

replace(host, "(?<=-i-).*", "")
Example here: https://regex101.com/r/blcCcQ/2
This (?<=-i-) is a lookbehind

I have no knowledge of Splunk. but the normal way to do that would be to match the part you don't want and replace it with an empty string.
The regex for doing that could be:
-i-.*
Then replace the match with an empty string.

Something simple like this should work:
([a-z-]+)-i-.+
The first capture group will return only the part preceding -i-.

Related

splunk: Get the first three numbers from ip address

I'm trying to get the first three sets of numbers of an IP address which is in this format: 10.10.10.10
Desired value would be 10.10.10
Try this regex: ^(.+)(?=\.\d+$)
DEMO
And from next time please post what have you tried along with how you plan to reach the solution.
Regex to match a correct IP4Address:
/^(([01]?\d?\d|2[0-4]\d|25[0-5])\.){3}([01]?\d?\d|2[0-4]\d|25[0-5])$/
Regex101
Regex to match first three blocks of an correct IP4Address:
/^(([01]?\d?\d|2[0-4]\d|25[0-5])\.){2}([01]?\d?\d|2[0-4]\d|25[0-5])$/
Regex101
or if it is still fine when it matches a point after the third block:
/^(([01]?\d?\d|2[0-4]\d|25[0-5])\.){3}$/
Regex101
was able to get it this way:
rex field=IP "(?<first_three>\d+\.\d+\.\d+)\.\d+"
Another method to do.
..| rex field=ip_addr "(?<split_ip>.+)\.[0-9]+"
Where,
ip_addr - field name
split_ip - variable under which the split IP address will be stored
Example:
Splunk Query:
| stats count | eval ip = "115.124.35.123" | rex field=ip "(?<split_ip>.+)\.[0-9]+" | table split_ip
Output:
115.124.35
Below works for me.
rex field=_raw "(?<ip_address>^\d+\.\d+\.\d+\.\d+)"|timechart count by ip_address
Use below regex :
^(?P<result>.+(?=\.\d+))
[link] https://regex101.com/r/bO4tY5/3
https://regex101.com/ is a super useful tool for this kind of stuff. It lets you write your regex and test it for different strings in real time.
Once you've got what you need, stick it into your Splunk search query with the rex command.
To answer your exact problem:
The regex code, where MY_FIELD_NAME_HERE is the name of the extracted field:
(?<MY_FIELD_NAME_HERE>\d+\.\d+\.\d+)\.\d+
The regex with examples from regex101:
https://regex101.com/r/qTTf4e/2
The command required for the Splunk query language, where ORIGNAL_FIELD is your original field holding 10.10.10.10 and MY_FIELD_NAME_HERE is the extracted field:
... | rex field="ORIGNAL_FIELD" "(?<MY_FIELD_NAME_HERE>\d+\.\d+\.\d+)\.\d+"

Regex string with optional suffix in Reqtify

I try to filter the following lines in Reqtify:
Li.success tc_BT_Cancel_From_Pause_State_a4_2016-01-14_16h40m16s.log
Li.success tc_BT_Cancel_From_Pause_State_a5_2016-01-14_16h40m23s.log
Li.success tc_BT_Cancel_Init_BtControlStop_2016-01-14_16h40m23s.log
With a first regex ^Li.\w+\stc_(\w+)_20 I achieve to extract
BT_Cancel_From_Pause_State_a4
BT_Cancel_From_Pause_State_a5
BT_Cancel_Init_BtControlStop
But my goal is to strip off the _a* suffix.
I already tried in an additional expression (.+)(_a\d)? but the result is unchanged.
Same for (.+)(_a\d|) .
Does anyone have an idea how to strip this optional part off?
The final list should be:
BT_Cancel_From_Pause_State
BT_Cancel_From_Pause_State
BT_Cancel_Init_BtControlStop
Thanks,
Chris
This should do the trick :
^Li.\w+\stc_(\w+?)(_a\d)?_20
Group 1 remains unchanged, group 2 will be the optional _aX.
If you don't want to group the second part, you can change it this way :
^Li.\w+\stc_(\w+?)(?:_a\d)?_20
I propose this:
^Li.\w+\stc_(\w+?)(?:_a\d)?_20
With Live Demo

regex for finding only between brackets

Given the below regex and text-
regex - #\{.*\}
text - "abc #{:abc :cde} dont-mtach #{:xyz :wqt} do-not do-not-not")
I would like to get only #{:abc :cde} #{:xyz :wqt} in the result. However the above also gives me dont-match in the result. Any ideas how I should modify the regex ?
#\{.*?\}
Make your * non greedy.Or simply use
#\{[^}]*\}
See demo

Need regex to strip away remaing part of a path

I am trying to write a regex which will strip away the rest of the path after a particular folder name.
If Input is:
/Repository/Framework/PITA/branches/ChangePack-6a7B6/core/src/Pita.x86.Interfaces/IDemoReader.cs
Output should be:
/Repository/Framework/PITA/branches/ChangePack-6a7B6
Some constrains:
ChangePack- will be followed change pack id which is a mix of numbers or alphabets a-z or A-Z only in any order. And there is no limit on length of change pack id.
ChangePack- is a constant. It will always be there.
And the text before the ChangePack can also change. Like it can also be:
/Repository/Demo1/Demo2/4.3//PITA/branches/ChangePack-6a7B6/core/src/Pita.x86.Interfaces
My regex-fu is bad. What I have come up with till now is:
^(.*?)\-6a7B6
I need to make this generic.
Any help will be much appreciated.
Below regex can do the trick.
^(.*?ChangePack-[\w]+)
Input:
/Repository/Framework/PITA/branches/ChangePack-6a7B6/core/src/Pita.x86.Interfaces/IDemoReader.cs
/Repository/Demo1/Demo2/4.3//PITA/branches/ChangePack-6a7B6/core/src/Pita.x86.Interfaces
Output:
/Repository/Framework/PITA/branches/ChangePack-6a7B6
/Repository/Demo1/Demo2/4.3//PITA/branches/ChangePack-6a7B6
Check out the live regex demo here.
^(.*?ChangePack-[a-zA-Z0-9]+)
Try this.Instead of replace grab the match $1 or \1.See demo.
https://regex101.com/r/iY3eK8/17
Will you always have '/Repository/Framework/PITA/branches/' at the beginning? If so, this will do the trick:
/Repository/Framework/PITA/branches/\w+-\w*
Instead of regex you could can use split and join functions. Example python:
path = "/a/b/c/d/e"
folders = path.split("/")
newpath = "/".join(folders[:3]) #trims off everything from the third folder over
print(newpath) #prints "/a/b"
If you really want regex, try something like ^.*\/folder\/ where folder is the name of the directory you want to match.

Get website regex from a website link

if I have a website like: www.google.com/en/my-page/anotherpage
how is it possible that with reg-ex to get: /en/my-page ? I am using this reg-ex in the IIS?
So far I have done something similar to this:
^(?:\\.|[^/\\])*/((?:\\.|[^/\\])*)/
but it is returning /en/my-page/ and I want it to return /en/my-page
In grep your regex is returning the string "www.google.com/en/". You can simply use the following regex if positive look behind is not mandatory :
(/[^/]+)+
You could use a look-ahead assertion to get rid of the last slash:
/\/.*(?=\/)/
This one should suit your needs:
^[^/]+(/.*)/[^/]+$
Visualization by Debuggex.
The output your looking for is in the first captured group.
Demo on RegExr.