Regex in splunk - starting with number and has comma in between - regex

I am trying write a regex to extract the number so that I can calculate the sum.
Below is the event:
abre0001.pxm: 55 records processed as of 2022-07-28 00:55:51.829407
abre0001.pxm: 23,555 records processed as of 2022-07-28 00:55:51.829407
abcd0001.pxm: 23,45,555 records processed as of 2022-07-28 00:55:52.543170
I want to extract the fields 55, 23,555, and 23,45,555 from each event and calculate the sum. However, I am unable to extract the number with a comma in it. I am able to get just the entries with only digits. Below is the regex used.
index="" source="" sourcetype="r" "ab*0001.pxm"
| rex field=_raw "pxm:\s+(?<value>/d+)/s"
| convert rmcomma(value)
| stats sum(value) as total_entries
The value field is unable to extract the number having a comma. It only extracts 55 rest of the entries are blank. Not sure what explicitly we need to give here.

| rex field=_raw "pxm:\s+(?<value>[\d,]+)\s"
| eval value=replace(value,",","")
d, and s are escaped and added "," to group that can be in the named capture group "value"
You then need to remove any commas, since they're not numerical

Related

Regex to capture group name on first line, then the corresponding value on the next?

I'm after a regex that will enable me to capture the name of the mailbox within the first line, then the corresponding value as a 'count' group after the carriage return. The below is a sample, there are roughly 14 mail addresses in total;
Protection.admin#test.com
705
Go.live#test.com
14
ABCpremier#test.com
29
reassuredtest.com
20
lifetest#test.com
8
I have used Rubular and I'm able to capture the values of the 'count' so to speak, but when using this then in splunk - I'm only capturing the first value of 705 as I believe it's falling foul of the '.' in 'Go.live';
[a-z\.]+#test.com\r\n](?<count>[^\n\r]+)
Would someone kindly assist me in a query that would cycle through, capture the mailbox name as one capture group, then the count value that proceeds it?
To get more than the first match from a regex in Splunk you must use the max_match option to the rex command. Once all of the mailbox names and counts are extracted, they're paired up, split into separate events, and then broken apart again. If you try to split the events without pairing up the names and counts then you'll lose the association between the name and its count.
Here's a run-anywhere example query. Note that I had to fix the fourth email address.
| makeresults | eval _raw="Protection.admin#test.com
705
Go.live#test.com
14
ABCpremier#test.com
29
reassured#test.com
20
lifetest#test.com
8"
```Commands above create demo data. Delete IRL```
```Extract the mailbox names and counts```
| rex max_match=0 "(?<mailbox>[^#]+)#[\s\S]+?(?<count>\d+)"
```Combine each mailbox name with its count```
| eval results=mvzip(mailbox,count)
```Remove line ends```
| eval results=trim(results,"
")
```Separate each name/count pair into their own events```
| mvexpand results
```Break out the mailbox and count values into separate fields again```
| eval results=split(results,",")
| eval mailbox=mvindex(results, 0), count=mvindex(results, 1)
```Display the results```
| table mailbox count

In Splunk how to find the number of words based on the a pipe separator and add a value to it and assign it to a new filed

Hi Have an event like this shown below
Today's Greeting Messag=Hello|myname|name|is|Alice|myName|is|bob"}
How can i count the number of words between message= till "}. I have a | delimiter that should helps me to get the count of words in between. But for every count i want to add a specific number
example for above log i will get 8 words in between as count based on | separator.
But for every count i would like to add some new number like 8+2 and the value to be updated to a new splunk field.
This will help in calculating if any event that is crossing the threshold of that value then i can trigger an alarm.
Some one please help me in getting this.
You can try the following, which splits the string in to a multi value field of each word, then counts the number of values in that field. You can then add whatever numbers to the end as you need.
| eval msg_mv=split(msg,"|")
| eval words = mvcount(msg_mv)-1
| eval final_count = words+2

Reg exp search in notes/comments/description data in PostgreSQL 10.7

I have a scenario which I am not able to do in 10.7 version. Basically, I have a data column in which I need to find the Reg Exp pattern inside the data which is in the form of notes/comments/description.
For example, Data in the column : The SSN number is 760-56-6289
In the above data 760-56-6289 is the actual SSN number which I need to find across all schemas/tables/columns for the defined reg exp pattern. And, we can have a pre or post text for actual SSN value.
Could you please let me know how to achieve this PostgreSQL 10.7?
Please let me know if you need more information for the same.
demo:db<>fiddle
SELECT
(regexp_matches(mycolumn, '^.*([\d]{3}-[\d]{2}-[\d]{4}).*$'))[1]
FROM mytable
The RegEx means:
Start of text: ^
arbitrary number of characters: .*
group of your number: (...)
3 digit characters: [\d]{3}
- character
2 digits: [\d]{2}
- character
4 digits: [\d]{4}
arbitrary number of characters: .*
end of text: $
regexp_matches() gives out all found groups as an array. So, there is only one group, the array contains only one value. This is your number which can be get with the index [1]

SPLUNK: extract and rank values from single event

Hi I have a field which has repeated groups of "<id>:<Flag>:<Rank>:<weight>:<quantity>" separated by column.
Example, 2113:X:1:2.92400000:14100.00000:613:X:7:2.92800:96300.00000:1132:L:2:2.92750000:14300.00000
I want to extract the id corresponding to the highest weight. In the above example I would get
Id Weight
613 2.928
from my regex knowledge, I could only work with single repetition of the event. but not more than that

extract number from string in Oracle

I am trying to extract a specific text from an Outlook subject line. This is required to calculate turn around time for each order entered in SAP. I have a subject line as below
SO# 3032641559 FW: Attached new PO 4500958640- 13563 TYCO LJ
My final output should be like this: 3032641559
I have been able to do this in MS excel with the formulas like this
=IFERROR(INT(MID([#[Normalized_Subject]],SEARCH(30,[#[Normalized_Subject]]),10)),"Not Found")
in the above formula [#[Normalized_Subject]] is the name of column in which the SO number exists. I have asked to do this in oracle but I am very new to this. Your help on this would be greatly appreciated.
Note: in the above subject line the number 30 is common in every subject line.
The last parameter of REGEXP_SUBSTR() indicates the sub-expression you want to pick. In this case you can't just match 30 then some more numbers as the second set of digits might have a 30. So, it's safer to match the following, where x are more digits.
SO# 30xxxxxx
As a regular expression this becomes:
SO#\s30\d+
where \s indicates a space \d indicates a numeric character and the + that you want to match as many as there are. But, we can use the sub-expression substringing available; in order to do that you need to have sub-expressions; i.e. create groups where you want to split the string:
(SO#\s)(30\d+)
Put this in the function call and you have it:
regexp_substr(str, '(SO#\s)(30\d+)', 1, 1, 'i', 2)
SQL Fiddle