SPLUNK: extract and rank values from single event - regex

Hi I have a field which has repeated groups of "<id>:<Flag>:<Rank>:<weight>:<quantity>" separated by column.
Example, 2113:X:1:2.92400000:14100.00000:613:X:7:2.92800:96300.00000:1132:L:2:2.92750000:14300.00000
I want to extract the id corresponding to the highest weight. In the above example I would get
Id Weight
613 2.928
from my regex knowledge, I could only work with single repetition of the event. but not more than that

Related

Regex in splunk - starting with number and has comma in between

I am trying write a regex to extract the number so that I can calculate the sum.
Below is the event:
abre0001.pxm: 55 records processed as of 2022-07-28 00:55:51.829407
abre0001.pxm: 23,555 records processed as of 2022-07-28 00:55:51.829407
abcd0001.pxm: 23,45,555 records processed as of 2022-07-28 00:55:52.543170
I want to extract the fields 55, 23,555, and 23,45,555 from each event and calculate the sum. However, I am unable to extract the number with a comma in it. I am able to get just the entries with only digits. Below is the regex used.
index="" source="" sourcetype="r" "ab*0001.pxm"
| rex field=_raw "pxm:\s+(?<value>/d+)/s"
| convert rmcomma(value)
| stats sum(value) as total_entries
The value field is unable to extract the number having a comma. It only extracts 55 rest of the entries are blank. Not sure what explicitly we need to give here.
| rex field=_raw "pxm:\s+(?<value>[\d,]+)\s"
| eval value=replace(value,",","")
d, and s are escaped and added "," to group that can be in the named capture group "value"
You then need to remove any commas, since they're not numerical

Google Sheets ArrayFormula to get INITIALS of arbitrary length name

Sample sheet.
As the title says, given a column of arbitrary number of words of arbitrary length, Want a single ArrayFormula to get the first letters of all words in the said column.
I have tried two methods, seen in sample sheet.
1) Using SPLIT and ARRAYFORMULA, can get it one cell but cannot extend down column.
2) Using 2 REGEXEXTRACT, can get for first 2 initials and extend down
But is it possible to get for arbitrary number of words for whole column using ArrayFormula.
Is it possible to use REGEXEXTRACT to return the first letters of many words?
This replaces every word with the captured first letter
=ARRAYFORMULA(UPPER(REGEXREPLACE(A1:A6,"(\w)\S*\s?","$1")))

PostgreSQL - finding string using regular expression

What I am looking to do is to, within Postgres, search a column for a string (an account number). I have a log table, which has a parameters column that takes in parameters from the application. It is a paragraph of text and one of the parameters stored in the column is the account number.
The position of the account number is not consistent in the text and some rows in this table have nothing in the column (since no parameters are passed on certain screens). The account number has the following format: L1234567899. So for the account number, the first character is a letter and then it is followed by ten digits.
I am looking for a way to extract the account number alone from this column so I can use it in a view for a report.
So far what I have tried is getting it into an array, but since the position changes, I cannot count on it being in the same place.
select foo from regexp_split_to_array(
(select param from log_table where id = 9088), E'\\s+') as foo
You can use regexp_match() to achieve that result.
(regexp_match(foo,'[A-Z][0-9]{10}'))[1]
DBFiddle
Use substring to pull out the match group.
select substring ('column text' from '[A-Z]\d{10}')
Reference: PostgreSQL regular expression capture group in select

OpenRefine : split a cell based on the a string of 5 numbers (postal code)

I am new to OpenRefine and GREL.
In a address row, I am trying to extract the city and the postal code.
The row will typically contains : 12 rue du Paradis 75012 Paris
I'd like to split this row starting from the 5 digit number (75012). After I could easily extract the city.
In the command "Split into several columns", what Regular expression would you put (or is it another command)?
Thanks!
The 'split into several columns' takes a regular expression as an argument to specify the separator to be used when doing the split. This is probably not what you need in this case - since there isn't a common expression for the separator.
Instead you would probably be better using the "Add column based on this column" option and then using a 'match' function to create the new column. The 'match' takes a regular expression as an argument, but allows you to capture the output - so you can use this to do pattern matching in a string. In this case for example you could use something like:
value.match(/.*\s+(\d{5})\s+(.*)/)
This would capture the 5 digit number and the city in an array:
["75012","Paris"]
You could then use this to create the values you want in the new column, or in two new columns. E.g.:
value.match(/.*\s+(\d{5})\s+(.*)/)[0]
will get the number

extract number from string in Oracle

I am trying to extract a specific text from an Outlook subject line. This is required to calculate turn around time for each order entered in SAP. I have a subject line as below
SO# 3032641559 FW: Attached new PO 4500958640- 13563 TYCO LJ
My final output should be like this: 3032641559
I have been able to do this in MS excel with the formulas like this
=IFERROR(INT(MID([#[Normalized_Subject]],SEARCH(30,[#[Normalized_Subject]]),10)),"Not Found")
in the above formula [#[Normalized_Subject]] is the name of column in which the SO number exists. I have asked to do this in oracle but I am very new to this. Your help on this would be greatly appreciated.
Note: in the above subject line the number 30 is common in every subject line.
The last parameter of REGEXP_SUBSTR() indicates the sub-expression you want to pick. In this case you can't just match 30 then some more numbers as the second set of digits might have a 30. So, it's safer to match the following, where x are more digits.
SO# 30xxxxxx
As a regular expression this becomes:
SO#\s30\d+
where \s indicates a space \d indicates a numeric character and the + that you want to match as many as there are. But, we can use the sub-expression substringing available; in order to do that you need to have sub-expressions; i.e. create groups where you want to split the string:
(SO#\s)(30\d+)
Put this in the function call and you have it:
regexp_substr(str, '(SO#\s)(30\d+)', 1, 1, 'i', 2)
SQL Fiddle