Regex to extract multiple pieces across multiple lines - regex

I am working on making basic Zabbix items for Wazuh. Its not to replace Wazuh, but our techs live in Zabbix and this provides an alert in Zabbix so techs can know something and can go check to Wazuh.
The issue is that Wazuh alerts are multi-line alerts and we need 2 pieces of information.
From the example below, we would like to get:
(server)
(level 10) -> 'High amount of POST requests in a small period of time (likely bot).'
I use the following regex:
([\r\n].*?)(?:=?\r|\n)(.*?(?:(level 10.*)).*)
This will match on level 10 and then I can use group 1 to get the host name (server). But I am unable to get the second part. I can create an item for each level of rule (1-10 for example) and can get host name. But I can not get the alert itself. I read that I need to create individual items for each pience but what I found is that Zabbix does not always grab right piece from alert. Maybe alert 10 is one item captured but host name is from another log entry.
Is there a way to capture all of these in one item using regex in Zabbix?
Thank you. I appreciate all your help.
** Alert 1646336311.8104996: - web,appsec,attack,pci_dss_6.5,pci_dss_11.4,gdpr_IV_35.7.d,nist_800_53_SA.11,nist_800_53_SI.4,tsc_CC6.6,tsc_CC7.1,tsc_CC8.1,tsc_CC6.1,tsc_CC6.8,tsc_CC7.2,tsc_CC7.3,
2022 Mar 03 19:38:31 (server) any->/var/log/nginx/access.log
Rule: 31533 (level 10) -> 'High amount of POST requests in a small period of time (likely bot).'

If I understand correctly, you can use Javascript in Preprocessing:
value1 = value.replace(/.*1.*/g,'Mobile"')
value2 = value.replace(/.*2.*/g,'Mobile"')
finishvalue = value1.concat(value2);
return(finishvalue)

Related

Find All String Occurrences, Except The Last One Found, and Remove Them

I am using Google Docs to open Walmart receipts that I email to myself. The Walmart store that I use 99.9% of the time seems to have made some firmware update to the Ingenico POS terminal that makes it display a running SUBTOTAL after each item is identified by the scanner. Here are some images to support my question..
The POS terminal looks like this:
Second image is the is the electronic receipt which I email myself from their IOS app. It is presumably taken from the POS terminal because it has the extra running SUBTOTAL lines after each item like the POS terminal screen shows. It has been doing this for a few months and I've been given no reason to believe, by management, that it will be corrected any time soon.
The final image is my actual paper receipt. This is printed from the register, its the one that you walk out with it and show the greeter/exit person to check your buggy and the items you've purchased.
Note that it does not show the extra SUBTOTAL.
I open the electronic receipt in a Google Document and their automatic OCR spits out the text of the receipt. It does a pretty darn good job, I'd say its 95%+ accurate with these receipts. I apply a very crude little regex that reformats these electronic receipts so that I can enter them into a database and use that data for my family's budgeting, taxes, and so forth. That has been working very well for me, albeit I would like to further automate that process but thats for a different question some day perhaps.
Right now, that little crude regex no longer formats the receipt into something usable for me.
What I would like to do is to remove the extra SUBTOTALS from the (broken) electronic receipt but leave the last SUBTOTAL alone. I highlighted the last SUBTOTAL on the receipt, which is always there, and should remain.
I have seen two other questions that are similar but I could not apply them to my situation. One of them was:
Remove all occurrences except the last one
What have I tried?
The following regex works in the online tester at regex101.com:
\nSUBTOTAL\t\d{1,3}(?:[.,]\d{3})*(?:[.,]\d{2})
It took me a while to come up with that regex from searching around but essentially I want it to find all of the SUBTOTAL literals with a preceding new-line and any decimal number amount from 0.01 to 999.99) and I just want to replace what that finds with a new-line and then I can allow my other regex creation to work on that like it used to before the firmware update to the POS terminal.
The regex correctly identifies every SUBTOTAL (including the last one) on the regex101.com site. I can apply a substitution of "\n" and I am back to seeing the receipt data I can work with but there were two issues:
1) I cant replicate this using Google Apps Script.
Here is my example:
function myFunction() {
var body = DocumentApp.getActiveDocument().getBody();
var newText = body.getText()
.match('\nSUBTOTAL\t\d{1,3}(?:[.,]\d{3})*(?:[.,]\d{2})')[1]
.replace(/%/mgi, "%\n");
body.clear();
body.setText(newText);
}
2) If I were to get the above code to work, I still have the issue of wanting to leave the last SUBTOTAL intact.
Here is a Google Doc that I have set up to experiment with:
https://docs.google.com/document/d/11bOJp2rmWJkvPG1FCAGsQ_n7MqTmsEdhDQtDXDY-52s/edit?usp=sharing
I use this regular expresion.
// JavaScript Syntax
'/\nSUBTOTAL\s\d{1,3}\.\d{2}| SUBTOTAL\n\d{1,3}\.\d{2}/g'
Also I make a script for google docs. You can use this Google Doc and see the results.
function deleting_subs() {
var body = DocumentApp.getActiveDocument().getBody();
var newText = body.getText();
var out = newText.replace(/\nSUBTOTAL\s\d{1,3}\.\d{2}|` SUBTOTAL\n\d{1,3}\.\d{2}/g, '');
// This is need to become more readable the resulting text.
out = out.replace(/R /g, 'R\n');
body.clear();
body.setText(out);
}
To execute the script, open the google doc file and click on:
Add ons.
Del_subs -> Deleting Subs.
Tip: After execute the complement/add on (Deleting Subs), undo the document edition, in that way other users can return to previous version of the text.
Hope this help to you.

Stream Analytics Output

I have a project that uses an event hub to receive data, this is sent every second, the data is received by a website using SignalR, this is all working fine, i have been storing the data in to blob storage via a Stream Analytics Job, but this is really slow to access, and with the amount of data i am receiving off just 6 devices, it will get even slower as this increases, i need to access the data to display historical data on via graphs on the website, and then this is topped up with the live data coming in.
I don't really need to store the data every second, so thought about only storing it every 30 seconds instead, but into a SQL DB, what i am trying to do, is still receive the data every second but only store it every 30, i have tried a tumbling window, but from what i can see, this just dumps everything every 30 seconds instead of the single entries.
am i miss understanding the Tumbling, Sliding and Hopping windows, i am guessing i cannot use them in this way ? if that is the case, i am guessing the only way to do it, would be to have the output db as an input, so i can cross reference the timestamp with the current time ?
unless anyone has any other ideas ? any help would be appreciated.
Thanks
am i miss understanding the Tumbling, Sliding and Hopping windows
You are correct that this will put all events within the Tumbling/Sliding/Hopping window together. However, this is only valid within a group by case, which requires a aggregate function over this group.
There is a aggregate function Collect() which will create an array of the events within a group.
I think this should be possible when you group every event within a 30 second tumbling window using Collect(), then in the next step, CROSS APPLY each record, which should output all received events within the 30 seconds.
With Grouper AS (
SELECT Collect() AS records
FROM Input TIMESTAMP BY time
GROUP BY TumblingWindow(second, 30)
)
SELECT
record.ArrayValue.FieldA AS FieldA,
record.ArrayValue.FieldB AS FieldB
INTO Output
FROM Grouper
CROSS APPLY GetArrayElements(Grouper.records) AS record
If you are trying to aggregate 30 entries into one summary row every 30 seconds then a tumbling window is a good choice. Something like the following should work:
SELECT System.TimeStamp AS OutTime, TollId, COUNT(*) as cnt, sum(TollCharge) as TollCharge
FROM Input TIMESTAMP BY EntryTime
GROUP BY TollId, TumblingWindow(second, 30)
Thanks for the response, I have been speaking to my contact at Microsoft and he suggested something similar, I had also found something like that in various examples online. what I actually want to do, is only update the database with the data every 30 seconds. so I will receive the event, store it, and I will not store it again until 30 seconds have passed. I am not sure how I can do it with and ASA job to be honest, as I need to have a record of the last time it was updated, I actually have a connection to the event hub from my web site, so in the receiver, I am going to perform a simple check, and then store the data from there.

Regular Expression, Creating an alarm to check for a string that appears less than 5 times.

I think it is going to be harder to explain this than to get a solution. I am using sitescope to monitor a webpage. I need to check the webpage for the string Div Class='proxy'. We are using a software that automatically checks a group of computers. It then creates a dashboard with Each computer and its status. We always have 5 computers in this group. We want an alarm that goes off when a computer disappears. The sitescope monitor uses regex to search for content on the page. There is no other identify marks we can search for except for div class='proxy' which is created for each computer. Of course in the source code the 5 div classes are not sequential so (div class='proxy'){5} does not return a happy resonpse.
what we want.
If div class='proxy' appears 5 times in the document return true
if div class='proxy' less than 5 times in the document return false
Like I said the hard part was going to be explaning the issue.
Have you tried something like: (div class='proxy'.?){5}?

Is there a limit to how long a filename URL statement can be?

I am on design number three I think now of a program that submits a series of stock tickers and metrics to Yahoo Finance. I don't need to go into too much total about what it does as I have got most of it up and running now apart from one remaining issue.
The Yahoo Finance site lists about 2700 stock tickers on the NASDAQ alone. I anticipated that submitting all of these in one filename URL statement might fall over for some reason, so set an initial string length of 500 tickers and built some nested macros to iterate through in 500 ticker blocks until everything I wanted had been extracted.
However during development of the code it seems that if I build a string with any more than about 200 tickers in I get an error telling me that SSL Support cannot be run and the code falls over.
Does anyone have any idea why this is? In ideal world I would like to be able to do this code in one pass where all 2700 stock tickers are pulled down. If this isn't possible if someone could explain why not that would be great.
Thanks

How to use Refunds and Adjustments via SDK (QBFC)?

I have been integrating my application with QuickBooks using the SDK QBFC. I have invoices working successfully. The issue I have come across and am struggling to find recources for is with credits. When sending the request to create an invoice with a negative value I get this message:
"Transaction amount must be positive."
I have tried using using the quantity as a negative and rate hoping it would work out the amount as a negative but then I got this:
"You can't use negative rates on inventory items, use neg quantity instead"
So, I have come to the realization that I need to use the Refunds and Adjustments in QuickBooks and cannot find any examples to follow.
I have worked this out. Using the same structure you would to create the request for an invoice but replace:
Dim invAdd As IInvoiceAdd
Set invAdd = msgSetRq.AppendInvoiceAddRq
With:
Dim refundAdd As ICreditMemoAdd
Set refundAdd = msgSetRq.AppendCreditMemoAddRq