AWS Metrics Filter pattern Extraction - amazon-web-services

I have awsService.log logs being sent to CloudWatch and I want to create a metric filter to extract the error value.
Example:
06/13/2020 07:35:33 : 578 : 3 : error occurs
05/13/2020 07:35:33 : 3 : 3 : error occurs
The error value I would like to extract is : 3
I tried with many regrex expressions like * : * : 3 : but it doesnot work.
Any help would be appreciated.

Unfortunately no complex patterning (such as Regex) is currently supported with Metric Filters.
According to the documentation you have 3 choices
Trying to match based on an exact string ([": 3 :"])
Using JSON metric filters (not possible for your example as it requires JSON)
Filtering based on condition of this being a space separated event ([date, time, seperator1, int1, seperator2, int2=3, ...])
Regarding extracting the error value, Metrics Filters provide a count for every time this event occurs, they don't count values from the query itself.

Related

Fluentd Parsing

Hi i'm trying to parse single line log using fluentd. Here is log i'm trying to parse.
F2:4200000000000000,F3:000000,F4:000000060000,F6:000000000000,F7:000000000,F8..........etc
This will parse into like this:
{ "F2" : "4200000000000000", "F3" : "000000", "F4" : "000000060000" ............etc }
I tried to use regex but it's confusing and making me write multiple regexes for different keys and values. Is there any easier way to achieve this ?
EDIT1: Heya! I will make this more detailed. I'm currently tailing logs using fluentd to Elasticsearch+Kibana. Here is unparsed example log that fluentd sending to Elasticsearch:
21/09/02 16:36:09.927238: 1 frSMS:0:13995:#HTF4J::141:141:msg0210,00000000000000000,000000,000000,007232,00,#,F2:00000000000000000,F3:002000,F4:000000820000,F6:Random message and strings,F7:.......etc
Elasticsearch recived message:
{"message":"frSMS:0:13995:#HTF4J::141:141:msg0210,00000000000000000,000000,000000,007232,00,#,F2:00000000000000000,F3:002000,F4:000000820000,F6:Random
digits and chars,F7:.......etc"}
This log has only message key so i can't index and create dashboard on only using whole message field. What am i trying to achieve is catch only useful fields, add key into it if it has no key and make indexing easier.
Expected output:
{"logdate" : "21/09/02 16:36:09.927238",
"source" : "frSMS",
"UID" : "#HTF4J",
"statuscode" : "msg0210",
"F2": "00000000000000000",
"F3": "randomchar314516",.....}
I used regex plugin to parse into this but it was too overwhelming and . Here is what i did so far:
^(?<logDate>\d{2}.\d{2}.\d{2}\s\d{2}:\d{2}:\d{2}.\d{6}\b)....(?<source>fr[A-Z]{3,4}|to[A-Z]{3,4}\b).(?<status>\d\b).(?<dummyfield>\d{5}\b).(?<HUID>.[A-Z]{5}\b)..(?<d1>\d{3}\b).(?<d2>\d{3}\b).(?<msgcode>msg\d{4}\b).(?<dummyfield1>\d{16}\b).(?<dummyfield2>\d{6}\b).(?<dummyfield3>\d{6,7}\b).(?<dummyfield4>\d{6}\b).(?<dummyfield5>\d{2}\b)...
Which results to :
"logDate": "21/09/02 16:36:09.205706",
"source": "toSMS" ,
"status": "0",
"dummyfield": "13995" ,
"UID" : "#HTFAA" ,
"d1" : "156" ,
"d2" : "156" ,
"msgcode" : "msg0210",
"dummyfield1" :"0000000000000000" ,
"dummyfield2" :"002000",
"dummyfield3" :"2000000",
"dummyfield4" :"00",
"dummyfield5" :"2000000" ,
"dummyfield6" :"867202"
Which only applies to example log and has useless fields like field1, dummyfield, dummyfield1 etc.
Other logs has the useful values and keys(date,source,msgcode,UID,F1,F2 fields) like i showcased on expected output. Not useful fields are not static(they can be none, or has less|more digits and chars) so they trigger the pattern not matched error.
So the question is:
How do i capture useful fields that i mentioned using regex?
How do i capture F1,F2,F3...... fields that has different value
patterns like char string mixed?
PS: I wraped the regex i wrote into html snippet so the <> capturing fields don't get deleted
Regex pattern to use:
(F[\d]+):([\d]+)
This pattern will catch all the 'F' values with whatever digit that comes after - yes even if it's F105 it still works. This whole 'F105' will be stored as the first group in your regex match expression
The right part of the above pattern will catch the value of all the digits following ':' up until any charachter that is not a digit. i.e. ',', 'F', etc.. and will store it as the second group in your regex match
Use
Depending on your coding language you will have to access your regex matches variable with an iterator and extract group 1 and group 2 respectivly
Python example:
import re
log = 'F2:4200000000000000,F3:000000,F4:000000060000,F6:000000000000,F7:000000000,F105:9726450'
pattern = '(F[\d]+):([\d]+)'
matches = re.finditer(pattern,log)
log_dict = {}
for match in matches:
log_dict[match.group(1)] = match.group(2)
print(log_dict)
Output
{'F2': '4200000000000000', 'F3': '000000', 'F4': '000000060000', 'F6': '000000000000', 'F7': '000000000', 'F105': '9726450'}
Assuming the logdate will be static(in pattern wise) You can ignore useless values using ".+" regex and get collect the useful values by their patterns. So the regex will be like this :
(?\d{2}.\d{2}.\d{2}\s\d{2}:\d{2}:\d{2}.\d{6}\b).+(?fr[A-Z]{3,4}|to[A-Z]{3,4}).+(?#[A-Z0-9]{5}).+(?msg\d{4})
And output will be like:
{"logdate" : "21/09/02 16:36:09.927238", "source" : "frSMS",
"UID" : "#HTF4J","statuscode" : "msg0210"}
And I'm working on getting F2,F3,FN keys and values.

Custom transform not getting applied in wrangler in Google Cloud Data Fusion

I am trying to following custom transform in a wrangler in Google Cloud Data Fusion.
set-column column (parse-as-json :column 2 ) ? column =^ "[" : (parse-as-json :column 1 )
I want to parse column as JSON to a depth of 2 if it is an array, which means if it starts with a square bracket ([), otherwise to a depth of 1. I am not sure if the colon in parse-as-json directive is causing issue here.
If I change it to following, it works fine:-
set-column column 'a' ? column =^ "[" : 'b'
I have also tried escaping the colon in parse-as-json directive with a backslash,still didn't work. What am I doing wrong here? Please suggest.
We don't currently support nested directives (ie. set-column with parse-as-json).
You can try first doing a copy of the column, then parse one copy with depth 1, and the parse the clone with depth 2. Then finally you can use the set-column to pick the column that is correct.
For example, if let's say the original column is called 'body', and depth 2 will produce null when it doesn't start with "[", you can do something like this:
copy body body_clone
parse-as-json body 1
parse-as-json body_clone 2
set-column final_result !body_clone_field ? body_field : body_clone_field

Parsing a name from a complex string in Tableau

I have a series of values in Tableau that are long strings intermixed with letters and numbers. I am unable to control the data output, but would like to parse the names from these strings. They follow the following format:
Potato 1TByte 4.5 NFA
Board 256GByte 553 NCA
Launch 4 512GByte 4.5 NFA
Launch 4S 512GByte 4.5 NCA
From each of these, I am attempting to capture the following:
"Potato"
"Board"
"Launch 4"
"Launch 4S"
Each string follows the same format: the name, followed by size, followed by some extra information we don't really care about.
I've tried to put together some text parsing strings, but am coming up short, and am still trying to learn regular expressions.
The Tableau calculated field I was trying to work with was something like the following:
LEFT([String], FIND([String], "Byte") - 2)
The issue is that the text and numbers preceding Byte can be anywhere from 4 to 2 characters and I need a way to identify the length of that.
Any help would be greatly appreciated!
One option which uses a regex replacement:
REGEXP_REPLACE('Launch 4 512GByte 4.5 NFA', ' \d+[A-Z]Byte .*$', '')
This strips off everything from the Byte term to the right, leaving us with only the product name.
You could try the following - this seems to work - Screenshot of Tableau output. Find below the formulas for the various derived columns you see in the screenshot (Your source column is called [Name])
Step1 = LEFT([Name],FIND([Name],"Byte")-1)
Step2 = LEN([Step1])-LEN(REPLACE([Step1]," ",""))
Step3 = FINDNTH([Step1]," ",[Step2])
Step4 = LEFT([Step1],[Step3]-1)
And of course you can nest all these in a single calculated field - kept them as separate columns for easier understanding

Where condition in geode

I am new to geode .
I am adding like below:
gfsh>put --key=('id':'11') --value=('firstname':'Amaresh','lastname':'Dhal') --region=region
Result : true
Key Class : java.lang.String
Key : ('id':'11')
Value Class : java.lang.String
Old Value : <NULL>
when I query like this:
gfsh>query --query="select * from /region"
Result : true
startCount : 0
endCount : 20
Rows : 9
Result
-----------------------------------------
('firstname':'A2','lastname':'D2')
HI
Amaresh
Amaresh
('firstname':'A1','lastname':'D1')
World
World
('firstname':'Amaresh','lastname':'Dhal')
Hello
NEXT_STEP_NAME : END
When I am trying to query like below I am not getting the value:
gfsh>query --query="select * from /region r where r.id='11'"
Result : true
startCount : 0
endCount : 20
Rows : 0
NEXT_STEP_NAME : END
Ofcourse I can use get command...But i want to use where condition..Where I am doing wrong..It gives no output
Thanks
In Geode the key is not "just another column". In fact, the basic query syntax implicitly queries only the fields of the value. However, you can include the key in your query using this syntax:
select value.lastname, value.firstname from /region.entries where key.id=11
Also, it is fairly common practice to include the id field in your value class even though it is not strictly required.
What Randy said is exactly right, the 'key" is not another column. The exact format of the query should be
gfsh>query --query="select * from /Address.entries where key=2"
What you are looking for here is getting all the "entries" on the region "Address" and then querying the key.
To check which one you want to query you can fire this query
gfsh>query --query="select * from /Address.entries"
You can always use the get command to fetch the data pertaining to a specific key.
get --key=<KEY_NAME> --region=<REGION_NAME>
Example:
get --key=1 --region=Address
Reference: https://gemfire.docs.pivotal.io/910/geode/tools_modules/gfsh/command-pages/get.html

AWQL - how can i use a regular expressions or something similar?

I am querying the adwords api via the following AWQL-Query (which works fine):
SELECT AccountDescriptiveName, CampaignId, CampaignName, AdGroupId, AdGroupName, KeywordText, KeywordMatchType, MaxCpc, Impressions, Clicks, Cost, Conversions, ConversionsManyPerClick, ConversionValue
FROM KEYWORDS_PERFORMANCE_REPORT
WHERE CampaignStatus IN ['ACTIVE', 'PAUSED']
AND AdGroupStatus IN ['ENABLED', 'PAUSED']
AND Status IN ['ACTIVE', 'PAUSED']
AND AdNetworkType1 IN ['SEARCH'] AND Impressions > 0
DURING 20140501,20140531
Now i want to exclude some campaigns:
we have a convention for our new campaigns that the campaign name begins with three numbers followed by an underscore, eg. "100_brand_all"
So i want to get only these new campaigns..
I tried lots of different variations for STARTS_WITH but only exact strings are working - but i need a pattern to match!
I already read https://developers.google.com/adwords/api/docs/guides/awql?hl=en and following its content it should be possible to use a WHERE expression like this:
CampaignName STARTS_WITH ['0','1','2','3']
But that doesn't work!
Any other ideas how i can achieve this?
Well, why don't you run a campaign performance report first, then process that ( get the campaign ids you want or don't want) the use those in the "CampaignId IN [campaign ids here] . or CampaignID NOT_IN [campaign ids]