Group By after parsing a message in AWS cloudwatch insights - amazon-web-services

I have messages which are like below, the following message is one of the messages (have so many JSON formats which are not at all related to this)
request body to the server {'sender': '65ddd20eac244AAe619383e4d8cb558834', 'message': 'hello'}
I would like to group of these messages based on sender (alphanumeric value) which is enclosed in JSON.

CloudWatch Logs Insights query:
fields #message |
filter #message like 'request body to the server' |
parse #message "'sender': '*', 'message'" as sender |
stats count(*) by sender
Query results:
-------------------------------------------------
| sender | count(*) |
|------------------------------------|----------|
| 65ddd20eac244AAe619383e4d8cb558834 | 4 |
| 55ddd20eac244AAe619383e4d8cb558834 | 3 |
-------------------------------------------------
Screenshot:

you can use filter.
fields #timestamp, #message
| filter #message like "65ddd20eac244AAe619383e4d8cb558834"
| sort #timestamp desc
| limit 20
it will filter all the messages limit to 20 that send by 65ddd20eac244AAe619383e4d8cb558834.
update:
suppose the JSON log formate is this
{
"sender": "65ddd20eac244AAe619383e4d8cb558835",
"message": "Hi"
}
Now I want to count number of messages from 65ddd20eac244AAe619383e4d8cb558835
how many messages are coming from each user?
so simple you can run the query
stats count(sender) by sender |
# To filter only message the contain sender, to avoid lambda default logs
filter #message like "sender"
if you want to see messages as well then modify the query a bit
stats count(*) by sender, message |
filter #message like "sender"
Here #message refers to whole to index where message refer to the JSON object message.
count_distinct
Returns the number of unique values for the field. If the field has
very high cardinality (contains many unique values), the value
returned by count_distinct is just an approximation.
how many distinct users in the selected interval?
It will list distinct users in 3hr of interval
stats count_distinct(sender) as distinct_sender by bin(3hr) as interval

Related

AWS Log Insights query with string contains and return string value as alias

I have below query to get data from cloudwatch log :
fields #timestamp, #user, #fileName, #fileType, strcontains(#message,'downloaded') or strcontains(#message,'unauthorized') as status
| parse #message /(?<#user>(?<=User\s).*(?=\shas))/
| parse #message /(?<#fileName>(?<=file\s).+(?=,))/
| parse #message /(?<#fileType>(?<=type\s).+(?="))/
I'm facing issue in selecting status column value.
If strcontains(#message,'downloaded') then I want to display status column value as 'Downloaded' and if strcontains(#message,'unauthorized') I want to display status column value as 'Unauthorized'.
Can someone provide input here to improve query to fetch desired results ??
Any help is appreciated.

Parse message in Log Insight

I want to parse this message :
[2021-08-30T14:01:01.443908+00:00] technical.INFO: Webhook
"239dfb55-c8f3-4ae2-8974-22dadb7417ba" (wallet.create) has been
handle.
To have :
UUID (here : 239dfb55-c8f3-4ae2-8974-22dadb7417ba)
The words in brackets (here: wallet.create)
I can get the UUID but not the terms in brackets.
I think my regex is correct but, it doesn't work on Log Insight :(
My query :
fields #message
| filter #message like /technical.INFO: Webhook "/
| parse #message /(?<webhookId>\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b)/
| parse #message /(?<#endpt_get>\(([^)]+)\)/
| sort #timestamp desc
| limit 5
My regex for word in brackets :
https://regex101.com/r/ewSm6O/1
If i comment this line of my query :
parse #message /(?<#endpt_get>\(([^)]+)\)/
enter image description here
I have the good result
The line of code I commented above blocks the result, I return nothing.
Could you please help me?
if your log messages are all going to have this same format, you can use glob instead of regex (and for something complex like this, that may be easier)
fields #message, #timestamp
| parse #message "technical.INFO: Webhook \"*\" (*) has been handle" as uuid, term_to_catch
| sort #timestamp by desc
| display #timestamp, uuid, term_to_catch
if some of the sections of the message (like technical.INFO ) would change, you can always * them and put a dummy variable to catch but then do nothing with it
| parse #message "*: Webhook \"*\" (*) has been handle" as type, uuid, term_to_catch
| display #timestamp, uuid, term_to_catch
alternatively - if you insist on your regex - then the reason is most likely because you are not storing the parsed results as their own variable, and so they are overwriting each other
| parse #message /your*regex/ as uuid
| parse #message /your*second.regex/ as term_to_catch
may get what you need as well.

CloudWatch Insights queries for error logs in Lambda

I want to add CloudWatch custom dashboard for the Lambda's error logs. I want the metric with only logs which are reflecting ERRORs in Lambda function. I tried with following query in log insights but it is not working:
fields #timestamp, #message
| sort #timestamp desc
| filter #message like ERROR
| limit 20
Also I tried to create filter but it is showing me There are no metrics in this namespace for the region "Europe (London)"
I managed to solve this issue by :
> fields #message
> | parse #message "[*] *" as loggingType, loggingMessage
> | fields #message | filter #message like /Error/
> | display loggingMessage
> | limit 500

How to split Cloudwatch field by its value in insights query

I'm trying to create an AWS dashboard visualization that displays the counts of cache hits vs. misses over a period of time. To do this, I'm setting up a log type dashboard with an insights query on the log. To be as simple as possible, my log is either:
{"cache.hit", true} or {"cache.hit", false}.
I would like for my dashboard to track both possibilities on the same graph, but it seems like I can't without breaking my log up into distinct rows for these values. For example, if my logs were simply:
{"cache.hit.true", true} or {"cache.hit.false", true}, then I could create 2 separate graphs to track these values independently in the dashboard, but that's not as clean.
To get them on one dash, I've tried this, but all it does is display the two fields, and the values for both display fields are the same, when they definitely shouldn't be:
fields #timestamp, #message, cache.hit as cache_hits
| filter cache_hits IN [0, 1]
| display cache_hits = 0 as in_cache_false
| display cache_hits = 1 as in_cache_true
| stat count (in_cache_true), count(in_cache_false) by bin(30s)
| sort #timestamp desc
| limit 20
This query below extracts out the cache hits and cache misses and then works out the cache hit percentage.
fields #timestamp, #message
| filter #message like /cache.hit/
| fields strcontains(#message, "true") as #CacheHit,
strcontains(#message, "false") as #CacheMiss
| stats sum(#CacheHit) as CacheHits, sum(#CacheMiss) as CacheMisses, sum(#CacheHit) / (sum(#CacheMiss) + sum(#CacheHit)) * 100 as HitPercentage by bin(30s)
| sort #timestamp desc

How to get additional lines of context in a CloudWatch Insights query?

I typically run a query like
fields #timestamp, #message
| filter #message like /ERROR/
| sort #timestamp desc
| limit 20
Is there any way to get additional lines of context around the messages containing "ERROR"? Similar to the A, B, and C flags with grep?
Example
For example, if I have a given log with the following lines
DEBUG Line 1
DEBUG Line 2
ERROR message
DEBUG Line 3
DEBUG Line 4
Currently I get the following result
ERROR message
But I would like to get more context lines like
DEBUG Line 2
ERROR message
DEBUG Line 3
with the option to get more lines of context if I want.
You can actually query the #logStream as well, which in the results will be a link to the exact spot in the respective log stream of the match:
fields #timestamp, #message, #logStream
| filter #message like /ERROR/
| sort #timestamp desc
| limit 20
That will give you a column similar to the right-most one in this screenshot:
Clicking the link to the right will take you to and highlight the matching log line. I like to open this in a new tab and look around the highlighted line for context.
I found that the most useful solution is to do your query and search for errors and get the request id from the "requestId" field and open up a second browser tab. In the second tab perform a search on that request id.
Example:
fields #timestamp, #message
| filter #requestId like /fcd09029-0e22-4f57-826e-a64ccb385330/
| sort #timestamp asc
| limit 500
With the above query you get all the log messages in the correct order for the request where the error occurred. This is an example that works out of the box with lambda. But if you push logs to CloudWatch in a different way and there is no requestId i would suggest creating a requestId per request or another identifier that is more useful for you use case and push that with your log event.