Parse message in Log Insight - amazon-web-services

I want to parse this message :
[2021-08-30T14:01:01.443908+00:00] technical.INFO: Webhook
"239dfb55-c8f3-4ae2-8974-22dadb7417ba" (wallet.create) has been
handle.
To have :
UUID (here : 239dfb55-c8f3-4ae2-8974-22dadb7417ba)
The words in brackets (here: wallet.create)
I can get the UUID but not the terms in brackets.
I think my regex is correct but, it doesn't work on Log Insight :(
My query :
fields #message
| filter #message like /technical.INFO: Webhook "/
| parse #message /(?<webhookId>\b[0-9a-f]{8}\b-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-\b[0-9a-f]{12}\b)/
| parse #message /(?<#endpt_get>\(([^)]+)\)/
| sort #timestamp desc
| limit 5
My regex for word in brackets :
https://regex101.com/r/ewSm6O/1
If i comment this line of my query :
parse #message /(?<#endpt_get>\(([^)]+)\)/
enter image description here
I have the good result
The line of code I commented above blocks the result, I return nothing.
Could you please help me?

if your log messages are all going to have this same format, you can use glob instead of regex (and for something complex like this, that may be easier)
fields #message, #timestamp
| parse #message "technical.INFO: Webhook \"*\" (*) has been handle" as uuid, term_to_catch
| sort #timestamp by desc
| display #timestamp, uuid, term_to_catch
if some of the sections of the message (like technical.INFO ) would change, you can always * them and put a dummy variable to catch but then do nothing with it
| parse #message "*: Webhook \"*\" (*) has been handle" as type, uuid, term_to_catch
| display #timestamp, uuid, term_to_catch
alternatively - if you insist on your regex - then the reason is most likely because you are not storing the parsed results as their own variable, and so they are overwriting each other
| parse #message /your*regex/ as uuid
| parse #message /your*second.regex/ as term_to_catch
may get what you need as well.

Related

AWS Cloudwatch Log Insights - replace string function

How do I use AWS Cloudwatch Log Insights' replace function?
The docs do not give working examples.
Given logs which contain paths such as /api/lumberjack/123/axe/456/fashion
I am trying:
fields message
| parse message "path=* " as path
| fields replace(path, /[0123456789]+/, 'ID') as uniqpath
| stats count(*) by uniqpath
I expect results like:
uniqpath | count
/api/lumberjack/ID/axe/ID/fashion | 12
/api/lumberjack/ID/beardedness | 44
But instead it complains "Invalid arguments, received: (path) but expected: (str: string,searchValue: string,replaceValue: string)"
The replace function accepts fields as input for the first argument.
What is not supported is the second argument. You are passing a regex which is not recognized as a string.
I have not found a way to convert the regex to string. But at least you can pass the fieldname path for the first param. I have tested it changing the regex for a normal string.
Query:
fields #message
| parse #message "path=*" as path
| fields replace(path, 'lumberjack', 'ID') as uniqpath
| stats count(*) by uniqpath
Results:
The error message is pretty self-explanatory. The replace function expects an input of type string for the first argument. You provided a fieldname path which is not acceptable.
EDIT: To my surprise, the replace function accepts path as the first argument, which is not mentioned in the doc. See Omar's answer above.
Haven't tested it but I've once made something similar to that so yours should be something like the following:
fields #message
| parse #message "path=* " as path
| parse path /(?<#endpt>(\/[0123456789]+\/?))/
| fields replace(path, coalesce(#endpt, ''), '/{id}/') as uniqpath
| stats count(*) by uniqpath

What is the best way to search log in AWS cloudwatch

I'm running AWS lambda. And I should find some informations from the Cloudwatch logs.
And What am I doing seems to too inefficient. But I don't know how to work.
I want to know more efficient way.
What am I doing is...
I have some ids
1111-1111-111
2222-2222-222
3333-3333-333
...
Search for specific messages with id in AWS log insight conolse
fields #timestamp, #logStream ,#message
| filter #message like /myId/
| sort #timestamp desc
| parse #message '"myId" : "*"' as my_id
| filter my_id like /1111-1111-111/
Download result csv file.
Parse #logStream with python
with open('1111-1111-111.csv') as csvfile:
reader = csv.DictReader(csvfile, delimiter=',')
for row in reader:
print(str(i) +": " +row['#logStream'])
get logStreams and search again in logInsight console
2021/06/05/[$LATEST]1111111111111
2021/06/05/[$LATEST]1111111111111
2021/06/05/[$LATEST]1111111111111
2021/06/05/[$LATEST]2222222222222
2021/06/05/[$LATEST]3333333333333
2021/06/05/[$LATEST]3333333333333
2021/06/05/[$LATEST]3333333333333
2021/06/05/[$LATEST]3333333333333
2021/06/05/[$LATEST]4444444444444
...
Search again with logStreams and get what I really want.
fields #timestamp, #logStream ,#message
| filter #logStream='2021/06/05/[$LATEST]1111111111111'
| filter #message like /file_name/
| parse #message "'file_name': '*'" as file_name
After getting file_name, I should search again inside file with myId. Because I'm not sure because of same logStreams.
If I do this manually, This is too hard. And If I do this with aws boto3 it's also hard for me because I'm not familiar with boto3 logs client wait process and result. Also I think there would be better way.
Could you suggest to me better workflow?

AWS Logs Insights parse regexp always empty

I have a log string:
F, [2021-02-24T09:06:30.428708 #9] FATAL -- : [3c25b3e6-fa19-48c8-93c7-5661dc2ec338]
ActionController::RoutingError (No route matches [GET] "/api/jsonws/invoke"):
I want to extract the path /api/jsonws/invoke as a parsed field with this request:
fields #timestamp, #message
| limit 300
| parse #message /.*No route matches [[A-Z]{3,7}] "(?<path>.*)".*/
I expect to see /api/jsonws/invoke in the output in column path, but instead my path column in the output is always empty.
I've tested the regexp expression with an online tool and it seem to work as I expect. I'm also sure that there are matching logs in the output.
Is there any mistake in my Log Insights query?
Regexp didn't work out, so I ended up doing this:
fields #timestamp, #message
| parse #message "(No route matches [*] \"*\"):" as method, path
| filter ispresent(path)
| stats count(*) as count by path, method
| sort count desc

Group By after parsing a message in AWS cloudwatch insights

I have messages which are like below, the following message is one of the messages (have so many JSON formats which are not at all related to this)
request body to the server {'sender': '65ddd20eac244AAe619383e4d8cb558834', 'message': 'hello'}
I would like to group of these messages based on sender (alphanumeric value) which is enclosed in JSON.
CloudWatch Logs Insights query:
fields #message |
filter #message like 'request body to the server' |
parse #message "'sender': '*', 'message'" as sender |
stats count(*) by sender
Query results:
-------------------------------------------------
| sender | count(*) |
|------------------------------------|----------|
| 65ddd20eac244AAe619383e4d8cb558834 | 4 |
| 55ddd20eac244AAe619383e4d8cb558834 | 3 |
-------------------------------------------------
Screenshot:
you can use filter.
fields #timestamp, #message
| filter #message like "65ddd20eac244AAe619383e4d8cb558834"
| sort #timestamp desc
| limit 20
it will filter all the messages limit to 20 that send by 65ddd20eac244AAe619383e4d8cb558834.
update:
suppose the JSON log formate is this
{
"sender": "65ddd20eac244AAe619383e4d8cb558835",
"message": "Hi"
}
Now I want to count number of messages from 65ddd20eac244AAe619383e4d8cb558835
how many messages are coming from each user?
so simple you can run the query
stats count(sender) by sender |
# To filter only message the contain sender, to avoid lambda default logs
filter #message like "sender"
if you want to see messages as well then modify the query a bit
stats count(*) by sender, message |
filter #message like "sender"
Here #message refers to whole to index where message refer to the JSON object message.
count_distinct
Returns the number of unique values for the field. If the field has
very high cardinality (contains many unique values), the value
returned by count_distinct is just an approximation.
how many distinct users in the selected interval?
It will list distinct users in 3hr of interval
stats count_distinct(sender) as distinct_sender by bin(3hr) as interval

How to get additional lines of context in a CloudWatch Insights query?

I typically run a query like
fields #timestamp, #message
| filter #message like /ERROR/
| sort #timestamp desc
| limit 20
Is there any way to get additional lines of context around the messages containing "ERROR"? Similar to the A, B, and C flags with grep?
Example
For example, if I have a given log with the following lines
DEBUG Line 1
DEBUG Line 2
ERROR message
DEBUG Line 3
DEBUG Line 4
Currently I get the following result
ERROR message
But I would like to get more context lines like
DEBUG Line 2
ERROR message
DEBUG Line 3
with the option to get more lines of context if I want.
You can actually query the #logStream as well, which in the results will be a link to the exact spot in the respective log stream of the match:
fields #timestamp, #message, #logStream
| filter #message like /ERROR/
| sort #timestamp desc
| limit 20
That will give you a column similar to the right-most one in this screenshot:
Clicking the link to the right will take you to and highlight the matching log line. I like to open this in a new tab and look around the highlighted line for context.
I found that the most useful solution is to do your query and search for errors and get the request id from the "requestId" field and open up a second browser tab. In the second tab perform a search on that request id.
Example:
fields #timestamp, #message
| filter #requestId like /fcd09029-0e22-4f57-826e-a64ccb385330/
| sort #timestamp asc
| limit 500
With the above query you get all the log messages in the correct order for the request where the error occurred. This is an example that works out of the box with lambda. But if you push logs to CloudWatch in a different way and there is no requestId i would suggest creating a requestId per request or another identifier that is more useful for you use case and push that with your log event.