Cloud watch logs prepending timestamp to each line - amazon-web-services

We have cloud watch log agent setup and the logs streamed are appending a timestamp to beginning of each line which we could see after export.
2017-05-23T04:36:02.473Z "message"
Is there any configuration on cloud watch log agent setup that helps not appending this timestamp to each log entry?
Is there a way to export cloud watch logs only the messages of log events? We dont want the timestamp on our exported logs.
Thanks

Assume that you are able to retrieve those logs using your Lambda function (Python 3.x).
Then you can use Regular Expression to identify the timestamp and write a function to strip it from the event log.
^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}Z\t
The above will identify the following timestamp: 2019-10-10T22:11:00.123Z
Here is a simple Python function:
def strip(eventLog):
timestamp = "r'^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}.\d{3}Z\t'"
result = re.sub(timestamp, "", eventLog)
return result

I don't think it's possible, I needed the same exact behavior you are asking for and looks like it's not possible unless you implement a man in the middle processor to remove the timestamp from every log message as suggested in the other answer
Checking the CloudWatch Logs Client API in the first place, it's required to send the timestamp with every log message you send to CloudWatch Logs (API reference)
And the export logs to S3 task API also has no parameters to control this behavior (API reference)

Related

how to Get CloudWatch Logs for last 30 minutes in command prompt?

how can I get last 30 minutes AWS CloudWatch logs which are inserted to the specific LogStream using AWS Command ?
Can you describe what you already tried yourself and what you ran into? Looking at the AWS CLI command reference, it seems that you should be able to run "aws cloudwatch get-log-events ----log-stream-name <name of the stream> --start-time <timestamp>" to get a list of events, starting at given UNIX timestamp, calculating the timestamp should be fairly trivial.
Addition, based on your comment: you'll need to look into the AWS concept of pagination. Most/many AWS API calls (which the CLI also makes for you) retrieve a size/length limited set of data and return a token if there is more data present. You can then make a subsequent call passing that token, which tells the service to return data starting at that token. Repeat this process until you no longer get a token back, at which point you know you have iterated the full dataset.
For this specific CLI command, there is a flag.
--next-token (string)
The token for the next set of items to return. (You received this token from a previous call.
Hope this helps?

How to apply Path Patterns in GCP Eventarc for BigQuery service's jobCompleted method?

I am developing a solution where a cloud function calls BigQuery procedure and upon successful completion of this stored proc trigger another cloud function. For this I am using Audit Logs "jobservice.jobcompleted" method. Problem with this approach is it will trigger cloud function on every job that are completed in BigQuery irrespective of dataset and procedure.
Is there any way to add Path Pattern to the filter so that it triggers only for specific query completion and not for all?
My query starts something like: CALL storedProc() ...
Also, as I tried to create a 2nd Gen function from console, I tried Eventarc trigger. But to my surprise BigQuery Event provider doesn't have Event for jobCompleted
Now I'm wondering if it's possible to trigger based on job complete event.
Update:I changed my logic now to use google.cloud.bigquery.v2.TableService.InsertTable method to make sure after inserting a record to a table it will add AuditLog message so that I can trigger the next service. This insert statement is present as the last statement in BigQuery procedure.
After running the procedure, the insert statement is inserting the data but resource name is coming as projects/<project_name>/jobs
I was expecting something like projects/<project_name>/tables/<table_name> so that I can apply path pattern on resource name.
Do I need to use different protoPayload.method?
Try to create a Log Sink for job completed with unique principal-email sv account and use pubsub with the sink.
Get pubsub published event to run destination service.

How can I trigger an alert based on log output?

I am using GCP and want to create an alert after not seeing a certain pattern in the output logs of a process.
As an example, my CLI process will output "YYYY-MM-DD HH:MM:SS Successfully checked X" every second.
I want to know when this fails (indicated by no log output). I am collecting logs using the normal GCP log collector.
Can this be done?
I am creating the alerts via the UI at:
https://console.cloud.google.com/monitoring/alerting/policies/create
You can create an alert based on log metric. For that, create a log based metric in Cloud Logging with the log filter that you want.
Then create an alert, aggregate per minute the metrics and set an alert when the value is below 60.
You won't have an alert for each missing message but based on a minute, you will have an alert when the expected value isn't reached.

Is there an api to send notifications based on job outputs?

I know there are api to configure the notification when a job is failed or finished.
But what if, say, I run a hive query that count the number of rows in a table. If the returned result is zero I want to send out emails to the concerned parties. How can I do that?
Thanks.
You may want to look at Airflow and Qubole's operator for airflow. We use airflow to orchestrate all jobs being run using Qubole and in some cases non Qubole environments. We DataDog API to report success / failures of each task (Qubole / Non Qubole). DataDog in this case can be replaced by Airflow's email operator. Airflow also has some chat operator (like Slack)
There is no direct api for triggering notification based on results of a query.
However there is a way to do this using Qubole:
-Create a work flow in qubole with following steps:
1. Your query (any query) that writes output to a particular location on s3.
2. A shell script - This script reads result from your s3 and fails the job based on any criteria. For instance in your case, fail the job if result returns 0 rows.
-Schedule this work flow using "Scheduler" API to notify on failure.
You can also use "Sendmail" shell command to send mail based on results in step 2 above.

Filter AWS Cloudwatch Lambda's Log

I have a Lambda function and its logs in Cloudwatch (Log group and Log Stream). Is it possible to filter (in Cloudwatch Management Console) all logs that contain "error"? For example logs containing "Process exited before completing request".
In Log Groups there is a button "Search Events". You must click on it first.
Then it "changes" to "Filter Streams":
Now you should just type your filter and select the beginning date-time.
So this is kind of a side issue, but it was relevant for us. (I posted this to another answer on StackOverflow but thought it would be relevant to this conversation too)
We've noticed that tailing and searching logs gets really slow after a log group has a lot of Log Streams in it, like when an AWS Lambda Function has had a lot of invocations. This is because "tail" type utilities and searching need to connect to each log stream to run. Log Events get expired and deleted due to the policy you set on the Log Group itself, but the Log Streams never get cleaned up. I made a few little utility scripts to help with that:
https://github.com/four43/aws-cloudwatch-log-clean
Hopefully that save you some agony over waiting for those logs to get searched.
You can also use CloudWatch Insights (https://aws.amazon.com/about-aws/whats-new/2018/11/announcing-amazon-cloudwatch-logs-insights-fast-interactive-log-analytics/) which is an AWS extension to CloudWatch logs that gives a pretty powerful query and analytics tool. However it can be slow. Some of my queries take up to a minute. Okay, if you really need that data.
You could also use a tool I created called SenseLogs. It downloads CloudWatch data to your browser where you can do queries like you ask about. You can use either full text and search for "error" or if your log data is structured (JSON), you can use a Javascript like expression language to filter by field, eg:
error == 'critical'
Posting an update as CloudWatch has changed since 2016:
In the Log Groups there is a Search all button for a full-text search
Then just type your search: