Trigger alert when expected associated event is not present - google-cloud-platform

I have two events that should appear in the logs to represent the completion of a transaction. I need to set up an alert when a transaction starts but no completion log entry is found.
What I've done so far is created two user-defined log metrics:
order_form_starts denotes the submission of an order.
order_form_sents denotes the successful transmission of items of an order.
This entry should be present 1 or more times per ..._start.
I'm able to query the the events individually but not sure how I can express an alert that should be triggered for the following "BAD" scenarios:
Some examples of possibilities:
GOOD: start(order_id=1), end(order_id=1)
GOOD: start(order_id=1), end(order_id=1), end(order_id=1)
BAD: start(order_id=1)
BAD: start(order_id=1), end(order_id=2)
Start Event:
fetch global::logging.googleapis.com/user/order_form_starts
| group_by [resource.project_id], row_count()
End Event:
fetch global::logging.googleapis.com/user/order_form_sents
| group_by [resource.project_id], row_count()

I don't think that you can express it that simply. If I had to implement this, I would create a cloud function that is querying (recurring) the logs for the events and write the result as a metric (gauge) into cloud monitoring.
From there you can define a threshold for that metric and create an alert.
You also might want to create traces with Google Cloud Trace for your transactions where you can keep track of the transactions.

Related

Dynamically Create Cloud Run or Function determined by attribute in pub/sub

I am trying to get a Cloud Run or Cloud Function to start and pull out messages that match its defined ID, for example, if a message with attribute ID 1 is put into the topic, The Cloud Run with ID 1 will take it out, it's important that all messages with attribute 1 go to the same instance.
I understand I could use filters on the subscriptions but I would like to able to easily change the amount of possible ID's, e.g. If I only put messages in the topic with ID's ranging between 0 and 4 then only five instances would be started.
How would I go about creating something like this? Does Pub/Sub support this sort of functionality?
I know I could create X amount of topics and then put each message into its own topic but that seems like a inefficient way of executing this when there is the attribute system.
This was not possible, I had to instead, wait for all data for the desired function to be ready before starting the function, I could not have it continually poll and get the correct data.
Therefore the best approach was to put the data into a firestore, then when all the data was ready for the next layer to process it, I would put a message in a pub sub that contained a message ID, this message ID would determine the data that this function would process.
The function would then query the firestore for messages with a property that included the message ID it was given.
I could not find any other work approaches that would give me the result I desired.

How can I trigger an alert based on log output?

I am using GCP and want to create an alert after not seeing a certain pattern in the output logs of a process.
As an example, my CLI process will output "YYYY-MM-DD HH:MM:SS Successfully checked X" every second.
I want to know when this fails (indicated by no log output). I am collecting logs using the normal GCP log collector.
Can this be done?
I am creating the alerts via the UI at:
https://console.cloud.google.com/monitoring/alerting/policies/create
You can create an alert based on log metric. For that, create a log based metric in Cloud Logging with the log filter that you want.
Then create an alert, aggregate per minute the metrics and set an alert when the value is below 60.
You won't have an alert for each missing message but based on a minute, you will have an alert when the expected value isn't reached.

Is there a way to easily get only the log entries for a specific AWS Lambda execution?

Lambda obviously tracks executions, since you can see data points in the Lambda Monitoring tab.
Lambda also saves the logs in log groups, however I get the impression that Lambda launches are reused if happening in a shorter interval (say 5 minutes between launches), so the output from multiple executions gets written to the same log stream.
This makes logs a lot harder to follow, especially due to other limitations (the CloudWatch web console is super slow and cumbersome to navigate, aws log get-log-events has a 1MB/10k message limitation which makes it cumbersome to use).
Is there some way to only get Lambda log entries for a specific Lambda execution?
You can filter by the RequestId. Most loggers will include this in the log, and it is automatically included in the START, END, and REPORT entries.
My current approach is to use CloudWatch Logs Insights to query for the specific logs that I'm looking for. Here is the sample query:
fields #timestamp, #message
| filter #requestId = '5a89df1a-bd71-43dd-b8dd-a2989ab615b1'
| sort #timestamp
| limit 10000

Scheduling a reminder email in AWS step function (through events/SES) based on Dynamo DB attributes

I have a step function with 3 lambdas, the last lambda is basically writing an entry in the dynamo DB with a timestamp, status = "unpaid" (this is updated to "paid" for some automatically based on another workflow), email and closes the execution. Now I want to schedule a reminder on any entry in the DynamoDB which is unpaid & over 7 days, a second reminder if any entry is unpaid over 14 days, a third last reminder on 19th day - sent via email. So the question is:
Is there any way to do this scheduling per Step function execution (that can then monitor that particular entry in ddb for 7, 14, 19 days and send reminders accordingly until the status is "unpaid").
If yes, would it be too much overhead since there could be millions of transactions.
The second way which I was thinking was to build another scheduler lambda sequence: the first lambda basically parsing through the whole ddb searching for entries valid for reminder (either 7, 14, 19). The second lambda getting the list from the first lambda and prepares the reminder based on whether its first, second or third (in loop) & the third Lambda one sending the reminder through SES.
Is there a better or easier way to do this?
I know we can trigger step functions or lambdas through cloud events or we also have crons that we can use but they were not suiting the use case much.
Any help here is appreciated?
DynamoDB does not have functionality for a delayed notification based on logic, you would need to design this flow yourself. Luckily AWS has all the tools you need to perform this.
I believe the best option would probably be to create a CloudWatch Events/EventBridge when the item is written to DynamoDB (either via your application or as a trigger via a Lambda using DynamoDB Streams).
This event would be scheduled for 7 days time, in the 7 days any checks could be performed to validate if it has been paid or not. If it has not been paid you schedule the next event and send out the notification. If it had been paid you would simply exit the Lambda function. This would then continue for the next 2 time periods.
You could then further enhance this by using DynamoDB streams so that in the event of the DynamoDB table being updated a Lambda is triggered to detect whether status has changed from unpaid. If this occurs simply remove the event trigger to prevent it even having to process.

How do you run functions in parallel?

My desire is to retrieve x number of records from a database based on some custom select statement, the output will be an array of json data. I then want to pass each element in the array into another lambda function in parallel.
So if 1000 records are returned, 1000 lambda functions need to be executed in parallel (I increase my account limit to what I need). If 30 out of 1000 fail, the main task that was retrieving the records needs to know about it.
I'm struggling to put together this simple flow.
I currently use javascript and AWS Aurora. I'm not looking for node.js/javascript code that retrieves the data, just the AWS Step Functions configuration and how to build an array within each function.
Thank you.
if 1000 records are returned, 1000 lambda functions need to be
executed in parallel
What you are trying to achieve is not supported by Step Functions. A State Machine task cannot be modified based on the input it received. So for instance, a Parallel task cannot be configured to add/remove functions based on the number of items it received in an array input.
You should probably consider using SQS Lambda trigger. Number of records retrieved from DB can be added to SQS queue which will then trigger a Lambda function for each item received.
If 30 out of 1000 fail, the main task that was retrieving the records
needs to know about it.
There are various ways to achieve this. SQS won't delete an item from the queue if Lambda returns an error. You can configure DLQ and RedrivePolicy based on your requirements. Or you may want to come up with a custom solution to keep the count on failing Lambdas to invoke the service that fetch records from the DB.