Which functions should I use to read aws lambda log - amazon-web-services

Once my lambda run is finished, I am getting this payload as a result:
{
"version": "1.0",
"timestamp": "2020-09-30T19:20:03.360Z",
"requestContext": {
"requestId": "2de65baf-f630-48a7-881a-ce3145f1127d",
"functionArn": "arn:aws:lambda:us-east-2:044739556748:function:puppeteer:$LATEST",
"condition": "Success",
"approximateInvokeCount": 1
},
"responseContext": {
"statusCode": 200,
"executedVersion": "$LATEST"
}
}
I would like to read logs of my run from cloudwatch and also memory usage which I can see in lambda monitoring tab:
How can do it via sdk? Which functions should I use?
I am using nodejs.

You need to discover the log stream name that has been assigned to the Lambda function invocation. This is available inside the Lambda function's context.
exports.handler = async (event, context) => {
console.log('context', context);
};
Results in the following log:
context { callbackWaitsForEmptyEventLoop: [Getter/Setter],
succeed: [Function],
fail: [Function],
done: [Function],
functionVersion: '$LATEST',
functionName: 'test-log',
memoryLimitInMB: '128',
logGroupName: '/aws/lambda/test-log',
logStreamName: '2020/10/03/[$LATEST]f123a3c1bca123df8c12e7c12c8fe13e',
clientContext: undefined,
identity: undefined,
invokedFunctionArn: 'arn:aws:lambda:us-east-1:123456781234:function:test-log',
awsRequestId: 'e1234567-6b7c-4477-ac3d-74bc62b97bb2',
getRemainingTimeInMillis: [Function: getRemainingTimeInMillis] }
So, the CloudWatch Logs stream name is available in context.logStreamName. I'm not aware of an API to map a Lambda request ID to a log stream name after the fact, so you may need to return this or somehow persist the mapping.

Finding logs of a specific request-id can be done via AWS cloudwatch API.
You can use [filterLogEvents][1] API to extract (using regex) the relevant START and REPORT logs to gather the relevant information of the memory usage (You will also get the log stream name in the response for future use).
If you want to gather all the relevant logs of a specific invocation you will need to query create pairs of START and REPORT logs and query for all the logs in the specific timeframe between them on a specific log stream.

Related

Dynamically scheduling events in AWS EventBridge from a Lambda

I have the following two equivalent lambda functions:
setNotification(date, text):
exports.lambdaHandler = async (event, context) => {
const params = {
Entries: [
{
Source: "com.aws.message-event-lambda",
EventBusName: "",
DetailType: "message",
Detail: JSON.stringify({
title: event.detail.title,
text: event.detail.text,
}),
},
],
};
await eventbridge.putEvents(params).promise();
};
sendNotification(text)
Currently I am using Event bridge to trigget th sendNotification function from the setNotification function, but it triggers the function immediatley.
How can I trigger the sendNotification function at a sppecific date defined by the setNotification function?
Currently I see the following 2 options:
Create code inside the setNotification function that creates a scheduled rule on the EventBridge
Stop using EventBridge and use step functions.
I would like to know what is the correct approach between these two or if there is a better approach which i havent found.
I figured it out, you need a different architecture including a lambda function called by a cron expression on the eventbridge that checks a DB for entries to then send notifications to.
More information on scheduling systems on AWS in the following link:
https://aws.amazon.com/blogs/architecture/serverless-scheduling-with-amazon-eventbridge-aws-lambda-and-amazon-dynamodb/

Handling nested JSON messages with AWS IoT Core rules and AWS Lambda

We are using an AWS IoT Rule to forward all messages from things to a Lambda function and appending some properties on the way, using this query:
SELECT *, topic(2) AS DeviceId, timestamp() AS RoutedAt FROM 'devices/+/message'
The message sent to the topic is a nested JSON:
{
version: 1,
type: "string",
payload: {
property1: "foo",
nestedPayload: {
nestedProperty: "bar"
}
}
}
When we use the same query for another rule and route the messages into an S3 bucket instead of a Lambda, the resulting JSON files in the bucket are as expected:
{
DeviceId: "test",
RoutedAt:1618311374770,
version: 1,
type: "string",
payload: {
property1: "foo",
nestedPayload: {
nestedProperty: "bar"
}
}
}
But when routing into a lambda function, the properties of the "nestedPayload" are pulled up one level:
{
DeviceId: "test",
RoutedAt:1618311374770,
version: 1,
type: "string",
payload: {
property1: "foo",
nestedProperty: "bar"
}
}
However, when debugging the Lambda locally using VS Code, providing a JSON file (in other words: not connecting to AWS IoT Core), the JSON structure is as expected, which is why I am assuming the error is not with the JSON serializer / deserializer, but with the rule.
Did anyone experience the same issue?
It turns out, the issue was with the SQL version of the rule.
We created the rule routing to the Lambda using CDK, which by default set the version to "2015-10-08". The rule routing to S3, which didn't show the error, was created manually and used version "2016-03-23". Updating the rule routing to the Lambda to also use "2016-03-23" fixed the issue.

Where can I log & debug Velocity Template Language (VTL) in AWS AppSync?

Is there any easy way to log or debug VTL coming from Request Mapping Template & Response Mapping Template rather than sending Queries & Mutations to debug & log?
Also, is there any Playground to check & play with VTL just like we can do with JavaScript in Web Console?
Can we work with AWS AppSync offline & check if everything written in VTL works as expected?
A super nasty way to log and debug is using validate in the response mapping
$util.validate(false, $util.time.nowISO8601().substring(0, 10) )
Here's how I logged a value in my VTL resolver:
Add a "$util.error" statement in your request or response template and then make the graphql call.
For example, I wanted to see what was the arguments passed as an input into my resolver, so I added the $util.error statement at the beginning of my template. So, my template was now:
$util.error("Test Error", $util.toJson($ctx))
{
"version" : "2017-02-28",
"operation" : "PutItem",
"key": {
"id": $util.dynamodb.toDynamoDBJson($ctx.arguments.user.id)
},
"attributeValues": {
"name": $util.dynamodb.toDynamoDBJson($ctx.arguments.user.name)
}
}
Then from the "Queries" section of the AWS AppSync console, I ran the following mutation:
mutation MyMutation {
addUser(user: {id: "002", name:"Rick Sanchez"}) {
id
name
}
}
This displayed the log results from my resolver as follows:
{
"data": null,
"errors": [
{
"path": [
"addUser"
],
"data": null,
"errorType": "{\"arguments\":{\"user\":{\"id\":\"002\",\"name\":\"Rick Sanchez\"}},\"identity\":null,\"source\":null,\"result\":null,\"request\":{\"headers\":{\"x-forwarded-for\":\"112.133.236.59, 130.176.75.151\",\"sec-ch-ua-mobile\":\"?0\",\"cloudfront-viewer-country\":\"IN\",\"cloudfront-is-tablet-viewer\":\"false\",\"via\":\"2.0 a691085135305af276cea0859fd6b129.cloudfront.net (CloudFront)\",\"cloudfront-forwarded-proto\":\"https\",\"origin\":\"https://console.aws.amazon.com\",\"content-length\":\"223\",\"accept-language\":\"en-GB,en;q=0.9,en-US;q=0.8\",\"host\":\"raxua52myfaotgiqzkto2rzqdy.appsync-api.us-east-1.amazonaws.com\",\"x-forwarded-proto\":\"https\",\"user-agent\":\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36 Edg/87.0.664.66\",\"accept\":\"*/*\",\"cloudfront-is-mobile-viewer\":\"false\",\"cloudfront-is-smarttv-viewer\":\"false\",\"accept-encoding\":\"gzip, deflate, br\",\"referer\":\"https://console.aws.amazon.com/\",\"x-api-key\":\"api-key-has-been-edited-out\",\"content-type\":\"application/json\",\"sec-fetch-mode\":\"cors\",\"x-amz-cf-id\":\"AvTMLvtxRq9M8J8XntvkDj322SZa06Fjtyhpf_fSXd-GmHs2UeomDg==\",\"x-amzn-trace-id\":\"Root=1-5fee036a-13f9ff472ba6a1211d499b8b\",\"sec-fetch-dest\":\"empty\",\"x-amz-user-agent\":\"AWS-Console-AppSync/\",\"cloudfront-is-desktop-viewer\":\"true\",\"sec-fetch-site\":\"cross-site\",\"sec-ch-ua\":\"\\\"Chromium\\\";v=\\\"87\\\", \\\" Not;A Brand\\\";v=\\\"99\\\", \\\"Microsoft Edge\\\";v=\\\"87\\\"\",\"x-forwarded-port\":\"443\"}},\"info\":{\"fieldName\":\"addUser\",\"parentTypeName\":\"Mutation\",\"variables\":{}},\"error\":null,\"prev\":null,\"stash\":{},\"outErrors\":[]}",
"errorInfo": null,
"locations": [
{
"line": 9,
"column": 3,
"sourceName": null
}
],
"message": "Test Error"
}
]
}
The answers to each of your 3 questions are as follows:
To unit test request/response mapping templates, you could use the method described in this blog post (https://mechanicalrock.github.io/2020/04/27/ensuring-resolvers-aren't-rejected.html).
A Playground for VTL experimentation exists in the AWS AppSync console where you you can edit and test the VTL for your resolvers.
The Amplify framework has a mock functionality which mocks AppSync, the AppSync VTL environment and DynamoDB (using DynamoDB Local). This would allow you to perform e2e tests locally.
Looks like you are looking for this new VTL logging utility such as
$util.log.info(Object) : Void
Documentation:
https://docs.aws.amazon.com/appsync/latest/devguide/utility-helpers-in-util.html
When I realized how much a pain it was to debug VTL, I created a lambda (nodejs) that logged the contents of my VTL template.
// my nodejs based debug lambda -- very basic
exports.handler = (event, context, callback) => {
const origin = context.request || 'oops';
if (context && context.prev) {
console.log('--------with context----------------');
console.log({ prev: context.prev.result, context, origin });
console.log({ stash: context.stash });
console.log('--------END: with context----------------');
callback(null, context.prev.result);
}
console.log('inside - LOGGING_DEBUGGER');
console.log({ event, context: context || null, origin });
callback(null, event);
};
This lambda helped me debug many issues inside my pipeline resolvers. However, I forgot if I used it as a direct lambda or with request+response templates.
To use it, I put values that I wanted to debug into $ctx.stash in my other pipeline functions. Then in my pipeline, I added the "debugger" function after this step -- in case there was an issue where my pipeline would blow up before a fatal error occurred.
Check the $util.log.info(Object) : Void from CloudWatch logging utils
PS: you need to turn on logging to Amazon CloudWatch Logs plus setting Field resolver log level on ALL more details here.

Regex filtering of messages in SNS

Is there a way to filter messages based on Regex or substring in AWS SNS?
AWS Documentation for filtering messages mentions three types of filtering for strings:
Exact matching (whitelisting)
Anything-but matching (blacklisting)
Prefix matching
I want to filter out messages based on substrings in the messages, for example
I have a S3 event that sends a message to SNS when a new object is added to S3, the contents of the message are as below:
{
"Records": [
{
"s3": {
"bucket": {
"name": "images-bucket"
},
"object": {
"key": "some-key/more-key/filteringText/additionaldata.png"
}
}
}
]
}
I want to keep the messages if only filteringText is present in key field.
Note: The entire message is sent as text by S3 notification service, so Records is not a json object but string.
From what I've seen in the documentation, you can't do regex matches or substrings, but you can match prefixes and create your own attributes in the MessageAttributes field.
To do this, I send the S3 event to a simple Lambda that adds MessageAttributes and then sends to SNS.
In effect, S3 -> Lambda -> SNS -> other consumers (with filtering).
The Lambda can do something like this (where you'll have to programmatically decide when to add the attribute):
let messageAttributes = {
myfilterkey: {DataType: "String", StringValue:"filteringText"}
};
let params = {
Message: JSON.stringify(payload),
MessageAttributes: messageAttributes,
MessageStructure: 'json',
TargetArn: SNS_ARN
};
await sns.publish(params).promise();
Then in SNS you can filter:
{"myfilterkey": ["filtertext"]}
It seems a little convoluted to put the Lambda in there, but I like the idea of being able to plug and unplug consumers from SNS on the fly and use filtering to determine who gets what.

How to I index the transformed log records into AWS Elasticsearch?

TLDR
The lambda function is not able to index the firehose logs into the AWS managed ES due to an "encoding problem".
Actual Error Response
I do not get any error when I base64 encode a single logEvent from a firehose record and send the collected records to the AWS managed ES.
See the next section for more details.
The base 64 encoded compressed payload is being sent to ES as the resulting json transformation is too big for ES to index - see this ES link.
I get the following error from the AWS managed ES:
{
"deliveryStreamARN": "arn:aws:firehose:us-west-2:*:deliverystream/*",
"destination": "arn:aws:es:us-west-2:*:domain/*",
"deliveryStreamVersionId": 1,
"message": "The data could not be decoded as UTF-8",
"errorCode": "InvalidEncodingException",
"processor": "arn:aws:lambda:us-west-2:*:function:*"
}
If the output record is not compressed, the body size is too long (as small as 14MB). Without compression and a simple base64 encoded payload, I get the following error in the Lambda logs:
{
"type": "mapper_parsing_exception",
"reason": "failed to parse",
"caused_by": {
"type": "not_x_content_exception",
"reason": "Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes"
}
}
Description
I have Cloudwatch logs that are getting buffered by size / interval which gets fed into a Kinesis Firehose. The firehose transports the logs into a lambda function which transforms the log into a json record which should then send it over to the AWS managed Elasticsearch cluster.
The lambda function gets the following JSON structure:
{
"invocationId": "cf1306b5-2d3c-4886-b7be-b5bcf0a66ef3",
"deliveryStreamArn": "arn:aws:firehose:...",
"region": "us-west-2",
"records": [{
"recordId": "49577998431243709525183749876652374166077260049460232194000000",
"approximateArrivalTimestamp": 1508197563377,
"data": "some_compressed_data_in_base_64_encoding"
}]
}
The lambda function then extracts .records[].data and decodes the data as base64 and decompresses the data which results in the following JSON:
{
"messageType": "DATA_MESSAGE",
"owner": "aws_account_number",
"logGroup": "some_cloudwatch_log_group_name",
"logStream": "i-0221b6ec01af47bfb",
"subscriptionFilters": [
"cloudwatch_log_subscription_filter_name"
],
"logEvents": [
{
"id": "33633929427703365813575134502195362621356131219229245440",
"timestamp": 1508197557000,
"message": "Oct 16 23:45:57 some_log_entry_1"
},
{
"id": "33633929427703365813575134502195362621356131219229245441",
"timestamp": 1508197557000,
"message": "Oct 16 23:45:57 some_log_entry_2"
},
{
"id": "33633929427703365813575134502195362621356131219229245442",
"timestamp": 1508197557000,
"message": "Oct 16 23:45:57 some_log_entry_3"
}
]
}
Individual item from .logEvents[] gets transformed into a json structure where the keys are the desired columns when searching logs within Kibana - something like this:
{
'journalctl_host': 'ip-172-11-11-111',
'process': 'haproxy',
'pid': 15507,
'client_ip': '172.11.11.111',
'client_port': 3924,
'frontend_name': 'http-web',
'backend_name': 'server',
'server_name': 'server-3',
'time_duration': 10,
'status_code': 200,
'bytes_read': 79,
'#timestamp': '1900-10-16T23:46:01.0Z',
'tags': ['haproxy'],
'message': 'HEAD / HTTP/1.1'
}
The transformed json gets collected into an array which gets zlib compressed and base64 encoded string which is then transformed into a new json payload as the final lambda result:
{
"records": [
{
"recordId": "49577998431243709525183749876652374166077260049460232194000000",
"result": "Ok",
"data": "base64_encoded_zlib_compressed_array_of_transformed_logs"
}
]}
Cloudwatch configuration
13 log entries (~4kb) can get transformed to about 635kb.
I have also decreased the thresholds for the awslogs, hoping that the size of the logs that are being sent to Lambda function is going to small:
buffer_duration = 10
batch_count = 10
batch_size = 500
Unfortunately, when there is a burst - the spike can be upwards of 2800 lines where the size is upwards of 1MB.
When the resulting payload from the lambda function is "too big" (~13mb of transformed logs), an error is logged in the lambda cloudwatch logs - "body size is too long". There doesn't seem to be any indication where this error is coming from or whether there is a size limit on the lambda fn's response payload.
So, the AWS support folks have told me that the following limitations can't be mitigated to solve this flow:
lambda payload size
compressed firehose payload incoming into lambda which is directly proportional to the lambda output.
Instead, I have modified the architecture to the following:
Cloudwatch logs are backed up in S3 via Firehose.
S3 events are processed by the lambda function.
The lambda function returns a success code if the lambda transforms and is able to successfully bulk index the logs into ES.
If the lambda function fails, a Dead Letter Queue (AWS SQS) is configured with a cloudwatch alarm. A sample cloudformation snippet can be found here.
If there are SQS messages, one could manually invoke the lambda function with those messages or set up a AWS batch job to process the SQS messages with the lambda function. However, one should be careful, that the lambda function doesn't failover again into the DLQ. Check the lambda cloudwatch logs to check why that message was not processed and sent over to the DLQ.