Invoke AWS Lambda with AWS X-Ray locally - amazon-web-services

Is there a way to invoke lambda with X-Ray by using sam invoke local?
According to the idea which PaulMaddox mentioned,
I have tried the step below, and I don't know whether I misunderstood :
Run a X-Ray Daemon locally (0.0.0.0:2000) by following the document
In my lambda's template.yaml set the ENV AWS_XRAY_DAEMON_ADDRESS: 0.0.0.0:2000
Invoke the function, but still got error Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set
Here is part of template.yaml setting, I used the environment variable to set AWS_XRAY_DAEMON_ADDRESS
It would be nice if you could provide more information.

I'm not too familiar with SAM, but...
You need to set the _X_AMZN_TRACE_ID environment variable. Currently, the X-Ray Node SDK works by cross communicating between the Lambda runtime start-up code and the user code.
Lambda starts the segment in their start-up code, records information such as timing and exceptions, and sends the segment to the X-Ray service. Then, it forwards the trace ID/parent ID/sampling decision to the user code via setting the _X_AMZN_TRACE_ID environment variable. This allows the SDK to create a separate subsegment, inferring a connection to the original segment, which gets "weaved" into the original on the service end without actually being directly related. Both are sent out of band, asynchronously from each other.
The _X_AMZN_TRACE_ID variable fits the same format as the tracing header here as discussed here: https://docs.aws.amazon.com/xray/latest/devguide/xray-concepts.html#xray-concepts-tracingheader
If you want to send traces through the Daemon to the X-Ray service, you'll need to figure out how to get SAM to construct this Lambda segment initially and set the _X_AMZN_TRACE_ID prior to importing the SDK.
Since the SDK auto-detects the presence of Lambda (which as I understand, SAM mimics), you'll have to set the _X_AMZN_TRACE_ID variable before importing in the SDK. Which, is kind of a catch-22, because you need to import the SDK (in the non-Lambda mode) to construct the Lambda segment before you can populate the _X_AMZN_TRACE_ID.
The problem lies here: https://github.com/aws/aws-xray-sdk-node/blob/master/packages/core/lib/aws-xray.js#L361
If you flip the SDK into LOG_ERROR mode (ignoring the Lambda errors), create and send the Lambda segment (just manually create a segment, load the generated ID/Parent ID/Sampling into _X_AMZN_TRACE_ID then close the segment) and then clear cache/re-import the SDK after, then that should work.
Otherwise, I suspect there may be some work on the SAM end to have this built in. But, hopefully this work around works.

Related

Finding an equivalent of environment variables in AWS Lambda Layers?

I am writing a serverless app on AWS.
I have broken up the app into many CloudFormation stacks. I am using CDK (in Python) to create the CF stacks to deploy the app.
A core requirement of my lambda functions, of course, is the ability to log events. To handle this (and all message passing in the app), I have created a custom EventBridge bus in one of my stacks. The name of the event bus is an output of the stack.
Because the logging functionality will be common to many lambda functions, it seems appropriate to put the logging functionality into a lambda layer. I can then have all of my lambda functions implement that layer, and logging will be automagically available to all of my lambda functions.
The problem is that my logging code needs to know which EventBridge event bus to write events to. I don't want to hardcode this, because I may deploy a 'dev' and/or 'test' and/or 'prod' version of the stack simultaneously and the logging layer in each environment needs to log to its environment's logging bus.
Because a lambda layer runs in the context of the parent lambda function, I can't set an environment variable on the layer.
I could set a 'LoggingEventBus' environment variable on all the lambda functions that use this layer. But that seems repetitive and error prone. (Although it's the best solution I can come up with at the moment.)
I could store the event bus in an SMS parameter store. But good Lord, lookup up that parameter every time any function in my app wants to log anything would just be ridiculous. (And I'm back to having to figure out a way for the logger to determine if it's in dev/test/prod to look up the right bus.)
I'm actually considering some build process during 'cdk synth' that modifies the source code of the logging layer and does a string replacement of the event bus name, so that the name actually is hardcoded by the time the code is deployed. But that solution has all kinds of danger signals around it.
Ideally, there would be some kind of 'environment variable' for the layer itself. But no such thing exists, and I acknowledge that such a feature would not be compatible with how most language runtimes work.
How have others solved this? Is there a 'right' answer for putting a setting in a lambda layer?
You are right that there is no concept of environment variables for Lambda Layers.
I could set a 'LoggingEventBus' environment variable on all the
lambda functions that use this layer. But that seems repetitive and
error prone.
We also use Layers for some logic (involving SNS Topics etc) which is common to multiple Lambda functions, and we do it very similar to what you suggested above.
Only difference is that we just set the stage in environment variables, not the values of our SNS Topics (or in your case the EventBridge name). And then let the Layer decide which SNS Topic to use based on the stage set in env.
In our Layer code, we use a config like this -
{
"Prod": {
"SNSTopicARN": "ProdARN",
"OtherConfig": "ProdValue"
},
"Dev": {
"SNSTopicARN": "DevARN",
"OtherConfig": "DevValue"
}
}
Then in the Layer Code, we have
config = configJson.get(env.STAGE)
Not the error-prone I guess :)
And we can add any more number of config variables in the Layer without touching the Lambda functions deployment.

Can an AWS Lambda modify a json file on itself?

I have an AWS Lambda function. which have an array on a .json file. now the thing is that I want to modify that .json but after the run, the json remains exactly the same than before the run.
The logs I place there make me think that is actually being modified, but, I wonder if a lambda goes back to its definition before the run.
tbh the information that I need to hold in that json is going to be always just a small amount of settings but those are going to be easy to modify without making a deploy and im trying to avoid using a db or an s3 bucket.
Regards,
Daniel
You're not going to be able to do this. Lambda stores the deployment package (i.e. the .zip or .jar file you used to deploy) and uses that package for the next Lambda it spins up. This new Lambda may or may not be the one that just ran.
The easiest way will be to store this in an S3 bucket. Be aware though that just like in multi-threaded programming you may have many processes (Lambda instances) running at the same time so resource contention is something to be aware of.
I want you to consider the following behaviour of Lambda function:
Let's say you spin one lambda up ,
and then you send a second message to lambda .
If you first lambda finished before you send the second message
The same lambda will run the message .
So this is why you see it changed the file , it's on the same instance with same files .
I would suggest loading json into memory ,
and not change the file directly .
That will solve your problem.
AWS Lambda images are immutable. You need to deploy new state file (json with array) or use some kind storage for it.

Lambda AWS X-Ray. Python SDK - Deactivate Locally

I have a Flask app running as an AWS Lambda Function deployed with Zappa and would like to activate X-Ray to get more information for the different functions.
Activating X-Ray with Zappa was easy enough - it only requires adding this line in the zappa-settings.json:
"xray_tracing": true
Further, I installed the AWS X-Ray Python SDK and added a few decorators to some functions, like this:
#xray_recorder.capture()
When I deploy this as a Lambda function, it all works well. The problem is using the system locally, both when running tests and when running the Flask in a local server instead of as a lambda function.
When I use any of the functions that are decorated either in a test or through the local server, the following exception is thrown:
aws_xray_sdk.core.exceptions.exceptions.SegmentNotFoundException: cannot find the current segment/subsegment, please make sure you have a segment open
Which of course makes sense, because AWS Lambda handles the creation of segments.
Are there any good ways to deactivate capturing locally? This would be useful e.g. for running unit tests locally on functions that I would like to watch in X-Ray.
One of the feature request of this SDK is to have a "disabled" global flag so everything becomes no-ops https://github.com/aws/aws-xray-sdk-python/issues/26.
However, it still depends on what you are testing against. It's good practice to test what actually will be run on Lambda. You can set some environment variables so the SDK thinks it is running on Lambda.
You can see the SDK is looking for two env vars https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/core/lambda_launcher.py. One is LAMBDA_TASK_ROOT set to true so it knows to switch to lambda-mode. The other one is _X_AMZN_TRACE_ID which contains the tracing context normally passed by lambda container.
If you just want to test non-XRay code you can set AWS_XRAY_CONTEXT_MISSING to LOG_ERROR so the SDK doesn't complain on context missing and simply give up capturing wrapped functions. This will run much less code path than mimic lambda behaviors. Ideally it would be better for the lambda local testing tool to be X-Ray friendly. Are you using https://github.com/awslabs/aws-sam-cli? There is already an open issue for this feature https://github.com/awslabs/aws-sam-cli/issues/217

Why might an AWS Lambda function run in docker-lambda, but crashes on AWS?

There are a set of Docker images available called docker-lambda that aim to replicate the AWS Lambda environment as closely as possible. It's third party, but it's somewhat endorsed by Amazon and used in their SAM Local offering. I wrote a Lambda function that invokes an ELF binary, and it works when I run it in a docker-lambda container with the same runtime and kernel as the AWS Lambda environment. When I run the same function on AWS Lambda, the method mysteriously crashes with no output. I've given the function more RAM than the process could possibly use, and the dependencies are clearly there because the I know that it's not really possible to debug this without access to the binaries, but are there any common/possible reasons why this might be happening?
Additionally, are there any techniques to debug something like this? When running in Docker, I can add a --cap-add SYS_PTRACE flag to invoke a command using strace and see what might be causing an error. I'm not aware of an equivalent debugging technique on AWS Lambda.
Willing to bet that the AWS Lambda execution environment isn't playing nicely with that third party image. I've had no problems with Amazon's amazon/aws-lambda-python:3.8 image.
Only other thing I can think of is tweaking the timeout of your lambda.

Automatically instrumenting Java application using AWS X-Ray

I am trying to achieve automatic instrumentation of all calls made by AWS SDKs for Java using X-Ray.
The X-Ray SDK for Java automatically instruments all AWS SDK clients when you include the AWS SDK Instrumentor submodule in your build dependencies.
(from the documentation)
I have added these to my POM
aws-xray-recorder-sdk-core
aws-xray-recorder-sdk-aws-sdk
aws-xray-recorder-sdk-spring
aws-xray-recorder-sdk-aws-sdk-instrumentor
and am using e.g. aws-java-sdk-ssm and aws-java-sdk-sqs.
I expected to only have to add the X-Ray packages to my POM and provide adequate IAM policies.
However, when I start my application I get exceptions such as these:
com.amazonaws.xray.exceptions.SegmentNotFoundException: Failed to begin subsegment named 'AWSSimpleSystemsManagement': segment cannot be found.
I tried wrapping the SSM call in a manual segment and so that worked but then immediately the next call from another AWS SDK throws a similar exception.
How do I achieve the automatic instrumentation mentioned in the documentation? Am I misunderstanding something?
It depends on how you make AWS SDK calls in your application. If you have added X-Ray servlet to your spring application per https://docs.aws.amazon.com/xray/latest/devguide/xray-sdk-java-filters.html then each time your application receives a request, the X-Ray servlet filter will open a segment and store it in the thread serving that request. Any AWS SDK calls you make as part of that request/response cycle will pick up that segment as the parent.
The error you got means the the X-Ray instrumentor tries to record the AWS API call to a subsegment but it cannot find a parent (which request this call belongs to).
Depending on your use case you might want to explicitly instrument certain AWS SDK clients and left others plain, if some of those clients are making calls in a background worker.