Finding an equivalent of environment variables in AWS Lambda Layers? - amazon-web-services

I am writing a serverless app on AWS.
I have broken up the app into many CloudFormation stacks. I am using CDK (in Python) to create the CF stacks to deploy the app.
A core requirement of my lambda functions, of course, is the ability to log events. To handle this (and all message passing in the app), I have created a custom EventBridge bus in one of my stacks. The name of the event bus is an output of the stack.
Because the logging functionality will be common to many lambda functions, it seems appropriate to put the logging functionality into a lambda layer. I can then have all of my lambda functions implement that layer, and logging will be automagically available to all of my lambda functions.
The problem is that my logging code needs to know which EventBridge event bus to write events to. I don't want to hardcode this, because I may deploy a 'dev' and/or 'test' and/or 'prod' version of the stack simultaneously and the logging layer in each environment needs to log to its environment's logging bus.
Because a lambda layer runs in the context of the parent lambda function, I can't set an environment variable on the layer.
I could set a 'LoggingEventBus' environment variable on all the lambda functions that use this layer. But that seems repetitive and error prone. (Although it's the best solution I can come up with at the moment.)
I could store the event bus in an SMS parameter store. But good Lord, lookup up that parameter every time any function in my app wants to log anything would just be ridiculous. (And I'm back to having to figure out a way for the logger to determine if it's in dev/test/prod to look up the right bus.)
I'm actually considering some build process during 'cdk synth' that modifies the source code of the logging layer and does a string replacement of the event bus name, so that the name actually is hardcoded by the time the code is deployed. But that solution has all kinds of danger signals around it.
Ideally, there would be some kind of 'environment variable' for the layer itself. But no such thing exists, and I acknowledge that such a feature would not be compatible with how most language runtimes work.
How have others solved this? Is there a 'right' answer for putting a setting in a lambda layer?

You are right that there is no concept of environment variables for Lambda Layers.
I could set a 'LoggingEventBus' environment variable on all the
lambda functions that use this layer. But that seems repetitive and
error prone.
We also use Layers for some logic (involving SNS Topics etc) which is common to multiple Lambda functions, and we do it very similar to what you suggested above.
Only difference is that we just set the stage in environment variables, not the values of our SNS Topics (or in your case the EventBridge name). And then let the Layer decide which SNS Topic to use based on the stage set in env.
In our Layer code, we use a config like this -
{
"Prod": {
"SNSTopicARN": "ProdARN",
"OtherConfig": "ProdValue"
},
"Dev": {
"SNSTopicARN": "DevARN",
"OtherConfig": "DevValue"
}
}
Then in the Layer Code, we have
config = configJson.get(env.STAGE)
Not the error-prone I guess :)
And we can add any more number of config variables in the Layer without touching the Lambda functions deployment.

Related

Using CDK to Create a Step Function With Dependencies on Other AWS Resources (Like a Lambda) Owned By Different Projects

We're using AWS Step Functions in our application. We have one step function we're creating with the use of the CDK as part of a deployment of Application A from Repository A. That step function needs to include a lambda function as one of the steps. The problem we're having is that this lambda function is created and maintained independently in a different repository (Repository B). We're not sure the best way to connect one AWS resource (AWS Lambda) with another AWS resource (AWS Step Functions) when the creation of those two resources is happening independently in two different places.
We'd like to not manually create the lambda or step function (or both) in each environment. It's time consuming, prone to error and we're going to have a lot of these situations occur.
Our best thought at the moment is that we could maybe have Application A create the step function, but have it create and reference an empty lambda. Initially the step function won't be fully functional of course, but then when we deploy Application B it could look for that empty lambda function and upload new code to it.
And, so that we don't have an issue where deploying Application B first results in non-working code. We can also handle the opposite condition: Application B could create the lambda function before uploading the code to it if it doesn't already exist. Application A could then look to see if the lambda function already exists when creating the step function and just reference the lambda function in the step function directly.
Concerns with this approach:
This is extra work and adds a lot of complexity to the deployment, so more potential for failure
I'm not sure I can easily look up a lambda function like this anyway (I guess it would have to be by name since we couldn't know what the ARN would be when we're writing the code). But then we have issues if the name changes too, so maybe there's a pre-defined ID or something we could use to look it up instead.
Potential for code failing in production. If when deploying to QA for testing we deploy Application A, then Application B, we really only know that scenario works. If, then, when going to production we deploy them in the opposite order it might break.
What are some good options for this kind of thing because I can't think of anything great. My best idea involves not using lambda at all but instead having the step function step be queueing something up in SQS, then Application B can just read from that queue no problem. It feels like this is a common enough scenario though that there must be some clean way to do it with lambda and I wouldn't want my decisions on what service type I can use in AWS be stymied by deployment feasibility.
Thanks
You can easily include an existing Lambda function in a new CDK-created Step Function. Use the Function.fromFunctionArn static method to get a read-only reference to the Lambda using its ARN. The CDK uses the ARN to add the necessary lambda:InvokeFunction permissions to the Step Functions' assumed role.
import { aws_stepfunctions_tasks as tasks } from 'aws-cdk-lib';
const importedLambdaTask = new tasks.LambdaInvoke(this, 'ImportedLambdaTask', {
lambdaFunction: lambda.Function.fromFunctionArn(
this,
'ImportedFunc',
'arn:aws:lambda:us-east-1:123456789012:function:My-Lambda5C096DFA-RLhGGzBJSnMN'
),
resultPath: '$.importedLambdaTask',
});
If you prefer not to hard code the Lambda ARN int the CDK stack, save the ARN to a SSM Parameter Store Parameter. Then import it into the stack by name and pass it to fromFunctionArn:
const lambdaArnParam = ssm.StringParameter.fromStringParameterName(
this,
'ArnFromParamStore',
'lambda-arn-saved-as-ssm-param'
);
Edit: Optionally add a Trigger construct to your CDK Application A to confirm the existence of the Application B Lambda dependency before deploying. Triggers are a newish CDK feature that let you run Lambda code during deployments. The Trigger Function should return an error if it cannot find the external Lambda, thereby causing Application A's deployment to fail.

What is the proper way to build many Lambda functions and updated them later?

I want to make a bot that makes other bots on Telegram platform. I want to use AWS infrastructure, look like their Lamdba functions are perfect fit, pay for them only when they are active. In my concept, each bot equal to one lambda function, and they all share the same codebase.
At the starting point, I thought to make each new Lambda function programmatically, but this will bring me problems later I think, like need to attach many services programmatically via AWS SDK: Gateway API, DynamoDB. But the main problem, how I will update the codebase for these 1000+ functions later? I think that bash script is a bad idea here.
So, I moved forward and found SAM (AWS Serverless Application Model) and CloudFormatting, which should help me I guess. But I can't understand the concept. I can make a stack with all the required resources, but how will I make new bots from this one stack? Or should I build a template and make new stacks for each new bot programmatically via AWS SDK from this template?
Next, how to update them later? For example, I want to update all bots that have version 1.1 to version 1.2. How I will replace them? Should I make a new stack or can I update older ones? I don't see any options in UI of CloudFormatting or any related methods in AWS SDK for that.
Thanks
But the main problem, how I will update the codebase for these 1000+ functions later?
You don't. You use lambda alias. This allows you to fully decouple your lambda versions from your clients. This works because you are using an alias of your function in your client's code (or api gateway). The alias is fixed and does not change.
However, alias is like a pointer - it can point to different versions of your lambda function. Therefore, when you publish a new lambda version you just point alias to it. Its fully transparent from your clients and their alias does not require any change.
I agree with #Marcin. Also it would be worth checking serverless? Seems like you are still experimenting so most likely you are deploying using bash scripts with AWS SDK/SAM commands. This is fine but once you start getting the gist of what your architecture looks like, I think you will appreciate what serverless can offer. You can deploy/teardown cloudformation stacks in matter of seconds. Also you can use serverless-offline so that you can have a local build of your AWS lambda architecture on your local machine.
All this has saved me hours of grunt work.

Lambda AWS X-Ray. Python SDK - Deactivate Locally

I have a Flask app running as an AWS Lambda Function deployed with Zappa and would like to activate X-Ray to get more information for the different functions.
Activating X-Ray with Zappa was easy enough - it only requires adding this line in the zappa-settings.json:
"xray_tracing": true
Further, I installed the AWS X-Ray Python SDK and added a few decorators to some functions, like this:
#xray_recorder.capture()
When I deploy this as a Lambda function, it all works well. The problem is using the system locally, both when running tests and when running the Flask in a local server instead of as a lambda function.
When I use any of the functions that are decorated either in a test or through the local server, the following exception is thrown:
aws_xray_sdk.core.exceptions.exceptions.SegmentNotFoundException: cannot find the current segment/subsegment, please make sure you have a segment open
Which of course makes sense, because AWS Lambda handles the creation of segments.
Are there any good ways to deactivate capturing locally? This would be useful e.g. for running unit tests locally on functions that I would like to watch in X-Ray.
One of the feature request of this SDK is to have a "disabled" global flag so everything becomes no-ops https://github.com/aws/aws-xray-sdk-python/issues/26.
However, it still depends on what you are testing against. It's good practice to test what actually will be run on Lambda. You can set some environment variables so the SDK thinks it is running on Lambda.
You can see the SDK is looking for two env vars https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/core/lambda_launcher.py. One is LAMBDA_TASK_ROOT set to true so it knows to switch to lambda-mode. The other one is _X_AMZN_TRACE_ID which contains the tracing context normally passed by lambda container.
If you just want to test non-XRay code you can set AWS_XRAY_CONTEXT_MISSING to LOG_ERROR so the SDK doesn't complain on context missing and simply give up capturing wrapped functions. This will run much less code path than mimic lambda behaviors. Ideally it would be better for the lambda local testing tool to be X-Ray friendly. Are you using https://github.com/awslabs/aws-sam-cli? There is already an open issue for this feature https://github.com/awslabs/aws-sam-cli/issues/217

Invoke AWS Lambda with AWS X-Ray locally

Is there a way to invoke lambda with X-Ray by using sam invoke local?
According to the idea which PaulMaddox mentioned,
I have tried the step below, and I don't know whether I misunderstood :
Run a X-Ray Daemon locally (0.0.0.0:2000) by following the document
In my lambda's template.yaml set the ENV AWS_XRAY_DAEMON_ADDRESS: 0.0.0.0:2000
Invoke the function, but still got error Missing AWS Lambda trace data for X-Ray. Expected _X_AMZN_TRACE_ID to be set
Here is part of template.yaml setting, I used the environment variable to set AWS_XRAY_DAEMON_ADDRESS
It would be nice if you could provide more information.
I'm not too familiar with SAM, but...
You need to set the _X_AMZN_TRACE_ID environment variable. Currently, the X-Ray Node SDK works by cross communicating between the Lambda runtime start-up code and the user code.
Lambda starts the segment in their start-up code, records information such as timing and exceptions, and sends the segment to the X-Ray service. Then, it forwards the trace ID/parent ID/sampling decision to the user code via setting the _X_AMZN_TRACE_ID environment variable. This allows the SDK to create a separate subsegment, inferring a connection to the original segment, which gets "weaved" into the original on the service end without actually being directly related. Both are sent out of band, asynchronously from each other.
The _X_AMZN_TRACE_ID variable fits the same format as the tracing header here as discussed here: https://docs.aws.amazon.com/xray/latest/devguide/xray-concepts.html#xray-concepts-tracingheader
If you want to send traces through the Daemon to the X-Ray service, you'll need to figure out how to get SAM to construct this Lambda segment initially and set the _X_AMZN_TRACE_ID prior to importing the SDK.
Since the SDK auto-detects the presence of Lambda (which as I understand, SAM mimics), you'll have to set the _X_AMZN_TRACE_ID variable before importing in the SDK. Which, is kind of a catch-22, because you need to import the SDK (in the non-Lambda mode) to construct the Lambda segment before you can populate the _X_AMZN_TRACE_ID.
The problem lies here: https://github.com/aws/aws-xray-sdk-node/blob/master/packages/core/lib/aws-xray.js#L361
If you flip the SDK into LOG_ERROR mode (ignoring the Lambda errors), create and send the Lambda segment (just manually create a segment, load the generated ID/Parent ID/Sampling into _X_AMZN_TRACE_ID then close the segment) and then clear cache/re-import the SDK after, then that should work.
Otherwise, I suspect there may be some work on the SAM end to have this built in. But, hopefully this work around works.

Force Discard AWS Lambda Container

How to manually forcefully discard a aws lambda function in the cluster using aws console or aws cli for development and testing purposes ?
If you redeploy the function it'll terminate all existing containers. It could be as simple as assigning the current date/time to the description of the Lambda function and redeploying. This will allow you to redeploy as many times as you need because something is unique and it will tear down all existing containers each time you do the deployment.
With that said, Lambda functions are supposed to be stateless. You should keep that in mind when you write your code (eg. avoid using global variables, use random file names if creating something temp, etc). From the sounds of things, I think you might have an issue with your design if you require the Lambda container to be torn down.
If you're using the UI, then a simple way to do this is to add or alter an environment variable on the function configuration page.
When you click "Save" the function will be reloaded.
Note: this won't work if you're using the versioned functions feature.