Force Discard AWS Lambda Container - amazon-web-services

How to manually forcefully discard a aws lambda function in the cluster using aws console or aws cli for development and testing purposes ?

If you redeploy the function it'll terminate all existing containers. It could be as simple as assigning the current date/time to the description of the Lambda function and redeploying. This will allow you to redeploy as many times as you need because something is unique and it will tear down all existing containers each time you do the deployment.
With that said, Lambda functions are supposed to be stateless. You should keep that in mind when you write your code (eg. avoid using global variables, use random file names if creating something temp, etc). From the sounds of things, I think you might have an issue with your design if you require the Lambda container to be torn down.

If you're using the UI, then a simple way to do this is to add or alter an environment variable on the function configuration page.
When you click "Save" the function will be reloaded.
Note: this won't work if you're using the versioned functions feature.

Related

Using CDK to Create a Step Function With Dependencies on Other AWS Resources (Like a Lambda) Owned By Different Projects

We're using AWS Step Functions in our application. We have one step function we're creating with the use of the CDK as part of a deployment of Application A from Repository A. That step function needs to include a lambda function as one of the steps. The problem we're having is that this lambda function is created and maintained independently in a different repository (Repository B). We're not sure the best way to connect one AWS resource (AWS Lambda) with another AWS resource (AWS Step Functions) when the creation of those two resources is happening independently in two different places.
We'd like to not manually create the lambda or step function (or both) in each environment. It's time consuming, prone to error and we're going to have a lot of these situations occur.
Our best thought at the moment is that we could maybe have Application A create the step function, but have it create and reference an empty lambda. Initially the step function won't be fully functional of course, but then when we deploy Application B it could look for that empty lambda function and upload new code to it.
And, so that we don't have an issue where deploying Application B first results in non-working code. We can also handle the opposite condition: Application B could create the lambda function before uploading the code to it if it doesn't already exist. Application A could then look to see if the lambda function already exists when creating the step function and just reference the lambda function in the step function directly.
Concerns with this approach:
This is extra work and adds a lot of complexity to the deployment, so more potential for failure
I'm not sure I can easily look up a lambda function like this anyway (I guess it would have to be by name since we couldn't know what the ARN would be when we're writing the code). But then we have issues if the name changes too, so maybe there's a pre-defined ID or something we could use to look it up instead.
Potential for code failing in production. If when deploying to QA for testing we deploy Application A, then Application B, we really only know that scenario works. If, then, when going to production we deploy them in the opposite order it might break.
What are some good options for this kind of thing because I can't think of anything great. My best idea involves not using lambda at all but instead having the step function step be queueing something up in SQS, then Application B can just read from that queue no problem. It feels like this is a common enough scenario though that there must be some clean way to do it with lambda and I wouldn't want my decisions on what service type I can use in AWS be stymied by deployment feasibility.
Thanks
You can easily include an existing Lambda function in a new CDK-created Step Function. Use the Function.fromFunctionArn static method to get a read-only reference to the Lambda using its ARN. The CDK uses the ARN to add the necessary lambda:InvokeFunction permissions to the Step Functions' assumed role.
import { aws_stepfunctions_tasks as tasks } from 'aws-cdk-lib';
const importedLambdaTask = new tasks.LambdaInvoke(this, 'ImportedLambdaTask', {
lambdaFunction: lambda.Function.fromFunctionArn(
this,
'ImportedFunc',
'arn:aws:lambda:us-east-1:123456789012:function:My-Lambda5C096DFA-RLhGGzBJSnMN'
),
resultPath: '$.importedLambdaTask',
});
If you prefer not to hard code the Lambda ARN int the CDK stack, save the ARN to a SSM Parameter Store Parameter. Then import it into the stack by name and pass it to fromFunctionArn:
const lambdaArnParam = ssm.StringParameter.fromStringParameterName(
this,
'ArnFromParamStore',
'lambda-arn-saved-as-ssm-param'
);
Edit: Optionally add a Trigger construct to your CDK Application A to confirm the existence of the Application B Lambda dependency before deploying. Triggers are a newish CDK feature that let you run Lambda code during deployments. The Trigger Function should return an error if it cannot find the external Lambda, thereby causing Application A's deployment to fail.

Finding an equivalent of environment variables in AWS Lambda Layers?

I am writing a serverless app on AWS.
I have broken up the app into many CloudFormation stacks. I am using CDK (in Python) to create the CF stacks to deploy the app.
A core requirement of my lambda functions, of course, is the ability to log events. To handle this (and all message passing in the app), I have created a custom EventBridge bus in one of my stacks. The name of the event bus is an output of the stack.
Because the logging functionality will be common to many lambda functions, it seems appropriate to put the logging functionality into a lambda layer. I can then have all of my lambda functions implement that layer, and logging will be automagically available to all of my lambda functions.
The problem is that my logging code needs to know which EventBridge event bus to write events to. I don't want to hardcode this, because I may deploy a 'dev' and/or 'test' and/or 'prod' version of the stack simultaneously and the logging layer in each environment needs to log to its environment's logging bus.
Because a lambda layer runs in the context of the parent lambda function, I can't set an environment variable on the layer.
I could set a 'LoggingEventBus' environment variable on all the lambda functions that use this layer. But that seems repetitive and error prone. (Although it's the best solution I can come up with at the moment.)
I could store the event bus in an SMS parameter store. But good Lord, lookup up that parameter every time any function in my app wants to log anything would just be ridiculous. (And I'm back to having to figure out a way for the logger to determine if it's in dev/test/prod to look up the right bus.)
I'm actually considering some build process during 'cdk synth' that modifies the source code of the logging layer and does a string replacement of the event bus name, so that the name actually is hardcoded by the time the code is deployed. But that solution has all kinds of danger signals around it.
Ideally, there would be some kind of 'environment variable' for the layer itself. But no such thing exists, and I acknowledge that such a feature would not be compatible with how most language runtimes work.
How have others solved this? Is there a 'right' answer for putting a setting in a lambda layer?
You are right that there is no concept of environment variables for Lambda Layers.
I could set a 'LoggingEventBus' environment variable on all the
lambda functions that use this layer. But that seems repetitive and
error prone.
We also use Layers for some logic (involving SNS Topics etc) which is common to multiple Lambda functions, and we do it very similar to what you suggested above.
Only difference is that we just set the stage in environment variables, not the values of our SNS Topics (or in your case the EventBridge name). And then let the Layer decide which SNS Topic to use based on the stage set in env.
In our Layer code, we use a config like this -
{
"Prod": {
"SNSTopicARN": "ProdARN",
"OtherConfig": "ProdValue"
},
"Dev": {
"SNSTopicARN": "DevARN",
"OtherConfig": "DevValue"
}
}
Then in the Layer Code, we have
config = configJson.get(env.STAGE)
Not the error-prone I guess :)
And we can add any more number of config variables in the Layer without touching the Lambda functions deployment.

What is the proper way to build many Lambda functions and updated them later?

I want to make a bot that makes other bots on Telegram platform. I want to use AWS infrastructure, look like their Lamdba functions are perfect fit, pay for them only when they are active. In my concept, each bot equal to one lambda function, and they all share the same codebase.
At the starting point, I thought to make each new Lambda function programmatically, but this will bring me problems later I think, like need to attach many services programmatically via AWS SDK: Gateway API, DynamoDB. But the main problem, how I will update the codebase for these 1000+ functions later? I think that bash script is a bad idea here.
So, I moved forward and found SAM (AWS Serverless Application Model) and CloudFormatting, which should help me I guess. But I can't understand the concept. I can make a stack with all the required resources, but how will I make new bots from this one stack? Or should I build a template and make new stacks for each new bot programmatically via AWS SDK from this template?
Next, how to update them later? For example, I want to update all bots that have version 1.1 to version 1.2. How I will replace them? Should I make a new stack or can I update older ones? I don't see any options in UI of CloudFormatting or any related methods in AWS SDK for that.
Thanks
But the main problem, how I will update the codebase for these 1000+ functions later?
You don't. You use lambda alias. This allows you to fully decouple your lambda versions from your clients. This works because you are using an alias of your function in your client's code (or api gateway). The alias is fixed and does not change.
However, alias is like a pointer - it can point to different versions of your lambda function. Therefore, when you publish a new lambda version you just point alias to it. Its fully transparent from your clients and their alias does not require any change.
I agree with #Marcin. Also it would be worth checking serverless? Seems like you are still experimenting so most likely you are deploying using bash scripts with AWS SDK/SAM commands. This is fine but once you start getting the gist of what your architecture looks like, I think you will appreciate what serverless can offer. You can deploy/teardown cloudformation stacks in matter of seconds. Also you can use serverless-offline so that you can have a local build of your AWS lambda architecture on your local machine.
All this has saved me hours of grunt work.

Alternative to AWS lambda when deployment package is greater than 250MB?

When I want to launch some code serverless, I use AWS Lambda. However, this time my deployment package is greater than 250MB.
So I can't deploy it on a Lambda...
I want to know what are the alternatives in this case?
I'd question your architecture. If you are running into problems with how AWS has designed a service (i.e. lambda 250mb max size) its likely you are using the service in a way it wasn't intended.
An anti-pattern I often see is people stuffing all their code into one function. Similar to how you'd deploy all your code to a single server. This is not really the use case for AWS lambda.
Does your function do one thing? If not, refactor it out into different functions doing different things. This may help remove dependencies when you split into multiple functions.
Another thing you can look at is can you code the function in a different language (another reason to keep functions small). I once had a lambda function in python that went over 250mb. When I looked at solving the same problem with node.js, my function size dropped to 20mb.
One thing you can do is before run the lambda function you can download the dependencies to /tmp folder from s3 bucket and then add it to python path, it would give you extra 512MB, although you need to take into consideration the download time for some of the lambda invocations

Clearing out tmp folder from AWS Lambda

Hi I have an AWS Lambda environment where the temp directory is now full and I get the following:
java.lang.RuntimeException: java.nio.file.FileSystemException: /tmp/out3786803744412914689: No space left on device
It's serverless so I cannot simply log into the box and delete the contents of the directory.
Is there any way to fix this other than deploying a code change to clear out the temp folder on restart?
When an AWS Lambda function is triggered, a temporary container is created. The Lambda function is then run within the container.
If the Lambda function is triggered many times, it is possible that multiple containers could be created. For example, if the function takes 5 seconds to run and 10 functions are triggered in one second, then 50 containers might be provisioned.
Also, once a function has completed executing, the container might be kept around and used again if the Lambda function is triggered again.
So, there is no single 'server' that is used for the Lambda function. It might be many, or it might be one that is reused.
It is recommended that functions delete their temporary files from /tmp before ending execution. This way, the space will be available for the next execution.
Conversely, you might want to intentionally keep some data in the container for the next execution to act like a cache. For example, if the function downloads some reference data, it will not need to re-download the data the next time if the container is reused.
Bottom line: Program the function to clean-up after itself.
To Add to #John Rotenstein's answer, our lambdas download a large ML model and move to /tmp at the start of the invocation.
In python we do something along the lines of:
if not os.path.isdir(f'/tmp/{self.model}'):
self.download_model()
For our use case this is better than clearing the /tmp dir at the end of the lambda run as it reduces the number of calls and downloads required to/from s3, giving a performance boost for warm starts. It also means the lambdas will finish quicker as they don't need to cleanup. The caveat here is our model is static so we don't need to worry about cache invalidation. If you need to load frequently changing data then of course clear the /tmp dir.
You could potentially build a Lambda shell into your Lambda function using (or emulating) the github lambdash project.
That would allow you to invoke the Lambda with a specific set of parameters that would trigger the Lambda shell feature and execute whatever shell command you passed to it, e.g. "rm /tmp/*". I would personally only consider doing this for development environments, not for production.
That said, the 'proper' answer is #John Rotenstein's answer.
I believe you could delete the contents of the /tmp folder since it would be isolated to your instance, meaning, everything in the /tmp folder was created by your lambda.
You could also offload all this data to some type of storage if it's still relevant.
S3
Dynamo
Redis