I am trying to pass a unix timestamp into API get request as a parameter to another system to grab data. The parameter needs to be the last time the AWS Lambda ran. I need to somehow store the last time the AWS lambda function has ran into maybe an s3 bucket and also recover that timestamp. So I can pass that value along into the next run.
Anyone have any ideas on how to do something like this?
Lambda does not store any last run time between invocations (especially as its possible there could be concurrent invocations of your Lambda at the same time).
Depending on the use case if you want your Lambda to both read and write DynamoDB will probably be your best choice, although you should be aware of the following:
Reads and writes use credits, if you're read and write heavy you will need to consider pricing and enable autoscaling if your load varies.
By default reads are eventually consistent, if your writes must be accurate you will want to use strongly consistent reads.
As an alternative you could store the value as a parameter store value, it is limited 1000 operations per second so if you are not using frequent requests this will provide a very simple implementation.
If you do not need the information within the Lambda itself, you can get this information by filtering the CloudWatch logs that are produced by your Lambda. This would not be advisable in your Lambda itself as duration would span longer than either of the above options.
A quick database you could access from an AWS Lambda function is AWS Systems Manager Parameter Store.
You can store simple information such as configuration settings, URLs to databases and even... the last execution time!
IAM permissions can be used to limit access to specific parameters.
AWS Lambda stores many metrics on each function, including invocation time. Are you using boto3 or some other SDK?
Related
We are developing a serverless application. The application has "users" that get added, "groups" that have "permissions" on different "resources". To check if a user has permission to do an action on a resource, there would be some calculations we will need to do. (We are using DynamoDB)
Basically, before every action, we will need to check if the user has permission to do that particular action on the given resource. I was thinking we could have a lambda function that checks that from a cache and If not in the cache, hits the DB, does the calculation, writes in the cache, and returns.
What kind of cache would be best to use here? We are going to be calling this internally from the backend itself.
Is API gateway the way to go still?
How about elastic cache for this purpose? Can we use it without having to configure a VPC? We are trying not to have to use a VPC in our application.
Any better ways?
They are all good options!
Elasticache is designed for caching data. API Gateway can also cache results.
An alternative is to keep the data "inside" the AWS Lambda function by using global variables. The values will remain present the next time the Lambda function is invoked, so you could cache results and an expiry time. Note, however, that Lambda might launch multiple containers if the function is frequently run (even in parallel), or not run for some time. Therefore, you might end up with multiple caches.
I'd say the simplest option would be API Gateway's cache.
Where are those permissions map (user <-> resource) is stored?
This aws's blog post might be interesting (it's about caching in lambda execution environment's memory.), because you could use dynamodb's table for that.
Is there a way to force AWS to execute a Lambda request coming from an API Gateway resource in a certain execution environment? We're in a use-case where we use one codebase with various models that are 100-300mb, so on their own small enough to fit in the ephemeral storage, but too big to play well together.
Currently, a second invocation with a different model will use the existing (warmed up) lambda function, and run out of storage.
I'm hoping to attach something like a parameter to the request that forces lambda to create parallel versions of the same function for each of the models, so that we don't run over the 512 MB limit and optimize the cold-boot times, ideally without duplicating the function and having to maintain the function in multiple places.
I've tried to investigate Step Machines but I'm not sure if there's an option for parameter-based conditionality there. AWS are suggesting to use EFS to circumvent the ephemeral storage limits, but from what I can find, using EFS will be a lot slower than reading from the ephemeral /tmp/ directory.
To my knowledge: no. You cannot control the execution environments. Only thing you can do is limit the concurrent executions.
So you never know, if it is a single Lambda serving all your events triggered from API Gateway or several running in parallel. You also have no control over which one of the execution environments is serving the next request.
If your issues is the /temp directory limit for AWS Lambda, why not try EFS?
The situation
Currently, I am using an Amazon SQS queue that triggers a Lambda function to process new messages upon arrival to the queue. Those Lambda functions are being moved to a DLQ (Dead-Letter Queue) upon failure.
In-order to seed the SQS queue, I am using a CRON that runs every day and inserts the available jobs into the queue.
I want to issue a summarizing alert/email once the processing of all the new jobs the CRON has inserted for the day are done or been processed, along with the details about how many successful, failing and total jobs were originally issued in that day.
The problem:
As the Lambda functions run separately, and the fact that I want to keep it that way, I was wondering what would be the best service to use in order to store the temporary count value (at least two out of the three counts are needed among the total, succeeding and failing counts)?
I was thinking about DynamoDB, but every DB seems to be an overkill for that, and won't be cost-effective either. S3 also doesn't seem to be the most practical/preferred for this type of solution. I can also use SQS (as its "storage" is somewhat designed for cases with relatively small data storage such as these) with an identifier "count" that will be updated by every Lambda function, but knowing which Lambda function was the last requires checking the whole queue, which seems like over-complicating that.
Any other AWS service that comes up to mind?
Here is a good listing of Storage Options in the AWS Cloud (2013, but includes some of that options available today as well).
AWS Systems Manager Parameter Store can be used as a 'mini-database'.
It requires AWS credentials to access (which would be available to the Lambda functions or whatever code you are running to perform this check) but has no operational cost.
From PutParameter - AWS Systems Manager:
Parameter Store offers a standard tier and an advanced tier for parameters. Standard parameters have a content size limit of 4 KB and can't be configured to use parameter policies. You can create a maximum of 10,000 standard parameters for each Region in an AWS account. Standard parameters are offered at no additional cost.
You could run into problems if multiple processes try to update the parameters simultaneously, but hopefully your use-case is pretty simple.
First i have a question about the way Lambda works:
If it's only triggered by 1 SQS queue and that queue now contains 100 messages, would it sequentially create and tear down 100 lambdas processes? Or would it do it in parallel?
My second question is the main one:
The job of my lambda is to request an access token (for an external service) that expires every hour and using it, perform some action on that external service.
Now, i want to be able to cache that token and only ask for it every 1 hour, instead of every time i make the request using the lambda.
Given the nature of how Lambda works, is there a way of doing it through code?
How can i make sure all Lambdas processes use the same access token?
(I know i can create a new Redis instance and make them all point to it, but i'm looking for a "simpler" solution)
You can stuff the token in the SSM parameter store. You can encrypt the value. Lambdas can check the last modified date on the value to monitor when expiration is pending and renew. No Redis instance to maintain, and the value would be encrypted.
https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-paramstore.html
You could also use DynamoDB for this. Lower overhead than Redis since it’s serverless. If you have a lot of concurrent Lambda, this may be preferable to SSM because you may run into rate limiting on the API. A little more work because you have to setup a DynamoDB table.
Another option would be to have a “parent” Lambda function that gets the API token and calls the “worker” Lambdas and passes the token as a parameter.
Is there a way to dynamically create scheduled lambda calls in AWS? I have to create many scheduled lambda calls that. I am aware of CloudWatch rules, but they have a limit on the amount you can get. I also heard about Cronally, but they are not launched yet, and I'd rather do something like this on my own. I do not see an obvious solution without trade offs, but does the 'easy way' exist, or it all depends on the particular application?
The cloudwatch events docs say the limit of 50 rules per account can be raised on request so maybe they might be able to raise it high enough for your needs.
Alternatively you could just do one rule that fires a single "scheduler"lambda function every minute. that scheduler can contain a
Schedule of which functions get fired at which times and invoke the other lambda functions according to that schedule. You could even store the schedule in a dynamoDB table or s3 bucket so don't need to update the lambda function itself to change the schedule.