This feature is not clear to me about the benefits (I didn't find any good documentation):
Is it just faster in the case you reuse the same zip for many lambda functions because you upload only 1 time and you just give the S3 link URL to each lambda function?
If you use an S3 link, will all your lambda functions be updated with the latest code automatically when you re-upload the zip file, meaning is the zip file on S3 a "reference" to use at each call to a lambda function?
Thank you.
EDIT:
I have been asked "Why do you want the same code for multiple Lambda functions anyway?"
Because I use AWS Lambda with AWS API Gateway so I have 1 project with all my handlers which are actual "endpoints" for my RESTful API.
EDIT #2:
I confirm that uploading a modified version of the zip file on S3 doesn't change the existing lambda functions result.
If an AWS guy reads this message, that would be great to have a kind of batch update feature that updates a set of selected lambda functions with 1 zip file on S3 in 1 click (or even an "automatic update" feature that detects when the file has been updated ;-))
Let's say you have 50 handlers in 1 project, then you modify something global impacting all of them, currently you have to go through all your lambda functions and update the zip file manually...
The code is imported from the zip to Lambda. It is exactly the same as uploading the zip file through the Lambda console or API. However, if your Lambda function is big (they say >10MB), they recommend uploading to S3 and then using the S3 import functionality because that is more stable than directly uploading from the Lambda page. Other than that, there is no benefit.
So for question 1: no. Why do you want the same code for multiple Lambda functions anyway?
Question 2: If you overwrite the zip you will not update the Lambda function code.
To add to other people's use cases, having the ability to update a Lambda function from S3 is extremely useful within an automated deployment / CI process.
The instructions under New Deployment Options for AWS Lambda include a simple Lambda function that can be used to copy a ZIP file from S3 to Lambda itself, as well as instructions for triggering its execution when a new file is uploaded.
As an example of how easy this can make development and deployment, my current workflow is:
I update my Node lambda application on my local machine, and git commit it to a remote repository.
A Jenkins instance picks up the commit, pulls down the appropriate files, adds them into a ZIP file and uploads this to an S3 bucket.
The LambdaDeployment function then automatically deploys this new version for me, without me needing to even leave my development environment.
To answer what I think is the essence of your question, AWS allows you to use S3 as the origin for your Lambda zip file because sometimes uploading large files via your browser can timeout. Also, storing your code on S3 allows you to store it centrally, rather than on your computer and I'm sure there is a CodeCommit tie-in there as well.
Using the S3 method of uploading your code to Lambda also allows you to upload larger files (AWS has a 10MB limit when uploading via web browser).
#!/bin/bash
cd /your/workspace
#zips up the new code
zip -FSr yourzipfile.zip . -x *.git* *bin/\* *.zip
#Updates function code of lambda and pushes new zip file to s3bucket for cloudformation lambda:codeuri source
aws lambda update-function-code --function-name arn:aws:lambda:us-west-2:YOURID:function:YOURFUNCTIONNAME --zip-file file://yourzipfile.zip
Depends on aws-cli install and aws profile setup
aws --profile yourProfileName configure
Related
I am working on a requirement, where i am doing multipart upload of the csv file from on prem server to S3 Bucket.
To achieve this using AWS Lambda I create a presigned url and use this url i am uploading the csv file. Now, once i have the file in AWS S3, i want it to be moved to AWS RDS Oracle DB. Initially i was planning to use AWS Lambda for this.
So once i have the file in S3, it triggers lambda(s3 event) and lambda will push this file to RDS. But with this the issue is with the file Size(600 MB).
I am looking for some other way, where whenever there is a file uploaded to S3, it should trigger any AWS service and that service will push this csv file to RDS. I have gone through AWS DMS/Data Pipeline, but not able to find any way to automate this migration
I need to automate this migration on every s3 upload, that is also cost effective.
Setup S3 Integration and build SPROCS to help automate load. Details found here.
UPDATE:
Looks like you don't even need to create a SPROC. You can just use the RDS procedure as outlined here. You would then just create an event-driven lambda function that is triggered on a given S3 event--e.g. on object PUT(), POST(), COPY, etc..--which passes the S3 metadata requisite to access the event object. Here is a simple Python example of what that Lambda and config might look like. You would then use the metadata passed on the trigger event--as outlined in the Python example--to dynamically create your procedure call then execute that procedure. You can also add the ensuing workflow logic that meets your requirements--i.e. TASK_ID fetch & operational handling, monitoring, etc...--to the same lambda function or separate those concerns by adding additional lambdas. Hope this helps!
Some images which is already uploaded on AWS S3 bucket and of course there is a lot of image. I want to edit and replace those images and I want to do it on AWS server, Here I want to use aws lambda.
I already can do my job from my local pc. But it takes a very long time. So I want to do it on server.
Is it possible?
Unfortunately directly editing file in S3 is not supported Check out the thread. To overcome the situation, you need to download the file locally in server/local machine, then edit it and re-upload it again to s3 bucket. Also you can enable versions
For node js you can use Jimp
For java: ImageIO
For python: Pillow
or you can use any technology to edit it and later upload it using aws-sdk.
For lambda function you can use serverless framework - https://serverless.com/
I have made youtube videos long back. This is related to how get started with aws-lambda and serverless
https://www.youtube.com/watch?v=uXZCNnzSMkI
You can trigger a Lambda using the AWS SDK.
Write a Lambda to process a single image and deploy it.
Then locally use the AWS SDK to list the images in the bucket and invoke the Lambda (asynchronously) for each file using invoke. I would also save somewhere which files have been processed so you can continue if something fails.
Note that the default limit for Lambda is 1000 concurrent executions, so to avoid reaching the limit you can send messages to an SQS queue (which then triggers the Lambda) or just retry when invoke throws an error.
Objective:
Whenever an object is stored in the bucket, trigger a batch job (aws batch) and pass the uploaded file url as an environment variable
Situation:
I currently have everything set up. I've got the s3 bucket with cloudwatch triggering batch jobs, but I am unable to get the full file url or to set environment variables.
I have followed the following tutorial: https://docs.aws.amazon.com/batch/latest/userguide/batch-cwe-target.html "To create an AWS Batch target that uses the input transformer".
The job is created and processed in AWS batch, and under the job details, i can see the parameters received are:
S3bucket: mybucket
S3key: view-0001/custom/2019-08-07T09:40:04.989384.json
But the environment variables have not changed, and the file URL does not contain all the other parameters such as access and expiration tokens.
I have also not found any information about what other variables can be used in the input transformer. If anyone has a link to a manual, it would be welcome.
Also, in the WAS CLI documentation, it is possible to set the environment variables when submitting a job, so i guess it should be possible here as well? https://docs.aws.amazon.com/cli/latest/reference/batch/submit-job.html
So the question is, how to submit a job with the file url as an environment variable?
You could accomplish this by triggering a Lambda function off the bucket and generating a pre-signed URL in the Lambda function and starting a Batch job from the Lambda function.
However, a better approach would be to simply access the file within the Batch function using the bucket and key. You could use the AWS SDK for your language or simply use awscli. For example you could download the file:
aws s3 cp s3://$BUCKET/$KEY /tmp/file.json
On the other hand, if you need a pre-signed URL outside of the Batch function, you could generate one with the AWS SDK or awscli:
aws s3 presign s3://$BUCKET/$KEY
With either of these approaches with accessing the file within the Batch job, you will need to configure the instance role of your Batch compute environment with IAM access to your S3 bucket.
We have the following workflow at my work:
Download the data from AWS s3 bucket to the workspace:
aws s3 cp --only-show-errors s3://bucket1
Unzip the data
unzip -q "/workspace/folder1/data.zip" -d "/workspace/folder2"
Run a java command
java -Xmx1024m -jar param1 etc...
Sync the archive back to the s3 target bucket
aws s3 sync --include #{archive.location} s3://bucket
As you can see that the downloading data from s3 bucket, unzipping, running some java operation on the data and copying back to s3 costs a lot of time and resources.
Hence, we are planning to unzip directly in the s3 target bucket and run java operation there. Would it be possible to run the java operation directly in s3 bucket? If yes, could you please provide some insights?
Its not possible to run the java 'in S3', but what you can do is move your Java code to an AWS Lambda function, and all the work can be done 'in the cloud', i.e., no need to download to a local machine, process and re-upload.
Without knowing the details of you requirements, I would consider setting up an S3 notification request that gets invoked each time a new file gets PUT into a particular location, and AWS Lambda function that gets invoked with the details of that new file, and then have Lambda output the results to a different bucket/location with the results.
I have done similar things (though not with java) and have found it rock solid way of processing files.
No.
You cannot run code on S3.
S3 is an object store, which don't provide any executing environment. To do any modifications to the files, you need to download it, modify and upload back to S3.
If you need to do operations on files, you can look into using AWS Elastic File System which you can mount to your EC2 instance and do the operations as required.
I'm a newbie in AWS Lambda functions. I used a script in AWS CLI in order to create an aws function in Node.js. This script has a config file called config.json. After function creation, I'm able to see the code on Lambda AWS Console and here comes my doubt. The code has this line:
var config = require('./config.json');
So, where this "./config.json" file is actually stored. Could I be able to edit the contents of config.json after deployment of lambda function?
Thanks in advance.
So, where this ./config.json file is actually stored?
It should be stored in the same directory as your Lambda handler function. They should be bundled in a zip file and deployed to AWS. If you didn't deploy it that way then that file doesn't currently exist.
If your Lambda function consists of multiple files you will have to bundle your files and deploy it to AWS as a zip file.
You cannot edit the source of external libraries/files via the AWS Lambda web console. You can only edit the source of the Lambda function handler via the web console.
Your files are placed into the directory specified in the environment variable LAMBDA_TASK_ROOT. You can read this via nodejs as process.env.LAMBDA_TASK_ROOT.
The code you deploy, including the config.json file are read-only, but if you do wish to modify files on the server, you may do so underneath /tmp. Mind, those changes will only be valid for that single container, for its lifecycle (4m30s - 4hrs). Lambda will auto-scale up and down between 0 and 100 containers, by default.
Global variables are also retained across invocations, so if you read config.json into a global variable, then modify that variable, those changes will persist throughout the lifecycle of the underlying container(s). This can be useful for caching information across invocations, for instance.