Situation:
I have a zip file with a JS function in a S3 bucket
The file properties say:
Link:
https://s3.just-an-example-region-for-this-post-1.amazonaws.com/my-bucket/server/func-helloworld.zip
When I create a new Lambda function and chose Upload a .ZIP from Amazon S3 and continue I get:
Trouble uploading file: Invalid S3 URL.
The zip file is accessible for everyone. I can download the zip file.
I can't find a good example on how this link should look like.
I found this: https://forums.aws.amazon.com/thread.jspa?messageID=468968񲟨
But I don't understand where to get my file in a format like mentioned in the thread.
That was fun...
S3 and Lambda need to be in the same region.
I thought just downloading a file from S3 would work no matter which region. Doesn't. Now I know.
I tried it step by step via the web console. Now that I read the CLI docs it says it everywhere... damn. Should have tried the CLI first.
Related
I'm trying to implement a backup mechanism to S3 bucket in my code.
Each time a condition is met I need to upload an entire directory contents to an S3 bucket.
I am using this code example:
https://github.com/aws/aws-sdk-go/tree/c20265cfc5e05297cb245e5c7db54eed1468beb8/example/service/s3/sync
Which creates an iterator of the directory content's and then use s3manager.Upload.UploadWithIterator to upload them.
Everything works, however I noticed it uploads all files and overwrites existing files on the bucket even if they weren't modified since last backup, I only want to upload the delta between each backup.
I know aws cli has the command aws s3 sync <dir> <bucket> which does exactly what I need, however I couldn't find anything equivalent on aws-sdk documentation.
Appreciate the help, thank you!
There is no such feature in aws-sdk. You could instrument it yourself for each file to check the hash of both objects before upload. Or use a community solution https://www.npmjs.com/package/s3-sync-client
I have a repository on Github. It has a zip file with size at 75MB. In my AWS Lambda, I need to pull into the zip files, unzip, and upload into AWS S3 bucket.
What is the best to transport the zip file over to the lambda? I have came up with the following ideas, but I have not find a sounding solution.
User AWS Lambda Layer to store the zip file, but here is 50 MB limit for layer.
Inside the lambda, Git clone the repo, and do unzip, then do analytic work. This will cause a lambda timeout in my case, because my process involved to upload each unzipped file into s3 bucket.
I have an S3 bucket with a bunch of zip files. I want to decompress the zip files and for each decompressed item, I want to create an $file.gz and save it to another S3 bucket. I was thinking of creating a Glue job for it but I don't know how to begin with. Any leads?
Eventually, I would like to terraform my solution and it should be triggered whenever there are new files in the S3 bucket,
Would a Lambda function or any other service be more suited for this?
From an architectural point of view, it depends on the file size of your ZIP files - if the process takes less than 15 minutes, then you can use Lambda functions.
If more, you will hit the current 15 minute Lambda timeout so you'll need to go ahead with a different solution.
However, for your use case of triggering on new files, S3 triggers will allow you to trigger a Lambda function when there are files created/deleted from the bucket.
I would recommend to segregate the ZIP files into their own bucket otherwise you'll also be paying for checking to see if any file uploaded is in your specific "folder" as the Lambda will be triggered for the entire bucket (it'll be negligible but still worth pointing out). If segregated, you'll know that any file uploaded is a ZIP file.
Your Lambda can then download the file from S3 using download_file (example provided by Boto3 documentation), unzip it using zipfile & eventually GZIP compress the file using gzip.
You can then upload the output file to the new bucket using upload_object(example provided by Boto3 documentation) & then delete the original file from the original bucket using delete_object.
Terraforming the above should also be relatively simple as you'll mostly be using the aws_lambda_function & aws_s3_bucket resources.
Make sure your Lambda has the correct execution role with the appropriate IAM policies to access both S3 buckets & you should be good to go.
I am using use an API (https://scihub.copernicus.eu/userguide/OpenSearchAPI) to download a large number (100+) of large files (~5GB each) and I want to store these files on an AWS s3 bucket.
My first iteration was to download the files locally and use AWS CLI to move them to an S3 bucket: aws s3 cp <local file> s3://<mybucket>, and this works.
To avoid downloading locally I used an ec2 instance and basically did the same from there. The problem however is that the files are quite large so I'd prefer to not even have to store the files and use my ec2 instance to kind of stream the files to my S3 bucket.
Is this possible?
You can use a byte array to populate an Amazon S3 bucket. For example, assume you are using the AWS SDK for Java V2. You can put an object into a bucket like this:
PutObjectRequest putOb = PutObjectRequest.builder()
.bucket(bucketName)
.key(objectKey)
.metadata(metadata)
.build();
PutObjectResponse response = s3.putObject(putOb,
RequestBody.fromBytes(getObjectFile(objectPath)));
Notice RequestBody.fromBytes method. Full example here:
https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/javav2/example_code/s3/src/main/java/com/example/s3/PutObject.java
One thing to note however. If your files are really large, you may want to consider uploading in parts. See this example:
https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/javav2/example_code/s3/src/main/java/com/example/s3/S3ObjectOperations.java
This feature is not clear to me about the benefits (I didn't find any good documentation):
Is it just faster in the case you reuse the same zip for many lambda functions because you upload only 1 time and you just give the S3 link URL to each lambda function?
If you use an S3 link, will all your lambda functions be updated with the latest code automatically when you re-upload the zip file, meaning is the zip file on S3 a "reference" to use at each call to a lambda function?
Thank you.
EDIT:
I have been asked "Why do you want the same code for multiple Lambda functions anyway?"
Because I use AWS Lambda with AWS API Gateway so I have 1 project with all my handlers which are actual "endpoints" for my RESTful API.
EDIT #2:
I confirm that uploading a modified version of the zip file on S3 doesn't change the existing lambda functions result.
If an AWS guy reads this message, that would be great to have a kind of batch update feature that updates a set of selected lambda functions with 1 zip file on S3 in 1 click (or even an "automatic update" feature that detects when the file has been updated ;-))
Let's say you have 50 handlers in 1 project, then you modify something global impacting all of them, currently you have to go through all your lambda functions and update the zip file manually...
The code is imported from the zip to Lambda. It is exactly the same as uploading the zip file through the Lambda console or API. However, if your Lambda function is big (they say >10MB), they recommend uploading to S3 and then using the S3 import functionality because that is more stable than directly uploading from the Lambda page. Other than that, there is no benefit.
So for question 1: no. Why do you want the same code for multiple Lambda functions anyway?
Question 2: If you overwrite the zip you will not update the Lambda function code.
To add to other people's use cases, having the ability to update a Lambda function from S3 is extremely useful within an automated deployment / CI process.
The instructions under New Deployment Options for AWS Lambda include a simple Lambda function that can be used to copy a ZIP file from S3 to Lambda itself, as well as instructions for triggering its execution when a new file is uploaded.
As an example of how easy this can make development and deployment, my current workflow is:
I update my Node lambda application on my local machine, and git commit it to a remote repository.
A Jenkins instance picks up the commit, pulls down the appropriate files, adds them into a ZIP file and uploads this to an S3 bucket.
The LambdaDeployment function then automatically deploys this new version for me, without me needing to even leave my development environment.
To answer what I think is the essence of your question, AWS allows you to use S3 as the origin for your Lambda zip file because sometimes uploading large files via your browser can timeout. Also, storing your code on S3 allows you to store it centrally, rather than on your computer and I'm sure there is a CodeCommit tie-in there as well.
Using the S3 method of uploading your code to Lambda also allows you to upload larger files (AWS has a 10MB limit when uploading via web browser).
#!/bin/bash
cd /your/workspace
#zips up the new code
zip -FSr yourzipfile.zip . -x *.git* *bin/\* *.zip
#Updates function code of lambda and pushes new zip file to s3bucket for cloudformation lambda:codeuri source
aws lambda update-function-code --function-name arn:aws:lambda:us-west-2:YOURID:function:YOURFUNCTIONNAME --zip-file file://yourzipfile.zip
Depends on aws-cli install and aws profile setup
aws --profile yourProfileName configure