I have a foo lambda that executes some code by reading some files.
I only want to run the lambda after I upload the 10 required files, which is the tricky part.
10 files are uploaded in S3 bucket via bitbucket pipeline
??? (need to wait for all new CSVs to be uploaded)
Execute foo lambda
If I use S3 upload trigger it will not work because it will call the lambda 10 times for each file upload...
The 10 files already exist in the S3 repo, I just replace them.
Any ideas how to run only the foo lambda ONCE after the 10 files are uploaded?
The AWS Lambda function will be triggered for every object created in the Amazon S3 bucket.
There is no capability to ask for the Lambda function to run only after 10 files are uploaded.
You will need to add custom code to the Lambda function to determine whether it is 'ready' to trigger your work (and the definition of 'ready' is up to you!).
Related
I have a Step Function State Machine that executes some lambdas in a flow.
I only want to run the State Machine after I upload the 10 required files for my flow, which is the tricky part.
10 files are manually uploaded in S3 bucket
S3 upload triggers a notification to EventBridge (need to wait for all new CSVs to be uploaded)
EventBridge starts the State Machine
I think my current flow will not work because it will call the state machine 10 times for each file upload...
I know how to use S3 file upload to trigger a state machine like in this example but I don't see how to make sure that ALL the files are uploaded before triggering the state machine ONCE.
Is this possible to achieve? Any ideas?
I think my current flow will not work because it will call the state machine 10 times for each file upload
That's right. Everytime an object is uploaded, the S3 event will trigger EventBridge which will invoke the step function.
To achieve your desired workflow, swap out the EventBridge rule with a lambda function. Have the lambda function check the total size of the bucket, and if the condition is met (10 files have been uploaded) invoke the Step Function directly.
I have a lambda function that has a trigger on PUT in the root of an S3 bucket. My function processes any file put in the bucket, then it moves the file to either a processed or a failed sub-folder depending on success. I want to be able to move files in bulk back to the root (easy to do via Management Console), but how to I simulate sending an s3 trigger for each file I move so I can re-process the file? I see how to do it manually via the console, but I would rather write a script to just take a list of files and simulate a trigger for each file. Can I do this using serverless invoke?
I am new to some AWS services.
Our source team uploads 2 files in a S3 bucket at some interval. They run one pipeline and upload a file in S3. Then they run another process and upload another file in S3. Both their processes run in parallel so files are not uploaded in a specific order.
we need to trigger our Lambda function when both files are uploaded in S3.
We tried to trigger lambda from S3 events and SNS but it trigger lambda twice because of 2 S3 events.
What is the best approach to handle this? Any suggestion would be appreciated.
I have multiple related files being uploaded to S3 bucket as group, which I want to process using aws Lambda. For examples externaly, inventory.txt, orders.txt, order_details.txt are received in one folder in s3 bucket. These are part of one batch. Someone else will send the same files in another folder in the same bucket.
I want to process these files(cleanse,combine, etc) at the same time (so 3 files at the same time) as a batch.
I have dabbled with Lambda, on S3 create object event level but it gets triggered for each file being uploaded. I want the lambda to trigger for the 3 files (and for the additional 3 files in another directory if applicable).
After the upload process completes, make it create a trigger file(dummy file). For example, your process uploads orders.txt, inventory.txt in s3://my-bucket/today_date/
Make it create a s3://my-bucket/today_date/today_date.complete after copy is complete.
Add S3 event trigger to your lambda function to execute when it a .complete file is uploaded to S3 and the process the rest of the files using lambda and delete the .complete file. And the process repeats for next day.
This feature is not clear to me about the benefits (I didn't find any good documentation):
Is it just faster in the case you reuse the same zip for many lambda functions because you upload only 1 time and you just give the S3 link URL to each lambda function?
If you use an S3 link, will all your lambda functions be updated with the latest code automatically when you re-upload the zip file, meaning is the zip file on S3 a "reference" to use at each call to a lambda function?
Thank you.
EDIT:
I have been asked "Why do you want the same code for multiple Lambda functions anyway?"
Because I use AWS Lambda with AWS API Gateway so I have 1 project with all my handlers which are actual "endpoints" for my RESTful API.
EDIT #2:
I confirm that uploading a modified version of the zip file on S3 doesn't change the existing lambda functions result.
If an AWS guy reads this message, that would be great to have a kind of batch update feature that updates a set of selected lambda functions with 1 zip file on S3 in 1 click (or even an "automatic update" feature that detects when the file has been updated ;-))
Let's say you have 50 handlers in 1 project, then you modify something global impacting all of them, currently you have to go through all your lambda functions and update the zip file manually...
The code is imported from the zip to Lambda. It is exactly the same as uploading the zip file through the Lambda console or API. However, if your Lambda function is big (they say >10MB), they recommend uploading to S3 and then using the S3 import functionality because that is more stable than directly uploading from the Lambda page. Other than that, there is no benefit.
So for question 1: no. Why do you want the same code for multiple Lambda functions anyway?
Question 2: If you overwrite the zip you will not update the Lambda function code.
To add to other people's use cases, having the ability to update a Lambda function from S3 is extremely useful within an automated deployment / CI process.
The instructions under New Deployment Options for AWS Lambda include a simple Lambda function that can be used to copy a ZIP file from S3 to Lambda itself, as well as instructions for triggering its execution when a new file is uploaded.
As an example of how easy this can make development and deployment, my current workflow is:
I update my Node lambda application on my local machine, and git commit it to a remote repository.
A Jenkins instance picks up the commit, pulls down the appropriate files, adds them into a ZIP file and uploads this to an S3 bucket.
The LambdaDeployment function then automatically deploys this new version for me, without me needing to even leave my development environment.
To answer what I think is the essence of your question, AWS allows you to use S3 as the origin for your Lambda zip file because sometimes uploading large files via your browser can timeout. Also, storing your code on S3 allows you to store it centrally, rather than on your computer and I'm sure there is a CodeCommit tie-in there as well.
Using the S3 method of uploading your code to Lambda also allows you to upload larger files (AWS has a 10MB limit when uploading via web browser).
#!/bin/bash
cd /your/workspace
#zips up the new code
zip -FSr yourzipfile.zip . -x *.git* *bin/\* *.zip
#Updates function code of lambda and pushes new zip file to s3bucket for cloudformation lambda:codeuri source
aws lambda update-function-code --function-name arn:aws:lambda:us-west-2:YOURID:function:YOURFUNCTIONNAME --zip-file file://yourzipfile.zip
Depends on aws-cli install and aws profile setup
aws --profile yourProfileName configure