Is it possible to specify multiple prefixes and suffixes to S3 Lambda Event trigger for object uploads - amazon-web-services

I have a lambda function that needs to be invoked when an object is uploaded to a bucket. It needs to be invoked only if the object is uploaded with prefixA/ or prefixB/.
But, I can see that it only supports a single prefix and a single suffix:
Is it possible to specify multiple prefixes and suffixes?

I have just figured out that I can create 2 separate triggers, each with their own prefixes, i.e prefixA and prefixB. So, I can specify as many prefixes as I want, with separate triggers.

Related

Negation of a folder in S3 Event Notification

I have an S3 bucket with an event notification set up. The trigger of the event is All object create events and it basically runs a lambda function.
Inside the bucket, I have a bunch of folders, one for each operator in our system.
Inside each operator's folder, there might be a folder called /exports/.
Objective
For objects created within the /exports/ folder (under the operator folder), I do not want to trigger the event.
Is that possible with S3?
No, not supported afaik. You have three options:
Use two buckets, one for the dropped items and one for the exports, where each bucket would have a BLAC/ prefix, a CUCC/ prefix etc.
Restructure your prefixes within this one bucket to have: import/BLACC/ , import/CUCC/, export/BLACC/, export/CUCC/ and configure the upload trigger on the prefix import/
Modify your Lambda function to treat objects with exports/ in their key as a no-op.

Add multiple suffixes to AWS Lambda?

I am configuring a Lambda function that should take .png and .pdf files from my bucket.
How can I provide Lambda config with multiple suffixes?
Here is what I want to do:
Please advise how to do it?
As it may seen from the tooltip description; you can't set multiple suffixes on aws console.
Enter a single optional suffix to limit the notifications to objects with keys that end with matching characters.
What you may do is to create multiple triggers and define each suffix in a separate s3 trigger.
The documentation has some sample xml configuration to support multiple/non-overlapping suffix/prefix options but i think it is not possible to set them on web console.

AWS -a Configure Trigger to detect only directory creation and not files creation

I am setting up a lambda function to get triggered only when a directory gets created in s3 and not the file
Example: {bucket-name}/a/b/c/d/
a , b, c, d are directories inside bucket.
I want to get a lambda function triggered when a key "d" (d is not a file, it is a directory) gets created.
Based on my research ,
Only Definite prefixes can be mentioned instead of mentioning {bucket-name}/*/
There is no specific filter in triggers to check for a directory creation. Files and directory creation are considered same as put object
operation. I want to trigger only during directory creation at certain depth, here in this example - i do not want to trigger
during directory/s3 key creation of a,b or c. I need to trigger only during directory creation of d (at deeper level). can this be done any ways while setting up a lambda trigger?
S3 isn't a file system - it is an object store. However, keys that end with a trailing "/" are generally treated as folders, so perhaps that is a way to check.
So I would have my lambda check to see if the object key had a trailing "/", and treat that as the folder creation.
Note that you can create file objects with a trailing "/", you just can't do that via the console, but if you have control over key creation you should be able to avoid that.
Edit:
To address the comment that you want the lambda to only trigger when a "folder" is created, not for every file added, this is not currently supported. Unless you are dealing with billions of files, I would not worry too much about the lambda costs. A function that takes 250ms to run with 256MB of RAM will cost you less than $5 per million objects.
Edit, July 2022:
You can accomplish this by adding an event notification on the bucket and putting "/" for the suffix. You will only get notified when a "folder" is created. (And I should also note that the console for S3 now allows creation of "folders")

AWS Lambda function getting called repeatedly

I have written a Lambda function which gets invoked automatically when a file comes into my S3 bucket.
I perform certain validations on this file, modify the particular and put the file at the same location.
Due to this "put", my lambda is called again and the process goes on till my lambda execution times out.
Is there any way to trigger this lambda only once?
I found an approach where I can store the file name in DynamoDB and can apply a check in lambda function, but can there be any other approach where DynamoDB's use can be avoided?
You have a couple options:
You can put the file to a different location in s3 and delete the original
You can add a metadata field to the s3 object when you update it. Then check for the presence of that field in s3 so you know if you have processed it already. Now this might not work perfectly since s3 does not always provide the most recent data on reads after updates.
AWS allows different type of s3 event triggers. You can try playing s3:ObjectCreated:Put vs s3:ObjectCreated:Post.
You can upload your files in a folder, say
s3://bucket-name/notvalidated
and store the validated in another folder, say
s3://bucket-name/validated.
Update your S3 Event notification to invoke your lambda function whenever there is a ObjectCreate(All) event in the /notvalidated prefix.
The second answer does not seem to be correct (put vs post) - there is not really a concept of update in S3 in terms of POST or PUT. The request to update an object will be the same as the initial POST of the object. See here for details on the available S3 events.
I had this exact problem last year - I was doing an image resize on PUT and every time a file was overwritten, it would be triggered again. My recommended solution would be to have two folders in your s3 bucket - one for the original file and one for the finalized file. You could then create the lambda trigger with the lambda prefix so it only checks the files in the original folder
The events are triggered in S3 based on if the object is put/post/copy/complete Multipart Upload - All these operations corresponds to ObjectCreate as per AWS documentation .
https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html
The best solution is to restrict your S3 object create event to particular bucket location. So that any change in that bucket location will trigger lambda function.
You can do the modification in some other bucket location which is not configured to trigger lambda function when object is created in that location.
Hope it helps!

Can I parameterize AWS lambda functions differently for staging and release resources?

I have a Lambda function invoked by S3 put events, which in turn needs to process the objects and write to a database on RDS. I want to test things out in my staging stack, which means I have a separate bucket, different database endpoint on RDS, and separate IAM roles.
I know how to configure the lambda function's event source and IAM stuff manually (in the Console), and I've read about lambda aliases and versions, but I don't see any support for providing operational parameters (like the name of the destination database) on a per-alias basis. So when I make a change to the function, right now it looks like I need a separate copy of the function for staging and production, and I would have to keep them in sync manually. All of the logic in the code would be the same, and while I get the source bucket and key as a parameter to the function when it's invoked, I don't currently have a way to pass in the destination stuff.
For the destination DB information, I could have a switch statement in the function body that checks the originating S3 bucket and makes a decision, but I hate making every function have to keep that mapping internally. That wouldn't work for the DB credentials or IAM policies, though.
I suppose I could automate all or most of this with the SDK. Has anyone set something like this up for a continuous integration-style deployment with Lambda, or is there a simpler way to do it that I've missed?
I found a workaround using Lambda function aliases. Given the context object, I can get the invoked_function_arn property, which has the alias (if any) at the end.
arn_string = context.invoked_function_arn
alias = arn_string.split(':')[-1]
Then I just use the alias as an index into a dict in my config.py module, and I'm good to go.
config[alias].host
config[alias].database
One thing I'm not crazy about is that I have to invoke my function from an alias every time, and now I can't use aliases for any other purpose without affecting this scheme. It would be nice to have explicit support for user parameters in the context object.