Is it possible for S3 notifications to SQS to fail? - amazon-web-services

I have setup an S3 bucket to publish messages for each PUT and POST actions. Files get uploaded to that bucket using CLI. It does work fine but out of 4 files pushed sequentially, only one triggers a message. I am not sure that this has always happened but it is happening consistently now. Note that it does not happen when I upload file manually (i.e. I always get a message per file).
I have made sure that there is no downstream system processing the messages (as a confirmation, I still see the original message triggered after the first file).
Is there any reason to believe that this AWS feature is not reliable? Since this is unlikely, what could be the problem here?

As suggested by Michael in the comment, the problem was that the bucket only listened to s3:ObjectCreated:Put. What was happening is that all other files but the first one were uploaded using multipart which was not triggering any message creation.
I modified the bucket to trigger messages on s3:ObjectCreated:* and it now works as expected.

Inspired by RaySF answer, I've fixed the issue directly in the AWS console.
Sign in AWS console
S3
Find your bucket and click on it
Properties tab
Events
Edit the related event
Change from PUT to All object create events

Related

S3 change last modified date

I have an S3 bucket that sends event notifications for new objects to SQS. Event notifications are filtered to one folder.
I want to simulate an upload of a large number of files at the same time. The problem is I need to upload faster. The fastest I got was to upload to another folder in the same s3 bucket and move the folder into the one with the trigger. It still copies files one by one.
Another thing I tried is:
disable event notification
copy files into the target folder
enable event notification
copy each file into itself (which causes the last modified date change and triggers an event notification)
Is there something faster? Or can we change the last modified date and trigger an event notification without copying?
I'm aware I can generate SQS events programmatically, but I want to do some real testing.

AWS CLoudWatch Log Trigger for Lambda

I have a problem in AWS regarding CloudWatch Log Triggers.
I have two Lambda functions. One (business-lambda) gets triggered when I upload a file to a S3 bucket. The other Lambda function (log-lambda) is triggered whenever business-lambda encounters an invalid file which results in creating an ERROR-log entry. I implemented this with a CloudWatch Log Trigger with filter "?ERROR" and having the log-lambda being subscribed to the log-group of the business-lambda.
Everything works fine as long as I upload one file at a time or at a maximum of ~3 files at a time.
But when I upload e.g. 10 invalid files at a time the log-lambda doesn't get triggered for all of the files. Instead it only gets triggered for 4-5 of them.
Is there some kind of "Cloudwatch-log-trigger/second" limit?
I found a solution - luk2302 made the correct suggestion in their comment.
In the log-lambda code I only process the first entry from an incoming log-event. But the log-lambda gets triggered once for multiple error-log entries from the business-lambda. I did not take this into account in the log-lambda code.
Thanks to everybody for their time!

AWS S3 folder put event notification

I've written a function in Python that uploads a folder and its content to S3. Now I would like S3 to generate an event (so I can send it to a lambda function). S3 allows to generate events only at file level, in fact folders on s3 are just a visualization layer, which means that S3 has no internal representation for folders, keys with the same root are simply grouped together. That said, as for now I've come up with three approaches that revolves around the idea of a 'poison pill'.
Send a special file at the end of the folder upload process, the creation of which sends an event to lambda that can open the file to read custom directives to act on. Seems that this approach is quite flexible, however it poses serious concerns security-wise (I know that ACLs are in place for this reason but I'm not quite sure if it's enough), and generates some overhead while downloading/uploading/deleting the file from/to local memory.
Map an event to the target lambdas and fire it directly. The difference in approaches is simply that in this case I'm not really creating a file on S3, I'm just making S3 believe so. I would use CloudWatch to fire custom S3-object-created events with the name of the folder for lambda to pick up. This approach feels a little more hacky than the other two, plus when I did my research on the matter it seemed like it shouldn't be possible to generate "mock" events on AWS (i.e. Trigger S3 create event). To my understanding however, the function put_events should do the trick.
Using SQS would allow to put the folder name into an SQS task that can be later consumed by lambda. This has some advantages over the other two approaches, since SQS has now a LIFO variant that allows for exactly-once-delivery, failures reprocessing (via dead letters queue), etc, however this generates a non-trivial amount of complexity compared to the other approaches.
At this point I'm trying to opt for the most 'correct' approach, and
in order to do so I'm trying to weight pros and cons to make an informed decision, which led me to some questions:
Is there another way I'm missing out to proceed that does not involve client notification ? (all the aforementioned approaches rely on the client sending the notification in one way or another, which is not very "cloudy")?
Is there a substantial difference between approaches 2 and 3, considering that both rely on sending the information in and out of a stream (CloudWatch and SQS respectively)?
Have you consider using the prefix option of S3 bucket event, I tested it and it worked fine. In my S3 bucket I created two folder test1 and test2. On s3 event I added prefix test1 with that in place every time put/copy operation happen on bucket lambda is trigger.
I think your question nets down to "how can I trigger a Lambda function after I have uploaded a folder full of files to S3?"
Unless you have some information a priori server-side that you can use to determine when the folder upload has completed, the client is going to have to tell you.
Options I would consider:
change your client to publish a message to SNS or to SQS upon the completion of uploading to S3. That message can then trigger your Lambda function.
after the last file has been uploaded to folder images/dogs/, upload a zero-sized object whose key is the same as the folder (images/dogs/). This is a 'sentinel file'. Use an S3 event trigger with suffix of / to detect the upload of that 'folder' object and trigger your Lambda.
I prefer the 1st option. It achieves the end goal without resulting in extraneous S3 objects. With SNS you can also configure multiple downstream processes in response to the ‘finished upload’ message (a fan out) if needed.

AWS Lambda function getting called repeatedly

I have written a Lambda function which gets invoked automatically when a file comes into my S3 bucket.
I perform certain validations on this file, modify the particular and put the file at the same location.
Due to this "put", my lambda is called again and the process goes on till my lambda execution times out.
Is there any way to trigger this lambda only once?
I found an approach where I can store the file name in DynamoDB and can apply a check in lambda function, but can there be any other approach where DynamoDB's use can be avoided?
You have a couple options:
You can put the file to a different location in s3 and delete the original
You can add a metadata field to the s3 object when you update it. Then check for the presence of that field in s3 so you know if you have processed it already. Now this might not work perfectly since s3 does not always provide the most recent data on reads after updates.
AWS allows different type of s3 event triggers. You can try playing s3:ObjectCreated:Put vs s3:ObjectCreated:Post.
You can upload your files in a folder, say
s3://bucket-name/notvalidated
and store the validated in another folder, say
s3://bucket-name/validated.
Update your S3 Event notification to invoke your lambda function whenever there is a ObjectCreate(All) event in the /notvalidated prefix.
The second answer does not seem to be correct (put vs post) - there is not really a concept of update in S3 in terms of POST or PUT. The request to update an object will be the same as the initial POST of the object. See here for details on the available S3 events.
I had this exact problem last year - I was doing an image resize on PUT and every time a file was overwritten, it would be triggered again. My recommended solution would be to have two folders in your s3 bucket - one for the original file and one for the finalized file. You could then create the lambda trigger with the lambda prefix so it only checks the files in the original folder
The events are triggered in S3 based on if the object is put/post/copy/complete Multipart Upload - All these operations corresponds to ObjectCreate as per AWS documentation .
https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html
The best solution is to restrict your S3 object create event to particular bucket location. So that any change in that bucket location will trigger lambda function.
You can do the modification in some other bucket location which is not configured to trigger lambda function when object is created in that location.
Hope it helps!

AWS Lambda: error creating the event source mapping: Configuration is ambiguously defined

There was an error creating the event source mapping: Configuration is ambiguously defined. Cannot have overlapping suffixes in two rules if the prefixes are overlapping for the same event type.
I created an event earlier from the GUI console 6-7 days ago and it was working fine. The next day the event just missing, i cant see it anymore at the Lambda console GUI. But every S3 objects still seems triggering the lambda function not a problem. If i cant see, it is not good; So i deleted the Lambda function, waited for 5-10 seconds before creating another new function. And now, i receive the same above when i try to create the event sources like this:
When i click "Submit" the event sources tab says "You do not have any event sources for this function", Lambda does not get triggered; it means the entire application flow is now broken :(
The problem is almost the same as: "https://forums.aws.amazon.com/thread.jspa?messageID=670712򣯸" But somehow i cant reply to that thread, so i created a new thread here instead. anyone encounter this issue?
In fact, i try to response to the existing AWS forum thread: https://forums.aws.amazon.com/thread.jspa?messageID=670712&#670712
but i keep getting this funny error: "Your message quota has been reached. Please try again later.". And i wasnt even posting anything, how can i use up my quota?
What I suspect is your S3 bucket may still be "linked" to the lambda function.
Maybe check your S3 bucket for events and remove them there, then try creating the lambda events again?
i.e. S3 bucket-> properties-> Events
After 6 years nice to see some people still befitting from this answer,
Here is a shamless plug to youtube video I uploaded 2022-12-13.
https://www.youtube.com/watch?v=rjpOU7jbgEs
The issue must be that the s3 bucket is already linked with the suffix/prefix you are trying to link. Remove the link in S3 and try again.
When you setup a lambda function and setup a trigger related to S3. The notification gets updated in the properties sections of that S3 bucket.
The mentioned error occurs when the earlier lambda function is deleted and you're trying to setup same kind of trigger again. This time the thing to note is, the S3 notification is still not deleted when you deleted the lambda function.
Goto S3 bucket > Properties > Event notifications
and delete the old setting and then setup new trigger in the new lambda function trigger.
Here is a link to a youtube video profiling this issue and demonstrating the solution:
https://www.youtube.com/watch?v=1Tfmc9nEtbU
Just as Ridwaan Manuel, you must remove the events by going to S3 bucket-> properties-> Events as the video shows.
Steps to reproduce this issue:
Create a bucket and create a folder called “example/”
Create Lambda Function
Add S3 trigger to the lambda using the bucket from (1) with default settings
Save the trigger
Click Save and notice error
Refresh the page and notice that the triggers disappeared
Add the same bucket again and notice the ambiguous reference error