There is a use case I'm working on and I'm not quite sure how it can be solved. The main goal is to upload an image from a react native app to an Amazon S3 bucket using an AWS Lambda function (API Gateway) in order to use Amazon Rekognition service with another Amazon S3 image depending on some values sent to the lambda.
Since the image could be too large I have to use presigned URLs, which means that i make a request to the lambda to get a presigned S3 url to the client so that the client uploads the image to the bucket straight away. But then, how can i use face Rekognition service within the AWS Lambda?
I know i can trigger a lambda after an S3 upload, so i could do the face Rekognition request right after the user makes the http request with the presigned URL, but how can i get face Rekognition service response from that triggered lambda to the original user?
I've thought about SNS, but sending some text message to the user after an image upload instead of a message in the app seems odd.
Thank you in advance and apologies for the long read
You're on the right track with SNS, maybe not with the service, but with the principal.
This is a problem of asynchronous handling of requests and how you can subsequently inform the user of the server's decision. To start, you're async process will need to store the result of the facial recognition somewhere, if you're already using a database (SQL or NoSQL), that would seem to be the place to do this.
Then you have to get the information to the user. Since your user is running a mobile application, there are only two ways of doing this. Either the user will have to poll the back-end service in order to retrieve the result of the async process, or your back-end will need to push the result to the device. Polling the service is straightforward and is usable depending on the load you expect from your application and the duration of the asynchronous process. You can also use long polling to reduce the number of requests, but this doesn't fix the issue (too many users spamming your service waiting for the result) itself.
If you want to notify the users, you will have to create a notification mechanism that is not based on polling a service. You could for example make use of WebSockets, configure your devices to have an MQTT connection (e.g., with AWS IoT) or use another cloud-based notification service that allows you to push messages to the device. You also do not have to include all the information in the message you push to your devices. The pushed message can be a trigger for the device to retrieve the result from the back-end service e.g., using an HTTP API.
Related
There are close to 100,000 devices that are generating logs (total of 10-20 TB a day) which I would like them to directly upload to kinesis. How do I control access? IAM only lets me create a max of 1000 users per account (I know we can request user limit increase), but would like to know what is a better way to do this.
One requirement is, I would like to be able to grant/revoke access to kinesis per device.
Since you have IoT Core already, I think that I would first try to leverage it for logging. This will let you take advantage of the certificate-based authorization that's built-in to IoT core, and I know that you can hook an IoT topic into a Kinesis stream.
If you feel that this would be too much volume (and perhaps too expensive based on the number of messages and rules), then I'd provide my devices with temporary security credentials that let them write to Kinesis and nothing else.
You would generate these credentials on a per-device basis (as far as I can tell, there are no quotas on the number of credentials per account), using a scheduled job, either in Lambda or on ECS. This job would iterate through your devices and generate a set of credentials for each. It would then either publish these credentials to the device via IoT Core, or update the device shadow.
The device could then use these credentials to create a Kinesis client to publish log messages. Your client would have to create a new client whenever it receives new credentials.
As an alternative, if your devices maintain logfiles internally, you could use a similar approach to trigger uploading those files to S3. In that case, rather than publishing temporary credentials, the scheduled task would publish a pre-signed URL for each device. It would publish the URL to the device, and the device would use that to upload its accumulated logs. Then you'd need something to do something with the files on S3.
I have a node.js backend that sends out images to a secondary api for transformations and then those images appear in s3 bucket. The problem is that the secondary api doesn't inform my api when the file is created in the bucket.
Is there some sort of long polling in s3 available because spamming get requests doesn't feel right (also will get expensive).
I'm considering adding a trigger on new files in s3, that will invoke a lambda that will put a message into some sort of pub/sub message broker and then I could just subscribe to it but this seems a bit too complicated?
From the S3 notification docs you can be notified via:
Amazon Simple Notification Service (Amazon SNS) topic
Amazon Simple Queue Service (Amazon SQS) queue
AWS Lambda
The relative benefits or each one are up to you but don't poll S3 for changes. Use one of these to be notified of the changes. You can decide to get notices for just new objects or deleted object.
I currently have a setup where my mobile front-end performs an AWS s3 upload of an image. The s3 upload triggers a AWS lambda function that starts a AWS step-function (state-machine) which performs various jobs and actions.
I am looking for the best (and most time-efficient) way to get the output at the end of the step-function back to the mobile devise.
One way is to monitor the executionARN of the state machine and, when it is completed, fetch the data. This seems to be the case with awslabs lambda-refarch-imagerecognition implementation here. However, my front-end is on mobile and I would rather not have to send and receive many request to check if the state-machine is finished.
Another possible solution is to refactor the process so that the s3 upload is a stand-alone event and, once it has been successful, make an API request to an AWS API-gateway that triggers the step-function. The API POST request will then return the response. The problem here is that the app must wait for the s3 response to proceed with starting the state-machine.
Is there a a better way to perform this sequence and receive a response. Ideally, the s3 upload would return the full response from the state-machine. This way there one request (image-upload) and one response.
I would use Amazon SNS -> push notifications. You say you want to avoid making "many requests" (and waiting for responses - or polling).
Amazon SNS allows you to publish to a specific topic.
Anything which is "subscribed" to the topic, will (receive a notification / message), whenever one (a stateless update) is published to the topic.
The "mobile front-end" (device - you mention) "would receive the message" / receive push notifications from the SNS endpoint / topic.
This could be triggered when the "state machine" completes, allowing the mobile device to "get a timely update" via a push notification.
This would avoid polling for a response.
Is there a way by which I can get notified when a upload is completed in S3 Bucket? The requirement is that I need to provide link to users after uploading of a video is complete in the bucket. By default now I provide link after 30 minutes of start of video, whether video takes 5 minutes to upload or 40 minutes. So is there any way like any API that provides information that the upload has been completed?
Notifications can be triggered in Amazon S3 when any of the following occur:
s3:ObjectCreated:*
s3:ObjectCreated:Put
s3:ObjectCreated:Post
s3:ObjectCreated:Copy
s3:ObjectCreated:CompleteMultipartUpload
s3:ObjectRemoved:*
s3:ObjectRemoved:Delete
s3:ObjectRemoved:DeleteMarkerCreated
s3:ReducedRedundancyLostObject
Notifications can be sent via three destinations:
Amazon Simple Notification Service (SNS), which in-turn can send notifications via email, HTTP/S endpoint, SMS, mobile push notification
Amazon Simple Queueing Service (SQS)
Amazon Lambda (not currently available in all regions)
See: Configuring Amazon S3 Event Notifications
The most appropriate choice depends on your programming preference and how your app is written:
Use SNS to push to an HTTP endpoint to trigger some code in your app
Write some code to periodically check an SQS queue
Write a Lambda function in Node.js or Java
Once triggered, your code would then need to identify who uploaded the video, retrieve their user details, then send them an email notification. This would be easiest if you control the key (filename) of the object being uploaded, since this will assist in determining the user to notify.
You can use Amazon Lambda to post a message to Amazon SNS (or notify you any other way) when a file is uploaded to S3.
Setup an S3 trigger to your Lambda function. See this tutorial: http://docs.aws.amazon.com/lambda/latest/dg/walkthrough-s3-events-adminuser.html
Inside your Lambda function, send out your notification. You can use SNS, SES, SQS, etc.
There is no direct method that can tell that whether the upload is complete or not in S3 bucket. You can do a simple thing which I have followed after lot of research and it is working correctly.
Follow this link and read the size of file after every 30 seconds or so as per your requirement when the file size has not changed for two simultaneous readings once again check the size for surety because it might be due to network congestion that size might not have changed for two simultaneous readings.
Looking to build a mobile application that records a a session of data. The data is required to be cleansed, and then uploaded into an incoming S3 bucket. An event is on this bucket that then triggers a Lambda function to process the data, which then is placed into an outgoing S3 bucket. This is in the form of a file with the file contents being a word on the result of the processing. This result then needs to be returned back to the device.
I'm looking to architect this using as many AWS services as possible. There does also need to be historical data available to the user(device), to see their previous results.
At the moment, I have the following ideas:
AWS Cognito to authenticate device
Mobile device will process and cleanse data, and again using Cognito authentication, place payload packet into S3 incoming bucket, with the DeviceID making up part of the filename
Process remaining as is with Lamdba function, with output being text file, again using DeviceID naming convention
Event trigger on outgoing S3 bucket, with another Lambda function to store the result into DynamoDB. Once stored, send a push notification to the device with the latest result (status)
A small EC2 instance with a custom Node.js admin app to search DynamoDB and view all results, and potentially intercept results (like a workflow) before sent to user. Even possible to trigger final notification to user from admin console
Device application will used AWS SDK to read DynamoDB results historically
Future may incorporate Elastic MapReduce to perform complex queries on results
Solution seems fairly sound, I'm still getting up to speed on all the available AWS services, so not sure if I'm missing anything glaringly obvious.
I'd recommend using AWS API Gateway in combination with Amazon Lambda instead of deploying standalone instance.
I'd also recommend using SNS for mobile push notifications since this very well aligns with the rest of your architecture.
S3 doesn't seem to be adding much value to this flow unless you are trying to decouple the processing of the input from the ingestion. You can invoke Amazon Lambda directly from your mobile app and write the result into ddb directly from lambda.
If you do use S3, I'd recommend using the cognito id as part of the key over the device id. The benefits of this are twofold, you can enable fine grain access control so that one user can't access another user's s3 objects or rows in ddb, and if a user has multiple devices and you use authenticated users, the user can see the same data as they transition between their mobile devices.