How do I notify a user that a lambda function has completed? - amazon-web-services

AWS lambda makes it possible to run code in response to events, such as the uploading of a file to s3. However, the lambda callback notifies the event invoker, and not the user who initiated the event.
Consider the following scenario:
A user uploads a file to s3
That file is processed
User receives notification that the processing is complete
How would you do this with AWS lambda?

When uploading the file, add the email address or other identifier to the object as Object User-Defined Metadata.
When uploading an object, you can also assign metadata to the object.
You provide this optional information as a name-value (key-value) pair
when you send a PUT or POST request to create the object. When
uploading objects using the REST API the optional user-defined
metadata names must begin with "x-amz-meta-" to distinguish them from
other HTTP headers. When you retrieve the object using the REST API,
this prefix is returned. When uploading objects using the SOAP API,
the prefix is not required. When you retrieve the object using the
SOAP API, the prefix is removed, regardless of which API you used to
upload the object.
When the Lambda function completes the file processing, it can read that same metadata, and send an appropriate notification to the user.

Related

Best practices of uploading a file to S3 and metadata to RDS?

Context
I'm building a mock service to learn AWS. I want a user to be able to upload a sound file (which other users can listen to). To do this I need the sound file to be uploaded to S3 and metadata such as file name, name of uploader, length, S3 ID to RDS. It is preferable that the user uploads directly to S3 with a signed URL instead of doubling the data transfered by first uploading it to my server and from there to S3.
Optimally this would be transactional but from what I have gathered there's no functionality for that given. In order to implement this and minimize the risk of the cases where the file being successfully uploaded to S3 but not the metadata to RDS and vice versa my best guess is as follows:
My solution
With words:
First is an attempt to upload the file to S3 with a key (uuid) I generate locally or server-side. If this is successful I make a request to my API to upload the metadata including the key to RDS. If this is unsuccessful I remove the object from S3.
With code:
uuid = get_uuid_from_server();
s3Client.putObject({.., key: uuid, ..}, function(err, data) {
if (err) {
reject(err);
} else {
resolve(data);
// Upload metadata to RDS through API-call to EC2 server. Remove s3 object with key:
uuid if this call is unsuccessful
}
});
As I'm learning, my approaches are seldom the best practices but I was unable to find any good information on this particular problem. Is my approach/solution above in line with best practices?
Bonus question: is it beneficial for security purposes to generate the file's key (uuid) server-side instead of client-side?
Here are 2 approaches that you can pick, assuming the client is a web browser or mobile app.
1. Use your server as a proxy to S3.
Your server acts as a proxy between your clients and S3, you have full control of the upload flow, control the supported file types and can inspect file contents, for example: to make sure the file is a correct sound file, before uploading to S3.
2. Use your server to create pre-signed upload URLs
In this approach, your client first requests server to create a single or multiple (for multi-part upload) pre-signed URLs. Clients then upload to your S3 using those URLs. Your server can save those URLs to keep track later.
To be notified when the upload finishes successfully or unsuccessfully, you can either
(1) Ask clients to call another API,e.g: /ack after the upload finishes for a particular signed URL. If this API is not called after some time, e.g: 1 hour, you can check with S3 and delete the file accordingly. You can do this because you have the signed URL stored in your DB at the start of the upload.
or
(2) Make use of S3 events. You can configure ObjectCreated event in S3, which is fired whenever an object is created, and send all the events to a queue in SQS, and have your server process each event from there. This way, you do not rely on clients to update your server after an upload finishes. S3 will notify your server accordingly, for all successful uploads.

How to associate an uploaded file with signed url to the original request

I have a cloud function that generates a signed url for user's request, and another function that processes the uploaded file and updates the database.
I'm not sure how to associate the uploaded file to the original request.
So, the first function
receives a request containing a userId.
generates a requestId and the signedUrl
records the requestId in a database for the userId
sends the signedUrl to the client
A second cloud function is triggered when the file is uploaded. How can this function associate the uploaded file to the requestId generate in the first function?
I thought of 2 approaches, but both feel wrong to me:
1.
Encode the requestId into the file name. ⇒ This feels brittle as it relies on naming convention.
2.
Rely on the client. Pass the requestId with the signedUrl to the client. The client notifies when file is uploaded. ⇒ this just has too many failure points
Is there a proper way to handle this?
The naming convention thing seems fine. An alternative might be to include a mandatory bit of custom object metadata in the signed URL. For example, include this header in your signature:
x-goog-meta-my-request-id:someRequestId
That would require that the user uploading the object set that header, which will cause the object to have the custom metadata value "my-request-id" set to "someRequestId". That value will then be visible in the object metadata sent along with the object finalized message.

How to check an AWS S3 key for existence with the AWS CPP SDK?

I use the S3 SDK CPP and have the following cenario:
I get some information sent from a client to my server (client wants to download from S3)
With the information sent I create a S3 key
I want to check if the key exists (has a file) on the S3
I create a presigned URL that allows the client to download a file from S3
Send URL to client
Client downloads the file
Before I execute step 4 I want to check if the key really exists on the S3. The client can't download a file that does not exist anyway.
I have an AWS::S3Client object. Do I really need to create a TransferManager for this or is there a simple way to handle this with the client object?
The client itself does not have a relation to S3 so I can't check it there. The server has to do all the work.
I found a working solution:
auto client = Aws::MakeShared<Aws::S3::S3Client>("client", getCredentials(), getClientConfig());
Aws::S3::Model::HeadObjectRequest request;
request.WithBucket(<bucketname>).WithKey(<s3key>);
const auto response = client->HeadObject(request);
response.IsSuccess(); //Is key existing on s3?
Issue an authenticated HTTP HEAD request against the object. You can use:
HeadObject
HeadObjectAsync
To quote:
The HEAD operation retrieves metadata from an object without returning
the object itself. This operation is useful if you're only interested
in an object's metadata. To use HEAD, you must have READ access to the
object.

ObjectCreated:Post on S3 Console upload?

My S3 Lambda Event listener is only seeing ObjectCreated:Put events when a file is uploaded via the S3 console. This is both for new files and overwriting existing files. Is this the expected behavior?
It seems like a new file upload should generate ObjectCreated:Post in keeping with the POST == Create, PUT == Update norm.
S3 has 4 APIs for object creation:
PUT is used for requests that send only the raw object bytes in the HTTP request body. It is the most common API used for creation of objects up to 5 GB in size.
POST uses specially-crafted HTML forms with attributes, authentication, and a file all as part of a multipart/form-data HTTP request body.
Copy is used where the source bytes come from an existing object in HTTP (which incidentally also uses HTTP PUT on the wire, but is its own event type). The Copy API is also used any time you edit the metadata of an existing object: once stored in S3, objects and their metadata are completely immutable. The console allows you to "edit" metadata, but it accomplishes this by copying the object on top of itself (which is a safe operation in S3, even when bucket versioning is not enabled, because the old object is untouched until the new object creation has succeeded) while supplying revised metadata. S3 does not support move or rename -- these are done with a copy followed by a delete. The maximum size of object that can be copied with the Copy API is 5 GB.
Multipart, which is mandatory for creating objects exceeding 5 GB and recommended for multi-megabyte objects. Multipart can be used for objects of any size, but each part (other than the last) must be at least 5 MiB in size, so it is not typically used for smaller uploads. This API also allows safe retrying of any parts that failed, uploading parts in parallel, and has multiple integrity checks to prevent any defects from appearing in the object that S3 reassembles. Multipart is also used to copy large objects
The console communicates with S3 using the standard public APIs, the same as the SDKs use, and uses either PUT or multipart, depending on the object size, and Copy for editing object metadata, as mentioned above.
For best results, always use the s3:ObjectCreated:* event, unless you have a specific reason not to.

AWS S3 Event - Client Identification

I'm looking to allow multiple clients can upload files to an S3 bucket (or buckets). The S3 create event would trigger a notification that would add a message to an SNS topic. This works, but I'm having issues deciding how to identify which client uploaded the file. I could get this to work by explicitly checking the uploaded file's subfolder/S3 name, but I'd much rather automatically add the client identifier as an attribute to the SNS message.
Is this possible? My other thought is using a Lambda function as a middle man to add the attribute and pass it along to the SNS Topic, but again I'd like to do it without the Lambda function if possible.
The Event Message Structure sent from S3 to SNS includes a field:
"userIdentity":{
"principalId":"Amazon-customer-ID-of-the-user-who-caused-the-event"
},
However, this also depends upon the credentials that were used when the object was uploaded:
If users have their individual AWS credentials, then the Access Key will be provided
If you are using a pre-signed URL to permit the upload, then the Access Key will belong to the one used in the pre-signed URL and your application (which generated the pre-signed URL) would be responsible for tracking the user who requested the upload
If you are generating temporary credentials for each client (eg by calling AssumeRole, then then Role's ID will be returned
(I didn't test all the above cases, so please do test them to confirm the definition of Amazon-customer-ID-of-the-user-who-caused-the-event.)
If your goal is to put your own client identifier in the message, then the best method would be:
Configure the event notification to trigger a Lambda function
Your Lambda function uses the above identifier to determine which user identifier within your application triggered the notification (presumably consulting a database of application user information)
The Lambda function sends the message to SNS or to whichever system you wish to receive the message (SNS might not be required if you send directly)
You can add user-defined metadata to your files before you upload the file like below:
private final static String CLIENT_ID = "client-id";
ObjectMetadata meta = new ObjectMetadata();
meta.addUserMetadata(CLIENT_ID, "testid");
s3Client.putObject(<bucket>, <objectKey>, <inputstream of the file>, meta);
Then when downloading the S3 files:
ObjectMetadata meta = s3Client.getObjectMetadata(<bucket>, <objectKey>);
String clientId = meta.getUserMetaDataOf(CLIENT_ID);
Hope this is what you are looking for.