AWS S3 Muitipart Upload via API Gateway or Lambda - amazon-web-services

I'm trying to create a reusable large-file serverless upload service in AWS (we host a number of sites). What I would like to do is to set up an API Gateway in AWS and use CORS to control which sites can upload, allowing the sites to use client-side code. Here is what I've tried and the roadblocks I've run into. Wondering if anybody has any suggested workarounds?
Calling S3 from client-code upload would require me to expose authentication information in client-side land, which seems bad
API Gateway does not appear to support calling S3 multipoint through its AWS Service integration type (URL is fixed to generic S3 service URL, and IAM isn't supported in HTTP integration type)
Leveraging Lambda to call the multipart API won't work, because it can only take in 6 MB of invoke request payload, and to get the 5 MB worth of minimal upload part size, base64 will make the data way more than 6 MB
I could do my own partial upload functionality in Lambda, storing the chunks in S3, but I can't figure out how to merge them together within Lambda's memory and tmp storage space (still PassThrough streams do not appear to work with AWS SDK)
Any ideas? Is any of these worth digging into? Or is serverless a no-go for this use case?
So, after further follow-up with Amazon, it's sort-of possible to use pre-signed URLs with the multipart API, but it's not very practical. Steps involved would include the following:
Create a new file, and split it into parts.
Generate a presigned URL to initiate the multiart upload.
Use the presigned URL to initiate the upload.
Generate a presigned URL for each part, using a part number.
Use the URLs to send the PutPart requests. Keep track of the Etag that is returned for the part number.
Combine all of the parts and corresponding ETAGs to form the request body.
Generate a presigned URL to complete the MP upload.
Complete the multipart upload by sending the request with the presigned complete multipart upload URL.
Will accept Angelo's answer since it did point in this direction which, technically, seems possible

You might be able to use presigned urls for the upload. In this case the client would hit your API, which would do whatever validation is necessary, and then generated a presigned url to S3 that is returned to the client. The client then directly uploads to s3.
You can see some information here: https://sanderknape.com/2017/08/using-pre-signed-urls-upload-file-private-s3-bucket/

Related

Differentiate between website endpoint from REST API endpoint for AWS S3

I have an input provided by a user, that would be used as the endpoint url for bucket operations for an S3 bucket.
Is there a way to differentiate if the url is a REST API endpoint or a website endpoint?
I did read: https://docs.aws.amazon.com/AmazonS3/latest/userguide/WebsiteEndpoints.html
which mentions "Supports only GET and HEAD requests on objects" for a website endpoint.
However, i have come across cases where the other operations worked even with a website endpoint.
I am using python boto3 for these APIs.
With only S3 you can't upload any data. It's a static website hosting, without any logic or ability to process data. You're just storing files that browser renders.
If you need logic I'd suggest adding some Lambda functions with REST endpoint. For your needs, you'll probably stay in free tier.

secure aws s3 objects (control access with authorizer like jwt, how a web app would normally do)

I need to secure my s3 bucket objects. In my web application I'm using aws-sdk to upload media to s3 bucket and get an http link back to access that object. This http link is public by default and I want to make it secure so that only authorized users can access the media. aws s3 allows to make the object private but it wont let anyone with the link access the object.
This link will be accessed from a mobile app where I dont want to use aws-sdk, Instead I want to execute some logic on aws side whenever someone tries to access the http link for the object.
What I would like to happen is, before the user gets access to s3 object, Some authorizer code would execute (like a jwt token authorizer) and depending on it user would be granted/denied access.
I'm currently looking into Amazon API Gateways, I believe they can be accessed as an http link and AWS Lambda could be used to secure them(where i would execute my jwt authorizer). Then these apis would have access to s3 internally.
If someone could point me in the right direction, If this is at all possible.
If I could use the same jwt token issued from my web-application to send along the request to Amazon API Gateway, that would be great.
I would make the bucket private, and place a CloudFront distribution in front of it. Using an Origin access identity to allow only CloudFront to directly access the S3 bucket.
Then to provide security I would use either CloudFront signed cookies, or Lambda#Edge with a custom JWT token validation.
The easiest solution to expose private objects in an S3 bucket is to create a pre-signed URL. Pre-signed URLs use the permissions from the service (which pre-signs the URL) to determine access and have only a limited duration in which they can be used. They can also be used to upload an object directly to S3 instead of having to proxy the upload through a lambda function.
For a download functionality and a smooth user experience, you can - for example - have a lambda function that generates a pre-signed URL and returns it as an HTTP 302 response, which should instruct the browser to automatically download the file from the new URL.
(Edit)
Following on what I've stated in the comments on this answer, if you're proxying the upload/download of the objects through services such as API Gateway or Lambda, you will be severely limited in the size of files that you are able to upload to S3. The payload size limit on an API Gateway is 10 MB and for requests to lambda your payload is capped at 6MB for synchronous invocations. If you want to upload something larger than 10 MB, you will need to use direct upload to S3 for which pre-signed URLs are the safest solution.
I know I am bit late here, but I wanted to give my opinion in case someone has the same problems.
Your mobile app should communicate with a server app (backend app) for authentication and authorization. let's say you are deploying your server app on AWS VPC. Now, it's simple to manage the files access by creating a policy which allow just your server app (IP, or VPC) to access the bucket. the authorization part will be managed on your application.

how to secure HLS streaming using AWS for mobile devices?

We have some videos in an S3 bucket. they've been transformed using AWS Elastic Transcoder to .m3u8 / .ts
We want the users to be able to stream these videos on both a web app and a mobile app.
Now, we want to secure this streaming, so our videos won't get pirated.
So, our proposed solution is as follows:
Prevent public access to the S3 bucket
create a cloudfront distribution with the bucket as the origin
Only enable access to this CDN using pre-signed URLs/cookies
For web app: use a pre-signed cookie (set by an endpoint at our backend that requires authentication), so that it works well with HLS (since the app needs to fetch a new segment every few seconds)
But now we don't know what to do with our mobile app. We can't use pre-signed cookies since there's no browser, and we can't use pre-signed URLs, since we'll need a signed URL for each segment we need to fetch. Any suggestions and solutions are welcome.
For our similar use-case:
We used CloudFront url and not S3 signed url. Because S3 signed URL is valid at object level and not folder level.
For paid videos, security and access was managed by Lambda#Edge on viewer requests.
Although we used OAuth and database inside that lambda, but surprisingly, we didn't face any bottlenecks on Lambda#Edge. For future plans we considered using Redis for seamless access validation inside Lambda#Edge.

How can a Cloudfront distribution an AWS KMS key to GET an S3 image encrypted at rest?

I would like to use AWS's Server Side Encryption (SSE) with the AWS Key Management Service (KMS) to encrypt data at rest in S3. (See this AWS blog post detailing SSE-KMS.)
However, I also have the requirement that I use Cloudfront Presigned URLs.
How can I set up a Cloudfront distribution to use a key in AWS KMS to decrypt and use S3 objects encrypted at rest?
(This Boto3 issue seems to be from someone looking for the same answers as me, but with no results).
This was previously not possible because CloudFront didn't support it and because (as I mentioned in comments on John's answer -- which was on the right track) there was no way to roll-your-own solution with Lambda#Edge because the X-Amz-Cf-Id request header --generated on the back side of CloudFront and visible only to S3, not to the trigger invocation -- would invalidate any signature you tried to add to the request inside a Lambda#Edge trigger, because signing of all X-Amz-* headers is mandatory.
But the X-Amz-Cf-Id header value is now exposed to a Lambda#Edge trigger function in the event structure -- not with the other request headers, but as a simple string attribute -- at event.Records[0].cf.config.requestId.
With that value in hand, you can use the execution role credentials and the built-in SDK in the Lambda#Edge environment to generate a signature and and add the necessary headers (including an Authorization header with the derived credential identifier and freshly-generated signature) to the request.
This setup does not use an Origin Access Identifier (OAI) because the Lambda#Edge trigger's IAM Execution Role is used instead of an OAI to persuade S3 that the request is authorized.
Achraf Souk has published an official AWS blog post explaining the solution from start to finish.
https://aws.amazon.com/blogs/networking-and-content-delivery/serving-sse-kms-encrypted-content-from-s3-using-cloudfront/
Use S3 Presigned URLs. This AWS article discusses how to generate urls using Java, but this is easily ported to another language.
Server-Side Encryption with AWS Key Management Service (SSE-KMS)
The following setup works for us:
In your application, generate a signed URL that Cloudfront can validate (https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-signed-urls.html).
Instead of using OAI, you create a Lambda#Edge origin request function as per: https://aws.amazon.com/blogs/networking-and-content-delivery/serving-sse-kms-encrypted-content-from-s3-using-cloudfront/
Please note that if your bucket contains an '.' (ours did), there's a bug in the JS code that can be mitigated with something like:
// Infer the region from the host header
// const region = options.host.split('.')[2];
const hostArr = options.host.split('.');
const region = hostArr [hostArr.length - 3];
Lastly, we added an origin response Lambda#Edge to wash away headers that we do not want exposed. Esp the X-Amz-Server-Side-Encryption-Aws-Kms-Key-Id that includes the AWS Account ID.
Lastly I'd like to comment on the above statement/comment that Lamda#Edge response bodies are limited to 1 MB, this only applies to content generated (or modified if you include the body) by the lambda function.
When using the Lambda#Edge function above, the response from the S3 origin has no such limit, we are serving objects >> 1MB (normally 100+ MB).

Upload file to s3 with custom response to client

I am trying to upload a file to s3 and then have lambda generate id, date.
I then want to return this data back to the client.
I want to avoid generating id and date on the client for security reasons.
Currently, I am trying to use API Gateway which invokes a lambda to upload into s3. However, I am having problems setting this up. I know that this is not a preferred method.
Is there another way to do this without writing my own web server. (I would like to use lambda).
If not, how can I configure my API Gateway method to support file upload to lambda?
You have a couple of options here:
Use API Gateway as an AWS Service Proxy to S3
Use API Gateway to invoke a Lambda function, which uses the AWS SDK to upload to S3
In either case, you will need to base64 encode the file content before calling API Gateway, and POST it in the request body.
We don't currently have any documentation on this exact use case but I would refer you to the S3 API and AWS SDK docs for more information. If you have any specific questions we'd be glad to help.
Thanks,
Ryan