I am looking for a quick/cheap way to implement the access management of a CDN containing images.
Please check the diagram
context
A user usually requests a high number of small images (latency should be below 200ms), and a user requests images one by one.
access management
A user updates the permission of an image in the DB
A permission hook is triggered and updates the permission file of the image (add user id in the file)
Upload/update the S3 bucket with the new permission file
invalidate CLOUDFRONT cache for this specific file
image request
A user sends an HTTP request to collect an image
METHOD: GET
URL: URL/image-id.png
HEADER: Authorisation (containing user id)
The edge lambda intercepts the request, validates the token, uses the image id to retrieve the permission file of the image, and finally verifies if the file contains the user id.
If any steps in point 2 fail, then the lambda returns an error message; otherwise forward the request to CDN.
What do you guys think? Is it a terrible idea?
Related
I am refactoring a project from a third-party company where they add two different Lambda#Edge functions which are triggered by CloudFront.
Basically, the flow is following:
When users call S3 file from web app -> CloutFront fire event which will call Lambda#Edge.
There are two Lambdas: one for counting downloads per user and another one to restrict access.
The problem is that solution is not working and missing a download count check.
What is the execution workflow for Lambda#Edge attached to the same event? I am thinking of placing all the logic inside of one Lambda as I am afraid that counting can happen earlier than access denied. However taking in consideration that lambda#edge have execution time limitation
The documentation is available here.
When a user requests a file there is a viewer request. If the file is in the cache, then a viewer response follows. There is no origin request. For this reason you should authenticate your users on a viewer request.
When the file isn't in the cache, there is an origin request. This is when the file is downloaded from S3.
You could have the logic in a single Lamda#Edge, but you could also:
Authenticate users on Viewer Request.
Count downloads on Viewer Response. A Viewer Response event will be triggered regardless, if there is cache hit or not, but not when the origin returns an HTTP status code of 400 or higher.
I have a lambda function that is triggered whenever a user uploads an image to an S3 bucket. I'm trying to write the generated url of that image to a DynamoDB database along with the email of the user who uploaded said image, which should be the user that is currently logged in.
I've gotten these attributes before by doing
event.request.userAttributes.email
But that was done in a Cognito triggered post-confirmation lambda function, so that information was stored in the event parameter of the handler function. In this scenario, I'm not sure if that information is sent along in the event. Any idea how I'd get access to information like that? I've been reading up JWT ID Tokens, but I haven't figured out how to access that or if that's the correct and safe approach.
I am afraid you will have to handle it yourself. One option which you may like is to use custom object metadata to store the information about the uploading user:
https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html
You can then retrieve the metadata together with the object and continue from that point.
I am building an app which lets users upload pictures and share it with his/her friends only.
I am using spring restful services to upload content directly to s3.
The user is authorized using OAuth.
To upload an image an authorized user's js client invokes POST /images/{userid}
To download the image, the client has to invoke GET /images/{userid}/{imageid}
Each user's content is stored in s3 under his own folder name, and the downloading service as explained in point 4 is the one that has to be invoked. Unfortunately, this means I cannot assign this url as source to an image tag <img src="">, because the authorization token should be sent on GET request. I cannot make the contents of user folder public because only user and his friends are allowed to see the images. The current service will soon become a bottleneck and I would like to avoid that.
What is the recommended architecture/design to solve this problem?
Instead of having a service that loads and returns the entire image file from S3, you should have a service that simply generates an S3 presigned URL. Then the URL retrieved from that service can be used in the <img src=""> tags on your site. This will be much more performant since the web browser will ultimately download the image directly from S3, while also still being secure.
The flow for downloading the images would be like this
User invokes GET request to download image
At Server End
Authenticate user
Query DB for metadata
Create a time based auth token.
Create a image URL(S3 based) and append auth token created in previous step
At the client end(User browser) redirect user to new URL(this url is effectively S3 location+auth token )
Now direct request will comes at the server( image URL+ auth token)
authenticate the token and then show image to user
Above URL will not be valid for long time , but your images are secured. As auth token is time based it will cater your case like if some one make the image private/public remove a friend.Deleted images , Copy paste of image url etc ..
The main security concern in direct js browser uploads to S3 is that users will store their S3 credentials on the client side.
To mitigate this risk, the S3 documentation recommends using a short lived keys generated by an intermediate server:
A file is selected for upload by the user in their web browser.
The user’s browser makes a request to your server, which produces a temporary signature with which to sign the upload request.
The temporary signed request is returned to the browser in JSON format.
The browser then uploads the file directly to Amazon S3 using the signed request supplied by your server.
The problem with this flow is that I don't see how it helps in the case of public uploads.
Suppose my upload page is publicly available. That means the server API endpoint that generates the short lived key needs to be public as well. A malicious user could then just find the address of the api endpoint and hit it everytime they want to upload something. The server has no way of knowing if the request came from a real user on the upload page or from any other place.
Yeah, I could check the domain on the request coming in to the api, and validate it, but domain can be easily spoofed (when the request is not coming from a browser client).
Is this whole thing even a concern ? The main risk is someone abusing my S3 account and uploading stuff to it. Are there other concerns that I need to know about ? Can this be mitigated somehow?
Suppose my upload page is publicly available. That means the server
API endpoint that generates the short lived key needs to be public as
well. A malicious user could then just find the address of the api
endpoint and hit it everytime they want to upload something. The
server has no way of knowing if the request came from a real user on
the upload page or from any other place.
If that concerns you, you would require your users to login to your website somehow, and serve the API endpoint behind the same server-side authentication service that handles your login process. Then only authenticated users would be able to upload files.
You might also want to look into S3 pre-signed URLs.
I'm writing app that handles large files upload (eg. 10GB). I want to use direct upload to S3 (pre-signed URL) and give that possibility for my web users. My steps are:
I'm creating IAM user with only "PUT" permission
I'm creating upload policy on the server side (and putting there information about max file size, file content type and policy expiration time (eg. 3 hours)
Web user is uploading the file using html form with that policy and pre-signed URL.
I'm checking the file headers on a server side after succesfull upload.
Now, I'm wondering about downsides and security issues of this approach. There are any?
Thank you.