If I set up my app to generate pre-signed URLs for access to S3 media (so that I can set the files to be private, unless accessed via a logged in user) then would I be right in saying that, if someone has access to the URL (within the expiry time) they can see the file, despite it being "private"?
So if someone was to send the URL to someone else, then it's not really private any more.
I guess there's no other way but this just seems odd to me.
Yes, you are correct that a signed URL can be "shared" because it is valid until it expires (or until the credentials that signed it expire or are otherwise invalidated, whichever comes first).
One common solution is for your application to generate signed URLs as the page is being rendered, using very short expiration times.
Another is for the link to the secured content to actually be a link back to the application, which verifies the user's authority to access the object, and then returns an HTTP redirect to a freshly-generated signed URL with a short expiration time (e.g. 5 seconds).
HTTP/1.1 302 Found
Location: https://example-bucket.s3.amazonaws.com/...?X-Amz-...
Signed URLs cannot be tampered with using currently feasible compute capabilities, so it is impractical to the point of impossibility for a signed URL to be modified by a malicious user.
Note also that a signed URL (for either S3 or CloudFront) only needs to be not-yet-expired when the download starts. The time required for the download to actually finish can be arbitrarily long, and the download will not be interrupted.
There is no ready-made service for the following option, but using a combination of CloudFront Lambda#Edge triggers and DynamoDB, it is possible to create a genuinely single-use URL, which consists of a randomly generated "token" stored in the Dynamo table and associated with the target object. When the URL is accessed, you use a DynamoDB conditional update in the Lambda trigger to update the (e.g.) "view_count" value from 0 to 1. If the token isn't in the table or the view count isn't 0, the conditional update fails, so access is denied; otherwise CloudFront allows the request to proceed -- exactly once. CloudFront accesses the S3 content using an Origin Access Identity, which all happens behind the scenes, so nothing related to the actual authentication of the request between CloudFront and S3 is accessible to the user. (For cryptographic-quality random token generation, you can also use KMS's GenerateRandom API action.)
There are a number of alternative approaches, including other uses of Lambda#Edge triggers to do things like inspect a request for an application-provided cookie and then querying the application server to authenticate the user.
CloudFront also supports signed cookies that it parses and interprets, itself, but these provide wildcard-based access to all your assets matching a specific URL and path (e.g. /images/*) and there is nothing to prevent a user from sharing their cookies, so these are probably not useful for your use case.
CloudFront signed URLs do support the option of allowing access only if the signed URL is used from a specific source (client) IP address, but this has potential problems in there is no assurance that a 1:1 correlation exists between users and IP addresses. Many users can be behind the same address (particularly in corporate network environments) or a single user's address can change at any moment.
The complexity of the possible implementations varies wildly, and what you need depends in part on how secure you need for your content to be. In many cases, more extreme solutions accomplish little more than discouraging honest users, because the user can still download the resource and share it via other means.
That would still be a separate user requesting content. For a separate user, the certificate would not longer be valid.
Source: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-signed-urls.html
Related
I have a resource called Sites.
I am planning to have an endpoint as follows:
/tenant/:tenantId/sites/:siteId
The endpoint is to return a site’s tree which will vary based on the userId extracted from the JWT token.
Since it will vary based on the user requesting it, should the endpoint have userId in the URI- may be as a query parameter?
How should caching work in this case?
The sites tree returned by this endpoint will also change based on updates in another resource (i.e users/groups)
Should the cache be discarded for all users whenever there is a change in the sites resource itself or when there is a change in groups?
I am using API gateway so will need to purge cache through client cache control header when any of the resources are updated.
Since, the data will vary on the user requesting it, the endpoint should have the userId in the URI - it could be simply a path parameter similar to the tenantId and siteId
caching can be done on the basis of If-modified-since header to indicate if the data has changed or not.
The If-Modified-Since HTTP header indicates the time for which a browser first downloaded a resource from the server. This helps determine whether or not the resource has changed since the last time it was accessed.
From a security point of view if a user only can access his own sites the user id should not be on the path (or query param), because if you do that, any user can modify the URL in its browser and try to access the other user sites. To avoid that the URL should not have any userId (you can replace it with something like /me) and the service that will handle the request should extract the id information from the token
I don't know if you are using an in-memory cache of distributed cache and if sites/users/groups are different services (deployed on different servers) or if they live in the same application, anyway, when any of the resources that cache depends on are modified, you should invalidate the cache for that users
I am planning to use CloudFront as the server cache for my site, secured by singed URLs. The reason for using signed URLs is to allow access to the content only to authenticated users.
My webapp, however, also needs to use the caching on the client side. Now since the signed URL will only be valid for a short time and then a new one will be generated, this will break the caching on the client side. Although the client will receive the same resource, it will have a new signed URL and the client browser will not be able to get it from the cache.
One of the reasons why I want to use signed URLs with short validity for long existing resources is having control over the transmitted data. In the best case these resources are cached at the client. If not, they are cached on CloudFront and CF will deliver them and spare the resources of my web app server. But I want to prevent an attacker to download the resource huge number of times from CF and cause me additional costs.
Is there a way to secure access to CloudFront resources using some other means than signed URLs? For example a good thing would be a signed cookie. The client would ask for a resource on a webapp URL and the webapp would return a redirect to a long-term CF URL for that resource but retrieving the resource would be only possible with a signed cookie with a short validity. The client would still see the long URL and could cache the resource but the resource would only be available short time for the download.
I do not want to mess with IP addresses since they are unreliable, often there could be many users behind one IP etc.
Is there something like that to overcome the local caching limitation of the signed URLs?
If there is nothing really confidential in the resources, i would probably not bother with signing them myself. But since you have that requirement, you can use signed cookies.
These can both be limited in time, but also in scope. So you can give access to at specific subset of URLs.
"Instead of using cookies for authorization, server operators might
wish to consider entangling designation and authorization by treating
URLs as capabilities. Instead of storing secrets in cookies, this
approach stores secrets in URLs, requiring the remote entity to
supply the secret itself. Although this approach is not a panacea,
judicious application of these principles can lead to more robust
security." A. Barth
https://www.rfc-editor.org/rfc/rfc6265
What is meant by storing secrets in URLs? How would this be done in practice?
One technique that I believe fits this description is requiring clients to request URLs that are signed with HMAC. Amazon Web Services offers this technique for some operations, and I have seen it implemented in internal APIs of web companies as well. It would be possible to sign URLs server side with this or a similar technique and deliver them securely to the client (over HTTPS) embedded in HTML or in responses to XMLHttpRequests against an API.
As an alternative to session cookies, I'm not sure what advantage such a technique would offer. However, in some situations, it is convenient or often the best way to solve a problem. For example, I've used similar techniques when:
Cross Domain
You need to give the browser access to a URL that is on another domain, so cookies are not useful, and you have the capability to sign a URL server side to give access, either on a redirect or with a long enough expiration that the browser has time to load the URL.
Examples: Downloading files from S3. Progressive playback of video from CloudFront.
Closed Source Limitations
You can't control what the browser or other client is sending, aside from the URL, because you are working with a closed source plugin of some kind and can't change its behavior. Again you sign the URL server side so that all the client has to do is GET the URL.
Examples: Loading video captioning and/or sprite files via WEBVTT, into a closed-source Flash video player. Sending a payload along with a federated single sign-on callback URL, when you need to ensure that the payload can't be changed in transit.
Credential-less Task Worker
You are sending a URL to something other than a browser, and that something needs to access the resource at that URL, and on top of that you don't want to give it actual credentials.
Example: You are running a queue consumer or task-based worker daemon or maybe an AWS Lambda function, which needs to download a file, process it, and send an email. Simply pre-sign all the URLs it will use, with a reasonable expiration, so that it can perform all the requests it needs to without any additional credentials.
I've read aws docs about using s3 + cloudfront + signed URL architecture to securely serve private content to public users. However it seems not secure enough to me. Let's me describe in steps:
Step 1: user logs in to my website.
Step 2: user clicks download (pdf, images, etc.)
Step 3: my web server will generate signed URL (expiry time: 30 secs), redirect user to the signed url and the downloading process happens.
Step 4: now, even though it's timed out after 30 secs, there is still a chance that any malicious snipper on my network will be able to catch the signed url and download my user's private content.
Any thought for this?
The risks you anticipate exist no matter what mechanism you use to "secure" anything on the web, if you aren't also using HTTPS to encrypt your users' interactions with the web site.
Without encryption, the login information, or perhaps cookies conveying the user's authentication state are also being sent in cleartext, and anything the user downloads can be directly captured without need for the signed link... making concern about capturing a download link via sniffing seem somewhat uninteresting compared to the more significant risk of general and overall insecurity that exists in such a setup.
On the other hand, if your site is using SSL, then when you deliver the signed URL to the user, there's a reasonable expectation that it will be hidden from snooping by the encryption... and, similarly, if the link to S3 also uses HTTPS, the SSL on that new connection is established before the browser transmits any information over the wire that would be discoverable by sniffing.
So, although it seems correct that there are potential security issues involved with this mechanism, I would suggest that a valid overall approach to security for user interactions should reduce the implications of any S3 signed URL-specific concerns down to a level comparable to any other mechanism allowing a browser to request a resource based on possession of a set of credentials.
I have an app that lets users post and share files, and currently it's my server that serves these files, but as data grows, so I'm investigating using Amazon S3. However, I use dynamic rules for what is public and what is private between certain users etc, so the server is the only possible arbiter, i.e. permissions cannot be decided on the app/client end.
Simplistically, I guess I can let my server GET data from S3, then send them back to the app. But obviously then I'm paying for bandwidth twice not to mention making my server do unnecessary work.
This seems like a fairly common problem, so I wonder how do people typically solve this problem? (Like I read that Dropbox stores its data on S3.)
We have an application with pretty much the same requirements, and there's a really good solution available. S3 supports signed, expiring S3 URLs to access objects. If you have a private S3 object that you, but not others, can access, you can create such a URL. If you give that URL to someone else, he or she can use it to fetch the object that they normally have no access to.
So the solution for your use case is:
User does a GET to the URL on your web site
Your code verifies that the user should be able to see the object (via your application's custom, dynamic rules)
The web site returns a redirect response to a signed S3 URL that expires soon, say in 5 minutes
The user's web browser does a GET to that signed S3 URL. Since it's properly signed and hasn't yet expired, S3 returns the contents of the object directly to the user's browser.
The data goes from S3 to the user without ever traveling back out through your web site. Only users your application has authorized can get the data. And if a user bookmarks or shares the URL it won't work once the expiration time has passed.