REST api and caching - amazon-web-services

I have a resource called Sites.
I am planning to have an endpoint as follows:
/tenant/:tenantId/sites/:siteId
The endpoint is to return a site’s tree which will vary based on the userId extracted from the JWT token.
Since it will vary based on the user requesting it, should the endpoint have userId in the URI- may be as a query parameter?
How should caching work in this case?
The sites tree returned by this endpoint will also change based on updates in another resource (i.e users/groups)
Should the cache be discarded for all users whenever there is a change in the sites resource itself or when there is a change in groups?
I am using API gateway so will need to purge cache through client cache control header when any of the resources are updated.

Since, the data will vary on the user requesting it, the endpoint should have the userId in the URI - it could be simply a path parameter similar to the tenantId and siteId
caching can be done on the basis of If-modified-since header to indicate if the data has changed or not.
The If-Modified-Since HTTP header indicates the time for which a browser first downloaded a resource from the server. This helps determine whether or not the resource has changed since the last time it was accessed.

From a security point of view if a user only can access his own sites the user id should not be on the path (or query param), because if you do that, any user can modify the URL in its browser and try to access the other user sites. To avoid that the URL should not have any userId (you can replace it with something like /me) and the service that will handle the request should extract the id information from the token
I don't know if you are using an in-memory cache of distributed cache and if sites/users/groups are different services (deployed on different servers) or if they live in the same application, anyway, when any of the resources that cache depends on are modified, you should invalidate the cache for that users

Related

Private Media Using Google Cloud CDN

I have an application that uses the google bucket to store media, as there are many media, it was necessary to use the CDN service to reduce latency when loading them. However, when I use the CDN, the media is public and accessible to any unauthenticated person. Is there any way to cache the media and at the same time keep it private through an authentication token?
I tried in many ways following the documentation, keeping the Cache Type capturing the information from the Cache-Control header and the authorization token, but after caching the media it is accessible without the authentication token.
Can anybody help me?
Is it possible to include the auth header as part of the cache key? This would require the origin server to have auth functionality.
This would enable the object to be put into cache when the key is present. Also, only a request with the auth header can retrieve the object from cache.
If a user without the key, or an incorrect key, attempts to get the file, it will be cache miss. The origin server should then authenticate the header and thus no allow the object to be delivered.
To add details to the link shared by John Hanley, using Signed URLs will cache the content and at the same time, keep the access private to your stored media. You can follow this documentation for a more detailed procedure.

Allowing temporary access to DynamoDB

I have a client app that connects to an Elastic Beanstalk server app. Some of my users need to register. But when the registration form loads it needs to get some data from DynamoDB so the user can choose between a few options.
The problem is that I set up my server in a way that any request to the server that is not authenticated (no auth tokens previously obtained by the client app from Cognito) gets denied. Of course, if a person is going to register they are not authenticated, which means they do not have access to the information from DynamoDB they need to register. It is only a couple of pieces of information I need, so it is very frustrating.
What I have thought about how to solve this:
Putting a long string of characters in the client app that gets sent to the server when a request is made for ONLY the couple of pieces of information I need. The server would also have that same string stored somewhere and would then compare them. If they match, then it returns the info requested. As I said, this would be done only for the 2 pieces of info I need, everything else would still be secure.
Leave the two routes public in my API that lead to the pieces of info I need (I know, it is a bad idea).
What would be the best way to go about this?
Assuming you're using cognito there is also a concept of an anonymous guest user which can have its own role assigned.
You can treat the anonymous guest user like a regular cognito user (it can have a role assigned), however you would scope its permissions down to the minimum it requires to perform these operations.
Alternatively use option 2, the API could call a Lambda that would return the necessary information simply reading the data. You would possibly want to look at caching the results as well to avoid your API Gateway endpoint being abused.

Amazon S3 pre-signed URLs

If I set up my app to generate pre-signed URLs for access to S3 media (so that I can set the files to be private, unless accessed via a logged in user) then would I be right in saying that, if someone has access to the URL (within the expiry time) they can see the file, despite it being "private"?
So if someone was to send the URL to someone else, then it's not really private any more.
I guess there's no other way but this just seems odd to me.
Yes, you are correct that a signed URL can be "shared" because it is valid until it expires (or until the credentials that signed it expire or are otherwise invalidated, whichever comes first).
One common solution is for your application to generate signed URLs as the page is being rendered, using very short expiration times.
Another is for the link to the secured content to actually be a link back to the application, which verifies the user's authority to access the object, and then returns an HTTP redirect to a freshly-generated signed URL with a short expiration time (e.g. 5 seconds).
HTTP/1.1 302 Found
Location: https://example-bucket.s3.amazonaws.com/...?X-Amz-...
Signed URLs cannot be tampered with using currently feasible compute capabilities, so it is impractical to the point of impossibility for a signed URL to be modified by a malicious user.
Note also that a signed URL (for either S3 or CloudFront) only needs to be not-yet-expired when the download starts. The time required for the download to actually finish can be arbitrarily long, and the download will not be interrupted.
There is no ready-made service for the following option, but using a combination of CloudFront Lambda#Edge triggers and DynamoDB, it is possible to create a genuinely single-use URL, which consists of a randomly generated "token" stored in the Dynamo table and associated with the target object. When the URL is accessed, you use a DynamoDB conditional update in the Lambda trigger to update the (e.g.) "view_count" value from 0 to 1. If the token isn't in the table or the view count isn't 0, the conditional update fails, so access is denied; otherwise CloudFront allows the request to proceed -- exactly once. CloudFront accesses the S3 content using an Origin Access Identity, which all happens behind the scenes, so nothing related to the actual authentication of the request between CloudFront and S3 is accessible to the user. (For cryptographic-quality random token generation, you can also use KMS's GenerateRandom API action.)
There are a number of alternative approaches, including other uses of Lambda#Edge triggers to do things like inspect a request for an application-provided cookie and then querying the application server to authenticate the user.
CloudFront also supports signed cookies that it parses and interprets, itself, but these provide wildcard-based access to all your assets matching a specific URL and path (e.g. /images/*) and there is nothing to prevent a user from sharing their cookies, so these are probably not useful for your use case.
CloudFront signed URLs do support the option of allowing access only if the signed URL is used from a specific source (client) IP address, but this has potential problems in there is no assurance that a 1:1 correlation exists between users and IP addresses. Many users can be behind the same address (particularly in corporate network environments) or a single user's address can change at any moment.
The complexity of the possible implementations varies wildly, and what you need depends in part on how secure you need for your content to be. In many cases, more extreme solutions accomplish little more than discouraging honest users, because the user can still download the resource and share it via other means.
That would still be a separate user requesting content. For a separate user, the certificate would not longer be valid.
Source: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-signed-urls.html

Storing a session using cookies or http session variables for a scalable solution?

I have one web app which under the login process stores the userId in a http session variable(After confirmation of course!). I'm not using any session variables other than this one to retrieve information about the user. I don't know if this one is the most scalable solution for me yet. Do my server reserve any memory for this? Is it better to use cookies instead?
If you are using multiple application servers (now or in the future), I believe the http session variable is dependent to the server the user is on (correct me if I'm wrong), so in this case, you can find a "sticky session" solution that locks the user to a particular server (e.g. EC2's Load Balancers offer this: http://aws.amazon.com/about-aws/whats-new/2010/04/08/support-for-session-stickiness-in-elastic-load-balancing/ ).
I recommend using a cookie (assuming my logic above is right), but you should make sure you have some sort of security measure on that so users can't change their cookie and gain access to another user's account. For example, you could hash some string w/ a secret key and the user ID which you check server-side to confirm it has not been tampered with.

Set session for different domains from the same server?

Does anyone know whether I can set a session value for the current domain, and use this session for another domain?
For example:
when I set session in domain www.aabc.com and I wish this session to work in domain www.ccc.com as well -- I click a button in www.aabc.com and change the header to www.ccc.com?
You can only set cookies for your domain (and other sites on your domain, like subdomains, if I remember correctly).
This is (mainly ?) for security reasons : else, anyone could set cookies for any website... I let you imagine the mess ^^
(The only way to set cookies for another domain seem to be by exploiting a browser's security hole - see http://en.wikipedia.org/wiki/Cross-site_cooking for instance ; so, in normal cases, not possible -- happily)
I had to set this up at my last job. The way it was handled was through some hand-waving and semi-secure hash passing.
Basically, each site, site A and site B, has an identical gateway setup on each domain. The gateway accepts a user ID, a timestamp, a redirect URL, and a hash. The hash is comprised of a shared key, the timestamp, the user ID.
Site A generates the hash and sends all of the information listed above to the gateway at site B. Site B then hashes the received passed user ID and timestamp with the shared key.
If the generated hash matches the received hash, then the gateway logs the user in and loads their session from a shared memory table or memcached pool and redirects the user to the received redirect url.
Lastly, the timestamp is used to be able to determine an expiration time for the provided passed hash (e.g.: the hash was only valid for x time). Something around 2.5 minutes is what we used for our TTL (to account for network lag and perhaps a refresh or two).
The key points here are:
Having a shared resource where sessions can be serialized
Using a shared key to create and confirm hashes (if you're going to use md5, do multiple passes)
Only allow the hash to be valid for a small, but reasonable amount of time.
This requires control of both domains.
Hope that was helpful.
You cannot access both domains sessions directly, however, there are legitimate solutions to passing session data between two sites that you control. For data that can be tampered with you can simply have a page on domain abc.com load a 1px by 1px "image" on xyz.com and pass the appropriate data in the querystring. This is very insecure so make sure the user can't break anything by tampering with it.
Another option is to use a common store of some kind. If they have access to the same database this can be a table that domain abc.com stores a record in and then passes the id of the record to domain xyz.com. This is a more appropriate approach if you're trying to pass login information. Just make sure you obfuscate the ids so a user can't guess another record id.
Another approach to the common store method if these two domains are on different servers or cannot access the same database is to implement some sort of cache store service that will store information for a time and is accessible by both domains. Domain abc.com passes in some data and the service passes back an ID that domain abc.com sends to domain xyz.com which then turns back to the service asking for the data. Again, if you develop this service yourself make sure you obfuscate the ids.