AWS S3 Presigned URLs Policies - amazon-web-services

AWS S3 Presigned URLs Policies - amazon-web-services

Currently working on a project that involves what is essentially an online gallery stored on AWS S3. The current process in question is that the frontend webpage sends a request for an API service hosted on EC2, which then returns with a presigned URL. The service will be using Go, with aws-sdk-go-v2.
The URL is time limited, set as a PUT object type. Unfortunately I haven't figured out how to limit the other aspects of the file that will be uploaded yet. Ideally, the URL should be limited in what it can accept, IE images only.
My searchs around have come up with mixed results, saying both it's possible, not possible, or just outright not mentioned for what I'm doing.
There's plenty of answers regarding setting a POST/PUT policy, but I they're either for the V1 SDK, or just outright a different language. This answer here even has a few libraries that does it, but I don't want to resort to using that yet since it requires access keys to be placed in the code, or somewhere in the environment (Trying to reap the benefits of EC2's IAM role automating the process).
Anyone know if this is still a thing on V2 of the SDK? I kinda want to keep it on V2 of the SDK just for the sake of consistency.
EDIT: Almost forgot. I saw that it was possible for S3 to follow a link upon upload completion as well. Is this still possible? Would be great as a verification that something was uploaded so that it could be logged.

You can try to validate the filename through the your backend API before returning the PreSigned PUT URL. And a less sequre but can be good, is to validate the file content in the frontend client.

I unfortunately have not discovered a way to restrict uploads via the Content-Type, but I have found a few tricks that might help you.
First, while this is a bit of a sledgehammer when you might want a scalpel, you can apply a bucket policy to restrict by filename. This example from AWS Knowledge Center only allows a few image types.
{
"Version": "2012-10-17",
"Id": "Policy1464968545158",
"Statement": [
{
"Sid": "Stmt1464968483619",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111111111111:user/exampleuser"
},
"Action": "s3:PutObject",
"Resource": [
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.jpg",
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.png",
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.gif"
]
},
{
"Sid": "Stmt1464968483619",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"NotResource": [
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.jpg",
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.png",
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.gif"
]
}
]
}
Second, you can automatically trigger a Lambda function after the file is uploaded. With this method, you can inspect anything about the file you want after it is uploaded. This is generally my preferred approach, but the problem with it is there's not really any immediate feedback the file was invalid.
My last option requires the client to do some more work, and it's what I used most recently. If you are uploading very large files, S3 requires you to use multipart uploads. This isn't multipart like an HTML form; this is breaking the file up into chunks and uploading each of them. If your file is sufficiently small--I think the limit is 5GB, you can just do a single part. Once all the parts are uploaded, you must make another call to your server that finalizes the upload with S3. This is where you can add some validation of the file and respond to the client while they're still on the upload page.

Related

AWS S3 Bucket Policy for CORS

I am trying to figure out how to follow these instructions to set up an S3 bucket on AWS. I have been trying most of the year but still can't make sense of the AWS documentation. I think this github repo readme may have been prepared at a time when the AWS S3 interface appeared differently (there is no CORS setting in the form to make the S3 bucket permissions now).
I asked this question earlier this year, and I have tried using the upvoted answer to make a bucket policy, which is precisely as shown in that answer, as:
[
{
"AllowedHeaders": [
"*"
],
"AllowedMethods": [
"POST",
"GET",
"PUT",
"DELETE",
"HEAD"
],
"AllowedOrigins": [
"*"
],
"ExposeHeaders": []
}
]
When I try this, I get an error that says:
Ln 1, Col 0Data Type Mismatch: The text does not match the expected
JSON data type Object. Learn more
The Learn More link goes to a page that suggests I can resolve this error by updating text to use the supported data type. I do not know what that means in this context. I don't know how to find out which condition keys require which type of data.
Resolving the error
Update the text to use the supported data type.
For example, the Version global condition key requires a String data
type. If you provide a date or an integer, the data type won't match.
Can anyone help with current instructions for how to create the CORS permissions that are consistent with the spirit of the instructions in the github readme for the repo?

What's the most efficient way to determine the minimum AWS permissions necessary for a Terraform configuration?

I have a Terraform configuration targeting deployment on AWS. It applies beautifully when using an IAM user that has permission to do anything (i.e. {actions: ["*"], resources: ["*"]}.
In pursuit of automating the application of this Terraform configuration, I want to determine the minimum set of permissions necessary to apply the configuration initially and effect subsequent changes. I specifically want to avoid giving overbroad permissions in policy, e.g. {actions: ["s3:*"], resources: ["*"]}.
So far, I'm simply running terraform apply until an error occurs. I look at the output or at the terraform log output to see what API call failed and then add it to the deployment user policy. EC2 and S3 are particularly frustrating because the name of the actions seems to not necessarily align with the API method name. I'm several hours into this with easy way to tell how far long I am.
Is there a more efficient way to do this?
It'd be really nice if Terraform advised me what permission/action I need but that's a product enhancement best left to Hashicorp.

Here is another approach, similar to what was said above, but without getting into CloudTrail -
Give full permissions to your IAM user.
Run TF_LOG=trace terraform apply --auto-approve &> log.log
Run cat log.log | grep "DEBUG: Request"
You will get a list of all AWS Actions used.

While I still believe that such super strict policy will be a continuous pain and likely kill productivity (but might depend on the project), there is now a tool for this.
iamlive uses the Client Side Monitoring feature of the AWS SDK to create a minimal policy based on the executed API calls. As Terraform uses the AWS SDK, this works here as well.
In contrast to my previous (and accepted) answer, iamlive should even get the actual IAM actions right, which not necessarily match the API calls 1:1 (and which would be logged by CloudTrail).
For this to work with terraform, you should do export AWS_CSM_ENABLED=true

Efficient way I followed.
The way I deal with is, allow all permissions (*) for that service first, then deny some of them if not required.
For example
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowSpecifics",
"Action": [
"ec2:*",
"rds:*",
"s3:*",
"sns:*",
"sqs:*",
"iam:*",
"elasticloadbalancing:*",
"autoscaling:*",
"cloudwatch:*",
"cloudfront:*",
"route53:*",
"ecr:*",
"logs:*",
"ecs:*",
"application-autoscaling:*",
"logs:*",
"events:*",
"elasticache:*",
"es:*",
"kms:*",
"dynamodb:*"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Sid": "DenySpecifics",
"Action": [
"iam:*User*",
"iam:*Login*",
"iam:*Group*",
"iam:*Provider*",
"aws-portal:*",
"budgets:*",
"config:*",
"directconnect:*",
"aws-marketplace:*",
"aws-marketplace-management:*",
"ec2:*ReservedInstances*"
],
"Effect": "Deny",
"Resource": "*"
}
]
}
You can easily adjust the list in Deny session, if terraform doesn't need or your company doesn't use some aws services.

EDIT Feb 2021: there is a better way using iamlive and client side monitoring. Please see my other answer.
As I guess that there's no perfect solution, treat this answer a bit as result of my brain storming. At least for the initial permission setup, I could imagine the following:
Allow everything first and then process the CloudTrail logs to see, which API calls were made in a terraform apply / destroy cycle.
Afterwards, you update the IAM policy to include exactly these calls.

The tracking of minimum permissions is now provided by AWS itself. https://aws.amazon.com/blogs/security/iam-access-analyzer-makes-it-easier-to-implement-least-privilege-permissions-by-generating-iam-policies-based-on-access-activity/.
If you wanted to be picky about the minimum viable permission principle, you could use CloudFormation StackSets to deploy different roles with minimum permissions, so Terraform could assume them on each module call via different providers, i.e. if you have a module that deploys ASGs, LBs and EC2 instances, then:
include those actions in a role where the workload lives
add an terraform aws provider block that assumes that role
use that provider block within the module call.
The burden is to manage possibly quite a few terraform roles, but as I said, if you want to be picky or you have customer requirements to shrink down terraform user's permissions.
You could also download the CloudTrail event history for the last X days (up to 90) and run the following:
cat event_history.json <(echo "]}") | jq '[.Records[] | .eventName] | unique'
The echo thing is due to the file being too big and shrunk (unknown reason) when downloaded from CloudTrail's page. You can see it below:
> jsonlint event_history.json
Error: Parse error on line 1:
...iam.amazonaws.com"}}
-----------------------^
Expecting ',', ']', got 'EOF'
at Object.parseError (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/jsonlint.js:55:11)
at Object.parse (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/jsonlint.js:132:22)
at parse (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/cli.js:82:14)
at main (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/cli.js:136:14)
at Object.<anonymous> (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/cli.js:178:1)
at Module._compile (node:internal/modules/cjs/loader:1097:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1149:10)
at Module.load (node:internal/modules/cjs/loader:975:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12)

As an addition to either of TF_LOG=trace / iamlive / CloudTrail approaches suggested before, please also note that to capture a complete set of actions required to manage a configuration (create/update/delete resources) one would need to actually apply three configurations:
Original one, to capture actions required to create resources.
Mutated one with as many resource arguments changed as possible, to capture actions required to update resources inplace.
Empty one (applied last) or terraform destroy to capture actions required to delete resources.
While configurations 1 and 3 are common to consider, configuration 2 is sometimes overlooked and it can be a tedious one to prepare. Without it Terraform will fail to apply changes that modify resources instead of deleting and recreating them.

Here is an extension on AvnerSo's answer:
cat log.log | ack -o "(?<=DEBUG: Request )[^ ]*" | sort -u
This command outputs every unique AWS request that Terraform has logged.
The "(?<=DEBUG: Request )[^ ]*" pattern performs a negative lookahead to find the first word after the match.
The -o flag only shows the match in the output.
sort -u selects the unique values from the list and sorts them.

Another option in addition to the previous answers is:
give broad permissions "s3:*", ... as explained earlier
Check the AWS Access Advisor tab in the AWS console for used permission and then trim down your permissions accordingly

Amazon S3 Policy Prevent Creation of Root-level Folders

We're looking to prevent our S3 users from creating new folders in our bucket's root. In other words, they must use existing folders within the bucket to upload or modify files. They may choose to create subfolders in these existing folders if they'd like.
Note: Using S3 policies. Users choose any existing folder. They do not have assigned folders.
I know S3 treats both files and folders as objects so I'm not sure this can even be done, but I believe in the community's potential.
Here's what I want:
Bucket-name: test-bucket
Action: Create folder in test-bucket's root.
Desired Result: Denied
Action: Upload random file in test-bucket's root.
Desired Result: Denied
Action: Upload file "file1" in test-bucket's existing "folder1" folder (test-bucket/folder1/file1).
Desired Result: Success
Action: Create folder "sub-folder1" in test-bucket's existing "folder1" folder (test-bucket/folder1/sub-folder1/).
Desired Result: Success

There is a flaw in your conceptual model.
I know S3 treats both files and folders as objects
That isn't correct.
Here's the correct version:
The S3 service and API have no concept of folders.
S3 objects are not in any real sense hierarchical.
The S3 console is the only entity with a concept of folders.
The S3 service and API support a concept of prefix, delimiter, and common prefixes.
When a List Objects request to the API is accompanied by a specified prefix, only objects with keys beginning with that prefix are returned, regardless of any / in their object key after the prefix.
When a List Objects request is accompanied a prefix and also by a delimiter (usually /), the API only returns objects whose keys match the given prefix and which have no subsequent / (after the prefix specified in the request) in the key. These are analogous to the "files in the folder."
The prefixes of the keys of any objects matching the given prefix but which do have a subsequent (after the specified prefix) / are coalesced down to a unique list of their prefixes, truncated to their next /. These are the common prefixes, analogous to "folders in the folder."
But, in fact, nothing is really "in" anything else.
The console creates an illusion of folders by reading the common prefixes from a list objects request to the API, and showing these as folders.
The console furthers the illusion by allowing you to "create a folder" -- but it is not in fact a folder, and it isn't even needed. It is simply an empty object with a key whose last character is /. This object is not needed for normal operation of S3, but is created as a convenience, so that you can navigate "into" an "empty folder" and upload a file "into" the empty folder.
However, what's really happening is this:
Console: "create folder foo in the root of the bucket"
API: PUT /foo/
Content-Length: 0
Console: "click folder foo"
API: GET /?prefix=foo/&delimiter=/
Console: "upload file bar.txt inside folder foo"
API: PUT /foo/bar.txt
Now... if you take an empty bucket, and use the API (not the console), you can simply PUT /foo/bar.txt and you get the exact same net result in the console -- you see a folder named "foo" containing "bar.txt." The folder is displayed because there's an object with the prefix foo/. Delete the object and the folder vanishes.
Conversely, if you did it from the top with the console, once you deleted "bar.txt" there would still be folder "foo" because that's really just an empty object whose sole purpose is to cause a folder to appear in the console navigation when there are no other objects with that common prefix.
So, no... S3 does not treat both files and folders as objects. The S3 console creates objects that spoof folders, strictly as an aid to navigation, and the magic here is that the object's key ends with /. On the other hand, if those empty objects aren't there, the console still displays objects as though they were in folders.
You then see the problem that develops. The S3 service can't be asked to test against something of which it is unaware and which in fact need not exist.
So, it is not technically possible to do exactly what you asked; however, there appears to be a limited workaround. The primary limitation is that you can't specify that "the folder must exist," but you can specify "the object key prefix must match a predefined set of patterns."
The relevant part of the bucket or user policy might look something like this...
"Action": "s3:PutObject",
"Resource": [
"arn:aws:s3:::examplebucket/taxdocuments/*",
"arn:aws:s3:::examplebucket/personnel/*",
"arn:aws:s3:::examplebucket/unicorns/*"
...
],
A user impacted by this policy would be able to create any object beginning with taxdocuments/ or personnel/ or unicorns/ in the "examplebucket" bucket, and would not be able to create an object without one of those prefixes. Beyond that, the can create console folders "in" folders "in" folders all day long, as long as one of these prefixes is at the beginning of every fake folder's object key.
The limitation of course is that making another folder eligible for access requires modifying the policy.
This might work also, but proceed with caution:
"Resource": "arn:aws:s3:::examplebucket/?*/?*",
Intuitively it seems like this might work, but the flaw here -- assuming the ?*/?* is valid (it seems to be) and that ? does not match 0 characters the way * does -- is that this allows a user to create a new (pseudo-)folder in the root as long as they simultaneously create something inside it with a name at least one character long, using the API -- that is, creating an object with key pics/cat.jpg "creates" the "pics" folder if it's not already there, as explained above. From the console, this should prevent creation of new folders in the root, but from the API it would impose no such restriction.

Thank you for your elaborate response #Michael. You are absolutely correct in saying that API and CLI calls can continue to create root-level folders when non-empty. Console and S3-browser access works as intended. It is a compromise, but it is the closest we can get to what we wanted. Here's is the bucket policy I'm using:
{
"Version": "2012-10-17",
"Id": "Policy1486492608325",
"Statement": [
{
"Sid": "Stmt1486492495770",
"Effect": "Allow",
"Principal": {
"*"
},
"Action": [
"s3:DeleteObject",
"s3:Get*",
"s3:List*",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::test-storage",
"arn:aws:s3:::test-storage/*"
]
},
{
"Sid": "Stmt1486492534643",
"Effect": "Deny",
"Principal": {
"*"
},
"Action": [
"s3:DeleteObject",
"s3:PutObject"
],
"NotResource": "arn:aws:s3:::test-storage/?*/?*"
}
]
}

Preventing a user from even knowing about other users (folders) on AWS S3

I have a question about writing IAM policies on AWS S3 that was partially answered here, in this nice post by Jim Scharf:
https://aws.amazon.com/blogs/security/writing-iam-policies-grant-access-to-user-specific-folders-in-an-amazon-s3-bucket/
Taking Jim's post as a starting point, what I am trying to achieve is preventing a user from even knowing about the existence of other users that have access to the same bucket while using S3's console. Jim's solution, as well as others I've found, restrict a given user from accessing the contents inside another user's folder. But none of the solutions prevent a user "u" from doing what I call a "partial listing", ie., not even displaying folders whose contents are not allowed to "u".
The following post is also very similar to my question:
How to set up S3 Policies for multiple IAM users such that each individual only has access to their personal bucket folder?
But unlike the setting in such post, I need to resemble a file system structure that has an "intermediate" home folder between the bucket name and the user-specific folder (just like in Jim's post):
mybucket/home/user1
mybucket/home/user2
What I've done so far is as follows:
Created a single bucket as well as a number of "folders" inside it
Created a number of users that I've grouped together in a group. Each user has a corresponding folder inside the bucket with the same name
Setup an IAM policy that I have attached to the group as follows:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowUsersToSeeBucketListInTheConsole",
"Action": ["s3:ListAllMyBuckets", "s3:GetBucketLocation"],
"Effect": "Allow",
"Resource": ["arn:aws:s3:::*"]
},
{
"Sid": "AllowRootAndHomeListingOfMyBucket",
"Action": ["s3:ListBucket"],
"Effect": "Allow",
"Resource": ["arn:aws:s3:::mybucket"],
"Condition":{"StringEquals":{"s3:prefix":["","home/"],"s3:delimiter":["/"]}}
},
{
"Sid": "AllowListingOfUserFolder",
"Action": ["s3:ListBucket"],
"Effect": "Allow",
"Resource": ["arn:aws:s3:::mybucket"],
"Condition":{"StringLike":{"s3:prefix":["home/${aws:username/*"]}}
}
]
}
I understand that the Sid "AllowRootAndHomeListingOfMyBucket" above grants permission to "ListBucket" when the prefix is "home/". Which in effect allows any user "u" in the group to list the entire set of folders "f" inside the "home" folder, regardless of whether "u" would have access to a given element of "f" or not. What I don't know is whether there is any cleverly designed "Deny" rule that would restrict even listing those folders whose contents "u" is not allowed to see.
According to Bob Kinney this was not possible back in 2013:
IAM Policy to list specific folders inside a S3 bucket for an user
However I am not sure if things have evolved in the meantime. Any suggestion is greatly appreciated.
Thank you.

No, this isn't possible, because what the policy allows is not what you can see, but rather what you can ask to see. And asking S3 to see object lists is done with prefix and delimeter.
When navigating a bucket, behind the scenes, the console asks for these things:
Click on bucket: List the root of the bucket (prefix empty string with delimiter /) -- returns all common prefixes ("folders") and all objects in the root up to one / deep. (It isn't shown, but folder names actually end with / when you create folders using the console -- that's the sole reason the console shows them as folders -- the hidden / at the end of what is actually an ordinary empty object).
Click on home: List all at prefix home/ with delimiter / -- returns all common prefixes and objects under home/ up to one more / -- so, this returns home/david/, home/emily, home/genevieve/, home/james/, etc.
click on david: List all at prefix home/david/ with delimiter / ... you perhaps get the idea.
Note how these three clicks correspond with the allowed actions in the blog post -- list the root, list home, list the user's specific directory under home.
Lacking permisson to list other users' home directories, you can see that they exist, but you can't drill dowm into them.
Reiterating... policies control what you can ask for, not what you can see. In order to navigate to your own home directory, you have to be able to list the home directories, or you can't navigate to yours. That's the fundamental reason why this can't be done with the console -- there is no policy you can write that prevents users from seeing other entries within a single level of the /-delimited hierarchy that they are allowed to see, because the permission is applied to the request, not the response.

Simple example to restrict access to Cloudfront(S3) files from some users but not others

I'm just getting started with permissions on AWS S3 and Cloudfront so please take it easy on me.
Two main questions:
I'd like to allow access to some users (e.g., those that are logged in) but not others. I assume I need to be using ACLs instead of a bucket policy since the former is more customizable in that you can identify the user in the URL with query parameters. First of all is this correct? Can someone point me to the plainest english description of how to do this on a file/user-by-file/user basis? The documentation on ACL confuses the heck out of me.
I'd also like to restrict access such that people can only view content on my-site.com and not your-site.com. Unfortunately the S3 documentation example bucket policy for this has no effect on access for my demo bucket (see code below, slightly adapted from AWS docs). Moreover, if I need to foremost be focusing on allowing user-by-user access, do I even want to be defining a bucket policy?
I realize i'm not even touching on how to make this work in the context of Cloudfront (the ultimate goal) but any thoughts on questions 1 and 2 would be greatly appreciated and mentioning Cloudfront would be a bonus at this point.
`
{
"Version": "2008-10-17",
"Id":"http referer policy example",
"Statement": [
{
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": {
"AWS": "*"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket/*",
"Condition": {
"StringLike": {
"aws:Referer": [
"https://mysite.com/*",
"https://www.mysite.com/*"
]
}
}
}
]
}

To restrict access to CDN, to serve what we call "private content" you need to use the API to generated signed URLs and you can define the expiration of the URL. More information is here.
You can use the Origin Access Identity—as explained here—to prevent the content from being served outside cloudfront.
I thought I had some code here from a past project to share and didn't. But, at least I was able to dig into my bookmarks and find one of the references that helped me in the process, and there is another post here at stackoverflow that mentions the same reference. See below the link to the reference and to the post.
http://improve.dk/how-to-set-up-and-serve-private-content-using-s3/
Cloudfront private content + signed urls architecture
Well, it is two years old, you might have to change it a little bit here and there, but you'll get the idea.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js