I am trying to figure out how to follow these instructions to set up an S3 bucket on AWS. I have been trying most of the year but still can't make sense of the AWS documentation. I think this github repo readme may have been prepared at a time when the AWS S3 interface appeared differently (there is no CORS setting in the form to make the S3 bucket permissions now).
I asked this question earlier this year, and I have tried using the upvoted answer to make a bucket policy, which is precisely as shown in that answer, as:
[
{
"AllowedHeaders": [
"*"
],
"AllowedMethods": [
"POST",
"GET",
"PUT",
"DELETE",
"HEAD"
],
"AllowedOrigins": [
"*"
],
"ExposeHeaders": []
}
]
When I try this, I get an error that says:
Ln 1, Col 0Data Type Mismatch: The text does not match the expected
JSON data type Object. Learn more
The Learn More link goes to a page that suggests I can resolve this error by updating text to use the supported data type. I do not know what that means in this context. I don't know how to find out which condition keys require which type of data.
Resolving the error
Update the text to use the supported data type.
For example, the Version global condition key requires a String data
type. If you provide a date or an integer, the data type won't match.
Can anyone help with current instructions for how to create the CORS permissions that are consistent with the spirit of the instructions in the github readme for the repo?
Related
Currently working on a project that involves what is essentially an online gallery stored on AWS S3. The current process in question is that the frontend webpage sends a request for an API service hosted on EC2, which then returns with a presigned URL. The service will be using Go, with aws-sdk-go-v2.
The URL is time limited, set as a PUT object type. Unfortunately I haven't figured out how to limit the other aspects of the file that will be uploaded yet. Ideally, the URL should be limited in what it can accept, IE images only.
My searchs around have come up with mixed results, saying both it's possible, not possible, or just outright not mentioned for what I'm doing.
There's plenty of answers regarding setting a POST/PUT policy, but I they're either for the V1 SDK, or just outright a different language. This answer here even has a few libraries that does it, but I don't want to resort to using that yet since it requires access keys to be placed in the code, or somewhere in the environment (Trying to reap the benefits of EC2's IAM role automating the process).
Anyone know if this is still a thing on V2 of the SDK? I kinda want to keep it on V2 of the SDK just for the sake of consistency.
EDIT: Almost forgot. I saw that it was possible for S3 to follow a link upon upload completion as well. Is this still possible? Would be great as a verification that something was uploaded so that it could be logged.
You can try to validate the filename through the your backend API before returning the PreSigned PUT URL. And a less sequre but can be good, is to validate the file content in the frontend client.
I unfortunately have not discovered a way to restrict uploads via the Content-Type, but I have found a few tricks that might help you.
First, while this is a bit of a sledgehammer when you might want a scalpel, you can apply a bucket policy to restrict by filename. This example from AWS Knowledge Center only allows a few image types.
{
"Version": "2012-10-17",
"Id": "Policy1464968545158",
"Statement": [
{
"Sid": "Stmt1464968483619",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111111111111:user/exampleuser"
},
"Action": "s3:PutObject",
"Resource": [
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.jpg",
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.png",
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.gif"
]
},
{
"Sid": "Stmt1464968483619",
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"NotResource": [
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.jpg",
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.png",
"arn:aws:s3:::DOC-EXAMPLE-BUCKET/*.gif"
]
}
]
}
Second, you can automatically trigger a Lambda function after the file is uploaded. With this method, you can inspect anything about the file you want after it is uploaded. This is generally my preferred approach, but the problem with it is there's not really any immediate feedback the file was invalid.
My last option requires the client to do some more work, and it's what I used most recently. If you are uploading very large files, S3 requires you to use multipart uploads. This isn't multipart like an HTML form; this is breaking the file up into chunks and uploading each of them. If your file is sufficiently small--I think the limit is 5GB, you can just do a single part. Once all the parts are uploaded, you must make another call to your server that finalizes the upload with S3. This is where you can add some validation of the file and respond to the client while they're still on the upload page.
I am new to AWS and I have followed this tutorial : https://docs.aws.amazon.com/apigateway/latest/developerguide/integrating-api-with-aws-services-s3.html, I am now able to read from the TEST console my AWS object stored on s3 which is the following (it is .json file):
[
{
"important": "yes",
"name": "john",
"type": "male"
},
{
"important": "yes",
"name": "sarah",
"type": "female"
},
{
"important": "no",
"name": "maxim",
"type": "male"
}
]
Now, what I am trying to achieve is pass query parameters. I have added type in the Method Request and added a URL Query String Parameter named type with method.request.querystring.type mapping in the Integration Request.
When I want to test, typing type=male is not taken into account, I still get the 3 elements instead of the 2 male elements.
Any reasons you think this is happening ?
For information, the Resources is the following (and I am using AWS Service integration type to create the GET method as explained in the AWS tutorial)
/
/{folder}
/{item}
GET
In case anyone is interested by the answer, I have been able to solve my problem.
The full detailed solution requires a tutorial but here are the main steps. The difficulty lies in the many moving parts so it is important to test each of them independently to make progress (quite basic you will tell me).
Make sure your SQL query to your s3 DB is correct, for this you can go in your s3 bucket, click on your file and select "query with s3 select" from the action.
Make sure that your lambda function works, so check that you build and pass the correct SQL query from the test event
Setup the API query strings in the Method Request panel and setup the Mapping Template in the Integration Request panel (for me it looked like this "TypeL1":"$input.params('typeL1')") using the json content type
Good luck !
I'm creating datasources/ datasets in code (boto3) but these don't show up in the console.
Even though the datasets are listed with list_data_sets, they don't seem to be available in the console.
I need to be able to create all the necessary datasets in code and then be able to use these to create new analyses/ dashboards in the console.
I'm using the Standard Edition of QuickSight.
Can this be done? Or, can it only be done in the Enterprise Edition? Or, not at all?
Thanks
According to the QuickSight pricing page "APIs" are not available in Standard Edition. Exactly what that means, I have no idea.
But, assuming it's possible to call create-data-set, one important thing to remember is that data set permissions are necessary in order for users to view them.
According to the boto docs, these permissions should be included in the following schema
Permissions=[
{
'Principal': 'string',
'Actions': [
'string',
]
},
]
In my code, I use the following to share with the all-users group (note the group principal, replace AWS_REGION and ACCOUNT_ID with your values)
Permissions= [
{
'Principal': 'arn:aws:quicksight:AWS_REGION:ACCOUNT_ID:group/default/all-users',
'Actions': [
'quicksight:DescribeDataSet',
'quicksight:DescribeDataSetPermissions',
'quicksight:PassDataSet',
'quicksight:DescribeIngestion',
'quicksight:ListIngestions'
]
}
],
I believe the same can be done for individual users, with an ARN resource of user/default/user.name instead of group/default/all-users.
For data sources, the set of permissions that I use is
'Actions': [
'quicksight:DescribeDataSource',
'quicksight:DescribeDataSourcePermissions',
'quicksight:UpdateDataSource',
'quicksight:UpdateDataSourcePermissions',
'quicksight:DeleteDataSource',
'quicksight:PassDataSource'
]
I am trying agora cloud Recording API and trying to record into AWS S3 bucket. The calls appear to go through fine. While doing stop record, I get success message. I have reproduced part of it here:
{
insertId: "5d66423d00012ad9d6d02f2b"
labels: {
clone_id:
"00c61b117c803f45c35dbd46759dc85f8607177c3234b870987ba6be86fec0380c162a"
}
lotextPayload: "Stop cloud recording success. FileList :
01ce51a4a640ecrrrrhxabd9e9d823f08_tdeaon_20197121758.m3u8, uploading
status: backuped"
timestamp: "2019-08-28T08:58:37.076505Z"
}
It shows the status 'backuped'. As per agora documentation, it uploaded the files into agora cloud. Then within 5 minutes it is supposed to upload into my AWS S3 bucket.
I am not seeing this file in my AWS bucket. I have tested the bucket secret key. the same key works fine for other application. Also I have verified CORS settings.
Please suggest how I could debug further.
Make sure you are inputing your S3 credentials correctly within the storageConfig settings
"storageConfig":{
"vendor":{{StorageVendor}},
"region":{{StorageRegion}},
"bucket":"{{Bucket}}",
"accessKey":"{{AccessKey}}",
"secretKey":"{{SecretKey}}"
}
Agora offers a Postman collection to make testing easier: https://documenter.getpostman.com/view/6319646/SVSLr9AM?version=latest
I had faced issue due to wrong uid.
Recording uid needs to be a random id which will be used for 'joining by the recording client which 'resides' on cloud. I had passed my main client's id.
Other two reasons I had faced:
S3 credentials
S3 CORS settings : Go to AWS S3 permissions and set the allowed CORS headers.
EDIT:
It could be something like this on S3 side ..
[
{
"AllowedHeaders": [
"Authorization",
"*"
],
"AllowedMethods": [
"HEAD",
"POST"
],
"AllowedOrigins": [
"*"
],
"ExposeHeaders": [
"ETag",
"x-amz-meta-custom-header",
"x-amz-storage-class"
],
"MaxAgeSeconds": 5000
}
]
I have a Terraform configuration targeting deployment on AWS. It applies beautifully when using an IAM user that has permission to do anything (i.e. {actions: ["*"], resources: ["*"]}.
In pursuit of automating the application of this Terraform configuration, I want to determine the minimum set of permissions necessary to apply the configuration initially and effect subsequent changes. I specifically want to avoid giving overbroad permissions in policy, e.g. {actions: ["s3:*"], resources: ["*"]}.
So far, I'm simply running terraform apply until an error occurs. I look at the output or at the terraform log output to see what API call failed and then add it to the deployment user policy. EC2 and S3 are particularly frustrating because the name of the actions seems to not necessarily align with the API method name. I'm several hours into this with easy way to tell how far long I am.
Is there a more efficient way to do this?
It'd be really nice if Terraform advised me what permission/action I need but that's a product enhancement best left to Hashicorp.
Here is another approach, similar to what was said above, but without getting into CloudTrail -
Give full permissions to your IAM user.
Run TF_LOG=trace terraform apply --auto-approve &> log.log
Run cat log.log | grep "DEBUG: Request"
You will get a list of all AWS Actions used.
While I still believe that such super strict policy will be a continuous pain and likely kill productivity (but might depend on the project), there is now a tool for this.
iamlive uses the Client Side Monitoring feature of the AWS SDK to create a minimal policy based on the executed API calls. As Terraform uses the AWS SDK, this works here as well.
In contrast to my previous (and accepted) answer, iamlive should even get the actual IAM actions right, which not necessarily match the API calls 1:1 (and which would be logged by CloudTrail).
For this to work with terraform, you should do export AWS_CSM_ENABLED=true
Efficient way I followed.
The way I deal with is, allow all permissions (*) for that service first, then deny some of them if not required.
For example
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowSpecifics",
"Action": [
"ec2:*",
"rds:*",
"s3:*",
"sns:*",
"sqs:*",
"iam:*",
"elasticloadbalancing:*",
"autoscaling:*",
"cloudwatch:*",
"cloudfront:*",
"route53:*",
"ecr:*",
"logs:*",
"ecs:*",
"application-autoscaling:*",
"logs:*",
"events:*",
"elasticache:*",
"es:*",
"kms:*",
"dynamodb:*"
],
"Effect": "Allow",
"Resource": "*"
},
{
"Sid": "DenySpecifics",
"Action": [
"iam:*User*",
"iam:*Login*",
"iam:*Group*",
"iam:*Provider*",
"aws-portal:*",
"budgets:*",
"config:*",
"directconnect:*",
"aws-marketplace:*",
"aws-marketplace-management:*",
"ec2:*ReservedInstances*"
],
"Effect": "Deny",
"Resource": "*"
}
]
}
You can easily adjust the list in Deny session, if terraform doesn't need or your company doesn't use some aws services.
EDIT Feb 2021: there is a better way using iamlive and client side monitoring. Please see my other answer.
As I guess that there's no perfect solution, treat this answer a bit as result of my brain storming. At least for the initial permission setup, I could imagine the following:
Allow everything first and then process the CloudTrail logs to see, which API calls were made in a terraform apply / destroy cycle.
Afterwards, you update the IAM policy to include exactly these calls.
The tracking of minimum permissions is now provided by AWS itself. https://aws.amazon.com/blogs/security/iam-access-analyzer-makes-it-easier-to-implement-least-privilege-permissions-by-generating-iam-policies-based-on-access-activity/.
If you wanted to be picky about the minimum viable permission principle, you could use CloudFormation StackSets to deploy different roles with minimum permissions, so Terraform could assume them on each module call via different providers, i.e. if you have a module that deploys ASGs, LBs and EC2 instances, then:
include those actions in a role where the workload lives
add an terraform aws provider block that assumes that role
use that provider block within the module call.
The burden is to manage possibly quite a few terraform roles, but as I said, if you want to be picky or you have customer requirements to shrink down terraform user's permissions.
You could also download the CloudTrail event history for the last X days (up to 90) and run the following:
cat event_history.json <(echo "]}") | jq '[.Records[] | .eventName] | unique'
The echo thing is due to the file being too big and shrunk (unknown reason) when downloaded from CloudTrail's page. You can see it below:
> jsonlint event_history.json
Error: Parse error on line 1:
...iam.amazonaws.com"}}
-----------------------^
Expecting ',', ']', got 'EOF'
at Object.parseError (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/jsonlint.js:55:11)
at Object.parse (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/jsonlint.js:132:22)
at parse (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/cli.js:82:14)
at main (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/cli.js:136:14)
at Object.<anonymous> (/usr/local/Cellar/jsonlint/1.6.0/libexec/lib/node_modules/jsonlint/lib/cli.js:178:1)
at Module._compile (node:internal/modules/cjs/loader:1097:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1149:10)
at Module.load (node:internal/modules/cjs/loader:975:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12)
As an addition to either of TF_LOG=trace / iamlive / CloudTrail approaches suggested before, please also note that to capture a complete set of actions required to manage a configuration (create/update/delete resources) one would need to actually apply three configurations:
Original one, to capture actions required to create resources.
Mutated one with as many resource arguments changed as possible, to capture actions required to update resources inplace.
Empty one (applied last) or terraform destroy to capture actions required to delete resources.
While configurations 1 and 3 are common to consider, configuration 2 is sometimes overlooked and it can be a tedious one to prepare. Without it Terraform will fail to apply changes that modify resources instead of deleting and recreating them.
Here is an extension on AvnerSo's answer:
cat log.log | ack -o "(?<=DEBUG: Request )[^ ]*" | sort -u
This command outputs every unique AWS request that Terraform has logged.
The "(?<=DEBUG: Request )[^ ]*" pattern performs a negative lookahead to find the first word after the match.
The -o flag only shows the match in the output.
sort -u selects the unique values from the list and sorts them.
Another option in addition to the previous answers is:
give broad permissions "s3:*", ... as explained earlier
Check the AWS Access Advisor tab in the AWS console for used permission and then trim down your permissions accordingly