GCS limit bucket access to an existing service account - google-cloud-platform

Usually there is Compute Engine default service account that is created automatically by GCP, this account is used for example by VM agents to access different resources across GCP and by default has role/editor permissions.
Suppose I want to create GCS bucket that can only be accessed by this default service account and no one else. I've looked into ACLs and tried to add an ACL to the bucket with this default service account email but it didn't really work.
I realized that I can still access bucket and objects in this bucket from other accounts that have for example storage bucket read and storage object read permissions and I'm not sure what I did wrong (maybe some default ACLs are present?).
My questions are:
Is it possible to limit access to just that default account? In that case who will not be able to access it?
What would be the best way to do it? (would appreciate a lot an example using Storage API)
There are still roles such as role/StorageAdmin, and actually no matter what ACLs will be put on the bucket I could still access it if I had this role (or higher role such as owner) right?
Thanks!

I recommend you not to use ACL (and Google also). It's better to switch the bucket in uniform IAM policy.
There are 2 bad side of ACL:
New created files aren't ACL and you need to set it everytime that you create a ne file
It's difficult to know who has and who hasn't access with ACL. IAM service is better for auditing.
When you switch to Uniform IAM access, Owner, Viewer, and Editor role no longer have access to buckets (the role/storage.admin isn't included in this primitive role). It could solve in one click all the unwanted access. Else, as John said, remove all the IAM permission on the bucket and the project that have access to the bucket except your service account.

You can control access to buckets and objects using Cloud IAM and ACLs.
For example grant the service account WRITE (R: READ,W: WRITE,O: OWNER) access to the bucket using ACLs:
gsutil acl ch -u service-account#project.iam.gserviceaccount.com:W gs://my-bucket
To remove access of service account from the bucket:
gsutil acl ch -d service-account#project.iam.gserviceaccount.com gs://my-bucket
If There are roles such as role/StorageAdmin in the IAM identities (project level), they will have access to all the GCS resources of the project. You might have to change the permission to avoid them having access.

Related

Limiting S3 bucket access to users within one account

I have an IAM user that has full S3 access (i.e. can perform any S3 actions on any S3 resource within the AWS account). This user has created a bucket and put some files in it. The bucket has a policy which just contains an Allow rule that grants access to a different IAM user, in the same AWS account. Public access is turned off for the bucket.
Should the first user be able to access objects in this bucket? If so, is that because they created the bucket, or because they're in the account that owns the bucket? Is it possible to limit access to a bucket for users within the same AWS account?
S3 is one of the few services with resource policies, in this case they are called bucket policies.
A user in the same account has access to a (S3) resource if
nothing explicitly denies the access AND
either the bucket policy grants access OR the user / entity has a policy attached that grants access
If you wanted to restrict a bucket to a single user / entity you would
need to write a bucket policy that specifies that using a Deny statement for every user except the target one AND
either add a statement to the bucket policy or a policy attached to the user / entity granting access to the bucket.
The standard doc for understanding policy evaluation logic is this. There are other, more complicated ways to achieve your goal using e.g. permission boundaries and SCPs but they are probably overkill in your situation.

AWS S3 bucket policy: permissions based on account id prefix

I have a centralized CloudTrail bucket which contains the CloudTrail logs of multiple accounts. Is it possible to write a bucket policy which allows that account 123456789112 can only download logs from Awslogs/123456789112 and that account 456789012345can only download logs from Awslogs/456789012345etc ? I don't want to hardcode this for each account since I have a lot of accounts. Is there a way to do this?
AWS IAM policies (and bucket policies) support a few policy variables that you can use as dynamic values such as aws:SourceIp, however account ID is not one of them. There is a aws:userid variable but it's the account ID only for the root user, for other principals like IAM user/role it is the user/role name. Technically if you used the AWS root user to access this bucket, you could use the userid variable in the Resource element to achieve what you want but it is strongly recommended not to use the root user for such everyday tasks (AWS recommendation).
There are also policy condition keys like aws:PrincipalAccount but without a relevant policy variable these cannot be used to dynamically compare the requesting account ID with the resource. There are no other IAM feature that could be used to achieve this.
I don't know your exact environment but a few things to consider:
I'd recommend to explicitly list the allowed principal ARNs anyway because even if you have many accounts, you should allow only specific IAM users/roles to read the bucket to follow the least privilege principle. Granting access based on account ID would allow all users/roles in that account to read these files and not just specific services. (unless this is the objective)
since this is a cross-account access (principal in account A wants to read from the bucket located in account B), you will need to allow this access on both sides, both in the requester's IAM policy and the target bucket's policy. Just a heads up. More info on AWS.
I would consider using Terraform to simplify the management of these resources
Hope this helps, let me know if you have more questions!

spark s3 access without configuring keys and with only IAM role

I have a HDP cluster on AWS and I have one s3(in other account) also, my hadoop version is Hadoop 3.1.1.3.0.1.0-187
Now I want to read from the s3 (which is in different account) and process, then write the result to my s3(same account as cluster).
But as per the HDP guide Here tells, I can configure only one keys of either my account or other account.
But in my case I want to configure two account keys, so How to do do that ?
Due to some security reason, other account can not change the bucket policy to add IAM role which is created in my account , Hence I tried to access like below
Configured the keys of other account
Added IAM role(which has access policy for my bucket) of my account
but Still I got below error when I tried to access my account s3 from spark write
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3
What you need is to use the EC2 instance profile role. It is an IAM role that is attached to your instance: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html
You first create a role with permissions that allow s3 access. Then you attach that role to your HDP cluster(EC2 autoscaling group and EMR can both achieve that).No IAM access key configuration needed on your side, although AWS still does that for you in the background. This is the s3 "outbound" access part.
The 2nd step is to set up the bucket policy to allow cross-account access: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-walkthroughs-managing-access-example2.html
You will need to do this for each bucket in your different accounts. This is basically the "inbound" s3 access permission part.
You will encounter 400 if any part of your access(i.e., your instance profile role's permission, S3 bucket ACL, bucket policy, public access block setting and etc..) is denied in the permission chain. There are much more layers on the "inbound" side. So to start to get things working, if you are not IAM expert, try to start with a very open policy(use '*' wildcard) and then narrow things down.
If I've understood right
you want your EC2 VMs to access an S3 bucket to which the IAM role doesn't have access
your have a set of AWS login details for the external S3 bucket (login and password)
HDP3 has an default auth chain of, in order
per-bucket secrets. fs.s3a.bucket.NAME.access.key, fs.s3a.bucket.NAME.secret.key
config-wide secrets fs.s3a.access.key, fs.s3a.secret.key
env vars AWS_ACCESS_KEY and AWS_SECRET_KEY
the IAM Role (it does an HTTP GET to the 169.something server which serves up a new set of IAM role credentials at least once an hour)
What you need to try here is set up some per-bucket secrets for only the external source (either in a JCEKS file on all nodes in core-site.xml, or in the spark default. For example, if the external bucket was s3a://external, you'd have
spark.hadoop.fs.s3a.bucket.external.access.key AKAISOMETHING spark.hadoop.fs.s3a.bucket.external.secret.key SECRETSOMETHING
HDP3/Hadoop 3 can handle >1 secret in the same JCEKS file without problems. HADOOP-14507. my code. Older versions let you put username:secret in the URI, but that's such a security troublespot (everything logs those URIs as they aren't viewed as sensistive), that feature has been cut from Hadoop now. Stick to the JCEKs file with a per-bucket secret, falling back to IAM role for your own data
Note you can fiddle with the authentication list for ordering and behaviour: if you add use the TemporaryAWSCredentialsProvider then it'll support session keys as well, which is often handy.
<property>
<name>fs.s3a.aws.credentials.provider</name>
<value>
org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider,
org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider,
com.amazonaws.auth.EnvironmentVariableCredentialsProvider,
org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider
</value>
</property>

Aws IAM Roles vs Bucket Policies

I have been reading a number of docs and watched number of videos, but I am still very confused about IAM Roles and Bucket policies. Here is what confuses me:
1) I create a bucket. At that time I can make it public or keep it private. If I make it public, then anyone, or any Application, can "see" the objects in the bucket. I think the permissions can be set to add/delete/get/list objects in the bucket. If this is the case, then why do I ever need to add any IAM Role for S3 buckets, or, add any Bucket policy (???)
2) At the time I create a bucket, can I give very specific permissions to only certain users/applications/EC2 instances etc to all or part of the bucket? e.g. App1 on EC2-X can access subfolder A in bucket B1.
3) Coming to IAM Roles, an EC2 role that gives full S3 access- what does it mean? Full access to any bucket? How can I restrict an app running on an EC2 to only certain buckets, with only certain restricted permissions (see #2) above)? Do all Apps on the EC2 have full access to all buckets? At the time of creating a bucket, can the permissions be so set that an IAM Role can be overruled?
4) Finally, what do Bucket Policies do in addition to the above IAM Roles? e.g is 'AllowS3FullAccess' a "Bucket Policy", or an "IAM Policy"? Why differentiate between types of policies- policies are just that- they define some permissions/rules on some objects/resources,as I see it.
Thanks for any clarifications.
- a newcomer to AWS
I think you are confusing permissions for resources with IAM entities.
i) There are resources (S3 bucket, EC2 instances etc.) owned by the AWS account and these resources can be accessed by IAM users, IAM roles or other AWS Services (can be from same or different account)
ii) We manage who can access and their permission level with policies
iii) Policies can be identity based (attached to IAM user/group/role) or resource based (attached to S3 bucket, SNS topic)
iv) Resource based policy will have a Principal element but the identity based policies will not have that (because the attached IAM entity is the Principal)
v) Permissions start from default deny, allow overrides the default deny and an explicit deny overrides any allow
vi) Final access will be determined by combination of all policies
To answer your questions:
1> We cannot add (or attach) an IAM role with an S3 bucket. If you want your bucket should be public (which is not recommended but need to do it till some extent if it's in use for static website), then you can keep it public
2> It is not possible while creating the bucket. You have to do it after creating the bucket via IAM and/or S3 bucket policy
3> If an IAM role has AmazonS3FullAccess, the role can (Effect:Allow) call any S3 API (s3:) for any S3 resource (Resource:) in your account (provided they don't have cross account access).
If multiple applications run on an instance with an IAM role attached and are using credentials provided by the role, their permission will be same.
4> I don't know where you got the reference AllowS3FullAccess but we cannot confirm unless we know the exact JSON. If it is attached to a bucket or has the Principal element, it is a bucket policy.
You can use IAM and Bucket policies based on your need. Usually bucket policies are used for cross account access or if you want to manage S3 permission policies in a single place.

AccessDeniedException: 403 Forbidden on GCS using owner account

I have tried to access files in a bucket and I keep getting access denied on the files. I can see them in the GCS console but can access them through that and cannot access them through gsutil either running the command below.
gsutil cp gs://my-bucket/folder-a/folder-b/mypdf.pdf files/
But all this returns is AccessDeniedException: 403 Forbidden
I can list all the files and such but not actually access them. I've tried adding my user to the acl but that still had no effect. All the files were uploaded from a VM through a fuse mount which worked perfectly and just lost all access.
I've checked these posts but none seem to have a solution thats helped me
Can't access resource as OWNER despite the fact I'm the owner
gsutil copy returning "AccessDeniedException: 403 Insufficient Permission" from GCE
gsutil cors set command returns 403 AccessDeniedException
Although, quite an old question. But I had a similar issue recently. After trying many options suggested here without success, I carefully re-examined my script and discovered I was getting the error as a result of a mistake in my bucket address gs://my-bucket. I fixed it and it worked perfectly!
This is quite possible. Owning a bucket grants FULL_CONTROL permission to that bucket, which includes the ability to list objects within that bucket. However, bucket permissions do not automatically imply any sort of object permissions, which means that if some other account is uploading objects and sets ACLs to be something like "private," the owner of the bucket won't have access to it (although the bucket owner can delete the object, even if they can't read it, as deleting objects is a bucket permission).
I'm not familiar with the default FUSE settings, but if I had to guess, you're using your project's system account to upload the objects, and they're set to private. That's fine. The easiest way to test that would be to run gsutil from a GCE host, where the default credentials will be the system account. If that works, you could use gsutil to switch the ACLs to something more permissive, like "project-private."
The command to do that would be:
gsutil acl set -R project-private gs://muBucketName/
tl;dr The Owner (basic) role has only a subset of the GCS permissions present in the Storage Admin (predefined) role—notably, Owners cannot access bucket metadata, list/read objects, etc. You would need to grant the Storage Admin (or another, less privileged) role to provide the needed permissions.
NOTE: This explanation applies to GCS buckets using uniform bucket-level access.
In my case, I had enabled uniform bucket-level access on an existing bucket, and found I could no longer list objects, despite being an Owner of its GCP project.
This seemed to contradict how GCP IAM permissions are inherited— organization → folder → project → resource / GCS bucket—since I expected to have Owner access at the bucket level as well.
But as it turns out, the Owner permissions were being inherited as expected, rather, they were insufficient for listing GCS objects.
The Storage Admin role has the following permissions which are not present in the Owner role: [1]
storage.buckets.get
storage.buckets.getIamPolicy
storage.buckets.setIamPolicy
storage.buckets.update
storage.multipartUploads.abort
storage.multipartUploads.create
storage.multipartUploads.list
storage.multipartUploads.listParts
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.getIamPolicy
storage.objects.list
storage.objects.setIamPolicy
storage.objects.update
This explained the seemingly strange behavior. And indeed, after granting the Storage Admin role (whereby my user was both Owner and Storage Admin), I was able to access the GCS bucket.
Footnotes
Though the documentation page Understanding roles omits the list of permissions for Owner (and other basic roles), it's possible to see this information in the GCP console:
Go to "IAM & Admin"
Go to "Roles"
Filter for "Owner"
Go to "Owner"
(See list of permissions)