How do I use athena workgroups to restrict access of a user to a particular database?
For e.g. I have a user "readonly" who should not be able to run select query on default database. Is this possible?
The way to restrict users from querying tables is to use IAM permissions. The permissions model in Athena is unfortunately more complicated than in an isolated data warehouse or RDBMS, since Athena is a part of a larger ecosystem that also includes S3 and Glue.
There is no specific permission for running SELECT. You can restrict users to run queries by controlling whether or not they are allowed to perform the athena:StartQueryExecution action, but you can't control what kind of queries they run.
Instead you need to think in terms of access to data, and access to the catalog.
To restrict reading you restrict the user's access to the data on S3. Even if a user is allowed to run a SELECT query they will get an error if they don't have permission to run s3:ListObject and s3:GetObject on the objects in the table's prefix.
You can also restrict a user's access to the catalog objects, i.e. the databases and tables – but that does not restrict their access to the data itself, think of it more as a restriction on creating, updating, and dropping databases and tables. Even if there is a way to restrict which databases and tables a user can see in the catalog, if they have permission to read the data they can read the data directly from S3 and skipping Athena.
You can find the documentation on how to control access to catalog objects here: https://docs.aws.amazon.com/athena/latest/ug/fine-grained-access-to-glue-resources.html
Workgroups in Athena can't be used to control access to data, nor to the catalog.
Related
There are some users in the Redshift data warehouse who have read and edit permissions. What query should I run to remove their edit permissions so that they can only do select queries?
The SQL command you are looking for is REVOKE - https://docs.aws.amazon.com/redshift/latest/dg/r_REVOKE.html
Now to get the exact command(s) you want to run a few more pieces of information will be needed. But first a quick overview of Redshift permissions. In many database like Redshift users don't (in general) have permissions, the objects have access information about which users/groups can perform which actions. So there is no 'make this user be read-only always' command. REVOKE can act on databases, schemas, tables, and views but these actions only apply to currently existing objects. Objects created in the future will have the access rights assigned by the creating users default ACL.
Now to the questions - are all these users part of a group and the only members of this group? If so you will likely want to apply the REVOKE to the group. Is this restriction for existing tables or do you want them to not be able to create new table (even temp table) in all schemas and database? This will impact what object types you want to revoke rights. Have you or your DBAs changed the default ACLs on the database? These may need to be updated to prevent write accesses being given on future objects in the database.
Is it possible to create a project in BigQuery to store data and another to query the data ? If yes, what rights should be given to the project querying the data to access the data stored by the other project ?
The idea would be to have a better control of costs.
Yes you can do that!
You have to give the roles/bigquery.dataViewer role to the user that will be querying the data (at least). What that account will be depends on the use-case. If you are going to query from BigQuery UI you have to give such permissions to the mail account with which you will log in GCP UI, but you can also give such permissions to particular users or service-accounts for programatic access too.
Here you have the documentation referring to BQ permissions and how to grant them.
Problem: I have a project in BigQuery where all my data is stored. Within this project I created multiple datasets containing different views. Now I want to use different service accounts to query the different datasets containing different views via grafana (if that matters). These users should only be able to query the views (and therefore a specific dataset) meant for them.
What I tried: I granted BigQuery User, Viewer or Editor permissions (I tried all of them) at a dataset level (and also BigQuery Meatadata Viewer at a project level). When I query a view, I receive the error:
User does not have bigquery.jobs.create permission in project xy.
Questions: It is not clear to me if granting bigquery.jobs.create permission on project level, will allow the user to query all datasets instead of only the one I want him to access to.
Is there any way to allow the user to create jobs only on a single dataset?
Update October 2021
I've just seen that this question did go unanswered for me back then but still gets a lot of views. I believe the possibilities changed a bit since I asked the question so here is how I'm handling it now:
I give the respective service account the role roles/bigquery.jobUser on project level. This allows it to create jobs in general, however since I don't give any other permissions yet it cannot query data yet.
Then I give the role roles/bigquery.dataViewer on the dataset level. That makes it possible for the service account to query only the dataset I granted the permission on.
It is also possible to grant roles/bigquery.dataViewer on table level, what will restrict access to only the specific table.
In case you want the service account not only to query (view) the data, but also to insert or change it for example, replace roles/bigquery.dataViewer with the role having the necessary permissions (or assign that role in addition).
How to grant the permissions:
On dataset level
On table or view level
We had a same problem, how we solved was, created a custom role and assigned the custom role to the particular dataset.
You can grant bigquery.user role to a specific dataset as indicated in this guide. The bigquery.user role contains the bigquery.jobs.create permission as well as other basic permissions related to querying datasets. You can check the full list of permissions for this role in this list.
As suggested above, you can also create custom roles having only the exact permissions you want by following this piece of documentation.
I need to hide some fields from a table to a certain group of users.
I thought about the creation of a view that allows me to mask those fields. However, once the permissions are set to only grant access to the view, the queries fail because they also need access to the table that is being queried under the view.
Is there a way (or condition) that allows me to grant access to the view but deny access to the table used in the view?
Column-level permissions will be supported as part of AWS lake formation. It is still on preview, but you can request access.
As Lake formation is only available in a specific region in beta test, for now you need to duplicate the data, or wait.
Is there any way of controlling access to DynamoDB data based on data in other tables? By way of comparison, Firebase Realtime Database rules have access to a snapshot of the entire database when being evaluated, so rules like this are possible:
".write": "root.child('allow_writes').val() === true"
But all my reading of the AWS permissions structure hasn't given me any clue how to achieve the same thing. There are variables that can be tested based on the current authenticated user, and some variables based on the current request, but no way I can see of referencing other data within the database.
AWS don't support this case, you're only option would be to put the access control in your application.
You can control table, item or attribute level data access in DynamoDB using a IAM policy variables. Frustratingly AWS don't even seem to publish a list of available policy variables. Typically it boils down to using Cognito sub or AWS userid, which the majority of people don't want to use as a partition keys in their tables.