Schedule query failure in GCP with 'The caller does not have permission' error - google-cloud-platform

So I created a python script similar to [BQ tutorial on SQ][1]. The service account has been set using os.environ. When executing with BigQuery Admin and other similar permissions(Data user, Data transfer agent, Data view etc) the schedule query creation fails with
status = StatusCode.PERMISSION_DENIED
details = "The caller does not have permission"
The least permission level it is accepting is 'Project Owner'. As this is a service account, I was hoping a lower permission level can be applied eg Bigquery Admin, as all I need with the Service account is to remotely create schedule queries. Even the how to guide says it should work. Can anyone provide some input if there is any other combination of permissions which will allow this to work please.
[1]: https://cloud.google.com/bigquery/docs/scheduling-queries#set_up_scheduled_queries

Related

GCP logging: Find all resources (recently) used by a specific user

This is part of my journey to get a clear overview of which users/service accounts are in my GCP Project and when they last logged in.
Endgoal: to be able to clean up users/service-accounts if needed when they weren't on GCP for a long time.
First question:
How can I find in the logs when a specific user used resources, so I can determine when this person last logged in?
You need the Auditlogs and to see them you can run the following query in Cloud Logging:
protoPayload.#type="type.googleapis.com/google.cloud.audit.AuditLog"
protoPayload.authenticationInfo.principalEmail="your_user_name_email_or_your_service_account_email"
You can also check the Activity logs and filter on a user:
https://console.cloud.google.com/home/activity
Related questions + answers:
Pull "last access" information on projects from Google Cloud Platform (GCP)
IAM users and last login date in google cloud
How to list, find, or search iam policies across services (APIs), resource types, and projects in google cloud platform (GCP)?
There is now also the newly added Log Analytics.
This allows you to use SQL to query your logs.
Your logging buckets _Default and _Required need to be upgraded to be able to use Log Analytics:
https://cloud.google.com/logging/docs/buckets#upgrade-bucket
After that you use for example the console to use SQL on your logs:
https://console.cloud.google.com/logs/analytics
Unfortunately, at the moment you can only query the logs that were created after you've switched on Log Analytics.
Example query in the Log Analytics:
SELECT
timestamp,
proto_Payload.audit_log.authentication_info.principal_email,
auth_info.resource,
auth_info.permission,
auth_info.granted
FROM
`logs__Default_US._AllLogs`
left join unnest(proto_Payload.audit_log.authorization_info) auth_info
WHERE
timestamp > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
and proto_payload.type = "type.googleapis.com/google.cloud.audit.AuditLog"
and proto_Payload.audit_log.authentication_info.principal_email in ("name_of_your_user")
ORDER BY
timestamp

How to show and change user in Scheduled Queries

Some of the scheduled queries in Google Cloud Platform suddenly don't run anymore, with the message "Access Denied: ... User does not have bigquery.tables.get permission for table..."
First, is it possible to see under which user the scheduled query is running?
Second, is it possible to change the user?
Thanks, Silvan
I always use service accounts for command line execution...
if you can use bq cli, look at --service_account and --service_account_credential_file
If you still want to use the schedule query, there is some documentation on the service account on https://cloud.google.com/bigquery/docs/scheduling-queries (per above)
This can also be done (for a normal non-service account user) via the console as per the instructions at: https://cloud.google.com/bigquery/docs/scheduling-queries#update_scheduled_query_credentials
"To refresh the existing credentials on a scheduled query:
Find and view the status of a scheduled query.
Click the MORE button and select Update credentials."
Although this thread is 2 years old, it is still relevant. So I will guide you on how to troubleshoot this issue below:
Cause:
This issue happens when the user that was running the query does not meet the required permissions. This could have been caused by a permissions removal or update of the scheduled query's user.
Step 1 - Checking which user is running the query:
Head to GCP - BigQuery - Scheduled Queries
Once on the scheduled queries screen, click on the display name of the query that need to be checked and head to configuration. There you will find the user that currently runs the query.
Step 2 - Understanding the permissions that are needed for running the query:
As specified on Google Cloud's website you need 3 permissions:
bigquery.transfers.update, and, on the dataset: bigquery.datasets.get and bigquery.datasets.update
Step 3 - Check running user's permissions:
From the GCP menu head to IAM & Admin - IAM
IAM
There you will find the permissions assigned to different users. Verify the permissions possessed by the user running the query.
Now we can solve this issue in 2 different ways:
Step 4 - Edit current user's roles or update the scheduler's credentials with an email that has the required permissions:
Option 1: Edit current user's roles: On the IAM screen you can click on "Edit
principal" next to a user to add, remove or update roles (remember to
add a role that complies with the permissions required mentioned in
Step 2).
Option 2: Update credentials (as #coderintherye suggested in another
answer): Head to GCP - BigQuery - Scheduled Queries and select
the query you want to troubleshoot - Head to MORE (on the
top-right corner of the screen) - Update credentials - Finally,
choose a mail. WARNING: That mail will now be the user that
runs the query, so make sure that it has the permissions needed
as mentioned in step 2.
To change a scheduled query from a user to a service account, you need to:
make sure that the service account is from the same project as the project where you are running your scheduled query.
You as a user and the service account, should have the appropriate permissions:
https://cloud.google.com/bigquery/docs/scheduling-queries#required_permissions
You can run a command from the CLI or python code to make the change from user to service account:
CLI:
bq update \
--transfer_config \
--update_credentials \
--service_account_name=abcdef-test-sa#abcdef-test.iam.gserviceaccount.com \
projects/862514312345/locations/us/transferConfigs/5dd12f12-0000-122f-bc38-089e0820fe38
Python:
from google.cloud import bigquery_datatransfer
from google.protobuf import field_mask_pb2
transfer_client = bigquery_datatransfer.DataTransferServiceClient()
service_account_name = "email address of your service account"
transfer_config_name = "projects/SOME_NUMBER/locations/EUROPE_OR_US/transferConfigs/A_LONG_ALPHANUMERIC_ID"
transfer_config = bigquery_datatransfer.TransferConfig(name=transfer_config_name)
transfer_config = transfer_client.update_transfer_config(
{
"transfer_config": transfer_config,
"update_mask": field_mask_pb2.FieldMask(paths=["service_account_name"]),
"service_account_name": service_account_name,
}
)
print("Updated config: '{}'".format(transfer_config.name))
See also here for code examples:
https://cloud.google.com/bigquery/docs/scheduling-queries#update_scheduled_query_credentials
bq update --transfer_config --update_credentials --service_account_name=<service_accounnt> <resource_name>
service account = service account id that you wish to use as a credential.
resource_name = resource name of the Scheduled query that you can see in the configuration section of the Scheduled query detail page.

How can I grant individual permissions in Google Cloud Platform for BigQuery users using python

I need to set up very fine-grained access control for user accounts in GCP using a python script
I know that via UI/gcloud util I can give it role roles/big query. user, but it has a lot of other permissions I don't want this service account to have.
How can I grant individual permissions via python scripts?
Go to your BigQuery console, click into the arrow at the right of one dataset and then click into Share dataset
And then add the e-mail of the user here:
You can choose one of 3 roles available: Viewer/Owner/Editor.
Do this in every dataset to every user.
Update to do it via Python script
You can do it with a Python script following this small tutorial.
The code will be something like:
from google.cloud import bigquery
client = bigquery.Client()
dataset = client.get_dataset(client.dataset('dataset1'))
entry = bigquery.AccessEntry(
role='READER',
entity_type='userByEmail',
entity_id='user1#example.com')
assert entry not in dataset.access_entries
entries = list(dataset.access_entries)
entries.append(entry)
dataset.access_entries = entries
dataset = client.update_dataset(dataset, ['access_entries']) # API request
#assert entry in dataset.access_entries

Spotify spark-bigquery connector issue while using BigQuerySelect method from Dataproc cluster

I am new to BigQuery GCP and to access BigQuery data we are using Spotify spark-bigquery connector as provided here.
We are able to use sqlContext.bigQueryTable("project_id:dataset.table") and its working.
When we are using sqlContext.bigQuerySelect("SELECT * FROM [project_id:dataset.table]") it is giving error:
The user xyz-compute#developer.gserviceaccount.com does not have permission to query table.
We have done necessary settings w.r.t json file and location. But don't have any clue about from where it is taking this user account details.
Please provide help regarding its cause and how to fix it in code.
This error indicates that the service account you are using (xyz-compute#developer.gserviceaccount.com) doesn’t have enough IAM permissions. You should go to your IAM settings and make sure it has at least BigQuery Data Viewer permissions.

User creation timestamp in Amazon Redshift

we have an audit and for same purpose i need a user report, where i need user creation timestamp of all redshift users.
Reference Link : How to get user creation timestamp in Amazon Redshift
My prod cluster is having around 98 user. However system table 'stl_userlog' where action='create' dose not return any record.
I'd be really great if i get some work-around on it. Thanks in Advance.
Audit logging is not enabled by default in Amazon Redshift.
For the user activity log, you must enable the enable_user_activity_logging database parameter. If you enable only the audit logging feature, but not the associated parameter, the database audit logs will log information for only the connection log and user log, but not for the user activity log. The enable_user_activity_logging parameter is disabled (false) by default, but you can set it to true to enable the user activity log. Refer below link for more information.
GOTO: Default Parameter Values
If I’ve made a bad assumption please comment and I’ll refocus my answer.