GCP logging: Find all resources (recently) used by a specific user - google-cloud-platform

This is part of my journey to get a clear overview of which users/service accounts are in my GCP Project and when they last logged in.
Endgoal: to be able to clean up users/service-accounts if needed when they weren't on GCP for a long time.
First question:
How can I find in the logs when a specific user used resources, so I can determine when this person last logged in?

You need the Auditlogs and to see them you can run the following query in Cloud Logging:
protoPayload.#type="type.googleapis.com/google.cloud.audit.AuditLog"
protoPayload.authenticationInfo.principalEmail="your_user_name_email_or_your_service_account_email"
You can also check the Activity logs and filter on a user:
https://console.cloud.google.com/home/activity
Related questions + answers:
Pull "last access" information on projects from Google Cloud Platform (GCP)
IAM users and last login date in google cloud
How to list, find, or search iam policies across services (APIs), resource types, and projects in google cloud platform (GCP)?

There is now also the newly added Log Analytics.
This allows you to use SQL to query your logs.
Your logging buckets _Default and _Required need to be upgraded to be able to use Log Analytics:
https://cloud.google.com/logging/docs/buckets#upgrade-bucket
After that you use for example the console to use SQL on your logs:
https://console.cloud.google.com/logs/analytics
Unfortunately, at the moment you can only query the logs that were created after you've switched on Log Analytics.
Example query in the Log Analytics:
SELECT
timestamp,
proto_Payload.audit_log.authentication_info.principal_email,
auth_info.resource,
auth_info.permission,
auth_info.granted
FROM
`logs__Default_US._AllLogs`
left join unnest(proto_Payload.audit_log.authorization_info) auth_info
WHERE
timestamp > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
and proto_payload.type = "type.googleapis.com/google.cloud.audit.AuditLog"
and proto_Payload.audit_log.authentication_info.principal_email in ("name_of_your_user")
ORDER BY
timestamp

Related

Schedule query failure in GCP with 'The caller does not have permission' error

So I created a python script similar to [BQ tutorial on SQ][1]. The service account has been set using os.environ. When executing with BigQuery Admin and other similar permissions(Data user, Data transfer agent, Data view etc) the schedule query creation fails with
status = StatusCode.PERMISSION_DENIED
details = "The caller does not have permission"
The least permission level it is accepting is 'Project Owner'. As this is a service account, I was hoping a lower permission level can be applied eg Bigquery Admin, as all I need with the Service account is to remotely create schedule queries. Even the how to guide says it should work. Can anyone provide some input if there is any other combination of permissions which will allow this to work please.
[1]: https://cloud.google.com/bigquery/docs/scheduling-queries#set_up_scheduled_queries

Pull "last access" information on projects from Google Cloud Platform (GCP)

I have a large number of projects to handle on Google Cloud Platform. To clean them up, I want to pull a list of all projects incl. information on usage so I can filter and identify e.g. outdated projects.
Especially info on "last access" would help a lot. I couldn't find a way yet to pull a datetime variable giving me the last use of e.g. "data access" or "configuration" activity.
Any idea on how to perform such a query? Even alternative ways of determining recent activity within projects would help. Most used resources are BigQuery, ComputeEngine, Buckets.
Thanks!
You can achieve this through:
Audit Logs Or
Cloud Asset Inventory (better than audit logs for your case).
You will have the ability to view activity at project level or at folder/organization level.
Edit:
Including the Cloud Asset Inventory query that #nordlicht.22 created to solve the issue:
gcloud asset search-all-resources \
--scope='projects/{ProjectID}' \
--query='updateTime > 1643155200' \
--order-by='createTime DESC' \
--limit='1'`

Log Buckets from Google

Is it possible to download a Log Storage (Log bucket) from Google Cloud Platform, specifically the one created by default? In case someone knows they can explain how to do it.
The possible solution for the question is you need to choose the required logs and then get the logs for the time period of 1 day to download them in JSON or CSV format.
Step1- From the logging console goto advanced filtering mode
Step2- To choose the log type use filtering query, for example
resource.type="audited_resource"
logName="projects/xxxxxxxx/logs/cloudaudit.googleapis.com%2Fdata_access"
resource.type="audited_resource"
logName="organizations/xxxxxxxx/logs/cloudaudit.googleapis.com%2Fpolicy"
Step3- You can download them as JSON and CSV format
If you have a huge number of audit logs generated per day then above one will not work out. So, you need to export logs to Cloud storage and a big query for further analysis. Please note that cloud logging doesn’t charge to export logs but destination charges might apply.
Another option, you can use the following gcloud command to download the logs.
gcloud logging read "logName : projects/Your_Project/logs/cloudaudit.googleapis.com%2Factivity" --project=Project_ID --freshness=1d >> test.txt

How to show and change user in Scheduled Queries

Some of the scheduled queries in Google Cloud Platform suddenly don't run anymore, with the message "Access Denied: ... User does not have bigquery.tables.get permission for table..."
First, is it possible to see under which user the scheduled query is running?
Second, is it possible to change the user?
Thanks, Silvan
I always use service accounts for command line execution...
if you can use bq cli, look at --service_account and --service_account_credential_file
If you still want to use the schedule query, there is some documentation on the service account on https://cloud.google.com/bigquery/docs/scheduling-queries (per above)
This can also be done (for a normal non-service account user) via the console as per the instructions at: https://cloud.google.com/bigquery/docs/scheduling-queries#update_scheduled_query_credentials
"To refresh the existing credentials on a scheduled query:
Find and view the status of a scheduled query.
Click the MORE button and select Update credentials."
Although this thread is 2 years old, it is still relevant. So I will guide you on how to troubleshoot this issue below:
Cause:
This issue happens when the user that was running the query does not meet the required permissions. This could have been caused by a permissions removal or update of the scheduled query's user.
Step 1 - Checking which user is running the query:
Head to GCP - BigQuery - Scheduled Queries
Once on the scheduled queries screen, click on the display name of the query that need to be checked and head to configuration. There you will find the user that currently runs the query.
Step 2 - Understanding the permissions that are needed for running the query:
As specified on Google Cloud's website you need 3 permissions:
bigquery.transfers.update, and, on the dataset: bigquery.datasets.get and bigquery.datasets.update
Step 3 - Check running user's permissions:
From the GCP menu head to IAM & Admin - IAM
IAM
There you will find the permissions assigned to different users. Verify the permissions possessed by the user running the query.
Now we can solve this issue in 2 different ways:
Step 4 - Edit current user's roles or update the scheduler's credentials with an email that has the required permissions:
Option 1: Edit current user's roles: On the IAM screen you can click on "Edit
principal" next to a user to add, remove or update roles (remember to
add a role that complies with the permissions required mentioned in
Step 2).
Option 2: Update credentials (as #coderintherye suggested in another
answer): Head to GCP - BigQuery - Scheduled Queries and select
the query you want to troubleshoot - Head to MORE (on the
top-right corner of the screen) - Update credentials - Finally,
choose a mail. WARNING: That mail will now be the user that
runs the query, so make sure that it has the permissions needed
as mentioned in step 2.
To change a scheduled query from a user to a service account, you need to:
make sure that the service account is from the same project as the project where you are running your scheduled query.
You as a user and the service account, should have the appropriate permissions:
https://cloud.google.com/bigquery/docs/scheduling-queries#required_permissions
You can run a command from the CLI or python code to make the change from user to service account:
CLI:
bq update \
--transfer_config \
--update_credentials \
--service_account_name=abcdef-test-sa#abcdef-test.iam.gserviceaccount.com \
projects/862514312345/locations/us/transferConfigs/5dd12f12-0000-122f-bc38-089e0820fe38
Python:
from google.cloud import bigquery_datatransfer
from google.protobuf import field_mask_pb2
transfer_client = bigquery_datatransfer.DataTransferServiceClient()
service_account_name = "email address of your service account"
transfer_config_name = "projects/SOME_NUMBER/locations/EUROPE_OR_US/transferConfigs/A_LONG_ALPHANUMERIC_ID"
transfer_config = bigquery_datatransfer.TransferConfig(name=transfer_config_name)
transfer_config = transfer_client.update_transfer_config(
{
"transfer_config": transfer_config,
"update_mask": field_mask_pb2.FieldMask(paths=["service_account_name"]),
"service_account_name": service_account_name,
}
)
print("Updated config: '{}'".format(transfer_config.name))
See also here for code examples:
https://cloud.google.com/bigquery/docs/scheduling-queries#update_scheduled_query_credentials
bq update --transfer_config --update_credentials --service_account_name=<service_accounnt> <resource_name>
service account = service account id that you wish to use as a credential.
resource_name = resource name of the Scheduled query that you can see in the configuration section of the Scheduled query detail page.

Get the BigQuery Table creator and Google Storage Bucket Creator Details

I am trying to identify the users who created tables in BigQuery.
Is there any command line or API that would provide this information. I know that audit logs do provide this information, but I was looking for a command line which could do the job so that i could wrap this in a shell script and run them against all the tables at one time. Same for Google Storage Buckets as well. I did try
gsutil iam get gs://my-bkt and looked for "role": "roles/storage.admin" role, but I do not find the admin role with all buckets. Any help?
This is a use case for audit logs. BigQuery tables don't report metadata about the original resource creator, so scanning via tables.list or inspecting the ACLs don't really expose who created the resource, only who currently has access.
What's the use case? You could certainly export the audit logs back into BigQuery and query for table creation events going forward, but that's not exactly the same.
You can find it out using Audit Logs. You can access them both via Console/Log Explorer or using gcloud tool from the CLI.
The log filter that you're interested in is this one:
resource.type = ("bigquery_project" OR "bigquery_dataset")
logName="projects/YOUR_PROJECT/logs/cloudaudit.googleapis.com%2Factivity"
protoPayload.methodName = "google.cloud.bigquery.v2.TableService.InsertTable"
protoPayload.resourceName = "projects/YOUR_PROJECT/datasets/curb_tracking/tables/YOUR_TABLE"
If you want to run it from the command line, you'd do something like this:
gcloud logging read \
'
resource.type = ("bigquery_project" OR "bigquery_dataset")
logName="projects/YOUR_PROJECT/logs/cloudaudit.googleapis.com%2Factivity"
protoPayload.methodName = "google.cloud.bigquery.v2.TableService.InsertTable"
protoPayload.resourceName = "projects/YOUR_PROJECT/datasets/curb_tracking/tables/YOUR_TABLE"
'\
--limit 10
You can then post-process the output to find out who created the table. Look for principalEmail field.