I've stored analytics in a BigQuery dataset, which I've been doing for over 1.5 years by now, and have hooked up DataStudio, etc and other tools to analyse the data. However, I very rarely look at this data. Now I logged in to check it, and it's just completely gone. No trace of the dataset, and no audit log anywhere showing what happened. I've tracked down when it disappeared via the billing history, and it seems that it mysteriously was deleted in November last year.
My question to the community is: Is there any hope that I can find out what happened? I'm thinking audit logs etc. Does BigQuery have any table-level logging? For how long does GCP store these things? I understand the data is probably deleted since it was last seen so long ago, I'm just trying to understand if we were hacked in some way.
I mean, ~1 TB of data can't just disappear without leaving any traces?
Usually, Cloud Audit Logging is used for this
Cloud Audit Logging maintains two audit logs for each project and organization: Admin Activity and Data Access. Google Cloud Platform services write audit log entries to these logs to help you answer the questions of "who did what, where, and when?" within your Google Cloud Platform projects.
Admin Activity logs contain log entries for API calls or other administrative actions that modify the configuration or metadata of resources. They are always enabled. There is no charge for your Admin Activity audit logs
Data Access audit logs record API calls that create, modify, or read user-provided data. To view the logs, you must have the IAM roles Logging/Private Logs Viewer or Project/Owner. ... BigQuery Data Access logs are enabled by default and cannot be disabled. They do not count against your logs allotment and cannot result in extra logs charges.
The problem for you is retention for Data Access logs - 30 days (Premium Tier) or 7 days (Basic Tier). Of course, for longer retention, you can export audit log entries and keep them for as long as you wish. So if you did not do this you lost these entries and your only way is to contact Support, I think
Related
For a monitoring project I'm using Logs Router to send log data to BigQuery Table so I can then query the BigQuery table from cloud functions. Would it be possible to directly query Log Explorer from Cloud Functions? (i.e not having to replicate my logs to BigQuery?)
Thanks
Yes, of course you can. You even have client libraries for that. However, keep in mind that, by default, your logs are kept only 30 days. It could be enough, or not, depending on your use case.
You can create custom log bucket, with a different retention period, or sing the logs in BigQuery.
The main advantage of BigQuery if the capacity to join the logs data with other data in BigQuery, to perform powerful analytics computation. But still depends on your use case.
Context: As a new intern at a firm, one of my responsibilities is to maintain a clean and ordered QuickSight Analysis and Datasets list.
There are a lot of existing analysis reports and dashboards on the firm's Amazon QuickSight account, dating back to several years. There is a concern of deleting the old reports/supporting datasets which take up a lot of SPICE storage because of the thought that someone is using/accessing it. Is there a way one can see the stats of each report - how many people accessed it, how many times it was used over the last month etc., which could help one decide the analysis reports/datasets that can be deleted. Please help.
This AWS blog post -- Using administrative dashboards for a centralized view of Amazon QuickSight objects -- discussed how BI administrators can use the QuickSight dashboard, Lambda functions, and other AWS services to create a centralized view of groups, users, and objects access permission information and abnormal access auditing.
It is mainly security focused, but you can get the idea, how to find the relevant information about access to QuickSight objects in the AWS CloudTrail events.
For the past month, something on my Google Cloud project has been running out of quota and breaking, but I cannot figure out what nor how to fix it. My project has billing enabled, and the quota page does not show any quotas that are near or past their limits.
Trying to fetch data from GCS throws: FatalError: Expect status [200] from Google Storage. But got status 429.
Looking into it, 429 means "too many requests", but I'm not sure why that's an issue. Isn't GCloud supposed to scale and charge me for my requests? The Google Cloud interface is also broken, when I try browsing my files, it just shows a red "Internal error encountered" and doesn't let me download my files from GCS.
It seems to fix at midnight when the quotas reset, but again, I don't see any quotas that are near or past their limits, so I'm not sure if this is some secret hidden quota I don't have access to, nor can I pay for.
The way I managed to fix this is by
Enabling Cloud Audit Logs (IAM -> Audit Logs -> Google Cloud Storage -> enable all 3)
Wait a day
Go to logging and view all storage accesses
Turns out one of my apps was misbehaving and querying GCS thousands of times per day. Fixing that bug resolves the quota issue.
I still have no idea:
What quota limit was being hit (it did not show anywhere)
Why it was hitting a limit instead of charging my billing
Why it broke in such strange ways instead of just telling me
But yes, some secret underlying API was out of quota because one of my apps was accidentally hitting GCS too much.
I have a certain image in my Google Cloud Storage Bucket but it was gone w/out us knowing how it was deleted. When I checked the activity logs.. All I can see is "updated". By the way I see it, updated will be the status whether the files is deleted or created. So how can I check this? I want to know at least when it was deleted
In order to see audit logs for Cloud Storage you have to first enable them.
To enable them in the console go to the IAM & Admin -> Audit Logs and than by selecting Google Cloud Storage you will see on the right side of the screen under LOG TYPE different services which you can enable or disable audit logging for. Please refer to this documentation where the procedure is described.
After the Cloud Storage logs are enabled you can see all the activity by going in the console to ACTIVITY and under Activity types select Data access. After this you will be able to see the operations on the bucket (considering you have the right permissions to see those).
I am using GCS to store images for my Android application.
I was searching the Google Cloud Platform console, but haven't found network usage or something that will show me how many people uploaded/downloaded how many files/bytes of data. They tell you how they calculate the price, using network and class a/b operations, but i don't find the place to track this data myself.
You have to export these logs to bigquery. You can't find them in GCP interface
Storage logs are generated once a day and contain the storage usage for the previous day. They are typically created before 10:00 am PST.
Usage logs are generated hourly when there is activity to report in the monitored bucket. Usage logs are typically created 15 minutes after the end of the hour.
https://cloud.google.com/storage/docs/access-logs