I am using GCS to store images for my Android application.
I was searching the Google Cloud Platform console, but haven't found network usage or something that will show me how many people uploaded/downloaded how many files/bytes of data. They tell you how they calculate the price, using network and class a/b operations, but i don't find the place to track this data myself.
You have to export these logs to bigquery. You can't find them in GCP interface
Storage logs are generated once a day and contain the storage usage for the previous day. They are typically created before 10:00 am PST.
Usage logs are generated hourly when there is activity to report in the monitored bucket. Usage logs are typically created 15 minutes after the end of the hour.
https://cloud.google.com/storage/docs/access-logs
Related
Google Cloud Build allows us to store logs in either GCS or Cloud Logging or both or none. I just want to know what difference it makes when storing logs in GCS vs storing them in Cloud Logging. Maybe some of the things you can highlight are:
Advantages and disadvantages, or typical scenarios when I should use one over the other.
Difference in pricing, which may cost more than the other.
It's 2 different places with different features
Cloud Logging buckets allow you to
Store data during a different retention periods
Query the data from Cloud Logging (and soon via SQL expression)
Sink the logs from different projects in a single place (I wrote an article on that)
Use cloud monitoring/cloud alerting features
Cloud Storage sink allows you to
Store data during a different retention periods, with different storage class cost
Sink the logs from different projects in a single bucket
Move, copy, manage your files as you wish.
Ingest log file in third party tool (like splunk)
IMO, the main difference is the query capacity and the Cloud Monitoring and Alerting integration. Things that you loose when you store logs in Cloud Storage.
However, you gain file management with Cloud Storage.
The main difference is retention (aka how long we keep the logs before we remove them).
Default Build logs have some default retention periods (some are configurable some are not), you can read more in this doc.
If you store Build logs in GCS you have more flexibility in terms of how long you want to retain them.
Cost wise will depends on your build configs, the more logs your builds are generating, the more you will pay, this really depends on your specific builds config.
It's quite a wide question and many factors depend on it. As #boredabdel mentioned, one of the differences is retention periods.
Another difference is that Google Cloud Storage has different costs when you are using Region/DualRegion/Multi-Region, StorageClass. It's good practice to check estimated costs using Google Cloud Pricing Calculator. You would also need to have proper Cloud Storage permissions. Depending if you will use the default Google-created Cloud Storage bucket you need Project Viewer role but for user-specified Cloud Storage bucket you need Storage Object Viewer role.
In Addition to GCS, you can also store build artifacts as it's mentioned in Storing build artifacts Documentation.
In short, it mainly depends on how long you want to keep those logs (mentioned by #boredabdel), which permission you have/might, how important those logs are (to use Cloud Logging or Cloud Storage).
Storing and managing build logs
Viewing build results
Configure and manage log buckets
I'm using GCP Stackdrive custom metrics and created few dashboard graphs to show the traffic on the system. The problem is that the graph system is keeping the data for few weeks - not forever.
From Stackdrive documentation:
See Quotas and limits for limits on the number of custom metrics and
the number of active time series, and for the data retention period.
If you wish to keep your metric data beyond the retention period, you
must manually copy the data to another location, such as Cloud Storage
or BigQuery.
Let's decide to work with Cloud Storage as a container to store data for the long term.
Questions:
How does this "manual data copy" is working? Just write the same data into two places (Google storage and Stackdrive)?
How the stackdrive is connecting the storage and generating graph of it?
You can use Stackdriver's Logs Export feature to export your logs into either of three sinks, Google Cloud Storage, BigQuery or Pub/Sub topic. Here are the instructions on how to export stackdriver logs. You are not writing logs in two places in real-time but exporting logs based on the filters you set.
One thing to keep in mind is you will not be able to use stackdriver graphs or alerting tools with the exported logs.
In addition, if you export logs into bigquery, you can plug a Datastudio graphe to see your metrics.
You can also do this with Cloud Storage export but it's less immediate and less handy
I'll suggest this guide on creating a pipeline to export metrics to BigQuery for long-term storage and analytics.
https://cloud.google.com/solutions/stackdriver-monitoring-metric-export
I tried to find out how to get the node hour usage of my Google Cloud ML Prediction API, but didn't find any. Is there a way to know the usage, except for looking at the bills?
Here is the API Documentation I referred to.
The documentation page you referenced is part of the Cloud Machine Learning Engine API documentation:
An API to enable creating and using machine learning models.
That API is for using the product itself, it doesn't contain billing information for the product.
For billing info you want to look at Cloud Billing and its API:
With this API, you can do any of the following.
For billing accounts:
...
Get billing information for a project.
However from just a quick glance at the docs (I didn't use it yet) the API itself doesn't appear to directly provide the particular info you're looking for. But possible ways to get that info appear to be:
using Billing Reports:
The Cloud Billing Reports page lets you view your Google Cloud
Platform (GCP) usage costs at a glance and discover and analyze
trends. The Reports page displays a chart that plots usage costs for
all projects linked to a billing account. To help you view the cost
trends that are important to you, you can select a data range, specify
a time range, configure the chart filters, and group by project,
product, or SKU.
Billing reports can help you answer questions like these:
How is my current month's GCP spending trending?
Export Billing Data to a File:
To access a detailed breakdown of your charges, you can export your
daily usage and cost estimates automatically to a CSV or JSON file
stored in a Google Cloud Storage bucket you specify. You can then
access the data via the Cloud Storage API, CLI tool, or Google Cloud
Platform Console.
Export Billing Data to BigQuery:
Billing export to BigQuery enables you to export your daily usage and
cost estimates automatically throughout the day to a BigQuery dataset
you specify. You can then access your billing data from BigQuery. You
can also use this export method to export data to a JSON file.
I've stored analytics in a BigQuery dataset, which I've been doing for over 1.5 years by now, and have hooked up DataStudio, etc and other tools to analyse the data. However, I very rarely look at this data. Now I logged in to check it, and it's just completely gone. No trace of the dataset, and no audit log anywhere showing what happened. I've tracked down when it disappeared via the billing history, and it seems that it mysteriously was deleted in November last year.
My question to the community is: Is there any hope that I can find out what happened? I'm thinking audit logs etc. Does BigQuery have any table-level logging? For how long does GCP store these things? I understand the data is probably deleted since it was last seen so long ago, I'm just trying to understand if we were hacked in some way.
I mean, ~1 TB of data can't just disappear without leaving any traces?
Usually, Cloud Audit Logging is used for this
Cloud Audit Logging maintains two audit logs for each project and organization: Admin Activity and Data Access. Google Cloud Platform services write audit log entries to these logs to help you answer the questions of "who did what, where, and when?" within your Google Cloud Platform projects.
Admin Activity logs contain log entries for API calls or other administrative actions that modify the configuration or metadata of resources. They are always enabled. There is no charge for your Admin Activity audit logs
Data Access audit logs record API calls that create, modify, or read user-provided data. To view the logs, you must have the IAM roles Logging/Private Logs Viewer or Project/Owner. ... BigQuery Data Access logs are enabled by default and cannot be disabled. They do not count against your logs allotment and cannot result in extra logs charges.
The problem for you is retention for Data Access logs - 30 days (Premium Tier) or 7 days (Basic Tier). Of course, for longer retention, you can export audit log entries and keep them for as long as you wish. So if you did not do this you lost these entries and your only way is to contact Support, I think
My current Google Cloud billing dashboard shows me the charges for this month, for all my Google Services: Storage, BigQuery, etc.
How do I get this information programmatically? The Cloud Billing API only gives me billing details, but that is not what I want.
I also tried looking into Google Storage for the billing file they periodically update, but it contains information for 2 days ago only.
What should I do?
If you enable 'Billing Export' in the 'Billing' tab, a new JSON or CSV file will be exported everyday into Google Cloud Storage. It will have that days' usage and cost information. I have files there since the beginning of my project.