Google Stackdrive custom metrics - data retention period - google-cloud-platform

I'm using GCP Stackdrive custom metrics and created few dashboard graphs to show the traffic on the system. The problem is that the graph system is keeping the data for few weeks - not forever.
From Stackdrive documentation:
See Quotas and limits for limits on the number of custom metrics and
the number of active time series, and for the data retention period.
If you wish to keep your metric data beyond the retention period, you
must manually copy the data to another location, such as Cloud Storage
or BigQuery.
Let's decide to work with Cloud Storage as a container to store data for the long term.
Questions:
How does this "manual data copy" is working? Just write the same data into two places (Google storage and Stackdrive)?
How the stackdrive is connecting the storage and generating graph of it?

You can use Stackdriver's Logs Export feature to export your logs into either of three sinks, Google Cloud Storage, BigQuery or Pub/Sub topic. Here are the instructions on how to export stackdriver logs. You are not writing logs in two places in real-time but exporting logs based on the filters you set.
One thing to keep in mind is you will not be able to use stackdriver graphs or alerting tools with the exported logs.

In addition, if you export logs into bigquery, you can plug a Datastudio graphe to see your metrics.
You can also do this with Cloud Storage export but it's less immediate and less handy

I'll suggest this guide on creating a pipeline to export metrics to BigQuery for long-term storage and analytics.
https://cloud.google.com/solutions/stackdriver-monitoring-metric-export

Related

Difference between storing Google Cloud Build logs in Cloud Logging vs storing them in GCS

Google Cloud Build allows us to store logs in either GCS or Cloud Logging or both or none. I just want to know what difference it makes when storing logs in GCS vs storing them in Cloud Logging. Maybe some of the things you can highlight are:
Advantages and disadvantages, or typical scenarios when I should use one over the other.
Difference in pricing, which may cost more than the other.
It's 2 different places with different features
Cloud Logging buckets allow you to
Store data during a different retention periods
Query the data from Cloud Logging (and soon via SQL expression)
Sink the logs from different projects in a single place (I wrote an article on that)
Use cloud monitoring/cloud alerting features
Cloud Storage sink allows you to
Store data during a different retention periods, with different storage class cost
Sink the logs from different projects in a single bucket
Move, copy, manage your files as you wish.
Ingest log file in third party tool (like splunk)
IMO, the main difference is the query capacity and the Cloud Monitoring and Alerting integration. Things that you loose when you store logs in Cloud Storage.
However, you gain file management with Cloud Storage.
The main difference is retention (aka how long we keep the logs before we remove them).
Default Build logs have some default retention periods (some are configurable some are not), you can read more in this doc.
If you store Build logs in GCS you have more flexibility in terms of how long you want to retain them.
Cost wise will depends on your build configs, the more logs your builds are generating, the more you will pay, this really depends on your specific builds config.
It's quite a wide question and many factors depend on it. As #boredabdel mentioned, one of the differences is retention periods.
Another difference is that Google Cloud Storage has different costs when you are using Region/DualRegion/Multi-Region, StorageClass. It's good practice to check estimated costs using Google Cloud Pricing Calculator. You would also need to have proper Cloud Storage permissions. Depending if you will use the default Google-created Cloud Storage bucket you need Project Viewer role but for user-specified Cloud Storage bucket you need Storage Object Viewer role.
In Addition to GCS, you can also store build artifacts as it's mentioned in Storing build artifacts Documentation.
In short, it mainly depends on how long you want to keep those logs (mentioned by #boredabdel), which permission you have/might, how important those logs are (to use Cloud Logging or Cloud Storage).
Storing and managing build logs
Viewing build results
Configure and manage log buckets

Can I query Logs Explorer directly from a cloud function?

For a monitoring project I'm using Logs Router to send log data to BigQuery Table so I can then query the BigQuery table from cloud functions. Would it be possible to directly query Log Explorer from Cloud Functions? (i.e not having to replicate my logs to BigQuery?)
Thanks
Yes, of course you can. You even have client libraries for that. However, keep in mind that, by default, your logs are kept only 30 days. It could be enough, or not, depending on your use case.
You can create custom log bucket, with a different retention period, or sing the logs in BigQuery.
The main advantage of BigQuery if the capacity to join the logs data with other data in BigQuery, to perform powerful analytics computation. But still depends on your use case.

How many pipelines can I create within Cloud Data Fusion?

I can't find limit information about Cloud Data Fusion.
Does anyone know, how many data pipelines can I create with Cloud Data Fusion by default? (link, source needed)
You can create as many pipelines as long as you are not hitting the quotas of the resources used in the pipeline. For example your pipeline uses BigQuery, Compute Engine, etc. and one of these hit a quota, then you are not able to create a new pipeline. See Data Fusion Quotas and limits for reference.

Google Cloud Datastore Billing

gcloud datastore export --namespaces="(default)" gs://${BUCKET}
Will google charge us for datastore read operations when we do datastore exports? We'd like to run nightly backups, but we don't want to get charged an arm and a leg.
Yes. It may not be huge unless your table contains lots and lots of entities.
Refer to the table for pricing details. https://cloud.google.com/datastore/pricing
Source:
Export and import operations are charged for entity reads and writes at the rates shown in the table above. If you cancel an export or import, you will be charged for operations performed up until the time that the cancel request has propagated through Cloud Datastore.
https://cloud.google.com/datastore/pricing

Google Cloud Storage network usage

I am using GCS to store images for my Android application.
I was searching the Google Cloud Platform console, but haven't found network usage or something that will show me how many people uploaded/downloaded how many files/bytes of data. They tell you how they calculate the price, using network and class a/b operations, but i don't find the place to track this data myself.
You have to export these logs to bigquery. You can't find them in GCP interface
Storage logs are generated once a day and contain the storage usage for the previous day. They are typically created before 10:00 am PST.
Usage logs are generated hourly when there is activity to report in the monitored bucket. Usage logs are typically created 15 minutes after the end of the hour.
https://cloud.google.com/storage/docs/access-logs