Azure VMSS Customscript extension execution monitoring with Azure Monitor - azure-virtual-machine

In some instances when executing custom script extension on Linux/Windows VMSS the execution fails may be for timeout or evening invalid file uris or invalid Storage Access Token. Is there a way using Azure monitor that I can capture this failure event so that I can trigger operational activities such as sending emails to the ops teams.

The VM Extension Provisioning Error failure events will pop up on the Activity Logs. So you can send the activity logs similar to Log Analytics workspace from the VM(s) to enable features of Azure Monitor Logs
Activity log data in a Log Analytics workspace is stored in a table called AzureActivity that you can retrieve with a log query in Log Analytics. The structure of this table varies depending on the category of the log entry. For a description of the table properties, see the Azure Monitor data reference.
For example you can filter
AzureActivity
| where * contains "VMExtensionProvisioningError"
Please ensure to add additional filters as required.
You can set up your log alert based on this.

Related

How do I write a stack driver log query that samples 5% of all logs returned?

I keep reading that I can write a log query to sample a percentage of logs but I have found zero examples.
https://cloud.google.com/blog/products/gcp/preventing-log-waste-with-stackdriver-logging\
You can also choose to sample certain messages so that only a percentage of the messages appear in Stackdriver Logs Viewer
How do I get 10% of all GCE load balancer logs with a log query? I know I can configure this on the backend, but I don't want that. I want to get 100% of logs in stackdriver and create a pub/sub log sink with a log query that only captures 10% of them and sends those sampled logs somewhere else.
I suspect you'll want to create a Pub/Sub sink for Log Router. See Configure and manage sinks
Using Google's Log querying, you can use sample to filter (inclusion) logs.

Monitoring Alert for Cloud Build failure on master

I would like to receive a notification on my Notification Channel every time in Cloud Build a Build on master fails.
Now there were mentions of using Log Viewer but it seems like there is no immediate way of accessing the branch.
Is there another way where I can create a Monitoring Alert/a Metric which is specific to master?
A easy solution might be to define a logging metric and link an alerting trigger to this.
Configure Slack alerting in Notification channels of GCP.
Define your logging metric trigger in Logs-based Metrics. Make a Counter with Units 1 and filter using the logging query language:
resource.type="build"
severity=ERROR
or
resource.type="build"
textPayload=~"^ERROR:"
Create an Alerting Policy with that metric you've just defined and link the trigger to your Slack notification channel you've configured in step 1.
you can create Cloud Build notifications sending you updates to desired channels, such as Slack or your SMTP server HTTP channel. Also create a PubSub topic when your build's state changes, such as when your build is created, when your build transitions to a working state.
I just went through the pain of trying to get the official GCP slack integration via Cloud Run working. It was too cumbersome and didn't let me customize what I wanted.
Best solution I see is to get Cloud Build setup to send Pub/Sub messages to the cloud-builds topic. With that, you can use the below repo I just made public to filter on the specific branch you want but looking at the data_json['substitutions']['BRANCH_NAME'] field.
https://github.com/Ucnt/gcp-cloud-build-slack-notifier

How to build tenant level metrics counter for GCP dataflow jobs?

Currently i am trying to create custom metrics for GCP dataflow job using apache beam Metrics and wanted to check if we can track/group counters based on tenant. For instance we have events generated by multiple tenants and all the events are processed in a dataflow job (which writes to big table) and i want to add metrics filter to group them by tenant so we could see elmentsAdded at tenant level.
This is not currently possible with Beam Metrics; they don't have the ability to set extra metric fields such as tenant.
However, you can use the Cloud Monitoring API directly from you pipeline code to export data into Cloud Monitoring with any schema you'd like

What is Log Mechanism in AWS cloud watch?

I have recently started learning about AWS cloud watch and I want to understand the concept of creating Logs so I went through a lot of links like
https://aws.amazon.com/answers/logging/centralized-logging/
I could understand that we can create log groups but and logs are basically to track activity. Is there anything more to it. When do the logs get created.
Any help would be highly appreciated!
You can get more details about Log Groups and CloudWatch Logs Concepts here
Following is the extract from that page
Log Events
A log event is a record of some activity recorded by the application or resource being monitored. The log event record that
CloudWatch Logs understands contains two properties: the timestamp of
when the event occurred, and the raw event message. Event messages
must be UTF-8 encoded.
Log Streams
A log stream is a sequence of log events that share the same source. More specifically, a log stream is generally intended to
represent the sequence of events coming from the application instance
or resource being monitored. For example, a log stream may be
associated with an Apache access log on a specific host. When you no
longer need a log stream, you can delete it using the aws logs
delete-log-stream command. In addition, AWS may delete empty log
streams that are over 2 months old.
Log Groups
Log groups define groups of log streams that share the same retention, monitoring, and access control settings. Each log stream
has to belong to one log group. For example, if you have a separate
log stream for the Apache access logs from each host, you could group
those log streams into a single log group called
MyWebsite.com/Apache/access_log.
And to answer your question "When do the logs get created.", basically that is completely dependent on your application. However, whenever they are created they get streamed to cloudwatch streams (if you have installed the cloudwatch agent and are streaming that particular log)
The advantage of using cloudwatch is that you can retain logs even after your EC2 instance is terminated and you dont need to SSH into the resource to check the logs, you can simply get that from AWS Console

Pinpoint session duration

Is there any way to demonstrate users session duration on Amazon Pinpoint dashboard?
All Pinpoint events have a startTimestamp tag which shows the time for that event, but I could not find session length in the dashboard.
Unfortunately this is currently not yet supported. You can get this data by exporting the metrics to a RedShift cluster and querying the analytics data using SQL (Or exporting to S3 and processing it with EMR).