I've several GCE instances located two zone: asia-southeast1-b and us-east4-c. All instances have already install stackdriver agent. In metrics explorer, I can't find asia-southeast1-b in CPU load metric:
But CPU Usage is OK:
What's wrong with this?
Can you execute this command inside the VM’s deployed in asia-southeast1-b:
grep collectd /var/log/{syslog,messages} | tail
This will show if there is any error with the agent.
To my understanding, this metric (CPU Load) is recollected from Stackdriver agent, then sent to Monitoring.
Let’s see if we can understand what is happening:
Is there a problem with Stackdriver Agent gathering that metric?
Or is there a problem in Monitoring API while ingesting it?
Let me ask you some questions:
Are you using different Operating Systems on the Instances on asia-southeast1-b in comparison to the one’s running in us-east4-c?
Which version of Stackdriver are you running?
In this link you will be able to determine which version you have installed.[2]
Did you make any changes in the configuration of the Stackdriver agent? The file is located in /etc/stackdriver/collectd.conf
Best regards,
[1] https://cloud.google.com/monitoring/agent/install-agent#agent-version
I've fixed this error by adding Monitoring Metric Writer permission to the service account.
https://stackoverflow.com/a/45068262/380774
Related
I am planning to migrate our app to AWS Fargate and so want to set up logging for the same as well and store all the logs in cloudwatch. I could see we have two options in Fargate - either use default awslogs log driver or use AWS Firelens to gather logs. I read the AWS documentation but unfortunately still not able to figure out which option to use and when precisely. Also, can someone advise on the cost side as well- which option costs how much between using awslogs driver vs aws firelens to send logs to cloudwatch in the same account? [I am looking for easy and efficient and cost effective option]
Is it fair to say in general,we use AWS Firelnes when you want to send logs data to non-AWS tools like elastic stack or datadog etc vs use awslogs driver when sending logs to cloudwatch
Can someone please advise?
Using Fargate launch type and want to use CloudWatch: You have to use awslog driver in your task definition. You can find more information about the CloudWatch pricing here. CloudWatch has a free tier and anything after the free tier cap (metrics, dashboards, alarms, logs, events, etc) has a different pricing calculation. For example, first 10K metrics cost 0.3$ in most regions, but the next 240K will cost 0.1$ while events are priced at 1$ for 1 million
Using Fargate and don't want to use CloudWatch: Use AWS FireLens to push container logs to third party logstorage system. The cost of 3rd party log storage system would come in play here. Data Dog/AppDynamics and others usually offer membership packages (Free/Premium/Enterprise etc). Unlike CloudWatch, each package will give you different capabilities. For example, on the free tier in DataDog you do not have alerts. Also non-AWS native monitoring tools are priced per host/CPU core for a specific amount of hours.
Firelens would also make sense if you want to ship to CloudWatch but want to do upfront filtering versus sending everything to CloudWatch
If you simply want all the log output from your ECS Fargate tasks to go to AWS CloudWatch Logs, then use the awslogs driver. This basically works "out of the box" with no further configuration needed on your part. This is the easiest solution. The only additional cost will be the cost of CloudWatch Logs, detailed in the "Logs" tab here.
If you want to send logs to some other logging service, like Splunk, then use the Firelens driver, and provide a Firelens configuration file that tells Firelens where to send your logs. There is no added cost for using the Firelens driver, but of course there is the added cost of whatever target services you configure Firelens to send your logs to.
I want to monitor free disk space on an EC2 instance using CloudWatch but can´t find any good tutorials. That instance is an Ubuntu Linux 18.
Any helps?
Start here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html - this will install the cloudwatch agent.
The https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html - this shows the how to configure the agent. You can get it to send various metrics to CloudWatch including Disk usage.
One you have the metrics in CloudWatch you can setup alarms on the metric also in CloudWatch
I'm trying to set up Googlecloud monitor Dashboard for my GCE's. I'm expiriancing some difficulties though when tring to filter.
I have serveral GCE, and some are not running and are as backup, but are still displayed in the Cloud Monitor.
I would like to monitor 3 metrics (for now) : CPU, Mem, Disk usage.
CPU wasnt a problem as i could just filter by GCE instance name:
But now if i try to do the same for Memory and Disk usage, I dont have the option to filter as I did using CPU. I tried serveral different approaches like filter by "metadata labels:name", "label", "zone" etc. - all result in a "no data avalible for selected timeframe" (without the filter data is displayed). I feel like I'm missing something trivial:
What am I doing wrong? How can I filter by Instancename? Do i need to Activate some logger on Google cloud? Thank you verymuch in advance!
Use the Cloud Monitoring agent to gather system and application metrics (disk, CPU, network and process) from VM instances and send them to Monitoring.
Install the Monitoring Agent
Use the Cloud Logging agent to gather logging metrics from VM instances and send them to Cloud Monitoring.
Install the Logging Agent
My question is setting when monitoring AWS metrics with stackdriver.
I'm tried thing below but, alert(policy) is not working.
How do I send alert(policy) with group settings?
I dont want is single monitoring, I do want is group settings.
I completed stackdriver monitoring setting for aws accounts by role settings. for next, I settinged group settings alert(policy) metrics is below.
load average > 5
disk usage > 80%
there target is some ec2 instances, these is group settings.
I complete settings for these. for next, did test of stress.
I looked at the metrics. Then the graph exceeded the threshold.
but not sended alert(policy), and not opened incidents.
below is details.
Alert(Policy) Creation
go to [Alerting/ Policies/ TARGET POLICY]
[Add Condition], for next select to [Metric Threshold]
RESOURCE TYPE is Instance(EC2)
APPLIES TO is Group
Select group. This group is Including EC2 Instances.
CONDITION TRIGGERS IF: Any Member Violates
IF METRIC is [CPU Load Average(past 1m)
CONDITION is above
THRESHOLD is 5 load
FOR is 1 minutes
Write by name and Push [Save Policy]
Test of Stress
ssh to target instances.
Execute stress test.
Confim the Load Average above reached 5.
but not sended alert(policy)
Confirm the Stackdriver
Confirm the above Load Average reached 5, with alert settings page.
But not opened Incidents.
I Tried other settings
For GCP instances, alerts will work correctly. It is both group setting and single setting.
Alerts will work for AWS instances in single configuration, but not for group settings.
Version info
stackdriver
stackdriver-agent version: stackdriver-agent.x86_64 5.5.2-366.amzn1
aws
OS: Amazon Linux
VERSION: 2016.03
ID_LIKE: rhel fedora
more detail is please comments.
If the agent wasn't configured correctly and is sending metrics to the wrong project, this could lead to the behavior described. This works for single instances but doesn't for group of instances. This might work for GCP because it's zero setup for monitoring GCE Instances. This causes any alerts which use group filters to not work.
https://cloud.google.com/monitoring/agent/troubleshooting#verify-project
"If you are using an Amazon EC2 VM instance, or if you are using private-key credentials on your Google Compute Engine instance, then the credentials could be invalid or they could be from the wrong project. For AWS accounts, the project used by the agent must be the AWS connector project, typically named "AWS Link..."."
These instructions at https://cloud.google.com/monitoring/agent/troubleshooting#verify-running help verify that agent is sending metrics correctly.
Suppose I have an EC2 instance, which I understood is a VM instance. So if I enable CloudWatch for this EC2 instance, should this monitoring capabilitiy offered by CloudWatch added into my EC2 instance, or it is just running in the hypervisor like XEN?
Thanks.
CloudWatch monitoring is always enabled by default for every EC2 instance at 5-min granularity. What you can enable is detailed monitoring which means you get 1-min observation granularity and aggregate metrics. Default monitoring at 5-min level is free, but detailed monitoring costs money.
Out-of-the-box CloudWatch metrics are measured at hypervisor level and you do not need to do anything to turn them on. See more info on what metrics are available here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-cloudwatch.html
Things like memory utilization and disk space can't be measured at hypervisor level so CloudWatch distributes a simple package with scripts that can be installed on the instance (Linux or Windows.) Those scripts report the data as custom metrics which also costs money. See http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/mon-scripts.html
It is monitored at the hypervisor layer. Aamazon generally will not look into the instance at VM layer so they can't monitor some feature such as memroy usage at the VM.