Setup "Stackdriver Kubernetes Monitoring" for AWS - amazon-web-services

Google Cloud Platform announced "Stackdriver Kubernetes Monitoring" at Kubecon 2018. It looks awesome.
I am an AWS user running a few Kubernetes clusters and immediately had envy, until I saw that it also supported AWS and "on prem".
Stackdriver Kubernetes Engine Monitoring
This is where I am getting a bit lost.
I cannot find any documentation for helping me deploy the agents onto my Kubernetes clusters. The closest example I could find was here: Manual installation of Stackdriver support, but the agents are polling for "internal" GCP metadata services.
E0512 05:14:12 7f47b6ff5700 environment.cc:100 Exception: Host not found (authoritative): 'http://metadata.google.internal./computeMetadata/v1/instance/attributes/cluster-name'
I'm not sure the Stackdriver dashboard has "Stackdriver Kubernetes Monitoring" turned on. I don't seem to have the same interface as the demo on YouTube here
I'm not sure if this is something which will get turned on when I configure the agents correctly, or something I'm missing.
I think I might be missing some "getting started" documentation which takes me through the setup.

You can use a Stackdriver partner service, Blue Medora BindPlane, to monitor AWS Kubernetes or almost anything else in AWS for that matter or on-premise. Here's an article from Google Docs about the partnership: About Blue Medora; you can signup for BindPlane through the Google Cloud Platform Marketplace.
It looks like BindPlane is handling deprecated Stackdriver monitoring agents. Google Cloud: Transition guide for deprecated third-party integrations

As per this article, currently Stackdriver Kubernetes Monitoring beta release only supports Kubernetes version v1.10.2 clusters running on Google Cloud Platform's Kubernetes Engine. To track when this feature will be available in AWS, I suggest creating a feature request in Public Issue Tracker.

Stackdriver monitoring of Amazon EKS, Azure AKS, and general purpose Kubernetes running on non--GCP hosted VMs is available if you enable the BindPlane option for Stackdriver.
https://cloud.google.com/stackdriver/blue-medora

Related

What are advantages of using Prometheus over Google Cloud Monitoring for SLO based monitoring?

I am working on creating monitoring based on SLO. So far I have been using Google Cloud Monitoring solutions like Dashboards, Alerting and Uptime Checks.
I have noticed GCP has now a Managed Service for Prometheus.
My question is what would be the advantage of using Prometheus(not only Google managed one)for monitoring. Is there anything that could be achieved with Prometheus that I could not achive with Google Cloud Monitoring?
Managed service for prometheus is a managed and automatically scalable prometheus endpoint. You can request the metrics with PromQL language instead of MQL (Monitoring Query Language).
What's the advantage? If you deploy an application instrumented with Open Telemetry (for example), you don't have to change anything. On Kubernetes (GKE), the managed collector do the job for you. Else you have to configure the collector to use Managed Service for Prometheus.
If you build an app from scratch, and you want it portable, Open Telemetry and Prometheus are standard tools to instrument your app.
If not, use Cloud Monitoring!
Important note
That feature is very new and, for now, only the metrics sinks with Managed Service for Prometheus can be query with PromQL. The other metrics must be requested by MQL. It could change in the future.
So, for now, if you can use built in Cloud Monitoring metrics, it's a better solution.

Unable to register external kubernetes cluster with GKE

I'm trying to set up a multi-cloud deployment using GKE as a single plain of glass for cluster management. Unfortunately, I can't see "Register cluster" option within GKE. I can create a cluster, I can delete a cluster, I can deploy a workload to a cluster, but the option with registering the new cluster is not available for me.
I'm not using the free tier and I'm not within an Organisation also.
Could somebody help me to figure out why it is so? I could not find the solution digging through GCP documentation.
Thank you in advance
I think what you are looking for is Anthos. It has a unified user interface and in the Anthos for operations section of the documentation it says:
Single pane of glass visibility across all clusters ...
But the link to the documentation to register a cluster gives me a 404.... I would suggest reaching out to Google Cloud Support to see if they can help you.
edit: It turns out that you need to be an Anthos customer to access the both the feature and the documentation for the feature.

GKE - Stackdriver Kubernetes Monitoring

Following the steps provided in this documentation.
I was looking into better monitoring of our GKE cluster and so thought I'd try out the beta kubernetes Stackdriver monitoring. My cluster version is 1.11.7 (later than the suggested 1.11.2) and I created the cluster with the --enable-stackdriver-kubernetes flag.
In the cluster details Stackdriver logging and monitoring is listed as 'Enabled v2(beta)' however in the stackdriver resources menu the 'kubernetes beta' option will simply not appear as shown here.
I have also confirmed fluentd, heapster and metadata-agent pods are running within the cluster as suggested by the docs.
Any possible suggestions are much appreciated.
I managed to resolve this issue:
Firstly the 'Kubernetes Beta' option appeared in Stackdriver appeared without me making any changes to the cluster(Slightly annoying)
I gave the clusters service account the appropriate monitoring and logging roles.

Update cluster stackdriver version in google cloud

I have this situation with my kubernetes cluster and I would like to rollback to the stable version of the google cloud, anybody knows how to do that?
This is the new logging and monitoring option described at https://cloud.google.com/monitoring/kubernetes-engine/
As far as I know there isn't a downgrade option. If you create a new cluster, you can choose which logging/monitoring integration to use.

How to integrate on premise logs with GCP stackdriver

I am evaluating stackdriver from GCP for logging across multiple micro services.
Some of these services are deployed on premise and some of them are on AWS/GCP.
Our services are either .NET or nodejs based apps and we are invested in winston for nodejs and nlog in .net.
I was looking # integrating our on-premise nodejs application with stackdriver logging. Looking # https://cloud.google.com/logging/docs/setup/nodejs the documentation it seems that there we need to install the agent for any machine other than the google compute instances. Is this correct?
if we need to install the agent then is there any way where I can test the logging during my development? The development environment is either a windows 10/mac.
There's a new option for ingesting logs (and metrics) with Stackdriver as most of the non-google environment agents look like they are being deprecated. https://cloud.google.com/stackdriver/docs/deprecations/third-party-apps
A Google post on logging on-prem resources with stackdriver and Blue Medora
https://cloud.google.com/solutions/logging-on-premises-resources-with-stackdriver-and-blue-medora
for logs you still need to install an agent on each box to collect the logs, it's a BindPlane agent not a Google agent.
For node.js, you can use the #google-cloud/logging-winston and #google-cloud/logging-bunyan modules from anywhere (on-prem, AWS, GCP, etc.). You will need to provide projectId and auth credentials manually if not running on GCP. Instructions on how to set these up is available in the linked pages.
When running on GCP we figure out the exact environment (App Engine, Compute Engine, etc.) automatically and the logs should up under those resources in the Logging UI. If you are going to use the modules from your development machines, we will report the logs against the 'global' resource by default. You can customize this by passing a specific resource descriptor yourself.
Let us know if you run into any trouble.
I tried setting this up on my local k8s cluster. By following this: https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/
But i couldnt get it to work, the fluentd-gcp-v2.0-qhqzt keeps crashing.
Also, the page mentions that there are multiple issues with stackdriver logging if you DONT use it on google GKE. See the screenshot.
I think google is trying to lock you in into GKE.