How to enable observability for NebulaGraph in AWS? - amazon-web-services

I would like to study how to enable grafana to monitor the metrics of NebulaGraph on AWS, I walked through https://aws-quickstart.github.io/quickstart-vesoft-nebula-graph-cloud/ and https://aws-quickstart.github.io/quickstart-vesoft-nebula-graph-cloud/ but still could not find clues on how to do it.
I noticed there is NebulaGraph Dashboard out there, but our monitoring infra are grafana based, could I wire them together?

According to my experience, if you already have a solution with Grafana and probably Prometheus, you can use the NebulaGraph tool stats-exporter to export the monitoring metrics into your system. You can click the link to find the instructions in the README.

Related

What are advantages of using Prometheus over Google Cloud Monitoring for SLO based monitoring?

I am working on creating monitoring based on SLO. So far I have been using Google Cloud Monitoring solutions like Dashboards, Alerting and Uptime Checks.
I have noticed GCP has now a Managed Service for Prometheus.
My question is what would be the advantage of using Prometheus(not only Google managed one)for monitoring. Is there anything that could be achieved with Prometheus that I could not achive with Google Cloud Monitoring?
Managed service for prometheus is a managed and automatically scalable prometheus endpoint. You can request the metrics with PromQL language instead of MQL (Monitoring Query Language).
What's the advantage? If you deploy an application instrumented with Open Telemetry (for example), you don't have to change anything. On Kubernetes (GKE), the managed collector do the job for you. Else you have to configure the collector to use Managed Service for Prometheus.
If you build an app from scratch, and you want it portable, Open Telemetry and Prometheus are standard tools to instrument your app.
If not, use Cloud Monitoring!
Important note
That feature is very new and, for now, only the metrics sinks with Managed Service for Prometheus can be query with PromQL. The other metrics must be requested by MQL. It could change in the future.
So, for now, if you can use built in Cloud Monitoring metrics, it's a better solution.

GCP | How I can see all the working virtual machines on a project?

I wrote some code to automate the training procedure on our company vm instances.
you probably know that sometimes GCP can't provide you at the current moment with a machine - 'out of resource' exception.
so , I'd like to monitor which of my machines successfully turned on and which not.
if there is some way to show it on Bigquery it will be great.
thanks .
Using the Cloud Monitoring (Stackdriver) functionality is good way for monitoring all you VMs.
Here is a detailed guide to implement Monitoring on a Compute Engine Instance.
Hope you find it useful.
You can use Google cloud's activity logs too:
Activity logging is enabled by default for all Compute Engine
projects.
You can see your project's activity logs through the Logs Viewer in
the Google Cloud Console:
In the Cloud Console, go to the Logging page. Go to the Logging page
When in the Logs Viewer, select and filter your resource type from the
first drop-down list. From the All logs drop-down list, select
compute.googleapis.com/activity_log to see Compute Engine activity
logs.
Here is the Official documentation.

Metric Registrar in Cloud Foundry

Does Metric Registrar works in Cloud Foundry without Pivotal?
I have open source Cloud Foundry and I need to get custom metrics from app. I installed Metric Registrar community plugin for CF, I registered my application with endpoint, I also defined log format. Unfortunately I see no traffic on registered endpoint.
If open source Cloud Foundry do not support Metric Registrar, is there any other way to get support for custom app metrics?
Does Metric Registrar works in Cloud Foundry without Pivotal?
The Metric Registrar is part of the VMware Tanzu Application Service product, it's not part of the Open Source Cloud Foundry project. It's a value-add feature for those using the paid product.
If open source Cloud Foundry do not support Metric Registrar, is there any other way to get support for custom app metrics?
You don't strictly need the Metric Registrar to do this. The Metric Registrar's main purpose is to take metrics from your apps and inject them into the Loggregator log/metric stream. This is convenient if you have other software that is already consuming log & metric streams from Loggregator.
You don't have to do that though, as there are other ways to export metrics from your app.
If you want them to go through Loggregator, you could export structured log messages (perhaps JSON?) via STDOUT that contains your metrics. Those will, like your other log messages, go out through Loggregator. You would then just need to have something ingesting your logs, identifying the structured messages, and parsing out your metrics. This is similar to what Metric Registrar does, you're just parsing out the structured log entries after they leave the platform.
If you have an ELK stack or similar running, you can probably make this solution work easily enough. ELK can ingest your logs & structured log metrics, then you can search/filter through the metrics and create dashboards.
Another option you could do is to run Prometheus/Grafana. You then just need to make sure your app has a Prometheus Exposition metrics endpoint (this is super easy with Java/Spring Boot & Spring Boot Actuator, but can be done in any language). Point Prometheus at your app and it will then be able to scrape metrics from your apps & you can use Grafana to view them. None of this goes through Loggregator.
If you're looking for a solution that's more automatic, you could run an APM agent (NewRelic, DataDog, AppDynamics, Dynatrace, etc..) with your apps. These will capture metrics directly from the process and export them to a SaaS platform where you can monitor/review them.
There are probably other options as well. This is just what comes to mind as I write this up.

Unable to autoscale GCP instances using custom memory metrics

I am trying to autoscale gcp instances based on memory metrics but I am unable to find the way how this can be done. I have tried to setup this through "stackdriver monitoring metrics" but no luck. Can someone help here how this can be done.
This is similar problem like posted on google forum but no proper answer here as well.
https://groups.google.com/forum/#!topic/gce-discussion/X6LA0-8mFak
It's required to install the Stackdriver Monitoring Agent by following this documentation.
Once installed, you will get more options to configure your autoscaler from your instance group page

How to integrate on premise logs with GCP stackdriver

I am evaluating stackdriver from GCP for logging across multiple micro services.
Some of these services are deployed on premise and some of them are on AWS/GCP.
Our services are either .NET or nodejs based apps and we are invested in winston for nodejs and nlog in .net.
I was looking # integrating our on-premise nodejs application with stackdriver logging. Looking # https://cloud.google.com/logging/docs/setup/nodejs the documentation it seems that there we need to install the agent for any machine other than the google compute instances. Is this correct?
if we need to install the agent then is there any way where I can test the logging during my development? The development environment is either a windows 10/mac.
There's a new option for ingesting logs (and metrics) with Stackdriver as most of the non-google environment agents look like they are being deprecated. https://cloud.google.com/stackdriver/docs/deprecations/third-party-apps
A Google post on logging on-prem resources with stackdriver and Blue Medora
https://cloud.google.com/solutions/logging-on-premises-resources-with-stackdriver-and-blue-medora
for logs you still need to install an agent on each box to collect the logs, it's a BindPlane agent not a Google agent.
For node.js, you can use the #google-cloud/logging-winston and #google-cloud/logging-bunyan modules from anywhere (on-prem, AWS, GCP, etc.). You will need to provide projectId and auth credentials manually if not running on GCP. Instructions on how to set these up is available in the linked pages.
When running on GCP we figure out the exact environment (App Engine, Compute Engine, etc.) automatically and the logs should up under those resources in the Logging UI. If you are going to use the modules from your development machines, we will report the logs against the 'global' resource by default. You can customize this by passing a specific resource descriptor yourself.
Let us know if you run into any trouble.
I tried setting this up on my local k8s cluster. By following this: https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/
But i couldnt get it to work, the fluentd-gcp-v2.0-qhqzt keeps crashing.
Also, the page mentions that there are multiple issues with stackdriver logging if you DONT use it on google GKE. See the screenshot.
I think google is trying to lock you in into GKE.