Google Cloud Logging assigns ERROR severity to all Python logging.info calls - google-cloud-platform

It's the first time I use Google Cloud Platform, so please be understanding!
I've built a scheduled workflow that simply runs a Batch job. The job runs Python code and uses the standard logging library for logging. When the job is executed, I can correctly see all the entries in Cloud Logging, but all the entries have severity ERROR although they're all INFO.
One possible reason I've been thinking about is that I haven't used the setup_logging function as described in the documentation here. The thing is, I didn't want to run the Cloud Logging setup when I run the code locally.
The questions I have are:
why does logging "work" (in the sense that logs end up in Cloud Logging) even if I did not use the setup_logging function? What is it's real role?
why do my INFO entries show up with ERROR severity?
if I include that snippet and that snippet solves this issue, should I include an if statement in my code that detects if I am running the code locally and skips that Cloud Logging setup step?

According to the documentation, you have to use a setup to send correctly logs to Cloud Logging.
This setup allows then to use the Python logging standard library.
Once installed, this library includes logging handlers to connect
Python's standard logging module to Logging, as well as an API client
library to access Cloud Logging manually.
# Imports the Cloud Logging client library
import google.cloud.logging
# Instantiates a client
client = google.cloud.logging.Client()
# Retrieves a Cloud Logging handler based on the environment
# you're running in and integrates the handler with the
# Python logging module. By default this captures all logs
# at INFO level and higher
client.setup_logging()
Then you can use the Python standard library to add logs to Cloud Logging.
# Imports Python standard library logging
import logging
# The data to log
text = "Hello, world!"
# Emits the data using the standard logging module
logging.warning(text)
why does logging "work" (in the sense that logs end up in Cloud Logging) even if I did not use the setup_logging function? What is
it's real role?
Without the setup, the log will be added to Cloud Logging but not with the correct type and as expected. It's better to use the setup.
why do my INFO entries show up with ERROR severity?
The same reason explained above
if I include that snippet and that snippet solves this issue, should I include an if statement in my code that detects if I am running the
code locally and skips that Cloud Logging setup step?
I think no need to add a if statement you run the code locally. In this case, the logs should be printed in the console even if the setup is present.

Related

How can I push text log files into Cloud Logging?

I have an application (Automation Anywhere A360) that whenever I want to log something with the app it will log it into a txt/csv file. I run a process in Automation Anywhere that is run in 10 bot runners (Windows VMs) concurrently (so each bot runner is going to log what is going on locally)
My intention is that instead of having sepparate log files for each bot runner, I'd like to have a centralized place where I store all the logs (i.e. Cloud Logging).
I know that this can be accomplished using Python, Java, etc. However, if every time I need to log something into Cloud Logging I invoke a Python script, even though that does the job, it takes around 2-3 seconds (I think this is a bit slow) connecting to gcp client and logging in (taking in this first step most of the time).
How woud you guys tackle this?
The solution that I am looking for is something like this. It is named BindPlane and it can collect log data from on-premises and hybrid infra and send it to GCP monitoring/logging stack
To whom it may (still) concern: You could use fluentd to forward logs to pubSub and from there to a Cloud Logging bucket.
https://flugel.it/infrastructure-as-code/how-to-setup-fluentd-to-retrieve-logs-send-them-to-gcp-pub-sub-to-finally-push-them-to-elasticsearch/

Profiler doesn't send metrics to CloudGuru

I have Lambda function that is integrated with Code Guru. I want make some profiling. I added needed layer and enviroment variables. I changed runtine to Java 8 (Corretto). I also added permission for role that is used to execute Lambda.
In Cloud Watch I see logs from profiler. Last colected log is
INFO: Attempting to report profile data: start=2020-12-15T13:14:14.639Z end=2020-12-15T13:20:05.534Z force=false memoryRefresh=false numberOfTimesSampled=28
I've seen in other examples that I should have more logs with information about succes or failure like this:
PM software.amazon.codeguruprofilerjavaagent.ProfilingCommand submitProfilingData
INFO: Successfully reported profile
but it didn't happen. There is no information. What can be reason of this?

Logging jobs on a Google Cloud VM

I am using a Google Cloud virtual machine to run several python scripts scheduled on a cron, I am looking for some way to check that they ran.
When I look in my logs I see nothing, so I guess simply running a .py file is not logged? Is there a way to turn on logging at this level? What are the usual approaches for such things?
The technology for recording log information in GCP is called Stackdriver. You have a couple of choices for how to log within your application. The first is to instrument your code with Stackdriver APIs which explicitly write data to the Stackdriver subsytem. Here are the docs for that and here is further recipe.
A second story is that you install the Stackdriver Logging Agent on your Compute Engine. This will then allow you to tap into other sources of logging output such as local syslog.

How can I export a Stack Driver log to a file for local processing?

All I know is, we can fetch logs using stack driver Logging or monitoring services. But from where these logs are being fetched from?
If i know where these logs are fetched from then no need of doing API calls or using another service to see my logs. I can simply download them and use my own code to process them.
Is there any way to do this?
There is a capability of Stack driver logging called "Exporting". Here is a link to the documentation. At a high level, exporting is the idea that when a new log message is written to a log, a copy of that message is then exported. The targets of the export (called sinks) can be:
Cloud Storage
Big Query
Pub/Sub
From your description, if you set up Cloud Storage as a sink, then you will have new files written to your Cloud Storage bucket that you can then retrieve and process.
The following image (copied from the docs) gives the best overview:
If you don't wish to use exports of new log entries, you can use either the API or gcloud to read the current logs. Realize that GCP held logs (within Stackdriver) expire after a period of time (30 days). See gcloud logging read.

Where can I find request and response logs for Spark?

I have just started using Spark framework. And experimenting with a local server on Mac OS
The documentation says that to enable debug logs I simply need to add a dependency.
I've added a dependency and can observe logs in the console.
The question is where the log files are located?
If you are following the Spark example here, you are only enabling slf4j-simple logging. By default, this only logs items to the console. You can change this programmatically (Class information here) or by adding a properties file to the classpath, as seen in this discussion. Beyond this you will likely want to implement a logging framework like log4j or logback, as slf4j is designed to act as a facade over an existing logging implementation.