SDK Gcloud logging timestamp filter - google-cloud-platform

I am having problem while trying to filter my log files to a specific period of time. Everything beside timestamps works fine when i'm trying to write a command. The moment everything looks good:
gcloud logging read "resource.type=XXX logName=projects/YYY/logs/appengine.googleapis.com%2Fnginx.health_check" > test.txt
All other things like --limit or --freshness are working without problems, but when i'm trying to get a period of time in my text file the command stops working. I'm getting information:
The file name, directory name, or volume label syntax is incorrect.
I've tried many things and this is command which gives me some error at least:
gcloud logging read "resource.type=XXX logName=projects/YYY/logs/appengine.googleapis.com%2Fnginx.health_check timestamp='2020-01-22T14:02:41.41Z'"
Please help me with correct syntax of specifying timestamps to get any period of time as a result.

I got it!
gcloud logging read "resource.type=XXX logName=projects/YYY/logs/appengine.googleapis.com%2Fnginx.health_check timestamp^>=""2020-01-21T11:00:00Z"" timestamp^<=""2020-01-22T11:00:00Z""" >t.txt
I found it here: Find logs between timestamps using stackdriver CLI
Thank you Braulio Baron for your help!

The syntax error is in the timestamp, it should be with \ before the date's quotes:
gcloud logging read "resource.type=gae_app AND logName=projects/xxxx/logs/syslog AND timestamp=\"2020-01-22T14:02:41.41Z\""
For more detail have a look into gcloud logging read

Related

gcloud logging read not returning anything, no errors either

I'm trying to retreive all the logs within a specific time, here is my command:
gcloud logging read 'timestamp<="2022-08-15T12:50:00Z" AND timestamp>="2022-08-15T13:20:00Z"' --bucket=pre-prod --project=non-prod --location=global --view=_AllLogs --freshness=3d
I've specified the freshness, but it just returns nothing, no errors either.
Anyone know why this is not working? I can see the logs on the GCP console for the above dates, but logging read just not working.
Had wrong timestamps, needed to swap the greater than and lower than symbols.

Log Buckets from Google

Is it possible to download a Log Storage (Log bucket) from Google Cloud Platform, specifically the one created by default? In case someone knows they can explain how to do it.
The possible solution for the question is you need to choose the required logs and then get the logs for the time period of 1 day to download them in JSON or CSV format.
Step1- From the logging console goto advanced filtering mode
Step2- To choose the log type use filtering query, for example
resource.type="audited_resource"
logName="projects/xxxxxxxx/logs/cloudaudit.googleapis.com%2Fdata_access"
resource.type="audited_resource"
logName="organizations/xxxxxxxx/logs/cloudaudit.googleapis.com%2Fpolicy"
Step3- You can download them as JSON and CSV format
If you have a huge number of audit logs generated per day then above one will not work out. So, you need to export logs to Cloud storage and a big query for further analysis. Please note that cloud logging doesn’t charge to export logs but destination charges might apply.
Another option, you can use the following gcloud command to download the logs.
gcloud logging read "logName : projects/Your_Project/logs/cloudaudit.googleapis.com%2Factivity" --project=Project_ID --freshness=1d >> test.txt

unable to show Time to Response(TTR) values on GCP Dashboard

I need your help , actually i need to create a dasboard in GCP to show TTR Time , TTR response via fetching logs from GCP logging that i am writting using script but unable to achieve it.
Below is command i am using:
gcloud logging write logging/user/TTR4 '{"Gremblin_correlation_exec_id": "correlation_id","SenerioName": "Senerio1","ServiceName": "Service1","SubsystemName": "subsystem1","TTRTime": 500,"EndTimestamp": "2020-11-30 06:06:56+00:00","Node_ipfirst": "10.128.0.55:80","node_ipsecound": "10.128.0.6:80","starttimestamp": "2020-11-30 05:58:08+00:00" }' --payload-type=json
i am getting jason data but not able to show it on dasboard like TTRTime above 500 using filter based upon service name and subsystemName.

Where are the EMR logs that are placed in S3 located on the EC2 instance running the script?

The question: Imagine I run a very simple Python script on EMR - assert 1 == 2. This script will fail with an AssertionError. The log the contains the traceback containing that AssertionError will be placed (if logs are enabled) in an S3 bucket that I specified on setup, and then I can read the log containing the AssertionError when those logs get dropped into S3. However, where do those logs exist before they get dropped into S3?
I presume they would exist on the EC2 instance that the particular script ran on. Let's say I'm already connected to that EC2 instance and the EMR step that the script ran on had the ID s-EXAMPLE. If I do:
[n1c9#mycomputer cwd]# gzip -d /mnt/var/log/hadoop/steps/s-EXAMPLE/stderr.gz
[n1c9#mycomputer cwd]# cat /mnt/var/log/hadoop/steps/s-EXAMPLE/stderr
Then I'll get an output with the typical 20/01/22 17:32:50 INFO Client: Application report for application_1 (state: ACCEPTED) that you can see in the stderr log file you can access on EMR:
So my question is: Where is the log (stdout) to see the actual AssertionError that was raised? It gets placed in my S3 bucket indicated for logging about 5-7 minutes after the script fails/completes, so where does it exist in EC2 before that? I ask because getting to these error logs before they are placed on S3 would save me a lot of time - basically 5 minutes each time I write a script that fails, which is more often than I'd like to admit!
What I've tried so far: I've tried checking the stdout on the EC2 machine in the paths in the code sample above, but the stdout file is always empty:
What I'm struggling to understand is how that stdout file can be empty if there's an AssertionError traceback available on S3 minutes later (am I misunderstanding how this process works?). I also tried looking in some of the temp folders that PySpark builds, but had no luck with those either. Additionally, I've printed the outputs of the consoles for the EC2 instances running on EMR, both core and master, but none of them seem to have the relevant information I'm after.
I also looked through some of the EMR methods for boto3 and tried the describe_step method documented here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.describe_step - which, for failed steps, have a FailureDetails json dict response. Unfortunately, this only includes a LogFile key which links to the stderr.gz file on S3 (even in that file doesn't exist yet) and a Message key which contain a generic Exception in thread.. message, not the stdout. Am I misunderstanding something about the existence of those logs?
Please feel free to let me know if you need any more information!
It is quite normal that with log collecting agents, the actual logs files doesn't actually grow, but they just intercept stdout to do what they need.
Most probably when you configure to use S3 for the logs, the agent is configured to either read and delete your actual log file, or maybe create a symlink of the log file to somewhere else, so that file is actually never writen when any process open it for write.
maybe try checking if there is any symlink there
find -L / -samefile /mnt/var/log/hadoop/steps/s-EXAMPLE/stderr
but it can be something different from a symlink to achieve the same logic, and I ddint find anything in AWS docs, so most probably is not intended that you will have both S3 and files at the same time and maybe you wont find it
If you want to be able to check your logs more frequently, you may want to think about installing a third party logs collector (logstash, beats, rsyslog,fluentd) and ship logs to SolarWinds Loggly, logz.io, or set up a ELK (Elastic search, logstash, kibana)
You can check this article from Loggly, or create a free acount in logz.io and check the lots of free shippers that they support

DataFlow gcloud CLI - "Template metadata was too large"

I've honed my transformations in DataPrep, and am now trying to run the DataFlow job directly using gcloud CLI.
I've exported my template and template metadata file, and am trying to run them using gcloud dataflow jobs run and passing in the input & output locations as parameters.
I'm getting the error:
Template metadata regex '[ \t\n\x0B\f\r]*\{[ \t\n\x0B\f\r]*((.|\r|\n)*".*"[ \t\n\x0B\f\r]*:[ \t\n\x0B\f\r]*".*"(.|\r|\n)*){17}[ \t\n\x0B\f\r]*\}[ \t\n\x0B\f\r]*' was too large. Max size is 1000 but was 1187.
I've not specified this at the command line, so I know it's getting it from the metadata file - which is straight from DataPrep, unedited by me.
I have 17 input locations - one containing source data, all the others are lookups. There is a regex for each one, plus one extra.
If it's running when prompted by DataPrep, but won't run via CLI, am I missing something?
In this case I'd suspect the root cause is a limitation in gcloud that is not present in the Dataflow API or Dataprep. The best thing to do in this case is to open a new Cloud Dataflow issue in the public tracker and provide details there.