Wrong event time in CloudWatch log events - amazon-web-services

Found the solution after searching, but leaving this here if somebody happens to run into similar kind of confusion. See resolution in the end.
I'm trying to figure out why AWS CloudWatch log service fails to understand the right timestamp for my log events. Currently all my events are being saved under Time 2017-01-01 no matter what the actual timestamp in the event is.
I'm feeding the log from syslog where docker is saving the logged events and I configured docker to put the timestamp in format:
170105/103242 (%y%m%d/%H%M%S)
I configured awslogs service with parameters:
datetime_format = %y%m%d/%H%M%S
I restarted the service and hit the server, but still when I go to CloudWatch and see the log entries, even entries that indeed start with timestamp 170105/103242 are actually saved as events that belong to date 2017-01-01 containing all events between 01-01 and 01-05
When I look at the awslogs.log I can see following lines:
2017-01-05 11:05:28,633 - cwlogs.push - INFO - 29223 - MainThread - Missing or invalid value for use_gzip_http_content_encoding config. Defaulting to using gzip encoding.
2017-01-05 11:05:28,633 - cwlogs.push - INFO - 29223 - MainThread - Using default logging configuration.
This makes me think that the configuration probably isn't actually reading/using the datetime_format but I don't understand why it decides to end up using default. I tried to put
use_gzip_http_content_encoding = true
under general settings, but it doesn't change the errors.
I am running out of ideas - has anyone managed to configure awslogger in a way where the datetime_format is actually used correctly?
Edit:
I'm currently hacking more console logs to local python2.7 push.py to see what is going on :)
RESOLVED:
Ok, problem was that I came into this project after the initial setup had been created and I had the impression that the logger was configured to use the .conf file in location:
/etc/awslogs/awslogs.conf
that was dynamically populated.
The environment had a script that gave this location to awslogs-agent-setup.py which tried to make the agent understand that configuration should be read from here.
However this script didn't actually do what it was supposed to do and when the service started, it actually read the config from
/var/awslogs/etc/awslogs.conf
Which contained the default values.
So the actual resolution was to change the datetime_format parameter in the default config and forget about the config I thought the service was using.

Add logging to /var/awslogs/lib/python2.7/site-packages/cwlogs/push.py and see how the actual config parameters are interpreted.
You will probably find out that the service is actually using configuration file at default location:
/var/awslogs/etc/awslogs.conf
and hence you have to edit configuration values there for them to be actually read.

Related

Attempting to put a CloudWatch Agent onto an On-Premise server. Issues with cwagent-otel-collector

As said in the title, I am attempting to put a CloudWatch Agent (CW agent) on my On-Premise-Server (OPS).
After running this line of code that I got from the AWS User Guide to start the CW agent:
& $Env:ProgramFiles\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1 -m ec2 -a start
I got this error:
****** processing cwagent-otel-collector ******
cwagent-otel-collector will not be started as it has not been configured yet.
****** processing amazon-cloudwatch-agent ******
AmazonCloudWatchAgent has been started
I did/do not know what this was so I searched and found that when someone else had this issue, they did not create a config file.
I did create a config file (named config.json by default) using the configuration wizard and I am still having the issue.
I have tried looking into a number of pages on that user guide, but nothing has resolved the issue.
Thank you in advance for any assistance you can provide.
This message is info and not an error.
CloudWatch agent is bundled with the AWS OpenTelemetry collector agent. They're actually two agents. CloudWatch agent and Otel collector have separate configuration files. If you provide a config for one and not the other, it will only start the one that is configured. This is expected behavior.
Thank you for taking the time to answer. I have since resolved the issue (recently).
Everything from the command I was using to the path where the file resided was incorrect.
Starting over and going through all the steps again with background information helped.
The first installation combined with learning everything for the first time produced the issue.
Anyone having this issue I recommend that when you hit a wall like this you start over. I know it is not what anyone wants to do, but in the end it saved time.

How could I start envoy from a dumped config. Which is generated by /config_dump

When debug envoy, I try to run from a dumpped config file, but couldn't figure it out.
Dump the config using the envoy admin api '/config_dump'.
curl -X POST http://127.0.0.1:15000/config_dump -o envoy.config
But can't start it up, there are errors:
envoy --config-path envoy.config
...
[2019-12-22 12:40:50.313][194][critical][main] [external/envoy/source/server/server.cc:98] error initializing configuration 'envoy.config': Protobuf message (type envoy.config.bootstrap.v2.Bootstrap reason INVALID_ARGUMENT:configs: Cannot find field.) has unknown fields
[2019-12-22 12:40:50.313][194][info][main] [external/envoy/source/server/server.cc:607] exiting Protobuf message (type envoy.config.bootstrap.v2.Bootstrap reason INVALID_ARGUMENT:configs: Cannot find field.) has unknown fields
The dumped config is actually not intended to be used to start up the server. You start a server with a Bootstrap Config, but if you look closely at the output of the /config_dump endpoint, it actually contains 5 or more separate config dumps. My local envoy (Envoy 1.12.2) actually show config dumps for:
Bootstrap Config
Clusters
Listeners
ScopedRoutes
Routes
Secrets
You can read more about the output structure in the config dump docs, but the summary of that is that it's a totally different structure.
If you do take the output of /config_dump and strip it down to just the bootstrap config field, you can indeed start the server with it.

Vora Manager 1.3 log rotation

Is there any log rotation in Vora 1.3? After 2 months of running Vora 1.3 I realized I'm almost of disk space on my nodes because /var/log/vora-manager is like 46 Gb. So I had to stop it, kill the logs and restart.
But maybe I missed some setting?
Edit 1: The log file is supposed to be stored in /var/log/vora/vora-manager, not the folder I mentioned above, but still I saw a huge log file there. The file /var/log/vora-manager is also mentioned in the line 178 of control.py script that is supposed to start a worked vora-manager.
You are right -- the vora-manager log file is not written into the standard /var/log/vora directory, instead it is written to /var/log/vora-manager. This has been corrected in Vora 1.4.
The logs should be rotating based on the vora_manager_log_max_file_size variable which is also set in Ambari.
Something must have gone wrong whenever vora tries to rotate the logs. I propose you search through your log file for the following line and see if it is followed by some kind of error:
vora.vora-manager-master: [c.740b0d26] : Running['sudo', '-i', '-u',
'root', '/usr/sbin/logrotate', '/etc/logrotate.d/vora-manager-master']
You can also change the verbosity of the logger by setting the vora_manager_log_level config variable in Ambari from INFO to WARNING. Be ware this will hide the log rotation log messages.

Cloudwatch logs - No Event Data after time elapses

I've looked on the AWS forums and elsewhere but haven't found a solution. I have a lambda function that, when invoked, creates a log stream which populates with log events. After about 12 hours or so, the log stream is still present, but when I open it, I see the following:
The link explains how to start sending event data, but I already have this set up, and I am sending event data, it just disappears after a certain time period.
I'm guessing there is some setting somewhere (either for max storage allowed or for whether logs get purged), but if there is, I haven't found it.
Another reason for missing data in the log stream might be a corrupted agent-state file. First check your logs
vim /var/log/awslogs.log
If you find something like "Caught exception: An error occurred (InvalidSequenceTokenException) when calling the PutLogEvents operation: The given sequenceToken is invalid. The next expected sequenceToken is:" you can regenerate the agent-state file as follows:
sudo rm /var/lib/awslogs/agent-state
sudo service awslogs stop
sudo service awslogs start
TL;DR: Just use the CLI. See Update 2 below.
This is really bizarre but I can replicate it...
I un-checked the "Expire Events After" box, and lo and behold I was able to open older log streams. What seems REALLY odd is that if I choose to display the "Stored Bytes" data, many of the files are listed at 0 bytes even though they have log events:
Update 1:
This solution no longer works as I can only view the log events in the first two log streams. What's more is that the Stored Bytes column displays different (and more accurate) data:
This leads me to believe that AWS made some kind of update.
UPDATE 2:
Just use the CLI. I've verified that I can retrieve log events from the CLI that I cannot retrieve via the web console.
First install the CLI (if you haven't already) and use the following command:
aws logs get-log-events --log-group-name NAME-OF-LOGGROUP --log-stream-name LOG-STREAM-NAME // be sure to escape special characters such as /, [, $ etc

Python - logging requests summary stats using locust

I'm using locust
http://docs.locust.io/en/latest/index.html
to simulate a bunch of web users doing random site visits and file downloads. The logging option is set by specifying
locust ... --logfile </path/to/log/file>...
But this only logs a subset of internal events and print statements in the code, it does not log the request stats which are printed out on the console (if you use the --no-web option) or the UI (if you don't specify the --no-web option).
How can you capture the request stats in the log file?
Try setting the log level. From what I just read in the source it defaults to INFO
In your case I would type
locust ... --logfile </path/to/log/file> --loglevel DEBUG
Information from source:
help="Choose between DEBUG/INFO/WARNING/ERROR/CRITICAL. Default is INFO."
The stats you see on the console are a result of logging through the console_logger. See https://github.com/locustio/locust/blob/master/locust/log.py#L50
You can add your custom FileHandler to the console_logger and get those stats in a file.
console_logger = logging.getLogger("console_logger")
fh = logging.FileHandler(filename="stats.log")
fh.setFormatter(logging.Formatter('%(message)s'))
console_logger.addHandler(fh)