CloudWatch logs acting weird - amazon-web-services

CloudWatch logs acting weird - amazon-web-services

I have two log files with multi-line log statements. Both of them have same datetime format at the begining of each log statement. The configuration looks like this:
state_file = /var/lib/awslogs/agent-state
[/opt/logdir/log1.0]
datetime_format = %Y-%m-%d %H:%M:%S
file = /opt/logdir/log1.0
log_stream_name = /opt/logdir/logs/log1.0
initial_position = start_of_file
multi_line_start_pattern = {datetime_format}
log_group_name = my.log.group
[/opt/logdir/log2-console.log]
datetime_format = %Y-%m-%d %H:%M:%S
file = /opt/logdir/log2-console.log
log_stream_name = /opt/logdir/log2-console.log
initial_position = start_of_file
multi_line_start_pattern = {datetime_format}
log_group_name = my.log.group
The cloudwatch logs agent is sending log1.0 logs correctly to my log group on cloudwatch, however, its not sending log files for log2-console.log.
awslogs.log says:
2016-11-15 08:11:41,308 - cwlogs.push.batch - WARNING - 3593 - Thread-4 - Skip event: {'timestamp': 1479196444000, 'start_position': 42330916L, 'end_position': 42331504L}, reason: timestamp is more than 2 hours in future.
2016-11-15 08:11:41,308 - cwlogs.push.batch - WARNING - 3593 - Thread-4 - Skip event: {'timestamp': 1479196451000, 'start_position': 42331504L, 'end_position': 42332092L}, reason: timestamp is more than 2 hours in future.
Though server time is correct. Also weird thing is Line numbers mentioned in start_position and end_position does not exist in actual log file being pushed.
Anyone else experiencing this issue?

I was able to fix this.
The state of awslogs was broken. The state is stored in a sqlite database in /var/awslogs/state/agent-state. You can access it via
sudo sqlite3 /var/awslogs/state/agent-state
sudo is needed to have write access.
List all streams with
select * from stream_state;
Look up your log stream and note the source_id which is part of a json data structure in the v column.
Then, list all records with this source_id (in my case it was 7675f84405fcb8fe5b6bb14eaa0c4bfd) in the push_state table
select * from push_state where k="7675f84405fcb8fe5b6bb14eaa0c4bfd";
The resulting record has a json data structure in the v column which contains a batch_timestamp. And this batch_timestamp seams to be wrong. It was in the past and any newer (more than 2 hours) log entries were not processed anymore.
The solution is to update this record. Copy the v column, replace the batch_timestamp with the current timestamp and update with something like
update push_state set v='... insert new value here ...' where k='7675f84405fcb8fe5b6bb14eaa0c4bfd';
Restart the service with
sudo /etc/init.d/awslogs restart
I hope it works for you!

We had the same issue and the following steps fixed the issue.
If log groups are not updating with latest events:
Run These steps:
Stopped the awslogs service
Deleted file /var/awslogs/state/agent-state
Updated /var/awslogs/etc/awslogs.conf configuration from hostaname to
instance ID Ex:
log_stream_name = {hostname} to log_stream_name = {instance_id}
Started awslogs service.

I was able to resolve this issue on Amazon Linux by:
sudo yum reinstall awslogs
sudo service awslogs restart
This method retained my config files in /var/awslogs/, though you may wish to back them up before a reinstall.
Note: In my troubleshooting, I had also deleted my Log Group via the AWS Console. The restart fully reloaded all historical logs, but at the present timestamp, which is of less value. I'm unsure if deleting the Log Group was this was necessary for this method to work. You might want to look at setting the initial_position config to end_of_file before you restart.

I found the reason. The time zone in my docker container is inconsistent with the time zone of my host computer. After setting the two time zones to be consistent, the problem is solved

Related

AWS Cloud Watch: How to specify which field to use for timestamp in json?

I have
datetime_format = "%Y-%m-%dT%H:%M:%S.%f%z"
in /etc/awslogs/awslogs.conf
And I have log like this:
{
"level": "info",
"ts": "2023-01-08T21:46:03.381067Z",
"caller": "bot/bot.go:172",
"msg": "Creating test subscription declined",
"user_id": "0394c017-2a94-416c-940c-31b1aadb12ee"
}
However timestamp does not parsed
I see warning in logs
2023-01-08 21:46:03,423 - cwlogs.push.reader - WARNING - 9500 - Thread-4 - Fall back to previous event time: {'timestamp': 1673211877689, 'start_position': 6469L, 'end_position': 6640L}, previousEventTime: 1673211877689, reason: timestamp could not be parsed from message.
upd:
tried to remove level
{
"ts": "2023-01-08T23:15:00.518545Z",
"caller": "bot/bot.go:172",
"msg": "Creating test subscription declined",
"user_id": "0394c017-2a94-416c-940c-31b1aadb12ee"
}
and still does not work.

There 2 different formats of cloudwatch log configurations:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AgentReference.html. This is deprecated as mentioned in the alert section of the page.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html. This is the configuration for new unified cloudwatch agent and it doesn't have the parameter datetime_format to configure. Instead it has the timestamp_format.
Since you have mentioned the datetime_format, I'm assuming you are using the old agent. In that case, the %z refers to UTC offset in the form +HHMM or -HHMM. +0000, -0400, +1030 as per the linked documentation[1 above]. Your timestamp doesn't have an offset mentioned hence your format should be %Y-%m-%dT%H:%M:%S.%fZ. There the Z is similar to T where it just represents a character. Also, specify the time_zone as UTC.

AWS Cloudwatch logs not working as expected

I am trying to use AWS CloudWatch to maintain the application logs in a Ubuntu EC2 instance. I have installed the awslogs agent using the following command as suggested in their documentation to monitor the file application.log and push any new entries in the file to CloudWatch.
Setup command - sudo python3 ./awslogs-agent-setup.py --region ap-south-1
It was working fine for a day when I tested it out after setting it up, then it stopped working from the next day. I can see that the changes in the log files are being detected by the AWS Agent, as there is an entry in the awslogs.log file as soon as there is a new entry in the application.log file. However, the same updates are not being pushed/reflected in the CloudWatch console.
What might have gone wrong here?
Entry in /var/log/awslogs.log
2020-02-27 12:19:03,376 - cwlogs.push.reader - WARNING - 1388 - Thread-4 - Fall back to previous event time: {'end_position': 10483213, 'timestamp': 1582261391000, 'start_position': 10483151}, previousEventTime: 1582261391000, reason: timestamp could not be parsed from message.
2020-02-27 12:19:07,437 - cwlogs.push.publisher - INFO - 1388 - Thread-3 - Log group: branchpayout-python-pilot, log stream: ip-172-27-99-136_application.log, queue size: 0, Publish batch: {'fallback_events_count': 2, 'source_id': 'c0bd7124acf1c35ede963da6b8ec9882', 'num_of_events': 2, 'first_event': {'end_position': 10483151, 'timestamp': 1582261391000, 'start_position': 10482278}, 'skipped_events_count': 0, 'batch_size_in_bytes': 985, 'last_event': {'end_position': 10483213, 'timestamp': 1582261391000, 'start_position': 10483151}}
Configuration in /var/awslogs/etc/awslogs.conf
[/home/ubuntu/application-name/application.log]
file = /home/ubuntu/application-name/application.log
datetime_format = %Y-%m-%d %H:%M:%S,%f
log_stream_name = {hostname}_application.log
buffer_duration = 5000
log_group_name = branchpayout-python-pilot
initial_position = end_of_file
multi_line_start_pattern = {datetime_format}

Check you log format and accordingly update your awslogs.conf.
for me nginx access log format in access.log was "%d/%b/%Y:%H:%M:%S %z" hence my config file contains :
datetime_format = %d/%b/%Y:%H:%M:%S %z
Below are the example .
Nginx error.log 2017/08/12 05:04:00 %Y/%m/%d %H:%M:%S
Nginx access.log 12/Aug/2017:06:19:17 +0900 %d/%b/%Y:%H:%M:%S %z
php-fpm error.log 12-Aug-2017 05:24:38 %d-%b-%Y %H:%M:%S
php-fpm www-error.log 10-Aug-2017 23:40:46 UTC %d-%b-%Y %H:%M:%S
messages Aug 12 06:13:36 %b %d %H:%M:%S
secure Aug 11 04:03:33 %b %d %H:%M:%S

Invalid Token Error Asing AWS logs

I have been battling this for hours and it's driving me nuts. I installed log agent and set it up correctly.
I can access the instance via this command. eb ssh
However, when I run the command sudo service awslogs restart , I get weird errors like
2017-06-12 16:31:41,899 - cwlogs.push.publisher - WARNING - 31909 -
Thread-7 - Caught exception: An error occurred
(UnrecognizedClientException) when calling the PutLogEvents operation:
The security token included in the request is invalid.
2017-06-12 16:31:41,899 - cwlogs.threads - ERROR - 31909 - Thread-7 -
Exception caught in <EventBatchPublisher(Thread-7, started daemon
140242458298112)>
Traceback (most recent call last):
I have changed the credentials multiple times, all to no avail.
Also, I get this error in the awslogs.log file:
2017-06-12 16:31:40,862 - cwlogs.push.reader -
WARNING - 31909 - Thread-8 - Fall back to previous event time:
{'timestamp': 1497246644000, 'start_position': 7142L, 'end_position':
7246L}, previousEventTime: 1497246644000, reason: timestamp could not
be parsed from message.
I am using the following format:
[/var/log/tomcat8/catalina.out]
datetime_format = %d-%b-%Y %H:%M:%S
file = /var/log/tomcat8/catalina.out
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = Catalina
Any help at this point will be appreciated.

Kindly append "sudo" to the "aws configure" command.

logs from only some files showing up aws cloudwatch

I configured aws cloudwatch log service on my linux instance. In the config file I set it to keep track of 3 log files:
[general]
state_file = /var/lib/awslogs/agent-state
[plugins]
cwlogs = cwlogs
[default]
region = us-west-1
[/var/log/cron]
file = /var/log/cron
log_group_name = /var/log/cron
log_stream_name = {instance_id}
datetime_format = %b %d %H:%M:%S
[/var/log/messages]
file = /var/log/messages
log_group_name = /var/log/messages
log_stream_name = {instance_id}
datetime_format = %b %d %H:%M:%S
[/var/log/test.log]
file = /var/log/test.log
log_group_name = /var/log/test.log
log_stream_name = {instance_id}
datetime_format = %b %d %H:%M:%S
However, in my console I'm only seeing logs showing up from messages. The permissions for the 3 files I'm trying to keep track of are -rw-------.
Does anybody know why this might be happening? I'm echoing test logs into each individual file and only the ones inserted into messages are showing up.
EDIT**: Here is my awslogs.log
2016-08-25 17:58:31,227 - cwlogs.push - INFO - 631 - MainThread - Missing or invalid value for use_gzip_http_content_encoding config. Defaulting to using gzip encoding.
2016-08-25 17:58:31,228 - cwlogs.push - INFO - 631 - MainThread - Using default logging configuration.
2016-08-25 17:58:31,234 - cwlogs.push.stream - INFO - 631 - Thread-1 - Starting publisher for [d4a8beb9b6b4535cac41dc75f252df59, /var/log/messages]
2016-08-25 17:58:31,234 - cwlogs.push.stream - INFO - 631 - Thread-1 - Starting reader for [d4a8beb9b6b4535cac41dc75f252df59, /var/log/messages]
2016-08-25 17:58:31,235 - cwlogs.push.reader - INFO - 631 - Thread-4 - Replay events end at 52578.
2016-08-25 17:58:31,235 - cwlogs.push.reader - INFO - 631 - Thread-4 - Start reading file from 52284.
2016-08-25 17:58:32,308 - cwlogs.push.publisher - WARNING - 631 - Thread-2 - Caught exception: An error occurred (DataAlreadyAcceptedException) when calling the PutLogEvents operation: The given batch of log events has already been accepted. The next batch can be sent with sequenceToken: 49561203985967314162297491311273568778757530964511949634

It's possible your agent state file is corrupted because you kept making changes to the configuration. There are two ways to fix this:
Option 1: Use a new name for your configuration block header.
That is, change [/var/log/cron] to [/something/else].
Option 2: Delete the agent state file after stopping the service.
sudo service awslogs stop
sudo rm /var/lib/awslogs/agent-state
sudo service awslogs start
Please note that Option 2 may initially cause duplicate logs to be pushed to CloudWatch as a new state file is created.

AWS logs agent setup

We have recently setup AWS logs agent on one of our test servers. Our log files usually contain multi-line events. e.g one of our log event is:
[10-Jun-2016 07:30:16 UTC] SQS Post Response: Array
(
[Status] => 200
[ResponseBody] => <?xml version="1.0"?><SendMessageResponse xmlns="http://queue.amazonaws.com/doc/2009-02-01/"><SendMessageResult><MessageId>053c7sdf5-1e23-wa9d-99d8-2a0cf9eewe7a</MessageId><MD5OfMessageBody>8e542d2c2a1325a85eeb9sdfwersd58f</MD5OfMessageBody></SendMessageResult><ResponseMetadata><RequestId>4esdfr30-c39b-526b-bds2-14e4gju18af</RequestId></ResponseMetadata></SendMessageResponse>
)
The log agent reference documentation says to use 'multi_line_start_pattern' option for such logs. Our AWS Log agent config is as follows:
[httpd_info.log]
file = /var/log/httpd/info.log*
log_stream_name = info.log
initial_position = start_of_file
log_group_name = test.server.name
multi_line_start_pattern = '(\[)+\d{2}-[a-zA-Z]{3}+-\d{4}'
However, the logs agent reporting breaks on aforementioned and similar events. The way it is being reported to CloudWatch Logs is as follows:
Event 1:
[10-Jun-2016 11:21:26 UTC] SQS Post Response: Array
Event 2:
( [Status] => 200 [ResponseBody] => <?xml version="1.0"?><SendMessageResponse xmlns="http://queue.amazonaws.com/doc/2009-02-01/"><SendMessageResult><MessageId>053c7sdf5-1e23-wa9d-99d8-2a0cf9eewe7a</MessageId><MD5OfMessageBody>8e542d2c2a1325a85eeb9sdfwersd58f</MD5OfMessageBody></SendMessageResult><ResponseMetadata><RequestId>4esdfr30-c39b-526b-bds2-14e4gju18af</RequestId></ResponseMetadata></SendMessageResponse>
Event 3:
)
Despite of the fact that its only a single event. Any clue whats going on here?

I think all you need to add is the following to your awslogs.conf
datetime_format = %d-%b-%Y %H:%M:%S UTC
time_zone = UTC
multi_line_start_pattern = {datetime_format}
http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AgentReference.html
multi_line_start_pattern
Specifies the pattern for identifying the start of a log message. A log message is made of a line that matches the pattern and any following lines that don't match the pattern. The valid values are regular expression or {datetime_format}. When using {datetime_format}, the datetime_format option should be specified. The default value is ‘^[^\s]' so any line that begins with non-whitespace character closes the previous log message and starts a new log message.
If that datetime format didn't work, you would need to update your regex to actually match your specific datetime. I don't think the one you have listed above actually works for your given format.
You could try this for instance:
[\d{2}-[\w]{3}-\d{4}\s{1}\d{2}:\d{2}:\d{2}\s{1}\w+]
does match
[10-Jun-2016 11:21:26 UTC]
See here: http://www.regexpal.com/?fam=96811
Once completed, issue a restart of the service and check to see if its parsing correctly.
$ sudo service awslogs restart

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

CloudWatch logs acting weird - amazon-web-services

I found the reason. The time zone in my docker container is inconsistent with the time zone of my host computer. After setting the two time zones to be consistent, the problem is solved

Related

AWS Cloud Watch: How to specify which field to use for timestamp in json?

AWS Cloudwatch logs not working as expected

Invalid Token Error Asing AWS logs

logs from only some files showing up aws cloudwatch

AWS logs agent setup

Categories

Resources