Difference bw "start_of_file" and "end_of_file" in AWS cloud agent configuration - amazon-web-services

I am trying to setup AWS cloud watch agent on one of our nodes in our cluster and unable to find the difference between start_of_file and end_of_file for initial_position configuration.
I created a log file tes1234.log and provided the below log configuration in awslogs.conf [/var/awslogs/etc/awslogs.conf] file
[test1234_log]
datetime_format = %Y-%m-%d %H:%M:%S
file = /var/xxx/log/test1234.log
buffer_duration = 5000
log_stream_name = test1234_log_stream
initial_position = start_of_file
log_group_name = xxx-test
After providing these information I started the agent and found that logstream test_1234 is created but when I change it to end_of_file I found that logstream is not getting created.
I unable to find the difference between start_of_file and end_of_fileand on which scenarios need to use what.Kindly help.

That setting lets you specify whether to consume the log file from the beginning, or whether to start from the end. This only applies to the very first time you start the agent, because once you start it the agent will save its own pointer on the file and will continue from that location if/when restarted.
You may want to choose "end_of_file" if you don't care about any old data at the time you install the agent for the very first time. If you'd like to upload all the data already accumulated in the file, then choose "start_of_file". The only downside of "start_from_file" is that the agent might take a while to upload the whole file and catch up to the tail.

Related

Where are the EMR logs that are placed in S3 located on the EC2 instance running the script?

The question: Imagine I run a very simple Python script on EMR - assert 1 == 2. This script will fail with an AssertionError. The log the contains the traceback containing that AssertionError will be placed (if logs are enabled) in an S3 bucket that I specified on setup, and then I can read the log containing the AssertionError when those logs get dropped into S3. However, where do those logs exist before they get dropped into S3?
I presume they would exist on the EC2 instance that the particular script ran on. Let's say I'm already connected to that EC2 instance and the EMR step that the script ran on had the ID s-EXAMPLE. If I do:
[n1c9#mycomputer cwd]# gzip -d /mnt/var/log/hadoop/steps/s-EXAMPLE/stderr.gz
[n1c9#mycomputer cwd]# cat /mnt/var/log/hadoop/steps/s-EXAMPLE/stderr
Then I'll get an output with the typical 20/01/22 17:32:50 INFO Client: Application report for application_1 (state: ACCEPTED) that you can see in the stderr log file you can access on EMR:
So my question is: Where is the log (stdout) to see the actual AssertionError that was raised? It gets placed in my S3 bucket indicated for logging about 5-7 minutes after the script fails/completes, so where does it exist in EC2 before that? I ask because getting to these error logs before they are placed on S3 would save me a lot of time - basically 5 minutes each time I write a script that fails, which is more often than I'd like to admit!
What I've tried so far: I've tried checking the stdout on the EC2 machine in the paths in the code sample above, but the stdout file is always empty:
What I'm struggling to understand is how that stdout file can be empty if there's an AssertionError traceback available on S3 minutes later (am I misunderstanding how this process works?). I also tried looking in some of the temp folders that PySpark builds, but had no luck with those either. Additionally, I've printed the outputs of the consoles for the EC2 instances running on EMR, both core and master, but none of them seem to have the relevant information I'm after.
I also looked through some of the EMR methods for boto3 and tried the describe_step method documented here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.describe_step - which, for failed steps, have a FailureDetails json dict response. Unfortunately, this only includes a LogFile key which links to the stderr.gz file on S3 (even in that file doesn't exist yet) and a Message key which contain a generic Exception in thread.. message, not the stdout. Am I misunderstanding something about the existence of those logs?
Please feel free to let me know if you need any more information!
It is quite normal that with log collecting agents, the actual logs files doesn't actually grow, but they just intercept stdout to do what they need.
Most probably when you configure to use S3 for the logs, the agent is configured to either read and delete your actual log file, or maybe create a symlink of the log file to somewhere else, so that file is actually never writen when any process open it for write.
maybe try checking if there is any symlink there
find -L / -samefile /mnt/var/log/hadoop/steps/s-EXAMPLE/stderr
but it can be something different from a symlink to achieve the same logic, and I ddint find anything in AWS docs, so most probably is not intended that you will have both S3 and files at the same time and maybe you wont find it
If you want to be able to check your logs more frequently, you may want to think about installing a third party logs collector (logstash, beats, rsyslog,fluentd) and ship logs to SolarWinds Loggly, logz.io, or set up a ELK (Elastic search, logstash, kibana)
You can check this article from Loggly, or create a free acount in logz.io and check the lots of free shippers that they support

AWS-Logs, ElasticSearch : Specific logs not showing up in ElasticSearch, working only for select logs

I am streaming AWSLogs to CLoudwatch and from there, I am streaming it on ElasticSearch domain. I can see in the overview, that I have a heck number of searchable documents, but I am not able to find the documents from many different log-streams when I am searching them in Kibana. It's working only for 2-3 out of 35 log streams. I can see all streams in CloudWatch logs, and on the right side, can also see that I am streaming to ElasticSearch instance. I will explain how I am doing it, maybe some idea what am I doing wrong. Thank you.
Installed AWSLogs service from here
Commands :
curl https://s3.amazonaws.com/aws-cloudwatch/downloads/latest/awslogs-agent-setup.py -O
sudo python ./awslogs-agent-setup.py --region OUR_REGION
Once that's done, I added in /var/awslogs/etc/awslogs.conf my log files in such manner :
[/var/www/html/var/log/exception.log]
datetime_format = %d/%b/%Y:%H:%M:%S
file = /var/www/html/var/log/exception.log
buffer_duration = 5000
log_stream_name = staging.1c.APP_NAME.exception.log
initial_position = end_of_file
log_group_name = staging.1c.APP_NAME.exception.log
After that, I logged into Kibana, and in index-patterns, defined an index-pattern as cwl-*.

How to add file_fingerprint_lines option as a --log-opt option in docker run command for Docker AWS log driver

I'm running my docker containers on CoreOS AWS instances and enabled aws log driver for the docker containers. Given below is my docker container run command.
docker run --log-driver=awslogs --log-opt awslogs-region=ap-southeast-1 --log-opt awslogs-group=stagingUrlMapperLogs --log-opt awslogs-datetime-format='\[%%b %%d, %%Y %%H:%%M:%%S\]' --log-opt tag="{{.Name}}/{{.ID}}" --net=host --name url-mapper url-mapper-example:latest
The issue is after some random period of time (1-2 days) on AWS CloudWatch side, no new log events are being recorded. After doing some research I came across this issue reported on AWS developer Forum. It says that adding file_fingerprint_lines option on CloudWatch config will solve the issue. But I didn't find any resources exaplaining how to set the file_fingerprint_lines command with docker run command.
Note - I'm running my servers in AWS autosacaling group which is connected to a launch configuration, so each time I scale up, new servers will spin up with the container running on it.
"But I didn't find any resources exaplaining how to set the file_fingerprint_lines command with docker run command."
I think that you have to set it in the CloudWatch Logs agent configuration file:
From the Amazon CloudWatch docs:
file_fingerprint_lines
Specifies the range of lines for identifying a file. The valid values
are one number or two dash delimited numbers, such as '1', '2-5'. The
default value is '1' so the first line is used to calculate
fingerprint. Fingerprint lines are not sent to CloudWatch Logs unless
all the specified lines are available
But, I think that the interesting point comes here:
What kinds of file rotations are supported?
The following file rotation mechanisms are supported:
Renaming existing log files with a numerical suffix, then re-creating
the original empty log file. For example, /var/log/syslog.log is
renamed /var/log/syslog.log.1. If /var/log/syslog.log.1 already exists
from a previous rotation, it is renamed /var/log/syslog.log.2.
Truncating the original log file in place after creating a copy. For
example, /var/log/syslog.log is copied to /var/log/syslog.log.1 and
/var/log/syslog.log is truncated. There might be data loss for this
case, so be careful about using this file rotation mechanism.
Creating a new file with a common pattern as the old one. For example,
/var/log/syslog.log.2014-01-01 remains and
/var/log/syslog.log.2014-01-02 is created.
The fingerprint (source ID) of the file is calculated by hashing the
log stream key and the first line of file content. To override this
behavior, the file_fingerprint_lines option can be used. When file
rotation happens, the new file is supposed to have new content and the
old file is not supposed to have content appended; the agent pushes
the new file after it finishes reading the old file.
And, how to override it:
You can have more than one [logstream] section, but each must have a
unique name within the configuration file, e.g., [logstream1],
[logstream2], and so on. The [logstream] value along with the first
line of data in the log file, define the log file's identity.
[general]
state_file = value
logging_config_file = value
use_gzip_http_content_encoding = [true | false]
[logstream1]
log_group_name = value
log_stream_name = value
datetime_format = value
time_zone = [LOCAL|UTC]
file = value
file_fingerprint_lines = integer | integer-integer
multi_line_start_pattern = regex | {datetime_format}
initial_position = [start_of_file | end_of_file]
encoding = [ascii|utf_8|..]
buffer_duration = integer
batch_count = integer
batch_size = integer
[logstream2]
...

Create a logStream for each log file in cloudwatchLogs

I use AWS CloudWatch log agent to push my application log to AWS Cloudwatch.
In the cloudwatchLogs config file inside my EC2 instance, I have this entry:
[/scripts/application]
datetime_format = %Y-%m-%d %H:%M:%S
file = /workingdir/customer/logfiles/*.log
buffer_duration = 5000
log_stream_name = {instance_id}
initial_position = start_of_file
log_group_name = /scripts/application
According to this configuration, all log files in workingdir directory are being sent to cloudwatchLogs in the same stream were the name is the instance Id.
My question is, I want for each log file, create a separate logStream, so that the logs reading can be more fast and parseable. In other words, every time I have a new log file, a new logstream is created automatically.
I thought of doing that by a shell script in a cron job but then I'll have to change many other configurations in the architecture, so I'm looking for a way to do it in the config file. In the documentation, they say that :
log_stream_name
Specifies the destination log stream. You can use a literal string or
predefined variables ({instance_id}, {hostname}, {ip_address}), or
combination of both to define a log stream name. A log stream is
created automatically if it doesn't already exist.
The names of the log files can't be 100% predictible, but they always have this structure though:
CustomerName-YYYY-mm-dd.log
Also, another problem is that :
A running agent must be stopped and restarted for configuration
changes to take effect.
How can I set the logStream in this case?
Any ideas or suggestions or workarounds are very appreciated.
I know it's been almost two years now, but I wanted to do the exact same thing and couldn't find a way to get it to work.
I resorted to the AWS Support, which then confirmed this cannot be done. We're limited to the options offered in the documentation, just like you posted. You can, however, have log groups contain the log file path up to the first dot:
log_group_name – Optional. Specifies what to use as the log group name
in CloudWatch Logs. Allowed characters include a-z, A-Z, 0-9, '_'
(underscore), '-' (hyphen), '/' (forward slash), and '.' (period).
We recommend that you specify this field to prevent confusion. If you
omit this field, the file path up to the final dot is used as the log
group name. For example, if the file path is
/tmp/TestLogFile.log.2017-07-11-14, the log group name is
/tmp/TestLogFile.log.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html

Sitecore purge/write log immediately

Is there a way to purge the log in Sitecore such that logs are written immediately. It's for production debugging.
Also strolling through log files, there are number of log files e.g. log.date.text and log.date.time.txt. Which one is the latest i.e. with our without time.
You can use next module for production server if you have remote access there :
https://marketplace.sitecore.net/Modules/S/Sitecore_Log_Analyzer.aspx
Other option is to use this module:
https://marketplace.sitecore.net/Modules/S/Sitecore_ScriptLogger.aspx
The log with no timestamp in the file name is the first on for that day.
A new log file is created each time the application pool restarts.
If you haven't changed any of the default log4net settings then the initial log file will be in the format log.yyyyMMdd.txt, each subsequent restart will cause a new file to be generated with the following format log.yyyyMMdd.HHmmss.txt.
The latest log file for the day will be the file with the latest timestamp.