Managing/deleting/rotating/streaming Elastic Beanstalk Logs - amazon-web-services

I am using Amazon EB for the first time. I've setup a Rails app running on linux and puma.
So far, I've been viewing logs through the eb logs command. I know that we can set EB to rotate the logs to S3 or stream it to CloudWatch.
My question here revolved around the deletion of the various log files.
Will the various logs, such as puma.log be deleted automatically or must I do it myself?
If i setup log rotations to S3, will the log files on the EC2 instance be deleted (and a fresh copy created in its place) when it gets rotated to S3? Or does it just keep growing indefinitely?
If i stream it to CloudWatch, will the same copy of the log be kept on the EC2 instance and grow indefinitely?
I've googled around but can't seem to find any notion of "Log management" or "log deletion" in the docs or on SO.

I'm using beanstalk on a LAMP project and I can answer a few of your questions.
You have to setup your log rotation policy at least on your app logs. Check if your base image already rotate this logs for you. The config should be in /etc/logrotate.conf for linux
When you use S3 logs with Beanstalk, it already tails and delete the logs after 15min. http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.logging.html#health-logs-s3location
The same copy of the log will be kept in your EC2 instance. Your log rotation policy /etc/logrotate.conf will be the one that will delete it. awslogs will keep some metadata to know which was the processed chunk of the logs so it does not create duplicates.
If you want an example on how to use cloudwatch logs with elasticbeanstalk check: http://www.albertsola.pro/store-aws-beanstalk-symfony-and-apache-logs-in-cloudwatch-logs/

Related

If you are using AWS to autoscale spot instances of your application, how do you handle logging?

Looking into adding autoscaling of a portion of our application using AWS simple message queuing which would launch EC2 on-demand or spot instances based on queue backlog.
One question I had, is how do you deal with collecting logs from autoscaled instances? New instances are spun up based on an image, but then they are shut down when complete. Currently, if there is an issue with one of our services, which causes it to crash, we have a system to automatically restart the service, but the logs and core dump files are there to review. If we switch to an autoscaling system, where new instance are spun up, how do you get logs and core dump files when there is a failure? Particularly if the instance is spun down.
Good practice is to ship these logs and aggregate them somewhere else, and there are many services such as DataDog and Rapid7 which will do this for you at a cost.
AWS however provides CloudWatch logs, which gives you a central place to store and view logs. It also allows you then to give users access to logs on the AWS console without them having to ssh onto a server.
Shipping your logs to CloudWatch logs requires the installation of the CloudWatch agent on your server and specifying in the config which logs to ship.
You could install the CloudWatch agent once and create an AMI of that server to use in your autoscaling group, or install and configure the CloudWatch agent in userdata for every time a server is spun up.
All the information you need to get started can be found here:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html

How to keep logs in AWS if application restarts?

I run a spring boot application in AWS with Docker. Sometimes Amazon have to restart a hardware. Then Environment Health of instance in Beanstalk goes Degraded, Warning, and restarts.
I do want my app logs from the last 7 days but it was restarted due to unforeseen AWS hardware issues so I lost my information. How can I avoid it and make AWS to save all my logs even after restart?
It is true that archiving logs to S3 would work for the most part but you may want to consider installing and configuring the CloudWatch Logs agent - http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/QuickStartEC2Instance.html
This will stream logs directly to CloudWatch and save them upon termination. You also could consider numerous other solutions for this such as Sumo Logic, ELK, Splunk, etc.
You should always build solutions so as to be ready even when hardware crashes. One possible solution could be that while rotating log files send them to S3 bucket. You can create a cron-job to do this.

Customizing ec2 log rotation to S3 on elastic beanstalk

I have an AWS elastic beanstalk environment with some amazon linux instances running a tomcat8 server. I have enabled log rotation from the beanstalk console and I can see the logs getting published to S3 every hour.
I would like to reduce the frequency of the rotation from 1 hour to maybe 12 hours (or something customizable that i can decide later - if customization is limited, i can fallback to daily). The only related pointers that I've found in the documentation is that the logrotate configuration is at /etc/logrotate.elasticbeanstalk.hourly/ and that the cron job runs hourly as defined here /etc/cron.hourly/.
The default logrotate configuration for tomcat is set to rotate after size:10mb but the force flag in the cron task basically ignores this and ends up rotating the log file much sooner (I don't have a whole lot of traffic). Too many log files makes it very annoying to use for any sort of debugging later.
How can I go about changing the logrotate configuration and override the cron job? Is the recommended option to overwrite these config files via a script in .ebextensions folder?
When an instance is terminated (replaced by another one during rolling updates or any other reason) then does Elastic Beanstalk automatically back up the pending logs to S3 or do we lose them? What changes should I do or not do in above configuration so I can ensure that all logs are updated to S3?

Easier way to access ElasticBeanstalk EC2 Log files

I am programming a Jersey service on Tomcat via EBS with LoadBalancer. I am finding getting the EC2's S3 catalina files very cumbersome. Currently I need to determine the EC2 instance(s) then work my way to each of the S3 locations, download the files, then I can diagnose.
The snapshot doesn't help due to the amount of requests that come in, it doesn't hold enough info and by the time I get the snapshot, it has "rolled" off the snapshot.
Two questions:
1) Is there an easier approach to logs files via AWS? (Increase time before rotation which I don't believe is supported as of now, scripts, etc)
2) Is there any software or scripts to access all the logs under load balancer? I am basically wanting to say "give me all logs for this EBS" and have it get all logs for that day under all servers for that load balancer (up or down)". The clincher is down. Problem becomes more complex when the load balancer takes down an instance right when the issue occurs.
Thanks!
As an immediate solution to your problem you can follow the approach suggested in this answer. Essentially you can modify the logrotate configuration to rotate for a bigger log size using ebextensions.
Then snapshot logs should work for you.
Let me know if you need more clarifications on this approach.
AWS has released CloudWatch Logs just last week, which enables you to to monitor and troubleshoot your systems and applications using your existing system, application, and custom log files:
You can send your existing system, application, and custom log files to CloudWatch Logs and monitor these logs in near real-time. [...] you can store your logs using highly durable, low-cost storage for later access.
See the introductory blog post Store and Monitor OS & Application Log Files with Amazon CloudWatch for an illustrated walk through, which touches on using Elastic Beanstalk and CloudWatch Logs already - this is further detailed in Using AWS Elastic Beanstalk with Amazon CloudWatch Logs.

Enable log file rotation to s3

I have enabled this option.
Problem is:
If I don't press snapshot log button log, is not going to s3.
Is there any method through which log publish to s3 each day?
Or how log file rotation option is working ?
If you are using default instance profile with Elastic Beanstalk, then AWS automatically creates permission to rotate the logs to S3.
If you are using custom instance profile, you have to grant Elastic Beanstalk permission to rotate logs to Amazon S3.
The logs are rotated every 15 minutes.
AWS Elastic Beanstalk: Working with Logs
For a more robust mechanism to push your logs to S3 from any EC2 server instance, you can pair LogRotate with S3. I've put all the details in this post as a reference whicould should be able to achieve exactly what you're describing.
Hope that helps.
NOTICE: if you want to rotate custom log files, then, depending on your container, you need to add links to your custom log files in a proper places. For example, consider Ruby on Rails deployment, if you want to store custom information, eg. some monitoring using Oink gem in oink.log file, add proper link in /var/app/support/logs using .ebextensions
.ebextensions/XXXlog.config
files:
"/var/app/support/logs/oink.log" :
mode: "120400"
content: "/var/app/current/log/oink.log"
This, after deploy, will create symlink:
/var/app/support/logs/oink.log -> /var/app/current/log/oink.log
I'm not sure why permissions 120400 are used, I took it from the example in Amazon AWS doc page http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html (seems like 120xxx is for symlinks in unix fs)
This log file rotation is good for archival purpose, but difficult to search and consolidate when you need the most.
Consider using services like splunk or loggly.