How to get docker app logs to S3 bucket - amazon-web-services

Is there any way to stream/push docker app logs to S3 bucket?
I know following 2 ways
Configure cloud watch logs/stream - All logs (both info & Error logs) are getting merged in this approach
Configure graylogs2 to push every log message and collect and then push to S3 bucket - Need to maintain graylogs2 app.
I am looking for any easy way to push docker app/error logs S3 Bucket
Thanks

A possible solution, though it's hard to tell for your case, is to run logstash in a separate container, and have your app direct logs to logstash. Since Logstash’s logging framework is based on Log4j 2 framework, it will likely be familiar to you. A plugin already exists for logstash to push to S3 on your behalf.
You can configure your existing log4j2 to emit to a port that logstash is running on.
If even this is considered too much maintenance for you, your best solution is probably just setting up a cron to run rsync.

Related

Using AWS ELK stack for capturing logs

I am new to AWS and was experimenting with some of its services. I managed to create a python application running inside an EC2 instance. The application creates a log file with the analysis data.
I want to connect this log file with AWS's Elasticsearch and Kibana service to begin running analytics on it.
Can someone point me to the best way of streaming my EC2 app's logs to AWS elasticsearch service.
You have multiple options to deal with this problem. in case of AWS
Install aws cloud watch log agent
Start log-agent with log file
Stream cloud watch log to lambda
Lamda will push logs to ELK
But I will go with the below approach as it will not need Lambda and log-group and the logs will send to ELK directly.
Filebeat
Logagent (node based pacakge)
Filebeat is part of the Elastic Stack, meaning it works seamlessly with Logstash, Elasticsearch, and Kibana. Whether you want to
transform or enrich your logs and files with Logstash, fiddle with
some analytics in Elasticsearch, or build and share dashboards in
Kibana, Filebeat makes it easy to ship your data to where it matters
most.
All you need to sepcify application log files.
paths:
- /app/log/*.log
- /app/log/*/*.log
Logagent is a modern, open-source, light-weight log shipper. It is like Filebeat and Logstash in one, without the JVM memory > footprint. It comes with out of the box and extensible log parsing, > on-disk buffering, secure transport, and bulk indexing to > Elasticsearch, Sematext Logs, and other destinations. Its low memory > footprint and low CPU overhead make it suitable for deploying on edge > nodes and devices, while its ability to parse and structure logs makes > it a great Logstash alternative.
sudo npm i -g #sematext/logagent
shipping-data-to-aws-elasticsearch-with-logagent

How can I configure Elastic Beanstalk to show me only the relevant log file(s)?

I'm an application developer with very limited knowledge of infrastructure. At my last job we frequently deployed Java web services (built as WAR files) to Elastic Beanstalk, and much of the infrastructure had already been set up before I ever started there, so I got to focus primarily on the code and not how things were tied together. One feature of Elastic Beanstalk that often came in handy was the button to "Request Logs," where you can select either the "Last 100 Lines" or the "Full Logs." What I'm used to seeing when clicking this button is to directly view the logs generated by my web service.
Now, at the new job, the infrastructure requirements are a little different, as we have to Dockerize everything before deploying it. I've been trying to stand up a Spring Boot web app inside a Docker container in Elastic Beanstalk, and have been running into trouble with that. And I also noticed a bizarre difference in behavior when I went to "Request Logs." Now when I choose one of those options, instead of dropping me into the relevant log file directly, it downloads a ZIP file containing the entire /var/log directory, with quite a number of disparate and irrelevant log files in there. I understand that there's no way for Amazon to know, necessarily, that I don't care about X log file but do care about Y log file, but was surprised that the behavior is different from what I was used to. I'm assuming this means the EB configuration at the last job was set up in a specific way to filter the one relevant log file, or something like that.
Is there some way to configure an Elastic Beanstalk application to only return one specific log file when you "Request Logs," rather than a ZIP file of the /var/log directory? Is this done with ebextensions or something like that? How can I do this?
Not too sure about the Beanstalk console, but using the EBCLI, if you enable CloudWatch log streaming (note that this would cost you to store logs in CloudWatch) for your Beanstalk instances, you can perform:
eb logs --stream --log-group <CloudWatch logGroup name>
The above command basically gives you the logs for your instance specific to the file/log group you specified. In order for the above command to work, you need to enable CloudWatch log streaming:
eb logs --stream enable
As an aside, to determine which log groups your environment presently has, perform:
aws logs describe-log-groups --region <region> | grep <beanstalk environment name>

How to keep logs in AWS if application restarts?

I run a spring boot application in AWS with Docker. Sometimes Amazon have to restart a hardware. Then Environment Health of instance in Beanstalk goes Degraded, Warning, and restarts.
I do want my app logs from the last 7 days but it was restarted due to unforeseen AWS hardware issues so I lost my information. How can I avoid it and make AWS to save all my logs even after restart?
It is true that archiving logs to S3 would work for the most part but you may want to consider installing and configuring the CloudWatch Logs agent - http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/QuickStartEC2Instance.html
This will stream logs directly to CloudWatch and save them upon termination. You also could consider numerous other solutions for this such as Sumo Logic, ELK, Splunk, etc.
You should always build solutions so as to be ready even when hardware crashes. One possible solution could be that while rotating log files send them to S3 bucket. You can create a cron-job to do this.

AWS ECS container logs design pattern

I have a classic scala app, it produces three different logs in the location
/var/log/myapp/log1/mylog.log
/var/log/myapp/log2/another.log
/var/log/myapp/log3/anotherone.log
I containerized the app and working fine, I can get those logs by docker volume mount.
Now the app/container will be deployed in AWS ECS with auto scaling group. in this case multiple container may run on one single ecs host.
I would like to use cloud watch to monitor my application logs.
One solution could be put aws log agent inside my application container.
Is there any better way to get those application logs from container to cloudwatch log.
help is very much appreciated.
When using docker, the recommended approach is to not log to files, but to send logs to stdout and stderr. Doing so prevents the logs from being written to the container's filesystem, and (depending on the logging driver in use), allows you to view the logs using the docker logs / docker container logs subcommand.
Many applications have a configuration option to log to stdout/stderr, but if that's not an option, you can create a symlink to redirect output; for example, the official NGINX image on Docker Hub uses this approach.
Docker supports logging drivers, which allow you to send logging to (among others) AWS cloud watch. After you modified your image to make it log to stdout/stderr, your can configure the AWS logging driver.
More information about logging in Docker can be found in the "logging" section in the documentation
You don't need log agent if you can change the code.
You can directly publish Custom Metric Data into ColudWatch like this page said: https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/examples-cloudwatch-publish-custom-metrics.html

Enable log file rotation to s3

I have enabled this option.
Problem is:
If I don't press snapshot log button log, is not going to s3.
Is there any method through which log publish to s3 each day?
Or how log file rotation option is working ?
If you are using default instance profile with Elastic Beanstalk, then AWS automatically creates permission to rotate the logs to S3.
If you are using custom instance profile, you have to grant Elastic Beanstalk permission to rotate logs to Amazon S3.
The logs are rotated every 15 minutes.
AWS Elastic Beanstalk: Working with Logs
For a more robust mechanism to push your logs to S3 from any EC2 server instance, you can pair LogRotate with S3. I've put all the details in this post as a reference whicould should be able to achieve exactly what you're describing.
Hope that helps.
NOTICE: if you want to rotate custom log files, then, depending on your container, you need to add links to your custom log files in a proper places. For example, consider Ruby on Rails deployment, if you want to store custom information, eg. some monitoring using Oink gem in oink.log file, add proper link in /var/app/support/logs using .ebextensions
.ebextensions/XXXlog.config
files:
"/var/app/support/logs/oink.log" :
mode: "120400"
content: "/var/app/current/log/oink.log"
This, after deploy, will create symlink:
/var/app/support/logs/oink.log -> /var/app/current/log/oink.log
I'm not sure why permissions 120400 are used, I took it from the example in Amazon AWS doc page http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html (seems like 120xxx is for symlinks in unix fs)
This log file rotation is good for archival purpose, but difficult to search and consolidate when you need the most.
Consider using services like splunk or loggly.