Best approach for sending logs from ECS Fargate into Elasticsearch - amazon-web-services

We have a setup with multiple containers running NodeJS services(node:11-alpine docker image) deployed in AWS ECS Fargate.
We already have a running ElasticSearch instance collecting logs from non-Fargate application. I would like to pass the logs from the Fargate containers into this ElasticSearch instance, but I have a hard time to figure out what is the best approach.
1)
It seem one way is to stream the logs from Cloudwatch --> Lambda --> ElasticSearch. It seem a bit overkill - isn't there another way to do this?
2)
I was hoping i could run a Logstash docker instance that could collect the logs from the containers but I am not sure if this is possible when running Fargate?
3)
Should I install something like FileBeat on each container and let that send the logs?
Any help is appreciated.

It seems one way is to stream the logs from Cloudwatch --> Lambda --> ElasticSearch. It seem a bit overkill - isn't there another way to do this?
If you're looking for an AWS-based managed solution, that is one of the ways. You don't really need to write a Lambda function, AWS does it for you. Although, you have to bear the cost of Lambda and CloudWatch.
There is another solution that is recommended by AWS and that is the use of fluent-bit as a sidecar container to export logs directly to Elasticsearch/OpenSearch from other containers running within a service. Using this solution, you save money by not using AWS CloudWatch. This solution also provides better results with regard to the loss of logs upon failure.
I was hoping I could run a Logstash docker instance that could collect the logs from the containers but I am not sure if this is possible when running Fargate?
Yes, that is possible if you run that container along with the other container.
Should I install something like FileBeat on each container and let that send the logs?
You can use Fluent Bit, Filebeat, Fluentd, Functionbeat, or Logstash as you like.
Note: If you're thinking of running your own logs exporter container like Logstash, Fluent Bit, etc, don't enable CloudWatch logging to save money as you're not going to use that.

Related

What's the proper way to forward ECS service logs to AWS CloudWatch?

So my understanding is that when I deploy a new service to ECS using AWS Copilot, logs are forwarded to CloudWatch automatically by default.
Copilot creates log groups for each service, I can see that in CloudWatch Logs.
However, according to AWS docs, logging can be also implemented using Copilot sidecars and AWS FireLens, which uses FluentD or FluentBit to collect logs, and then it forwards stuff CloudWatch.
I don't understand why is this necessary. I mean, why to create a sidecar for logging to CloudWatch, when logging seems to work automatically, without any sidecar.
https://aws.github.io/copilot-cli/docs/developing/sidecars/
There is an example here for logging via FireLens. What's the benefit of doing this over the logging mechanism that just works by default?
Thanks in advance!
AWS Copilot builds an image for you application that already has an agent configured to forward logs to CloudWatch, however you might want to deploy other images to ECS that don't have this agent installed. For example, suppose you wanted to deploy an nginx container to ECS, you might choose to use a sidecar to forward logs instead of customizing the nginx image.

Logging for AWS ECS FARGATE

I'm trying to use AWS Firelens for our FARGATE container logs. I'm are doing so by using the side-car method. Everything is working as expected. However I have one concern, i.e. if the logs in the source container is rotational. In other words, should I be worried about the disk space in our containers to run out because of the logs?
The reason why we ask this is because after enabling Cloudwatch Agent in the source container, I see that there is no change in the disk_used_percent at all. This is why I couldn't come to a conclusion. Can someone assist me with some information on the aforementioned question.

What is the replacement with AWS infrastructure for Celery and Redis?

My application is written in django f/w which uses celery and redis for asynchronous tasks. I would like to autoscale workers according load/no.of messages in queue. For this, I would like to make use of different options provided by AWS.
What is the replacement with AWS infrastructure for Celery and Redis?
ElastiCache is Amazon's Managed Service for an in-memory data store. Elasticache lets you provision a service that provides high-performance data-store functionality using either Redis or Memcached as a base.
I have not personally used Celery but I know if it as a message queue like RabbitMQ. In this case, the likely Managed Service alternative would be AmazonMQ. AmazonMQ uses Apache ActiveMQ under the hood but the API layer should abstract away most of the differences for you.
If you wanted to, you could probably get away with running Celery on AWS and not use their AmazonMQ Service. You could simply perform the install process on an EC2 instance or with ElasticBeanstalk. Or even run it in a Linux Container on something like ECS or Fargate.
If you were to use EC2, you could probably even get away with using an existing community Marketplace AMI with Celery already provisioned and presumably configurable with cloud-init data.
Here are blog posts and other questions from people setting up Celery-based queue up on AWS:
Using EC2
Using ElasticBeanstalk
Using Fargate
Using ECS
Hope this helps! If you need any additional information or support for this question feel free to reach out with any questions you may have!

On AWS, run an AWS CLI command daily

I have an AWS CLI invocation (in this case, to launch a configured EMR cluster to do some steps and then shut down) but I'm not sure how to go about running it daily.
I guess one way to do it is an EC2 micro instance running a cron job, or an ECS task in a micro that launches the command, but that all seems like it might be overkill. It looks like there's also a way to do it in Lambda, but rom what I can tell it'd be kludgy.
This doesn't have to be a good long-term solution, something that's suitable until I can do it right (Data Pipelines) would work just fine.
Suggestions?
If it is not a strict requirement to use the AWS CLI, you can use one of the AWS SDK instead to programmatically invoke Lambda.
Schedule a CloudWatch Rules using cron
When configured, the CloudWatch Rules will trigger a Lambda function
Implement a Lambda function that calls EMR using one of the supported SDKs (e.g. the EMR class in the AWS JavaScript SDK)
Make sure that you have the IAM configuration in place
Full example is available in the Schedule AWS Lambda Functions Using CloudWatch Events
Kludgy? Yes, configuration is needed, however if you take into account the amount of work required to launch EC2 / ECS (and make sure that it re-launches in the event of failure), I'd say it evens out.
Not sure about the whole task that you are doing, but to avoid doing it:
Manually
Avoid another set up for resources in AWS (as you mentioned)
I would create a simple job in a Continuous Integration (CI) server like jenkins,bamboo,circleci ..... (list can go on). I would assume that you might already have a CI server running, why not use it?

AWS ECS custom CloudWatch metrics

I'm looking for a way to establish custom metrics over StatsD protocol for Amazon Elastic Container Service. I've found a documentation on how to establish Amazon CloudWatch Agent on EC2. It works well. However I'm failing to find a correct configuration for Dockerfile. Quite probably some set of custom IAM permissions will also be required there.
Is it possible to have Docker containers working from AWS ECS with custom metrics using StatsD reporting to AWS CloudWatch?
Rather than building your own container, you can use the one provided by Amazon. This article explains how, including a link to an example daemon service task configuration.