My application is written in django f/w which uses celery and redis for asynchronous tasks. I would like to autoscale workers according load/no.of messages in queue. For this, I would like to make use of different options provided by AWS.
What is the replacement with AWS infrastructure for Celery and Redis?
ElastiCache is Amazon's Managed Service for an in-memory data store. Elasticache lets you provision a service that provides high-performance data-store functionality using either Redis or Memcached as a base.
I have not personally used Celery but I know if it as a message queue like RabbitMQ. In this case, the likely Managed Service alternative would be AmazonMQ. AmazonMQ uses Apache ActiveMQ under the hood but the API layer should abstract away most of the differences for you.
If you wanted to, you could probably get away with running Celery on AWS and not use their AmazonMQ Service. You could simply perform the install process on an EC2 instance or with ElasticBeanstalk. Or even run it in a Linux Container on something like ECS or Fargate.
If you were to use EC2, you could probably even get away with using an existing community Marketplace AMI with Celery already provisioned and presumably configurable with cloud-init data.
Here are blog posts and other questions from people setting up Celery-based queue up on AWS:
Using EC2
Using ElasticBeanstalk
Using Fargate
Using ECS
Hope this helps! If you need any additional information or support for this question feel free to reach out with any questions you may have!
Related
I am newbie in AWS and totally confused about the deploy. here i have
React for front-end , Nodejs for API, Mongodb for database and redis for session store.
Can i use 1 EC2 for every service ? or
Divide every service as different EC2
Can i use Elastic Beanstalk Environment?
Which is better option for scaling and update without downtime in future ?
Can I use 1 EC2 for every service?
Its depend on your case but the best approach is to utilize the underlying EC2 instance is to run multiple services on single EC2 for nodejs and front-end app, as nodejs container-based application take maximum advantage in this case. in this case, ECS blue-green deployment with the dynamic port of the container can help to scale with zero downtime.
Divide every service as different EC2
In nodejs based application this approach does not help you a lot where for Redis and mongo it make sense if you are planning for clustering and replica also these applications need persistent storage so will keep storage on each instance, so my suggestion is to keep redis and mongo DB in daemon mode and application in replica mode, as these are application that will do blue-green deployment not the redis or Db.
AWS provides two types of task to deal with such cases
REPLICA—
The replica scheduling strategy places and maintains the desired
number of tasks across your cluster. By default, the service scheduler
spreads tasks across Availability Zones. You can use task placement
strategies and constraints to customize task placement decisions. For
more information, see Replica.
DAEMON—
The daemon scheduling strategy deploys exactly one task on each active
container instance that meets all of the task placement constraints
that you specify in your cluster. When using this strategy, there is
no need to specify a desired number of tasks, a task placement
strategy, or use Service Auto Scaling policies. For more information
ecs_services
We have a setup with multiple containers running NodeJS services(node:11-alpine docker image) deployed in AWS ECS Fargate.
We already have a running ElasticSearch instance collecting logs from non-Fargate application. I would like to pass the logs from the Fargate containers into this ElasticSearch instance, but I have a hard time to figure out what is the best approach.
1)
It seem one way is to stream the logs from Cloudwatch --> Lambda --> ElasticSearch. It seem a bit overkill - isn't there another way to do this?
2)
I was hoping i could run a Logstash docker instance that could collect the logs from the containers but I am not sure if this is possible when running Fargate?
3)
Should I install something like FileBeat on each container and let that send the logs?
Any help is appreciated.
It seems one way is to stream the logs from Cloudwatch --> Lambda --> ElasticSearch. It seem a bit overkill - isn't there another way to do this?
If you're looking for an AWS-based managed solution, that is one of the ways. You don't really need to write a Lambda function, AWS does it for you. Although, you have to bear the cost of Lambda and CloudWatch.
There is another solution that is recommended by AWS and that is the use of fluent-bit as a sidecar container to export logs directly to Elasticsearch/OpenSearch from other containers running within a service. Using this solution, you save money by not using AWS CloudWatch. This solution also provides better results with regard to the loss of logs upon failure.
I was hoping I could run a Logstash docker instance that could collect the logs from the containers but I am not sure if this is possible when running Fargate?
Yes, that is possible if you run that container along with the other container.
Should I install something like FileBeat on each container and let that send the logs?
You can use Fluent Bit, Filebeat, Fluentd, Functionbeat, or Logstash as you like.
Note: If you're thinking of running your own logs exporter container like Logstash, Fluent Bit, etc, don't enable CloudWatch logging to save money as you're not going to use that.
I have a Django web app. I am planning to deploy on the AWS web server.
I am using celery and rabbitmq que manager for my application.
I have read about the AWS services.
I have two options use :
1) AWS Elastic Beanstalk or
2) Create an EC2 instance of linux and install postgresql, celery, rabbitmq etc
So which is better to use.
AWS EC2 is always a better option as it gives you complete access on the OS and physical access to the data storage. This will help you to manage your application is a much more efficient way. Also EC2 instance can not only host a single application but can have as much ever applications that you require(depends on the capacity/instance type of the server). This will let you tweak the webserver proxy as well.
In case of Beanstalk you do not get similar options, you have to manage the applications with the options that are available to you.
To summarise:
In case you want complete control of you application - Use EC2.
If you are looking for a managed service wherein not much control is required you can opt for Beanstalk. Personally I would like to have the entire control over my application ;)
I'm using AWS Dynamo DB, Lambda, ElastichSearch, ElasticCache(Redis). I want to bring all these services offline for local development. I wonder's is there a Docker container for all these services?
Perhaps! There's a (set of) Docker containers that claim they provide local implementations of popular AWS services: localstack.
Edit: For lambda specific things there's also Docker Lambda!
I've never actually used these Docker containers, but have wanted to. (But my development needs try to use commodity services instead of vendor specific. So MongoDB instead of DynoDB, and sure we might use ElastiCache to run our Redis cluster, but that just means in local development we can use Redis directly. Having said that, that's not everyone's cup of tea / maybe not possible for some things..)
We use docker for most AWS Services for local development except for AWS Lambda.
We use the service containers as below:
MySQL for RDS MySQL
Redis for ElastiCache
ElasticSearch for AWS ElasticSearch
fake-s3 for S3
ActiveMQ for mocking SQS and SNS topics (The implementation for SNS topics is a bit ugly, but abstracted out in one place with some if-else statements)
Most of our services make use docker-compose to start the dependent containers. We've included these containers on our build server too to run our integration tests.
In addition, most of the containers we are using needed some modifications to the original Docker file. So we had to push our changes to our own Docker repository, which we maintain using ECS.
For Lambda, we do not use a docker container as we start our own HTTP server locally to test and invoke the lambda function.
Been using this setup for over a year without any issues. You may also want to refer to this blog from IFTTT to get some more ideas around DNS resolution and how to make this effort better.
I'm developing a prototype IoT application which does the following
Receive/Store data from sensors.
Web application with a web-based IDE for users to deploy simple JavaScript/Python scripts which gets executed in Docker Containers.
Data from the sensors gets streamed to these containers.
User programs can use this data to do analytics, monitoring etc.
The logs of these programs are outputted to the user on the webapp
Current Architecture and Services
Using one AWS EC2 instance. I chose EC2 because I was trying to figure out the architecture.
Stack is Node.js, RabbitMQ, Express, MySQl, MongoDB and Docker
I'm not interested in using AWS IoT services like AWS IoT and Greengrass
I've ruled out Heroku since I'm using other AWS services.
Questions and Concerns
My goal is prototype development for a Beta release to a set of 50 users
(hopefully someone else will help/work on a production release)
As far as possible, I don't want to spend a lot of time migrating between services since developing the product is key. Should I stick with EC2 or move to Beanstalk?
If I stick with EC2, what is the best way to handle small-medium traffic? Use one large EC2 machine or many small micro instances?
What is a good way to manage containers? Is it worth it use swarm and do container management? What if I have to use multiple instances?
I also have small scripts which have status of information of sensors which are needed by web app and other services. If I move to multiple instances, how can I make these scripts available to multiple machines?
The above question also holds good for servers, message buses, databases etc.
My goal is certainly not production release. I want to complete the product, show I have users who are interested and of course, show that the product works!
Any help in this regard will be really appreciated!
If you want to manage docker containers with least hassle in AWS, you can use Amazon ECS service to deploy your containers or else go with Beanstalk. Also you don't need to use Swarm in AWS, ECS will work for you.
Its always better to scale out rather scale up, using small to medium size EC2 instances. However the challenge you will face here is managing and scaling underlying EC2's as well as your docker containers. This leads you to use Large EC2 instances to keep EC2 scaling aside and focus on docker scaling(Which will add additional costs for you)
Another alternative you can use for the Web Application part is to use, AWS Lambda and API Gateway stack with Serverless Framework, which needs least operational overhead and comes with DevOps tools.
You may keep your web app on Heroku and run your IoT server in AWS EC2 or AWS Lambda. Heroku is on AWS itself, so this split setup will not affect performance. You may heal that inconvenience of "sitting on two chairs" by writing a Terraform script which provisions both EC2 instance and Heroku app and ties them together.
Alternatively, you can use Dockhero add-on to run your IoT server in a Docker container alongside your Heroku app.
ps: I'm a Dockhero maintainer