large number of docker containers and aws resources - amazon-web-services

My application allow users to create multiple nodered instances using docker containers , each docker container handle one instance , the number of containers could reach 65000 container , so:
how can i configure host resources to handle that number of containers
if my host memory is 16 gb how to check if it can handle all those instances
if No should i increase memroy size or should i create another instance (aws instance) and use it

You don't, 16GB / 65000 = 0.25mb per container, there is no way a NodeJS process will start, run and do anything useful. So you are never going to run 65000 containers on a single machine.
As I said in the comment on your previous question, you need to spend some time running tests and determine how much memory/CPU your typical use case will need and you then set resource limits on the container when started to limit it to those values. How to set resource limits are in the Docker docs
If your work load is variable then you may need to have different sizes of container depending on the workload.
You then do the basic maths that is CPU in the box/ %CPU per container and memory in the box / memory per container and which ever is the smaller number is the max number of containers you can run. (You will also need to include some overhead for monitoring/management and other house keeping tasks)
It's up to you to decide what approach to sizing/choosing cloud infrastructure to support that work load and what will be economically viable.
(or you could outsource this to people that have already done a bunch of this e.g. FlowForge inc [full disclosure, I am lead developer at FlowForge])

Related

Why not get 2 nano instances instead 1 micro instance on AWS?

I'm choosing instances do run microservices on an AWS EKS cluster.
When reading about it on this article an taking a look on the aws docs it seems that choosing many small instances instead of on larger instance results on a better deal.
There seems to be no downside on taking, for instance, 2 t3.nano (2 vCPU / 0.5GiB each) vs 1 t3.micro (2 vCPU / 1GiB each). The price and the memory are the same but the CPU provided has a huge difference the more instances you get.
I assume there are some processes running on each machine by default, but I found no places metioning its impact on the machine resources or usage. Is it negligible? Is there any advantage on taking one big instance instead?
The issue is whether or not your computing task can be completed on the smaller instances and also there is an overhead involved in instance-to-instance communication that isn't present in intra-instance communication.
So, it is all about fitting your solution onto the instances and your requirements.
There is no right answer to this question. The answer depends on your specific workload, and you have to try out both approaches to find out what works best for your case. There are advantages and disadvantages to both approaches.
For example, if the OS takes 200 MB for each instance, you will be left with only 600 MB both nano instances combined vs the 800 MB on the single micro instance.
When the cluster scales out, initializing 2 nano instances might roughly take twice as much time as initializing one micro instance to provide the same additional capacity to handle the extra load.
Also, as noted by Cargo23, inter-instance communication might increase the latency of your application.

Containerized Monitoring

I am monitoring an EC2 instance using Prometheus, node Exporter, and Grafana - each of which operates in its own container.
I thought that putting each of the 3 monitoring tools in its own container would make the system easier to set up and the system would run faster. Is that true that the system would run faster?
To start running all 3 monitoring tools, I have a docker-compose that starts all 3 containers at once. Would it be beneficial to run all 3 monitoring tools in one container instead of separate containers.
Here is the current system architecture
It is probably not a good idea to try to combine the processes into a single container.
There are various reasons:
Containers aren't like VMs; there's no significant overhead in running X processes in X containers;
In fact, keeping the processes in separate images (and thus running them in separate containers) permits each to be maintained (e.g. patched) distinctly;
... And it permits you to potentially secure each process distinctly;
It's considered good practice to run one process per container
Keeping the processes in their own containers permits you to e.g. run one prometheus container and one grafana container for multiple containers;
...following on from this, this permits more flexibility in relocating containers too, potentially dropping e.g. Grafana to use a Grafana hosted service etc.

How does Google Cloud Run spin up instantly

So, I really like the idea of server-less. I came across Google Cloud Functions and Google Cloud Run.
So google cloud functions are individual functions, which is a broad perspective, I assume google must be securely running on a huge nodejs server. And it contains all the functions of all the google consumers and fulfils the request using unique URLs. Now, Google takes care of the cost of this one big server and charges users for every hit their function gets. So its pay to use. And makes sense.
But when it comes to Cloud Run. I fail to understand how does it work. Obviously the container must not always be running because then they will simply charge a monthly basis instead of a per-hit basis, just like a normal VM where docker image is deployed. But no, in reality, they charge on per hit basis, that means they spin up the container when a request arrives. So, I don't understand how does it spin it up so fast? The users have the flexibility of running any sort of environment, that means the docker container could contain literally anything. Maybe a full-fledged Linux OS. How does it load up the environment OS so quickly and fulfils the request? Well, maybe it maintains the state of the machine and shut it down when not in use, but even then, it will require a decent amount of time to restore the state.
So how does google really does it? How is it able to spin up a customer's container in literally no time?
The idea of fast spinning-up sandboxes containers (that run on their own kernel for security reasons) have been around for a pretty long time. For example, Intel Clear Linux Containers and Firecracker provide fast startup through various optimizations.
As you can imagine, implementing something like this would require optimizations at many layers (scheduling, traffic serving, autoscaling, image caching...).
Without giving away Google’s secrets, we can probably talk about image storage and caching: Just like how VMs use initramfs to pre-cache the state of the VM, instead of reading all the files from harddisk and following the boot sequence, we can do similar tricks with containers.
Google uses a similar solution for Cloud Run, called gVisor. It's a user-space virtualization technique (not an actual VMM or hypervisor). To run containers on a Linux-like environment, gVisor doesn't need to boot a Linux kernel from scratch (because gVisor reimplements the linux kernel in go!).
You’ll find many optimizations on other serverless platforms across most cloud providers (such as how to keep a container instance around, should you be predictively scheduling inactive containers before the load arrives). I recommend reading the Peeking Behind the Curtains of Serverless Platforms paper to get an idea about what are the problems in this space and what are cloud providers trying to optimize for speed and cost.
You have to decouple the containers to the VMs. The second link of Dustin is great because if you understand the principles of Kubernetes (and more if you have a look to Knative), it's easy to translate this to Cloud Run.
You have a pool of resources (Nodes in Kubernetes, the VM in fact with CPU and memory) and on these resources, you can run container: 1, 2, 1000 per VM, maybe, you don't know and you don't care. The power of the container, is the ability to be packaged with all the dependency that it needs. Yes, I talked about package because your container isn't an OS, it contains the dependencies for interacting with the host OS.
For preventing any problem between container from different project/customer, the container run into a sandbox (GVisor, first link of Dustin).
So, there is no VM to start and to stop, no VM to create when you deploy a Cloud Run services,... It's only a start of your container on existing resources. It's also for this reason that you need to have a stateless container, without disks attached to it.
Do you want 3 "secrets"?
It's exactly the same things with Cloud Functions! Your code is packaged into a container and deploy exactly as it's done with Cloud Run.
The underlying platform that manages Cloud Functions and Cloud Run is the same. That's why the behavior and the feature are very similar! Cloud Functions is longer to deploy because Google need to build the container for you. With Cloud Run the container is already built.
Your Compute Engine instance is also managed as a container on the Google infrastructure! More generally, all is container at Google!

App Deployment Container Size vs Quantity

I am learning how to deploy a meteor app on galaxy and I am really confused by all this container stuff.
I am trying to understand when it would be better to scale an app by increasing container size rather than by adding more containers.
If I had lightweight chat room website for example. Why would I ever need to upgrade the container size if I can just add more small containers. In the end isn't the sum of processing power what matters?
2 x 0.5 containers = 1 x 1 container
The cost of doing it either way is the same.
Also, if a user modifies the database while using the app in one container, won't the other instances of the app running on other containers take a while to notice the change? If users on different containers were chatting together it would be a problem wouldn't it? How would you avoid it?
The only way I can make sense of this is:
Either lack of CPU and RAM, or capacity to handle parallel requests are going to create a need to scale.
If the app receives too much traffic you get more containers.
If the app uses too much CPU and RAM you get a bigger container.
But how can app ever get too big to fit in one container? Won't the CPU and RAM used by the app be related to how many users are using that instance of the app. Couldn't you just solve the problem by adding more containers and spreading out the users and decreasing CPU and RAM usage this way.
And why would you need to get more containers to handle more requests. Won't a bigger container also handle more requests?
The question you are asking is too broad to answer. In your case both strategies increasing container size (or vertical scaling) and adding more container (or horizontal scaling) will work if implemented effectively.
But preferring horizontal scaling is the best option. When you launch a cluster of containers they run behind AWS Elastic Loadbalancer and if you enable sticky sessions there will be no any problem in chat rooms.
Read this
http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/elb-sticky-sessions.html
Also this is quiet good to read.
http://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch_alarm_autoscaling.html
https://aws.amazon.com/blogs/compute/powering-your-amazon-ecs-clusters-with-spot-fleet/
Then the question of database, I assume you will be using a parent database for your app so all the containers will be reading from same DB so do not worry about the changes applied from one container and seeing those changes applied from other container if proper DB optimization is in place there will be no any issue.

AWS autoscaling an existing instance

This question has a conceptual and practical parts.
Conceptually I'd like to know if using the autoscaling functionality is equivalent to simply increasing the compute power by a factor of the number of added instances?
Practically ... how does this work? I have one running instance, its database sitting on an LVM composed of multiple EBS volumes, similarly with all website data. Judging from the load on the instance I either need to upgrade to a more powerful instance or introduce this autoscaling. Is it a copy of the running server? If so, how is the database (etc) kept consistent?
I've read through the AWS documentation, and still haven't got the picture yet - I could set one autoscaling group up which would probably clear my doubts, but I am very leery to do this with a production server.
Any nudges in the right direction would be welcome.
Normally if you have a solution that also uses a database, and several machines in the solution, the database is typically not on any of the machines but is instead hosted seperately with each worker machine pointing to the same database - if you are on AWS platform already, then DynamoDB or RDS are both good solutions for this.
In theory, for some applications, upgrading the size of the single machine will give you the same power as adding several smaller machines, but increasing the size of the single machine, while usually these easiest thing to do at first, should not be considered autoscaling and has its own drawbacks. Here are some things to consider:
Using multiple machines instead of one big one gives you some fault tolerance. One or more machines can go down and if your solution is properly designed new machines will spin up to replace them.
Increasing the size of a single machine solution means you are probably paying too much. If you size that single machine big enough to handle peak workloads, that means at other times (maybe most of the time), you are paying for a bigger machine than you need. If you setup your autoscaling solution properly more machines come on line in response to increasing demand, and then they terminate when that demand decreases - you only pay for the power you need when you need it.
When your solution is designed in this manner, you need to think of all of the worker machines as ephermal - likely to disappear at any time, so you need to build your solution differently. Besides using a hosted database (like on DynamoDB or AWS RDS), you also should not store any data on the machines in your auto-scaling group that doesn't also live somewhere else. For example, if part of your app allows users to upload images, you don't store them on the instances, you store them in S3. Same would apply to any other new data that comes in.
You need to be able to figuratively 'pull the plug' at any instant on any of the machines in your ASG without losing data.
Ultimately a properly setup auto-scaling solution will likely serve you better, but without doubt it is simpler to just 'buy a bigger machine' and the extra money you spend on running that bigger machine may be more than offset by the time and effort you don't have to spend re-architecting your solution to properly run in an autoscaling environment. The unique requirements of your solution will ultimately decide which approach is better.