I have been benchmarking RethinkDB's performance on an i3.Metal ec2 instance, with all 8 nvme SSD's in a raid 0, and am finding the performance to be lacking.
The setup I have is 3 clients subscribed to the changes of a table. Then one client progressively increases it's writes per second, which get sent to the other two clients via a changefeed.
This setup can reach about 1.5k writes a second before it starts to get erratic and lag. If I don't have any clients, and no changefeed subscriptions I can get about 2.5k writes a second.
There were benchmarks which showed RethinkDB getting roughly 8k read/writes per node on inferior hardware. I am just wondering is there some special way to configure the i3.metal to get better performance? Or is there an entirely different cloud service that will provide better RethinkDB performance?
Related
i have a newbie question in here, but i'm new to clouds and linux, i'm using google cloud now and wondering when choosing a machine config
what if my machine is too slow? will it make the app crash? or just slow it down
how fast should my vm be? in the image bellow
last 6 hours of a python scripts i'm running and it's cpu usage, it's obviously running for less than %2 of the cpu for most of it's time, but there's a small spike, should i care about the spike? and also, how much should my cpu usage be max before i upgrade? if a script i'm running is using 50-60% of the cpu most of the i assume i'm safe, or what's the max before you upgrade?
what if my machine is too slow? will it make the app crash? or just
slow it down
It depends.
Some applications will just respond slower. Some will fail if they have timeout restrictions. Some applications will begin to thrash which means that all of a sudden the app becomes very very slow.
A general rule, which varies among architects, is to never consume more than 80% of any resource. I use the rule 50% so that my service can handle burst traffic or denial of service attempts.
Based on your graph, your service is fine. The spike is probably normal system processing. If the spike went to 100%, I would be concerned.
Once your service consumes more than 50% of a resource (CPU, memory, disk I/O, etc) then it is time to upgrade that resource.
Also, consider that there are other services that you might want to add. Examples are load balancers, Cloud Storage, CDNs, firewalls such as Cloud Armor, etc. Those types of services tend to offload requirements from your service and make your service more resilient, available and performant. The biggest plus is your service is usually faster for the end user. Some of those services are so cheap, that I almost always deploy them.
You should choose machine family based on your needs. Check the link below for details and recommendations.
https://cloud.google.com/compute/docs/machine-types
If CPU is your concern you should create a managed instance group that automatically scales based on CPU usage. Usually 80-85% is a good value for a max CPU value. Check the link below for details.
https://cloud.google.com/compute/docs/autoscaler/scaling-cpu
You should also consider the availability needed for your workload to keep costs efficient. See below link for other useful info.
https://cloud.google.com/compute/docs/choose-compute-deployment-option
I have a REST API web server, built in .NetCore, that has data heavy APIs.
This is hosted on AWS EC2, I have noticed that the average response time for certain APIs are ~4 seconds and if I turn up the AWS-EC2 specs, the response time goes down to a few milliseconds. I guess this is expected, what I don't understand is that even when I load test the APIs on a lower end CPU, the server never crosses 50% utilization of memory/CPU. So what is the correct technical explanation that makes the APIs perform faster if the lower end CPU never reaches a 100% utilization of memory/CPU?
There is no simple answer, there are so many ec2 variations you need to first figure out what is slowing down your API.
When you 'turn up' your ec2 instance, you are getting some combination of more memory, faster cpu, faster disk and more network bandwidth - and we can't tell which one of those 'more' features are improving your performance. Different instance classes ar optimized for different problems.
It could be as simple as the better network bandwidth, or it could be that your application is disk-bound and the better instance you chose is optimized for i/O performance.
Depending on what feature your instance is lacking, it would help you decide which type of instance to upgrade to - or as you have found out, just upgrade to something 'bigger' and be happy with the performance (at the tradeoff of being more expensive).
A c5.2xlarge instance has 8 vCPU. If I run os.cpu_count() (Python) or std::thread::hardware_concurrency() (C++) they each report 8 on this instance. I assume the underlying hardware is probably a much bigger machine, but they are telling me what I have available to me, and that seems useful and correct.
However, if my ECS task requests only 2048 CPU (2 vCPU), then it will still get 8 from the above queries on a c5.2xlarge machine. My understanding is Docker is going to limit my task to only using "2 vCPU worth" of CPU, if other busy tasks are running. But it's letting me see the whole instance.
It seems like this would lead to tasks creating too many threads/processes.
For example, if I'm running 2048 CPU tasks on a c5.18xlarge instance, each task will think it has 72 cores available. They will all create way too many threads/processes overall; it will work but be inefficient.
What is the best practice here? Should programs somehow know their ECS task reservation? And create threads/processes according to that? That seems good except then you might be under-using an instance if it's not full of busy tasks. So I'm just not sure what's optimal there.
I guess the root issue is Docker is going to throttle the total amount of CPU used. But it cannot adjust the number of threads/processes you are using. And using too many or too few threads/processes is inefficient.
See discussion of cpu usage in ECS docs.
See also this long blog post: https://goldmann.pl/blog/2014/09/11/resource-management-in-docker/
There is a huge difference between virtualization technologies and containers. Having a clear understanding of these technologies will help. That being said an application should be configurable if you want to deploy it in different environments.
I would suggest creating an optional config which tells the application that it can only use certain number of cpu cores. If that value is not provided then it falls back to auto detect.
Once you have this option when defining ECS task you can provide this optional config, which will fix the problem you are facing.
I've written a simple batch file that starts apache and sends a curl request to my server at start time. I am using windows server 2016 and n-4 compute engine instance.
I've noticed that 2 identical machines require vastly different start up times. One sends a message in just 40s, other one takes almost 80s. While in console, both seem to start at the same time, the reality is different, since the other one is inaccessible for 80s via RD tools.
The second machine is made from disk image of the first one. What factors contribute to the start time? Where should I trip the fat?
The delay could occur if the instances are in different regions and also if the second instance has some additional memory intensive applications or additional customizations done. The boot disk type for the instance also contributes to the booting time. Are you getting any information from the logs about this delay during the startup time? You could also compare traceroute results on both instances to see if there is a delay at some point in the network.
At the moment, I have a single c4.large (3.75GB RAM, 2 vCPU) instance in my workers cluster, currently running 21 tasks for 16 services. These tasks range from image processing, to data transformation, most sending HTTP requests too. As you can see, the instance is quite well utilisated.
My question is, how do I know how many tasks to place on an instance? I am placing up to 8 tasks for a service, but I'm unsure as to whether this results in a speed increase, given they are using the same underlying instance. How do I find the optimal placement?
Should I put many chefs in my kitchen, or will just two get the food out to customers faster?
We typically run lots of smaller sized server in our clusters. Like 4-6 t2.small for our workers and place 6-7 tasks on each. The main reason for this is not to speed up processing but reduce the blast radius of servers going down.
We've seen it quite often for a server to simply fail an instance health check and AWS take it down. Having the workers spread out reduces the effect on the system.
I agree with the other people’s 80% rule. But you never want a single host for any kind of critical applications. If that goes down you’re screwed. I also think it’s better to use larger sized servers because of their increase network performance. You should look into a host with enhanced networking, especially because you say you have a lot of HTTP work.
Another thing to consider is disk I/O. If you are piling too many tasks on a host and there is a failure, it’s going to try to schedule those all somewhere else. I have had servers crash because of too many tasks being scheduled and burning through disk credits.