ec2 instance running locust.io issues - amazon-web-services

I'm trying to run a locust.io load test on an ec2 instance - a t2.micro. I fire up 50 concurrent users, and initially everything works fine, with the CPU load reaching ~15%. After an hour or so though, the network out shows a drop of about 80% -
Any idea why this is happening? It's certainly not due to CPU credits. Maybe I reached the network limits for a t2 micro instance?
Thanks

Are you sure it's not a CPU credit issue? Can you check your cpu credits over that same time period to see how they look?
Or better yet, run the same test on a non-t2 instance. One that isn't limited in it's CPU usage.
t2.micro's consume CPU credits at usage about 10% of CPU.

Related

AWS EC2 Performance explanation

I have a REST API web server, built in .NetCore, that has data heavy APIs.
This is hosted on AWS EC2, I have noticed that the average response time for certain APIs are ~4 seconds and if I turn up the AWS-EC2 specs, the response time goes down to a few milliseconds. I guess this is expected, what I don't understand is that even when I load test the APIs on a lower end CPU, the server never crosses 50% utilization of memory/CPU. So what is the correct technical explanation that makes the APIs perform faster if the lower end CPU never reaches a 100% utilization of memory/CPU?
There is no simple answer, there are so many ec2 variations you need to first figure out what is slowing down your API.
When you 'turn up' your ec2 instance, you are getting some combination of more memory, faster cpu, faster disk and more network bandwidth - and we can't tell which one of those 'more' features are improving your performance. Different instance classes ar optimized for different problems.
It could be as simple as the better network bandwidth, or it could be that your application is disk-bound and the better instance you chose is optimized for i/O performance.
Depending on what feature your instance is lacking, it would help you decide which type of instance to upgrade to - or as you have found out, just upgrade to something 'bigger' and be happy with the performance (at the tradeoff of being more expensive).

How to minimize Google Cloud launch latency

I have a persistent server that unpredictably receives new data from users, needing about 10 GPU instances to crank at the problem for about 5 minutes, and I send the answer back to the users. The server itself is a cheap always-persistent single CPU Google Cloud instance. When a user request comes in, my code launches my 10 created but stopped Google Cloud GPU instances with
gcloud compute instances start (instance list)
In the rare case if the stopped instances don't exist (sometimes they get wiped) that's detected and they're recreated with
gcloud beta compute instances create (...)
This system all works fine. My only complaint is that even with created but stopped instances, the launch time before my GPU code finally starts to run is about 5 minutes. Most of this is just the time for the instance itself to launch its Ubuntu host and call my code.. the delay once Ubuntu is running to start the GPU is only about 10 seconds.
How can I reduce this 5 minute delay? I imagine most of it comes from Google having to copy over the 4GB of instance data to the target machine, but the startup time of (vanilla) Ubuntu adds probably 1 more minute. I'm not even sure if I could quantify these two numbers independently, I only can measure the combined 3-7 minutes delay from the launch until my code starts responding.
I don't think Ubuntu OS startup time is the major startup latency contributor since I timed an actual machine with the same Ubuntu and same GPU on my desk from poweron boot up and it began running my GPU code in 46 seconds.
My goal is to get results back to my users as soon as possible, and that 5 minute startup delay is a bottleneck.
Would making a smaller instance SIZE of say 2GB help? What else can I do to reduce the latency?
2GB is large. That's a heckuva big image. You should be able to cut that down to 100MB, perhaps using Alpine instead of Ubuntu.
Copying 4GB of data is also less than ideal. Given that, I suspect the solution will be more of an architecture change than a code change.
But if you want to take a whack at everything which is NOT about your 4GB of data, there is a capability to prepare a custom image for your VMs. If you can build a slim custom image that will help.
There's good resources for learning more, the two I would start with include:
- Improve GCE Boot Times with Custom Images
- Three steps to Compute Engine startup-time bliss: Google Cloud Performance Atlas

aws rds 100% cpu 2 vcores

i currently use T2.Micro RDS with SQL Express.
Due to a heavy load application running, there might be times that 1 request of a visitor might take 30 seconds to complete. This makes the RDS work 100% CPU. The result is any other visitor that goes to the website same time and during 100% CPU load, the website takes much longer to answer.
T2.micro has 1 vCPU.
I'm thinking of upgrade to T2.medium with has 2 vCPU.
The question is, if i have 2 vCPU will i avoid the bottleneck?
Example, 1st visitor with 30 second request, uses vCPU #1 and second visitor comes same time, is he using vCPU #2 ? Will that help my situation ?
Also, i did not see any option in aws rds to see what CPU is that. Is there option to choose faster vCPU somehow ?
Thank you.
The operating system's scheduler automatically handles the distribution of running threads across all the available cores, to get as much work done as possible in the least amount of time.
So, yes, a multi-core machine should improve performance as long as more than one query is running. If a single, CPU-intensive, long-running query -- and nothing else -- is running on a 2-core machine, the maximum CPU utilization you'd probably see would be about 50%... but as long as there is more than one query running, each of them will be running on one of the cores at a time, and the system can actually move a thread anong the available cores as the workload shifts, to put them on the optimum core.
A t2.micro is a very small server, but t2 is a good value proposition. With all the t2-class machines, you aren't allowed to run 100% CPU continuously, regardless of the number of cores, unless you have a sufficient CPU credit balance available. This is why the t2 is so inexpensive. You need to keep an eye on this metric as well. CPU credits are earned automatically over time, and spent by using CPU. A second motivation for upscaling a t2 machine is that larger t2 instances earn these credits at a faster rate than smaller ones.
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html

AWS Elasticache CPU usage exceeding 100%

We have been using AWS Elasticache for our applications. We had initially set a CPU alarm threshold for 22% (4 core node, so effectively 90% CPU usage), which is based on the recommended thresholds. But we often see the CPU utilization crossing well over 25% to values like 28%, 34%.
What I am trying to understand that how is this theoretically possible, considering Redis is single-threaded ? The only way I can think that this can happen is if there is maintenance operation happening on other cores, which can bump the CPU usage > 25%. Even if the cluster is highly loaded, it should cap CPU usage at 25% and probably start timing out for clients. Can someone help me understand under what scenarios can the CPU usage of a single-threaded Redis instance cross 100% CPU utilization ?
Redis event loop is single-threaded. the Redis process itself is not. There are a couple of extra threads to offline some I/O bound operations. Now, these threads should not consume CPU.
However, Redis also forks child processes to take care of heavy duty operations like AOF rewrite or RDB save. Each forked process generally consumes 100% of a CPU core (except if the operation is slowed down by I/Os), on top of the Redis event loop consumption.
If you find the CPU consumption regularly high, it may be due to a wrong AOF and RDB configuration (i.e. the Redis instance rewrites the AOF or generates a dump too frequently).

Beanstalk app CPU spikes

We are running a docker container on AWS elastic beanstalk, which was running fine for a few weeks but suddenly started to experience very sudden CPU spikes (from ~5% to ~60% in a matter of minutes), who sometimes drop back down quickly, and sometimes stay high for enough time to produce an autoscaling event and spin up a few instances for extra help (Which are terminated some time after that when the CPU spike dies down).
The funny thing is, I wanted to investigate the problem today, so I've sshed into every instance (4 in total) and ran top on all of them, trying to locate the CPU consuming process, and was surprised to discover all instances have ~15% CPU busy (system + user combined), while the EBS monitoring page still shows the servers are at 60% CPU.
I've measured these figures for the good part of the hour, making sure the CPU high load stays high, while the top command still shows low values.
I've also tried to measure CPU for a while using the advice found here -https://askubuntu.com/questions/22021/how-to-log-cpu-load, and got the same very low CPU stats when querying the server directly.
My question is - is it possible AWS monitoring system is not showing me accurate data? Is there anyway to verify the data displayed in the monitoring page?
Any help would be appreciated.