I am trying to load test for web services with 1000 Users using JMeter. I can see CPU usage is around 30% once 1000 Users are injected but when it comes to response, maximum time taken is around 12 Seconds.
My query is, if the CPU is not utilized 100%, maximum time in receiving any response should not get more than few seconds.
It is good that you are monitoring CPU usage on application server side. However the devil may live somewhere else, for instance application can experience the lack of available RAM, does intensive swapping or reaches the limits of network or disk IO so you should consider these metrics as well. Aforementioned ones (and more) can be monitored using JMeter PerfMon Plugin.
Basically the same as point 1, but applied to JMeter side of things. JMeter tests are very resource intensive and if JMeter lacks resources it will be sending requests much slower. So make sure you monitor baseline OS health metrics on JMeter machine(s) as well. Also 1000 users is quite a high load, double check you test corresponds JMeter Best Practices
It may be the bottleneck in your application, i.e. it isn't capable of providing a good response time given 1000 concurrent users. Use the relevant profiler tool to detect the most long running functions and investigate the root cause.
Related
I created the website for sending sms, but in some point, the report (success and fail) of sms is very slow.
After that, I tried to check out what is problem in query, but I could not find.
Here's the workflow that I experienced.
I sent 200,000 sms, and the report (pending to success or fail) was properly worked. 300 sms per a second report was updated from pending to success or fail
After sending about 200,000 sms, the report is messed up, but sending sms worked fine.
The report speed was about 1 sms report updated per a second
So, I checked the statistics of aws how it is working
Currently, I am using 4 CPU and 16 memory
At some point, the cpu using rate is close to the 100%, and network traffic was very high.
Is it problem with the query that I wrote? or should I increase the cpu, ram, and ssd?
I would love to know increasing cpu is the problem of code (query) or not.
If it is because high traffic of sending sms, CPU, and ram need to be increased or not.
Thank you
This is a non-trivial quetion.
Looking at the graphs of syetem performance, you can see that there are a few CPU spikes where CPU usage gets quite high, but they are breif spikes. The overall CPU usage isn't bad, outside of the brief spikes.
So, first, I'd look at the code. How does it work? Are there obvious places where the CPU may spike?
Without seeing the code or even knowing what language it's in, there's not much can do to help,
i have a newbie question in here, but i'm new to clouds and linux, i'm using google cloud now and wondering when choosing a machine config
what if my machine is too slow? will it make the app crash? or just slow it down
how fast should my vm be? in the image bellow
last 6 hours of a python scripts i'm running and it's cpu usage, it's obviously running for less than %2 of the cpu for most of it's time, but there's a small spike, should i care about the spike? and also, how much should my cpu usage be max before i upgrade? if a script i'm running is using 50-60% of the cpu most of the i assume i'm safe, or what's the max before you upgrade?
what if my machine is too slow? will it make the app crash? or just
slow it down
It depends.
Some applications will just respond slower. Some will fail if they have timeout restrictions. Some applications will begin to thrash which means that all of a sudden the app becomes very very slow.
A general rule, which varies among architects, is to never consume more than 80% of any resource. I use the rule 50% so that my service can handle burst traffic or denial of service attempts.
Based on your graph, your service is fine. The spike is probably normal system processing. If the spike went to 100%, I would be concerned.
Once your service consumes more than 50% of a resource (CPU, memory, disk I/O, etc) then it is time to upgrade that resource.
Also, consider that there are other services that you might want to add. Examples are load balancers, Cloud Storage, CDNs, firewalls such as Cloud Armor, etc. Those types of services tend to offload requirements from your service and make your service more resilient, available and performant. The biggest plus is your service is usually faster for the end user. Some of those services are so cheap, that I almost always deploy them.
You should choose machine family based on your needs. Check the link below for details and recommendations.
https://cloud.google.com/compute/docs/machine-types
If CPU is your concern you should create a managed instance group that automatically scales based on CPU usage. Usually 80-85% is a good value for a max CPU value. Check the link below for details.
https://cloud.google.com/compute/docs/autoscaler/scaling-cpu
You should also consider the availability needed for your workload to keep costs efficient. See below link for other useful info.
https://cloud.google.com/compute/docs/choose-compute-deployment-option
I am running a load test for 1 million threads through JMeter and want to view the resource utilization on the server. For that, I am using Perfmon plugin. When the report is generated, it shows 100% usage of CPU but when view my AWS server dashboard, CPU utilization was only 1.5% at the maximum point.
Any Thoughts?
I can only think of incorrect configuration of the PerfMon Metrics Collector, given the proper configuration it should be precise enough and capture the CPU usage (as well as other metrics) in the real time
Demo:
For the above demo I used s-tui on AWS side, sha1sum to stress the CPU and pretty much default configuration of the PerfMon Metrics Collector
More information just in case: How to Monitor Your Server Health & Performance During a JMeter Load Test
I have a REST API web server, built in .NetCore, that has data heavy APIs.
This is hosted on AWS EC2, I have noticed that the average response time for certain APIs are ~4 seconds and if I turn up the AWS-EC2 specs, the response time goes down to a few milliseconds. I guess this is expected, what I don't understand is that even when I load test the APIs on a lower end CPU, the server never crosses 50% utilization of memory/CPU. So what is the correct technical explanation that makes the APIs perform faster if the lower end CPU never reaches a 100% utilization of memory/CPU?
There is no simple answer, there are so many ec2 variations you need to first figure out what is slowing down your API.
When you 'turn up' your ec2 instance, you are getting some combination of more memory, faster cpu, faster disk and more network bandwidth - and we can't tell which one of those 'more' features are improving your performance. Different instance classes ar optimized for different problems.
It could be as simple as the better network bandwidth, or it could be that your application is disk-bound and the better instance you chose is optimized for i/O performance.
Depending on what feature your instance is lacking, it would help you decide which type of instance to upgrade to - or as you have found out, just upgrade to something 'bigger' and be happy with the performance (at the tradeoff of being more expensive).
I have been doing load test for very long in my company but tps never passed 500 transaction per minute. I have more challenging problem right now.
Problem:
My company will start a campaing and ask a questiong to it's customers and first correct answer will be rewarded. Analists expect 100.000 k request in a second at global maximum. (doesnt seem to me that realistic but this can be negotiable)
Resources:
Jmeter,
2 different service requests,
5 x slave with 8 gb ram,
80 mbps internet connection,
3.0 gigahertz
Master computer with same capabilities with slaves.
Question:
How to simulete this scenario, is it possible? What are the limitations. How should be the load model. Are there any alternative to do that?
Any comment is important..
Your load test always need to represent real usage of application by real users so first of all carefully implement your test scenario to mimic real human using a real browser with all its stuff like:
cookies
headers
embedded resources (proper handling of images, scripts, styles, fonts, etc.)
cache
think times
etc.
Make sure your test is following JMeter Best Practices, i.e.:
being run in non-GUI mode
all listeners are disabled
JVM settings are optimised for maximum performance
etc.
Once done you need to set up monitoring of your JMeter engines health metrics like CPU, RAM, Swap usage, Network and Disk IO, JVM stats, etc. in order to be able to see if there is a headroom to continue. JMeter PerfMon Plugin is very handy as its results can be correlated with the Test Metrics.
Start your test from 1 virtual user and gradually increase the load until you reach the target throughput / your application under test dies / JMeter engine(s) run out of resources, whatever comes the first. Depending on the outcome you will either report success or defect or will need to request more hosts to use as JMeter engines / upgrade existing hardware.