I have a test where users will logs in and enter search keyword in search field and will get the results. Finally logs out.
Now I want to test concurrency using Jmeter. So this is what I came up with.
Test plan
Thread group
+ Login request
+ Synchronizing Controller
+ Search string
+ Synchronizing Controller
+ Logout
I have added 10 in number of threads. I have added 5 in Synchronizing Controller. So when I run the test I will get the concurrency of 5 users? Rest 5 users will be simultaneous users?
Also I have depended request when login page loads. So to achieve concurrency on login, I have added all the request in transaction controller and added Synchronizing Controller as child to transaction controller. Please let me know if I am doing it right.
Also please let me know if there is another way to achieve concurrency for specific action (ex: 5 users hitting login button at same time).
First off, you should try to distinguish between 'concurrent' and 'simultaneous'. They are normally very similar terms but in load testing they have different meanings. Simultaneous means two or more requests at the same time. Concurrent is two or more threads (scripts) running in parallel.
So, what you are talking about is trying to configure JMeter to simulate multiple simultaneous requests. But actually, there's a much, much better approach than this. Instead of focusing on trying to hit the same request at the same time, which is fiddly in JMeter, you should setup your test to be a realistic representation of the sort of load you want your application to support. If you do that well, using random wait times, throughput controllers and a realistic number of threads, then you will automatically be testing concurrency and at the same time running genuine, valid and useful performance tests too.
So, basically, drop the synchronising timer, use a constant throughput timer instead, configure wait times and then calculate the correct number of threads to generate the desired load.
The added bonus to this approach is you will be much less likely to raise false negatives. For example, if you hit your server with 5 simultaneous login requests then you might find that this call is single-threaded and the response times increase. But maybe this doesn't matter, maybe the chances of two login calls at the same time are so small that it is not worth spending time changing the code. This is a very, very important concept in load testing - perhaps the most important - you must have realistic objectives, without these you could be running tests, finding false bugs and generally wasting time forever.
Related
Development analysis had revealed that a certain issue had occurred due to two concurrent user requests being executed in the server. However, when the same was performed using a JMeter script with two threads(users), the issue did not get reproduced despite both threads being synchronized using a synchronization timer for the save method request and the listeners indicating that both response times of the threads were the same for that particular request.
What could be the potential cause of this observation? and could there be suggestions to improve the test to either disprove or prove this claim in a better way?
I can come up with the following assumptions:
The analysis result is not correct
The analysis result is correct but the issue is intermittent or Heizenbug-like
You're not sending "the same" request using JMeter, maybe the payload is incorrect or you miss a header. If it's possible to obtain a network footprint of the issue in form of i.e. .pcap or .har file you could compare it with the network footprint produced by JMeter
Concurrent user actions and simultaneous user actions and two different things. Please see the details in this article.
Following diagrams taken from the article explain the difference.
JMeter will simulate the concurrent user actions by defaults. What you really need is to simulate simultaneous user actions. You can achieve this by adding a Synchronizing Timer to your test plan.
The purpose of the SyncTimer is to block threads until X number of threads have been blocked, and then they are all released at once. A SyncTimer can thus create large instant loads at various points of the test plan.
I have a Django view which needs can be cached, however it needs to be recycled every 100th time when the view is called by the HTTP request.
I cannot use the interval based caching here since the number will keep changing upon traffic.
How would I implement this? Are there other nice methods around except maintaining a counter (in db) ?
Here are some ideas / feedback:
You're going to have to centralize something if you need it to be exact - the Redis idea in this linked solution looks OK if you can't put it in the main DB. If Redis is in your stack, I'd use that. If the 100 requests can be per user and you're using sessions, you could attach a counter to the session.
implementing a counter that counts requests with django
To not centralize the counter outside of the webserver would mean your app needs to be and stay single-threaded to keep counts in memory. It would also reset if the server was restarted. Not a great idea IMO...
If you really can't make it work with anything else, you could hack something like a request counter on your load balancer (...if the load balancer is a single machine you control, and you're comfortable doing that) and pass it as a header for Django to read.
Experiencing very high response latency with Redis, to the point of not being able to output information when using the info command through redis-cli.
This server handles requests from around 200 concurrent processes but it does not store too much information (at least to our knowledge). When the server is responsive, the info command reports used memory around 20 - 30 MB.
When running top on the server, during periods of high response latency, CPU usage hovers around 95 - 100%.
What are some possible causes for this kind of behavior?
It is difficult to propose an explanation only based on the provided data, but here is my guess. I suppose that you have already checked the obvious latency sources (the ones linked to persistence), that no Redis command is hogging the CPU in the slow log, and that the size of the job data pickled by Python-rq is not huge.
According to the documentation, Python-rq inserts the jobs into Redis as hash objects, and let Redis expires the related keys (500 seconds seems to be the default value) to get rid of the jobs. If you have some serious throughput, at a point, you will have many items in Redis waiting to be expired. Their number will be high compared to the pending jobs.
You can check this point by looking at the number of items to be expired in the result of the INFO command.
Redis expiration is based on a lazy mechanism (applied when a key is accessed), and a active mechanism based on key sampling, which is run in the event loop (in pseudo background mode, every 100 ms). The point is when the active expiration mechanism is running, no Redis command can be processed.
To avoid impacting the performance of the client applications too much, only a limited number of keys are processed each time the active mechanism is triggered (by default, 10 keys). However, if more than 25% keys are found to be expired, it tries to expire more keys and loops. This is the way this probabilistic algorithm automatically adapt its activity to the number of keys Redis has to expire.
When many keys are to be expired, this adaptive algorithm can impact the performance of Redis significantly though. You can find more information here.
My suggestion would be to try to prevent Python-rq to delegate item cleaning to Redis by setting expiration. This is a poor design for a queuing system anyway.
I think reduce ttl should not be the right way to avoid CPU usage when Redis expire keys.
Didier says, with a good point, that the current architecture of Python-rq that it delegates the cleaning jobs to Redis by using the key-expire feature. And surely, like Didier said it is not the best way. ( this is used only when result_ttl is greater than 0 )
Then the problem should rise when you have a set of keys/jobs with a expiration dates near one of the other, and it could be done when you have a bursts of job creation.
But Python-rq sets expire key when the job has been finished in one worker,
Then it doesn't have too sense, because the keys should spread around the time with enough time between them to avoid this situation
I've only heard about tools like Celery, but I don't know if it fits my needs and is the best solution I can have.
Imagine a game like Travian. We initiate building and we have to wait N seconds until the construction is finished. When and how should we complete the construction?
Solution 1: Check if there are active construction every time the page loads. If queries like that takes some time we can make them asynchronous. If there are some - then complete.
However, in this way we are constantly waiting for the user to reload the page. Sure, we can use cronjob to check for constructions to be completed from time to time, but cronjobs execute once in a minute or less often. Constructions / attacks etc. must be executed as precisely as possible.
The solution above works, but has some cons. What are better and RELIABLE ways to perform actions like those I mentioned.
Moreover, let's assume that resources needs to be regenerated at X per hour speed and we need to regenerate them very precisely and pretty often. How can I achieve this without waiting for the page to be refreshed?
Finally, solution shall work in Webfaction hosting or any other shared hosting. I've heard that Celery doesn't work in Webfaction or am I mistaken?
Yes, celery have periodic tasks with seconds:
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html
Also you can run tasks in time with celery's crontab
http://celery.readthedocs.org/en/latest/userguide/periodic-tasks.html#crontab-schedules
Also if you need to check resources count I think it's common part for every request, so your response should looks like
{
"header": {"resources": {"wood":1, "stone":500}}
"data": {.. you real data shoud be here...}
}
You need to add header to response that will contain common information like resources count, unread messages etc and handle it properly on client.
To improve it you can use nginx + ssl + memcache backend.
I am running Django on Apache. I have several client computers which should call urllib2.urlopen() and send over some data which my server will process and immediately send back a reply. However, when I am testing this I found a very tricky issue. I have one client repeatedly send the same data to be processed. The first time, it takes around ~20 seconds, second time, it takes about 40 seconds, third time I get a 504 (gateway timeout) error. If I try to send data some more 504 errors randomly pop up. I am pretty sure this is an issue with Postgres as the function that processes the information makes many database calls, however, I do not know why the performance of Postgres would decline so much. I have tried several database optimization tricks, including this one: (http://stackoverflow.com/questions/1125504/django-persistent-database-connection), to no avail.
Thanks in advance.
Edit: The requests are not coming concurrently. They are coming in back to back and each query involves a lot of SELECTs and JOINs, and there are a few INSERTs and UPDATEs as well. The apache error logs show that it is just a simple timeout, where the function to process the client posted data takes over 90 seconds.
If it's really Postgres, then you should turn on the logging of slow statements in the Postgres configuration to find out which statement exactly is taking so much time.
This can be done by setting the configuration property log_min_duration.
Details are in the manual:
http://www.postgresql.org/docs/current/static/runtime-config-logging.html#GUC-LOG-MIN-DURATION-STATEMENT
You say the function makes "many database calls" so I'd start with a very low number, or even 0 to log the duration of all statements, then you might be able to identify the slow ones.
It could also be a locking issued. Maybe the first call does not end its transaction properly and subsequent calls run into a timeout when waiting for a resource.
You can verify this by checking the system view pg_locks after the first call.
Have you checked the Apache error_logs? Have you set django DEBUG = True or ADMINS = ('email#addr.com',) so you can get a detailed error report about what the actual cause of the issue is? If so, how about pasting some information here.
Why are you certain that it's postgres? Have you done diagnostics to come to that conclusion? If so, please let us know.
Are you running apache with mod_wsgi? How many processes and threads have you allocated to your django application?
Also, 20 seconds to process the first transaction is a huge amount of time. Perhaps you could show us the view code that is causing the time out. We may be able to help there.
I sincerely doubt that it's going to be postgres alone that is causing the issue. It probably has something to do with application code, or server configuration.