Concurrency Issue not Recreatable through a JMeter Script - concurrency

Development analysis had revealed that a certain issue had occurred due to two concurrent user requests being executed in the server. However, when the same was performed using a JMeter script with two threads(users), the issue did not get reproduced despite both threads being synchronized using a synchronization timer for the save method request and the listeners indicating that both response times of the threads were the same for that particular request.
What could be the potential cause of this observation? and could there be suggestions to improve the test to either disprove or prove this claim in a better way?

I can come up with the following assumptions:
The analysis result is not correct
The analysis result is correct but the issue is intermittent or Heizenbug-like
You're not sending "the same" request using JMeter, maybe the payload is incorrect or you miss a header. If it's possible to obtain a network footprint of the issue in form of i.e. .pcap or .har file you could compare it with the network footprint produced by JMeter

Concurrent user actions and simultaneous user actions and two different things. Please see the details in this article.
Following diagrams taken from the article explain the difference.
JMeter will simulate the concurrent user actions by defaults. What you really need is to simulate simultaneous user actions. You can achieve this by adding a Synchronizing Timer to your test plan.
The purpose of the SyncTimer is to block threads until X number of threads have been blocked, and then they are all released at once. A SyncTimer can thus create large instant loads at various points of the test plan.

Related

Django(2.11) simultaneous (within 10ms) identical HTTP requests

Consider a POST/PUT REST API (using DRF).
If the server receives request1 and within a couple of ms request2 with identical everything to request1 (duplicate request), is there a way to avoid the request2 to be executed using some Django way? Or Should I deal with it manually by some state?
Any inputs would be much appreciated.
There isn't anything out of the box so you would need to write something your self potentially a piece of custom middleware (https://docs.djangoproject.com/en/3.0/topics/http/middleware/) would be best as then it would run over all of the requests. You would need to capture and exam the requests so you'd need a fast storage of some sort such as a memory store.
You could also look into the python asynco library - https://docs.python.org/3/library/asyncio-sync.html
Another possible solution would be using a FIFO message queue which is configured to support de-duplication based on content. This would turn the request into an deferred process though so it may not be suitable for your needs.

SWF Activity is not completing even though the computation has finished

I'm testing a new SWF workflow, and I've got some activity that makes a RESTful call out to another service. Problem is, I can see through logging that the actual call takes less than a second to complete, but the Activity always times out in SWF (START_TO_CLOSE of 5 mins). Being more specific, the RESTful call is a list call, and when I limit the batch size to a small number, the Activity completes and moves on very quickly. But at some seemingly arbitrary threshold, it chokes completely.
Does anyone have any insight into this? I've read that SWF calls have a size limitation of 1 MB, does anyone know how to find the size of data my workers are trying to pass SWF?
After some remote debugging, it turns out the response from the task is too big and the activity is failing silently. The failure occurs when the framework tries to report the response back to SWF, and the SDK calls RespondActivityTaskCompleted. That API has a length restriction on the internal result param:
Length Constraints: Maximum length of 32768.
This is a validation error that throws an uncaught exception and is swallowed internally until the Activity times out.
I wouldn't recommend using activity input and output parameters for passing large data sets. SWF is an orchestration technology, not the data passing one. The standard workarounds are:
Storing result in a separate store (S3 for example) and passing reference to it.
Caching result locally on a machine and route all following activities to the same host for them to have access to the cached result. See fileprocessing sample for the details of routing approach.
BTW. Have you checked out Cadence which is an open source version of SWF with much better client side libraries?

The rate of control plane requests made by this account is too high

I'm using AWS Dynamo DB and it keeps giving me the following error when trying to create DB by https://www.npmjs.org/package/dynamodb:
The rate of control plane requests made by this account is too high
Does anyone know what the reason is?
Thanks
Could you share your code that is calling the create? And does this happen every time, or only sometimes? If you can get insight into whether the CreateTable API call is failing, or a DescribeTable API call is failing, that would be helpful too. If you can log the request ids of all of the requests you're making, and share them on this post, we (the DynamoDB folks) can see if we can get more details on our side.
This error may occur when you create, update, or delete many tables simultaneously (as in call the API with many operations simultaneously). This is easy to do in Node.js because of its non-blocking programming model. The error may also happen if you CreateTable and then immediately call DescribeTable simultaneously or immediately after (this typically doesn't happen though).

How to generate Concurrent User load in Jmeter

I have a test where users will logs in and enter search keyword in search field and will get the results. Finally logs out.
Now I want to test concurrency using Jmeter. So this is what I came up with.
Test plan
Thread group
+ Login request
+ Synchronizing Controller
+ Search string
+ Synchronizing Controller
+ Logout
I have added 10 in number of threads. I have added 5 in Synchronizing Controller. So when I run the test I will get the concurrency of 5 users? Rest 5 users will be simultaneous users?
Also I have depended request when login page loads. So to achieve concurrency on login, I have added all the request in transaction controller and added Synchronizing Controller as child to transaction controller. Please let me know if I am doing it right.
Also please let me know if there is another way to achieve concurrency for specific action (ex: 5 users hitting login button at same time).
First off, you should try to distinguish between 'concurrent' and 'simultaneous'. They are normally very similar terms but in load testing they have different meanings. Simultaneous means two or more requests at the same time. Concurrent is two or more threads (scripts) running in parallel.
So, what you are talking about is trying to configure JMeter to simulate multiple simultaneous requests. But actually, there's a much, much better approach than this. Instead of focusing on trying to hit the same request at the same time, which is fiddly in JMeter, you should setup your test to be a realistic representation of the sort of load you want your application to support. If you do that well, using random wait times, throughput controllers and a realistic number of threads, then you will automatically be testing concurrency and at the same time running genuine, valid and useful performance tests too.
So, basically, drop the synchronising timer, use a constant throughput timer instead, configure wait times and then calculate the correct number of threads to generate the desired load.
The added bonus to this approach is you will be much less likely to raise false negatives. For example, if you hit your server with 5 simultaneous login requests then you might find that this call is single-threaded and the response times increase. But maybe this doesn't matter, maybe the chances of two login calls at the same time are so small that it is not worth spending time changing the code. This is a very, very important concept in load testing - perhaps the most important - you must have realistic objectives, without these you could be running tests, finding false bugs and generally wasting time forever.

Django/Postgres performance worsening after repeatedly processing the same query

I am running Django on Apache. I have several client computers which should call urllib2.urlopen() and send over some data which my server will process and immediately send back a reply. However, when I am testing this I found a very tricky issue. I have one client repeatedly send the same data to be processed. The first time, it takes around ~20 seconds, second time, it takes about 40 seconds, third time I get a 504 (gateway timeout) error. If I try to send data some more 504 errors randomly pop up. I am pretty sure this is an issue with Postgres as the function that processes the information makes many database calls, however, I do not know why the performance of Postgres would decline so much. I have tried several database optimization tricks, including this one: (http://stackoverflow.com/questions/1125504/django-persistent-database-connection), to no avail.
Thanks in advance.
Edit: The requests are not coming concurrently. They are coming in back to back and each query involves a lot of SELECTs and JOINs, and there are a few INSERTs and UPDATEs as well. The apache error logs show that it is just a simple timeout, where the function to process the client posted data takes over 90 seconds.
If it's really Postgres, then you should turn on the logging of slow statements in the Postgres configuration to find out which statement exactly is taking so much time.
This can be done by setting the configuration property log_min_duration.
Details are in the manual:
http://www.postgresql.org/docs/current/static/runtime-config-logging.html#GUC-LOG-MIN-DURATION-STATEMENT
You say the function makes "many database calls" so I'd start with a very low number, or even 0 to log the duration of all statements, then you might be able to identify the slow ones.
It could also be a locking issued. Maybe the first call does not end its transaction properly and subsequent calls run into a timeout when waiting for a resource.
You can verify this by checking the system view pg_locks after the first call.
Have you checked the Apache error_logs? Have you set django DEBUG = True or ADMINS = ('email#addr.com',) so you can get a detailed error report about what the actual cause of the issue is? If so, how about pasting some information here.
Why are you certain that it's postgres? Have you done diagnostics to come to that conclusion? If so, please let us know.
Are you running apache with mod_wsgi? How many processes and threads have you allocated to your django application?
Also, 20 seconds to process the first transaction is a huge amount of time. Perhaps you could show us the view code that is causing the time out. We may be able to help there.
I sincerely doubt that it's going to be postgres alone that is causing the issue. It probably has something to do with application code, or server configuration.