Django jMeter Api testing fail due to http timeout connection fail - django

Jmeter config:
webserver:
server name: 120.0.0.1
port no: 9000
Timeouts
Connect: Blank
Response: Blank
Method: GET
Thread user: 500
Ramp-up: 1
loop count: 1 (check box not checked)
In 500 user 70 user fail
here is code
#api_view(['GET'])
def country_list(request):
#country = cache.get('country')
try:
countryData = Country.objects.values('country_id','country_name').all().order_by('country_name')
#countryData = Country.objects.extra(select={'name': 'country_name','id':'country_id'}).values('id','name').all().order_by('country_name')[:5]
serializer = CountrySerializer(countryData,many=True)
#cache.set('country', serializer.data, 30)
return JsonResponse({'data': serializer.data, 'error': 0 })
except (KeyError, Country.DoesNotExist):
return JsonResponse({ 'error': 1 })
and response is here
Thread Name: country 1-169
Sample Start: 2017-05-26 15:43:44 IST
Load time: 21014
Connect Time: 0
Latency: 0
Size in bytes: 2015
Sent bytes:0
Headers size in bytes: 0
Body size in bytes: 2015
Sample Count: 1
Error Count: 1
Data type ("text"|"bin"|""): text
Response code: Non HTTP response code: java.net.ConnectException
Response message: Non HTTP response message: Connection timed out: connect
HTTPSampleResult fields:
ContentType:
DataEncoding: null

So it looks like you found the bottleneck in your application. However it is hard to say what is the maximum amount of users which can be served providing reasonable response time and when the error start occurring.
I would recommend the following amendments:
Make sure you run JMeter and Django application on different hosts (JMeter is quite resource intensive and this way you will avoid mutual interference)
Change Ramp-up to something above 1 second (i.e. let it be 500 seconds so JMeter will add a user each second). The idea is to increase the load gradually this way you will be able to correlate increasing response time with the increasing load and will be able to exactly determine saturation and failure points. See JMeter Ramp-Up - The Ultimate Guide for more details.
Change Loop Count to -1 so your requests would loop forever
Set desired test duration in the "Scheduler" section of the Thread Group
Monitor OS resources consumption on Django server level using i.e. JMeter PerfMon Plugin.
So when failure occurs you should know at least the following:
how many concurrent users were there when it happened (you can view it using Active Threads Over Time listener or HTML Reporting Dashboard)
whether the failure is caused by lack of i.e. CPU or RAM. If not - further steps would be examining your backend configuration (i.e. whether web/database server configuration is suitable for that many connections, checking logs for any suspicious entries, using Python Profiling Tools to get the reason of slowness or failure, etc)

Related

Django's infinite streaming response logs 500 in apache logs

I have a Django+Apache server, and there is a view with infinite streaming response
def my_view(request):
try:
return StreamingHttpResponse(map(
lambda x: f"{dumps(x)}\n",
data_stream(...) # yields dicts forever every couple of seconds
))
except Exception as e:
print_exc()
return HttpResponse(dumps({
"success": False,
"reason": ERROR_WITH_CLASSNAME.format(e.__class__.__name__)
}), status=500, content_type="application/json")
When client closes the connection to the server, there is no cleanup to be done. data_stream will yield one more message which won't get delivered. No harm done if that message is yielded and not received as there are no side-effects. Overhead from processing that extra message is negligible on our end.
However, after that last message fails to deliver, apache logs 500 response code (100% of requests). It's not getting caught by except block, because print_exc doesn't get called (no entries in error log), so I'm guessing this is apache failing to deliver the response from django and switching to 500 itself.
These 500 errors are triggering false positive alerts in our monitoring system and it's difficult to differentiate an error due to connection exit vs an error in the data_stream logic.
Can I override this behavior to log a different status code in the case of a client disconnect?
From what I understand about the StreamingHttpResponse function is that any exceptions raised inside it are not propagated further. This has to do with how WSGI server works. If you start handling an exception and steal the control, the server will not be able to to finish the HTTP response. So the error is handled by the server and printed in the terminal. If you attach the debugger to this and see how the exception is handled you will be able to find a line in wsgiref/handlers.py where your exception is absorbed and taken care of.
I think in this file- https://github.com/python/cpython/blob/main/Lib/wsgiref/handlers.py

Sagemaker Batch Transform Error "Model container failed to respond to ping; Ensure /ping endpoint is implemented and responds with an HTTP 200 status"

My task is to do large scale inference via Sagemaker Batch Transform.
I have been following the tutorial: bring your own container, https://github.com/aws/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
I have encountered many problems and solved them by searching stack overflow. However there is one problem that still causes the trouble.
When I run the same code and same dataset using 20 EC2 instances simultaneously, sometimes I get the error "Model container failed to respond to ping; Please ensure /ping endpoint is implemented and responds with an HTTP 200 status", and sometimes I don't.
What I find most frustrating is that, I have already do nothing for /ping (see code below)
#app.route("/ping", methods=["GET"])
def ping():
"""Determine if the container is working and healthy. In this sample container, we declare it healthy if we can load the model successfully."""
# health = ScoringService.get_model() is not None # You can insert a health check here
# status = 200 if health else 404
status = 200
return flask.Response(response="\n", status=status, mimetype="text/csv")
How could the error still happen?
I read from some posts (e.g., How can I add a health check to a Sagemaker Endpoint?
) saying that "ping response should return within 2 seconds timeout".
How can I increase the ping response timeout? And in general, what can I do to prevent the error from happening?
Quick clarification, SageMaker Batch Transform and creating a Real Time Endpoint are two different facets. For [Batch Transform] you do not use a persistent endpoint rather create a transformer object that can perform inference on a large set of data. An example of the Bring Your Own approach with Batch can be seen here.
Regardless of Batch or Real-Time the /ping must be passed. Make sure that you are loading your model in this route. Generally if your model is not loaded properly this leads to that health check error being emitted. Here's another BYOC example, in the predictor.py you can see me loading the model in the /ping route.
Lastly not sure what you mean by 20 instances simultaneously? Are you backing the endpoint with a 20 instance count?

I have tested my AWS server (8 GB RAM) on which my Moodle site is hosted for 1000 users using JMeter, I am getting 0% error, what could be the issue?

My moodle site is hosted on AWS Server of 8 GB RAM, i carried out various tests on the server using JMeter (NFT), I have tested from 15 to almost 1000 users, however I am still not getting any error(less than 0.3%). I am using the scripts provided by moodle itself. What could be the issue? Is there any issue with the script? I have attached a screenshot with this which shows the reports of 1000 users test for referenceenter image description here
If you're happy with the amount of errors and response times (maximum response time is more than 1 hour which is kind of too much for me) you can stop here and report the results.
However I doubt that a real user will be happy to wait 1 hour to see the login page so I would rather define some realistic pass/fail criteria, for example would expect the response time to be not more than 5 seconds. In this case you will have > 60% of failures if this is what you're trying to achieve.
You can consider using the following test elements
Set reasonable response timeouts using HTTP Request Defaults:
so if any request will last longer than 5 seconds it will be terminated as failed
Or use Duration Assertion
in this case JMeter will wait for the response and mark it as failed if the response time exceeds the defined duration

Why respond time is 100 times slower than processing request on server?

I have a computer engine server in us-east1-b zone.
n1-highmem-4 (4 vCPUs, 26 GB memory) with 50 GB SSD and everything shows normal in monitoring graphs.
we are using this server as rails based RESTful API.
The problem is when we send a request to the server it takes very long time to receive the response.
Here is our server log:
as you can see it took 00:01 second to response to the request
and here is the response received by postman:
as you can see X-Runtime is 0.036319 as expected but we received the response in 50374 ms which means almost 1 min after server response!
I hope this answer can help people with same problem.
Passenger's highly optimized load balancer assumes that Ruby apps can handle 1 (or thread-limited amount of) concurrent connection(s). This is usually the case and results in optimal load-balancing. But endpoints that deal with SSE/Websockets can handle many more concurrent connections, so the assumption leads to degraded performance.
You can use the force max concurrent requests per process configuration option to override this. The example below shows how to set the concurrency to unlimited for /special_websocket_endpoint:
server {
listen 80;
server_name www.example.com;
root /webapps/my_app/public;
passenger_enabled on;
# Use default concurrency for the app. But for the endpoint
# /special_websocket_endpoint, force a different concurrency.
location /special_websocket_endpoint {
passenger_app_group_name foo_websocket;
passenger_force_max_concurrent_requests_per_process 0;
}
}
In Passenger 5.0.21 and below the option above was not available yet. In those versions there is a workaround for Ruby apps. Enter code below into config.ru to set the concurrency (on the entire app).

High load on jetty

I'm running load tests on my MBP. The load is injected using gatling.
My web server is jetty 9.2.6
On a heavy load, number of threads remains constant : 300 but the number open socket is growing from 0 to 4000+, which generates a too much open files at OS level.
What does it mean ?
Any idea to improve the situation ?
Here is the output of jetty stat
Statistics:
Statistics gathering started 643791ms ago
Requests:
Total requests: 56084
Active requests: 1
Max active requests: 195
Total requests time: 36775697
Mean request time: 655.7369791202325
Max request time: 12638
Request time standard deviation: 1028.5144674112403
Dispatches:
Total dispatched: 56084
Active dispatched: 1
Max active dispatched: 195
Total dispatched time: 36775697
Mean dispatched time: 655.7369791202325
Max dispatched time: 12638
Dispatched time standard deviation: 1028.5144648655212
Total requests suspended: 0
Total requests expired: 0
Total requests resumed: 0
Responses:
1xx responses: 0
2xx responses: 55644
3xx responses: 0
4xx responses: 0
5xx responses: 439
Bytes sent total: 281222714
Connections:
org.eclipse.jetty.server.ServerConnector#243883582
Protocols:http/1.1
Statistics gathering started 643784ms ago
Total connections: 8788
Current connections open: 1
Max concurrent connections open: 4847
Mean connection duration: 77316.87629452601
Max connection duration: 152694
Connection duration standard deviation: 36153.705226514794
Total messages in: 56083
Total messages out: 56083
Memory:
Heap memory usage: 1317618808 bytes
Non-heap memory usage: 127525912 bytes
Some advice:
Don't have the Client Load and the Server Load on the same machine (don't cheat and attempt to put the load on 2 different VMs on a single physical machine)
Use multiple client machines, not just 1 (when the Jetty developers test load characteristics, we use at least 10:1 ratio of client machines to server machines)
Don't test with loopback, virtual network interfaces, localhost, etc.. Use a real network interface.
Understand how your load client manages its HTTP version + connections (such as keep-alive or http/1.1 close), and make sure you read the response body content, close the response content / streams, and finally disconnect the connection.
Don't test with unrealistic load scenarios. A real-world usage of your server will be a majority of HTTP/1.1 pipelined connections with multiple requests per physical connection. Some on fast networks, some on slow networks, some even on unreliable networks (think mobile)
Raw speed, serving the same content, all on unique connections, is ultimately a fascinating number and can produce impressive results, and also completely pointless and proves nothing about how your application's performance on Jetty will behave with real world scenarios.
Finally, be sure you are testing load in realistic ways.