package main
import (
"io"
"net/http"
)
func hello(w http.ResponseWriter, r *http.Request) {
io.WriteString(w, "Hello world!\n")
}
func main() {
http.HandleFunc("/", hello)
http.ListenAndServe(":8000", nil)
}
I've got a couple of incredibly basic HTTP servers, and all of them are exhibiting this problem.
$ ab -c 1000 -n 10000 http://127.0.0.1:8000/
This is ApacheBench, Version 2.3 <$Revision: 1604373 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
apr_socket_recv: Connection refused (61)
Total of 5112 requests completed
With a smaller concurrency value, things still fall over. For me, the issue seems to show up around the 5k-6k mark usually:
$ ab -c 10 -n 10000 http://127.0.0.1:8000/
This is ApacheBench, Version 2.3 <$Revision: 1604373 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
apr_socket_recv: Operation timed out (60)
Total of 6277 requests completed
And in fact, you can drop concurrency entirely and the problem still (sometimes) happens:
$ ab -c 1 -n 10000 http://127.0.0.1:8000/
This is ApacheBench, Version 2.3 <$Revision: 1604373 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 127.0.0.1 (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
apr_socket_recv: Operation timed out (60)
Total of 6278 requests completed
I can't help but wonder if I'm hitting some kind of operating system limit somewhere? How would I tell? And how would I mitigate?
In short, you're running out of ports.
The default ephemeral port range on osx is 49152-65535, which is only 16,383 ports. Since each ab request is http/1.0 (without keepalive in your first examples), each new request takes another port.
As each port is used, it get's put into a queue where it waits for the tcp "Maximum Segment Lifetime", which is configured to be 15 seconds on osx. So if you use more than 16,383 ports in 15 seconds, you're effectively going to get throttled by the OS on further connections. Depending on which process runs out of ports first, you will get connection errors from the server, or hangs from ab.
You can mitigate this by using an http/1.1 capable load generator like wrk, or using the keepalive (-k) option for ab, so that connections are reused based on the tool's concurrency settings.
Now, the server code you're benchmarking does so little, that the load generator is being taxed just as much as the sever itself, with the local os and network stack likely making a good contribution. If you want to benchmark an http server, it's better to do some meaningful work from multiple clients not running on the same machine.
Related
From some reason download traffic from virtual machine on GCP (Google Cloud Platform) with Debian 9 is limited to 50K/s? Upload seems to be fine, inline with my local upload link.
It is the same with scp or https download. Any suggestions what might be wrong, where to search?
Machine type
n1-standard-1 (1 vCPU, 3.75 GB memory)
CPU platform
Intel Skylake
Zone
europe-west4-a
Network interfaces
Premium tier
Thanks,
Mihaelus
Simple test:
wget https://hrcki.primasystems.si/Nova/assets/download.test.html
Output:
--2018-10-18 15:21:00-- https://hrcki.primasystems.si/Nova/assets/download.test.html Resolving
hrcki.primasystems.si (hrcki.primasystems.si)... 35.204.252.248
Connecting to hrcki.primasystems.si
(hrcki.primasystems.si)|35.204.252.248|:443... connected. HTTP request
sent, awaiting response... 200 OK Length: 541422592 (516M) [text/html]
Saving to: `download.test.html.1' 0% [] 1,073,152 48.7K/s eta
2h 59m
Always good to minimize variables when trying to diagnose. So while it is unlikely the use of HTTP is why things are that very slow, you might consider using netperf or iperf3 to measure TCP bulk transfer performance between your VM in GCP and your local system. You can do that either "by hand" or via PerfKit Benchmarker https://cloud.google.com/blog/products/networking/perfkit-benchmarker-for-evaluating-cloud-network-performance
It can be helpful to have packet traces - from both ends when possible - to look at. You want the packet traces to be started before the test - it is important to see the packets used to establish the TCP connection(s). They do not need to be "full packet" traces, and often you don't want them to be. Capturing just the first 96 bytes of each packet would be sufficient for this sort of investigating.
You might also consider taking snapshots of the network statistics offered by the OSes running in your GCP VM and local system. For example, if running *nix taking a snapshot of "netstat -s" before and after the test. And perhaps a traceroute from each end towards the other.
Network statistics and packet traces, along with as many details about the two endpoints as possible are among the sorts of things support organizations are likely to request when looking to help resolve an issue of this sort.
I have a vps on which I run my telegram bot. My bot has like 10 messages a second. regularly my cpu load is between 50 and 70 and it works fine. sometimes, like now, cpu load drops on 0% or 1% (Like there's nothing to do, While there's a queue of 10K messages pending) and doesn't accept any new request. even if I enter my domain address it won't open my home page (May be it opens it after 60 seconds). What's the problem for? How can I solve it?
By the way, I contacted the company that I bought my server from, And they said there's no connection issues and they're right, Because at the same time I can open netstat page or kloxo control panel and I have no problem with that.
webserver: apache
Php: 5.6
Dns server: bind
Thanks
I'm trying to optimize a single core 1GB ram Digital Ocean VPS to handle more requests per second. After some tweaking (workers/gzip etc.) it serves about 15 requests per second. I don't have anything to compare it with but I think this number can be higher.
The stack works like this:
VPS -> Docker container -> nginx (ssl) -> Varnish -> nginx -> uwsgi (Django)
I'm aware of the fact that this is a long chain and that Docker might cause some overhead. However, almost all requests can be handled by Varnish.
These are my tests results:
ab -kc 100 -n 1000 https://mydomain | grep 'Requests per second'
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Requests per second: 18.87 [#/sec] (mean)
I have actually 3 questions:
Am I correct that 18.87 requests per second is low?
For a simple Varnished Django blog app, what would be an adequate value (indication)?
I already applied the recommended tweaks (tuned for my system) from this tutorial. What can I tweak more and how do I figure out the bottlenecks.
First some note about Docker. It is not meant to run a multiple processes in a single docker container. Docker is not a replacement for a VM. It simply allows to run processes in isolation. So the docker diagram should be:
VPS -> docker nginx container -> docker varnish container -> docker django container
To make your life using multiple Docker containers simpler, I would recommend to use Docker-compose. It is not perfect but its an excellent start.
Old but still fundamentally relent blog post about that. Note that some suggestions are no-longer relevant like nsenter since docker exec command is now available but most of the blog post is still correct.
As for your performance issues, yes, 18 requests per second is pretty low. However the issue is probably has nothing to do with nginx and is most likely in your Django application and possibly varnish (however very unlikely).
To debug PA issues in Django, I would recommend to use django-debug-toolbar. Most issues in Django are caused by unnecessary SQL queries. You can see them easily in debug toolbar. To solve most them you can use select_related() and prefetch_related. For more detailed analysis, I would also recommend profiling your application. cProfile is a great start. Also some IDEs like PyCharm include built-in profilers so its pretty easy to profile your application to see which functions are taking most of the time which you can optimize. Finally you can use 3rd party tools to profile your application. Even free newrelic account will give you quite a bit of information. Alternatively you can use opbeat which is a new cool kid on the block.
I am using Hunchentoot for a web app to be a high traffic db driven app, also depends on web sockets protocol and http ajax requests.
When I benchmark my app with apache benchmark as
ab -c 50 -n 1000
connection is reset prompt is shown. for unto 40 concurrency test is completed but after not. How can one increase max-thread-count of Hunchentoot.
What is the realistic numbers of concurrency and request number per unit time for a high traffic web app that I should think according to? for example for reddit or twitter.
You can pass in a number to the taskmaster with the :max-thread-count keyword.
Ok, so I created a very simple WAR which serves a simple Hello World .jsp. With all the HTML it's about 200bytes.
Deployed it on my server running Jetty 7.5.x jdk 6u27
On my client computer create simple JMeter test plan with: Thread Group, HTTP Request, Response Assertion, Summary Report Client also running jdk6u27
I set up the thread group to 5 threads running for 60secs and I got 5800 requests/sec
Then I setup 10 threads and got 6800 requests/sec
The moment I disable Keep-Alive in JMeter on the HTTP Request sampler. I seem to get lots of big pauses on the client side I suppose, it doesn't seem the server is receiving anything. I get less pauses at 5 threads or barely none but at 10 threads it hangs pretty much all the time.
What does this mean exactly?
Keep in mind I'm technically creating a REST service and I was getting the same issue, so I though maybe I was doing something funky in my service, till I figured out it's a Keep-Alive issue as it's doing it pretty much on a staic web app. So in reality I will have 1 client request 1 server response. The client will not be keeping the connection open.
My guess is that since Keep-Alive is what allows HTTP Connection (and thereby, socket) reuse, you are running out of available ephemeral port numbers -- there are only 64k port numbers, and since connections must have unique client/server port combos (and server port is fixed), you can quickly go through those. Now, if ports were reusable as soon as connection was closed by one side, it would not matter: however, as per TCP spec, both sides MUST wait for configurable amount of time (default: 2 minutes) until reuse is considered safe.
For more details you can read a TCP book (like "Stevens book"); above is a simplification.