Broken Pipe in Django dev server - What does this actually mean? - django

I can't quite figure out a pattern to why/when I see it. Cheers.

By "pipe" it means the TCP connection between the server and the browser. By "broken" it means closed.
You'll see broken pipes when somebody closes their browser window, hits stop, or sometimes just from timing out because something else breaks the connection.
The confusing thing is that the python process likely won't notice that the connection is closed until it tries to write to it, which could be well after the connection closes.

I get this error when a browser closes a connection (it can time out, or can be manually closed). Normally it happens when I send too many connections to runserver at once (i.e. I'm serving static media, and loading a heavy page for the first time).
Django's runserver should not be used in production, and it doesn't handle concurrent connections with any grace. If this happens a lot, you can consider using something like django_cpserver or gunicorn in development, but you don't get as much debug information out of them in the console.

Related

Where to even begin investigating issue causing database crash: remaining connection slots are reserved for non-replication superuser connections

Occasionally our Postgres database crashes and it can only be solved by restarting the server. We have tried incrementing the max connections and Django's CONN_MAX_AGE. Also, I am trying to learn how to set up PgBouncer. However, I am convinced the underlying issue must be something else which is fixable.
I am trying to find what that issue is. The problem is I wouldn't know where or what to begin to look at. Here are some pieces of information:
The errors are always OperationalError: FATAL: remaining connection slots are reserved for non-replication superuser connections and OperationalError: could not write to hash-join temporary file: No space left on device. I think this is caused by opening too many database connections, but I have never managed to catch this going down live so that I could inspect pg_stat_activity and see what actual connections were active.
Looking at the error log, the same URL shows up for the most part. I've checked the nginx log and it's listed in many different lines, meaning the request is being made multiple times at once rather than Django logging the same error multiple times. All these requests are responded with 499 Client Closed Request. In addition to this same URL, there are of course sprinkled requests of other users trying to access our site.
I should mention that the logic the server processes when the URL in question is requested is pretty simple and I see nothing suspicious that could cause a database crash. However, for some reason, the page loads slowly in production.
I know this is very vague and very little to work with, but I am not used to working sysadmin, I only studied this kind of thing in college and so far I've only worked as a developer.
Those two problems are mostly independent.
Running out of connection slots won't crash the database. It just is a sign that you either don't use a connection pool or you have a connection leak, i.e. you forget to close transactions in your code.
Running out of space will crash your database if the condition persists.
I assume that the following happens in your system:
Because someone forgot a couple of join conditions or for some other reason, some of your queries take a very long time.
They also priduce a lot of (perhaps intermediate) results that are cached in temporary files that eventually fill up the disk. This out of space condition is cleared as soon as the query fails, but it can crash the database.
Because these queries take long and block a database session, your application keeps starting new sessions until it reaches the limit.
Solution:
Find and fix thise runaway queries. As a stop-gap, you can set statement_timeout to terminate all statements that take too long.
Use a connection pool with an upper limit so that you don't run out of connections.

how does django channels knows that a client has disconnected?

I fear that a client loses connection abruptly and get disconnected without leaving any info to the backend. If this happens, will the socket keep active?
I have found the answer in a issue on the django channels repository:
Using websocket auto-ping to periodically assert clients are still connected to sockets, and cutting them loose if not. This is a feature that's already in daphne master if you want to try it, and which is coming in the next release; the ping interval and time-till-considered-dead are both configurable. This solves the (more common) clients-disconnecting-uncleanly issue.
Storing the timestamp of the last seen action from a client in a database, and then pruning ones you haven't heard of in a while. This is the only approach that will also get round disconnect not being sent because a Daphne server was killed, but it's more specialised, so Channels can't implement it directly.
Link to the issue on github

Propagate http abort/close from nginx to uwsgi / Django

I have a Django application web application, and I was wondering if it was possible to have nginx propagate the abort/close to uwsgi/Django.
Basically I know that nginx is aware of the premature abort/close because it defaults to uwsgi_ignore_client_abort to "off", and you get nginx 499 errors in your nginx logs when requests are aborted/closed before the response is sent. Once uwsgi finishes processing the request it throws an "IO Error" when it goes to return the response to nginx.
Turning uwsgi_ignore_client_abort to "on" just makes nginx unaware of the abort/close, and removes the uwsgi "IO Errors" because uwsgi can still write back to nginx.
My use case is that I have an application where people page through some ajax results very quickly, and so if the quickly page through I abort the pending ajax request for the page that they skipped, this keeps the client clean and efficient. But this does nothing for the server side (uwsgi/Django) because they still have to process every single request even if nothing will be waiting for the response.
Now obviously there may be certain pages, where I don't want the request to be prematurely aborted for any reason. But I use celery for long running requests that may fall into that category.
So is this possible? uwsgi's hariakari setting makes me think that it is at some level.... just can't figure out how to do it.
My use case is that I have an application where people page through some ajax results very quickly, and so if the quickly page through I abort the pending ajax request for the page that they skipped, this keeps the client clean and efficient.
Aborting an AJAX request on the client side is done through XMLHttpRequest.abort(). If the request has not yet been sent out when abort() is called, then the request won't go out. But if the request has been sent, the server won't know that the request has been aborted. The connection won't be closed, there won't be any message sent to the server, nothing. If you want the server to know that a request is no longer needed, you basically need to come up with a way to identify requests so that when you make the initial request you get an identifier for it. Then, through another AJAX request you could tell the server that an earlier request should be cancelled. (If you search questions about abort() like this one and search for "server" you'll find explanations saying the same.)
Note that uwsgi_ignore_client_abort is something that deals with connection closures at the TCP level. That's a different thing from aborting an AJAX request. There is generally no action you can take in JavaScript that will entail closing a TCP connection. The browser optimizes the creation and destruction of connections to suit its needs. Just now, I did this:
I used lsof to check whether any process had a connection to example.com. There were none. (lsof is a *nix utility that allows listing open files. Network connections are "files" in *nix.)
I opened a page to example.com in Chrome. lsof showed the connection and the process that opened it.
Then I closed the page.
I polled with lsof to see if the connection I identified earlier was still opened. It stayed open for about one minute after I closed the page even though there was no real need to keep the connection open.
And there's no amount of fiddling with uswgi settings that will make it be aware of aborts performed through XMLHttpRequest.abort()
The use-case scenario you gave was one where users were paging fast through some results. I can see two possibilities for the description given in the question:
The user waits for a refresh before paging further. For instance, Alice is looking through an list of user names sorted alphabetically for user "Zeno" and each time a new page is shown, she sees the name is not there and pages down. In this case, there's nothing to abort because the user's action is dependent on the request having been handled first. (The user has to see the new page before making a decision.)
The user just pages down without waiting for a refresh. Alice again is looking for "Zeno" but she figures it's going to be on the last page so click, click, click she goes. In this case, you can debounce the requests made to the server. Then the next page button is pressed, increment the number of the page that should be shown to the user but don't send the request right away. Instead, you wait for a small delay after the user ceases clicking the button to then send the request with final page number and so you make one request instead of a dozen. Here is an example of a debounce performed for a DataTables search.
Now obviously there may be certain pages, where I don't want the request to be prematurely aborted for any reason.
This is precisely the problem behind taking this one way or the other.
Obviously, you may not want to continue spending system resources processing a connection that has since been aborted, e.g., an expensive search operation.
But then maybe the connection was important enough that it still has to be processed even if the client has disconnected.
E.g., the very same expensive search operation, but one that's actually not client-specific, and will be cached by nginx for all subsequent clients, too.
Or maybe an operation that modifies the state of your application — you clearly wouldn't want to have your application to have an inconsistent state!
As mentioned, the problem is with uWSGI, not with NGINX. However, you cannot have uWSGI automatically decide what was your intention, without you revealing such intention yourself to uWSGI.
And how exactly will you reveal your intention in your code? A whole bunch of programming languages don't really support multithreaded and/or asynchronous programming models, which makes it entirely non-trivial to cancel operations.
As such, there is no magic solution here. Even the concurrency-friendly programming languages like Golang are having issues around the WithCancel context — you may have to pass it around in every function call that could possibly block, making the code very ugly.
Are you already doing the above context passing in Django? If not, then the solution is ugly but very simple — any time you can clearly abort the request, check whether the client is still connected with uwsgi.is_connected(uwsgi.connection_fd()):
http://lists.unbit.it/pipermail/uwsgi/2013-February/005362.html

Do I need close db connection in command [django]

According to this (http://djangosnippets.org/snippets/926/) snippet, connection closed in handle. But it is kind of old code.
In django 1.4, Do we must close connection? I looked through django code but I cann't find code that closes connection.
If django closes connection where is it?
Thank you.
As the snippet stated:
# Close the DB connection. This is required as a workaround for an
# edge case in MySQL: if the same connection is used to
# create tables, load data, and query, the query can return
# incorrect results.
From Django:
So, yes, if you do something to deliberately create lots of connections,
lot of connections will be created. However, Django closes its connection to the
database at the end of each request/response cycle, so there is only one connection
in operation per thread or process handling requests and responses. If you're not
using the HTTP layer, it's still only one connection per thread of execution and
you are in complete control of the number of threads you create.
https://code.djangoproject.com/ticket/9878
First a disclaimer: I'm not an expert (far from it).
Anyway, the snippet you reference refers to to two tickets: the Django ticket suggests closing the connection has been included in the code for loading fixtures (see lines 55-60 in django/core/management/commands/loaddata.py). The MySQL ticket suggests nothing on their side has changed much.
Anyway, I think it depends on what you are trying to do. And most importantly, the "connection closing" fix seems a MySQL only thing. If you use any other DB, the bug suggested by the comments in the snippet should not occur.

How can I simulate a hung web service?

I'm trying to test out modes of failure for software that interacts with a web service, and I've already had reported issues where problems occur if the software doesn't get a timely response (e.g., it's waiting a minute or longer). I'd like to simulate this so that I can track down and fix issues myself, but unplugging the network connection doesn't do the trick, because it returns immediately with no route found.
What I'd like to know is, is there a simple way I can make a CGI script that accepts a connection but just sits there, keeping the connection alive for several minutes, without doing a while (true) {} type of loop?
How about letting the script sleep for some (very long) time?
I don't know what language you are using for your scripting, but in .net you could do something like Thread.Sleep(6000);
HTTP Fiddler is excellent for this sort of thing. You can simulate slow connections and, if you want, get it to "break" when a request comes in so you can similate a response that never returns.
Go get it from here...
http://www.fiddlertool.com/fiddler/
You will have to idle in some way since if your CGI script returns the connection will get closed.
If your network equipment supports throttling you might want to limit outgoing traffic to something ridiculously low.