Do I need close db connection in command [django]

Do I need close db connection in command [django] - django

According to this (http://djangosnippets.org/snippets/926/) snippet, connection closed in handle. But it is kind of old code.
In django 1.4, Do we must close connection? I looked through django code but I cann't find code that closes connection.
If django closes connection where is it?
Thank you.

As the snippet stated:
# Close the DB connection. This is required as a workaround for an
# edge case in MySQL: if the same connection is used to
# create tables, load data, and query, the query can return
# incorrect results.
From Django:
So, yes, if you do something to deliberately create lots of connections,
lot of connections will be created. However, Django closes its connection to the
database at the end of each request/response cycle, so there is only one connection
in operation per thread or process handling requests and responses. If you're not
using the HTTP layer, it's still only one connection per thread of execution and
you are in complete control of the number of threads you create.
https://code.djangoproject.com/ticket/9878

First a disclaimer: I'm not an expert (far from it).
Anyway, the snippet you reference refers to to two tickets: the Django ticket suggests closing the connection has been included in the code for loading fixtures (see lines 55-60 in django/core/management/commands/loaddata.py). The MySQL ticket suggests nothing on their side has changed much.
Anyway, I think it depends on what you are trying to do. And most importantly, the "connection closing" fix seems a MySQL only thing. If you use any other DB, the bug suggested by the comments in the snippet should not occur.

Related

Where to even begin investigating issue causing database crash: remaining connection slots are reserved for non-replication superuser connections

Occasionally our Postgres database crashes and it can only be solved by restarting the server. We have tried incrementing the max connections and Django's CONN_MAX_AGE. Also, I am trying to learn how to set up PgBouncer. However, I am convinced the underlying issue must be something else which is fixable.
I am trying to find what that issue is. The problem is I wouldn't know where or what to begin to look at. Here are some pieces of information:
The errors are always OperationalError: FATAL: remaining connection slots are reserved for non-replication superuser connections and OperationalError: could not write to hash-join temporary file: No space left on device. I think this is caused by opening too many database connections, but I have never managed to catch this going down live so that I could inspect pg_stat_activity and see what actual connections were active.
Looking at the error log, the same URL shows up for the most part. I've checked the nginx log and it's listed in many different lines, meaning the request is being made multiple times at once rather than Django logging the same error multiple times. All these requests are responded with 499 Client Closed Request. In addition to this same URL, there are of course sprinkled requests of other users trying to access our site.
I should mention that the logic the server processes when the URL in question is requested is pretty simple and I see nothing suspicious that could cause a database crash. However, for some reason, the page loads slowly in production.
I know this is very vague and very little to work with, but I am not used to working sysadmin, I only studied this kind of thing in college and so far I've only worked as a developer.

Those two problems are mostly independent.
Running out of connection slots won't crash the database. It just is a sign that you either don't use a connection pool or you have a connection leak, i.e. you forget to close transactions in your code.
Running out of space will crash your database if the condition persists.
I assume that the following happens in your system:
Because someone forgot a couple of join conditions or for some other reason, some of your queries take a very long time.
They also priduce a lot of (perhaps intermediate) results that are cached in temporary files that eventually fill up the disk. This out of space condition is cleared as soon as the query fails, but it can crash the database.
Because these queries take long and block a database session, your application keeps starting new sessions until it reaches the limit.
Solution:
Find and fix thise runaway queries. As a stop-gap, you can set statement_timeout to terminate all statements that take too long.
Use a connection pool with an upper limit so that you don't run out of connections.

how does django channels knows that a client has disconnected?

I fear that a client loses connection abruptly and get disconnected without leaving any info to the backend. If this happens, will the socket keep active?

I have found the answer in a issue on the django channels repository:
Using websocket auto-ping to periodically assert clients are still connected to sockets, and cutting them loose if not. This is a feature that's already in daphne master if you want to try it, and which is coming in the next release; the ping interval and time-till-considered-dead are both configurable. This solves the (more common) clients-disconnecting-uncleanly issue.
Storing the timestamp of the last seen action from a client in a database, and then pruning ones you haven't heard of in a while. This is the only approach that will also get round disconnect not being sent because a Daphne server was killed, but it's more specialised, so Channels can't implement it directly.
Link to the issue on github

Auto failover multiple connections to mirror database when principal goes down

I have a principal database (server_A), mirror database (server_B), and a witness database (server_C). The databases are set up for automatic failover, that is, when server_A goes down or fails over, server_B assumes the role of the new principal database. The database quorum is set up correctly to the best of my knowledge.
I have written an application in c++ to connect to the database and get a value to ensure a true connection. The application detects when a failure occurs on the GetValue call and attempts to reconnect when the error occurs.
The issue is this:
When I have MULTIPLE connections to the database (two threads connected, once connected, it will get a value in a loop), when the failover occurs (stopping sql server on server A so server B will take over as principal), I detect the connection failure and destroy my connection and attempt to reconnect using the same connection string:
"Driver={SQL Native Client};Server=tcp:Server_A;Failover_Partner=tcp:Server_B;Database=SomeDatabase;Uid=SomeUser;Pwd=SomePassword;"
** NOTE **
I have verified that the failover has taken place by monitoring the databases.
Even though, the connection to the database has been properly disposed of, I cannot reconnect to the database until I restart the application, OR if I bring server_A back online (now acting as the mirror database) and then failover server_B (shutting down sql server) making server A the principal database again, the application can reconnect without having to completely close out.
Though I could manipulate the connection string to make server_B the new principal and server_A the new Failover_Partner, this is not an ideal solution as many more connections will be utilized.
Keep in mind, this ONLY happens with multiple connections to the database. If I run the application with only one connection, all is fine and I can reconnect just fine when the failover occurs.
EDIT: If I connect in the beginning with multiple threads, all is fine. When I shutdown SQL Server, and therefore a failover occurs, I can reconnect only when I go through and delete ALL objects and re-instantiate new objects. Also, I am using SQL Native Client 11.0 (ODBC). Thoughts?

A lot of what you're describing is consistent with the issue described in KB 2605597 "Time-out error when a mirrored database connection is created by the .NET Framework data provider for SQLClient."
The KB describes problems when the connection timeout is set to 15 seconds, I have anecdotally heard of similar problems when the connection timeout is set to 0 (which isn't a good idea for other reasons, mentioning just in case).
This hotfix is applied to the application servers. If you want to rule this out as a possible cause, you could test raising the timeout (like it says in the workaround sections of the post) to make sure it's not the issue.
Later thought: The other thing I notice that is unusual is that you're specifying the TCP protocol in the connection string and the failover partner name. It's not clear to me from the documentation that it's supported in the failover partner name. You might want to try removing that and specifying the network attribute instead. (Recommended here.)
I do understand that you believe the issue isn't these things due to the single / multiple connections issue you've tested out.
However, I think you're better off simplifying the connection string so it's as consistent as possible with the published examples and making sure it's not the issues that people have commonly hit with this first. (The retry issue happens when there is latency, which can make it somewhat sporadic.)

Ok I have found the answer.
I had to modify the hosts file because my application did not reside in the same domain as the databases. Therefore when trying to fail over, I could not reach the database with the instance name (which is what the failover partner was cached as). I changed the hosts file to resolve the instance name to the ip address of the machine and it all works now.

Getting "idle in transaction" for postgresql with django

We are using Django 1.3.1 and Postgres 9.1
I have a view which just fires multiple selects to get data from the database.
In Django documents it is mentioned, that when a request is completed then ROLLBACK is issued if only select statements were fired during a call to a view. But, I am seeing lot of "idle in transaction" in the log, especially when I have more than 200 requests. I don't see any commit or rollback statements in the postgres log.
What could be the problem? How should I handle this issue?

First, I would check out the related post What does it mean when a PostgreSQL process is “idle in transaction”? which covers some related ground.
One cause of "Idle in transaction" can be developers or sysadmins who
have entered "BEGIN;" in psql and forgot to "commit" or "rollback".
I've been there. :)
However, you mentioned your problem is related to have a lot of
concurrent connections. It sounds like investigating the "locks" tip
from the post above may be helpful to you.
A couple more suggestions: this problem may be secondary. The primary
problem might be that 200 connections is more than your hardware and
tuning can comfortably handle, so everything gets slow, and when things
get slow, more things are waiting for other things to finish.
If you don't have a reverse proxy like Nginx in front of your web app,
considering adding one. It can run on the same host without additional
hardware. The reverse proxy will serve to regulate the number of
connections to the backend Django web server, and thus the number of
database connections-- I've been here before with having too many
database connections and this is how I solved it!
With Apache's prefork model, there is 1=1 correspondence between the
number of Apache workers and the number of database connections,
assuming something like Apache::DBI is in use. Imagine someone connects
to the web server over a slow connection. The web and database server
take care of the request relatively quickly, but then the request is
held open on the web server unnecessarily long as the content is
dribbled back to the client. Meanwhile, the database connection slot is
tied up.
By adding a reverse proxy, the backend server can quickly delivery a
repliy back to the reverse proxy and then free the backend worker and
database slot.. The reverse proxy is then responsible for getting the
content back to the client, possibly holding open it's own connection
for longer. You may have 200 connections to the reverse proxy up front,
but you'll need far fewer workers and db slots on the backend.
If you graph the db slots with MRTG or similar, you'll see how many
slots you are actually using, and can tune down max_connections in
PostgreSQL, freeing those resources for other things.
You might also look at pg_top to
help monitor what your database is up to.

I understand this is an older question, but this article may describe the problem of idle transactions in django.
Essentially, Django's TransactionMiddleware will not explicitly COMMIT a transaction if it is not marked dirty (usually triggered by writing data). Yet, it still BEGINs a transaction for all queries even if they're read only. So, pg is left waiting to see if any more commands are coming and you get idle transactions.
The linked article shows a small modification to the transaction middleware to always commit (basically remove the condition that checks if the transaction is_dirty). I'll be trying this fix in a production environment shortly.

Multiple permanent connection for Django?

My website is using Django, and now I want to port part of the logic to a Redis, so I need a Redis connection for my views.py code, obvious I can't write connect to redis code in views.py because it might be called multiple times, so I need to put the connect somewhere in the django, perhaps middleware?
But I don't want to make this complicated, just the same place where the MySQL database connected, I want to add a global object for Redis connection. Perhaps later for XMPP conenction and ZeroMQ.
How to do this?
ANy idea is appreciated. Thanks in advance :)

in typical Django server settings multiple requests will be handled by the same worker process.
You can simply put a global variable to hold the connection on top of views.py and use the conenction in each view function/class, the connection will be established when the worker process starts and closes when the worker process got recycled. It's semi-permanent connection but good enough.
MySQL connection works the same way in Django. It's not each db connection per request but per worker process life-span

It isn't obvious you want to do that. It isn't obvious why you would want to do that.
So why not connect in views.py? To use a single "global connection" will mean adding locking/serialization code to ensure that your connection is safe to use amongst many calls to your views. I actually create and connect right in the method in my various and sundry views.py files. Sometimes I connect to one instance or another. I've seen no performance issues and also don't have to worry about concurrency safety. I let Redis figure that out.
Another aspect of a global shared connection is degraded performance - you'll have one page view waiting on another's to finish before it can run. By allowing each to have it's own connection you avoid one view slowing down another while waiting on access to a global connector.
Consider this: if your queries are so small and fast that you don't expect to see a performance hit from serializing every page that accesses Redis, then you won't see any performance degradation from a connection per page as you connect, query, and close. I highly doubt that the cost of setting up the connection is significantly more than serializing all page accesses that connect to Redis.
So my suggestion is to just try it. If and only if you see an issue should you worry about implementing something you will probably not need.

There is a great piece of code for this already. http://github.com/andymccurdy/redis-py

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js