I'm developing on Mac OSX 10.8. Lately I've been noticing that when I've been running the dev server for a while, and then -C to exit, the process continues to run in the background. I have to do a ps to find the process and kill it, or it won't let me use the same address:port again.
I didn't have to do that in earlier versions of Django (I'm currently running 1.7.3 on this project). Seems a bit messy, but don't know of another way to stop the dev server and free the port/resources?
Rgds,
Ross.
Django development server have multiple threads, so when closing main process, there might be some running threads in background. It happens when there is some request being processed (some long-term request can hang or if you're using websockets or something, connection might prevent closing thread).
Check if all of your requests are properly closed before closing your server and it should shut down properly.
Related
Introduction
I encountered this very interesting issue this week, better start with some facts:
pylibmc is not thread safe, when used as django memcached backend, starting multiple django instance directly in shell would crash when hit with concurrent requests.
if deploy with nginx + uWSGI, this problem with pylibmc magically dispear.
if you switch django cache backend to python-memcached, it too will solve this problem, but this question isn't about that.
Elaboration
start with the first fact, this is how I reproduced the pylibmc issue:
The failure of pylibmc
I have a django app which does a lot of memcached reading and writing, and there's this deployment strategy, that I start multiple django process in shell, binding to different ports (8001, 8002), and use nginx to do the balance.
I initiated two separate load test against these two django instance, using locust, and this is what happens:
In the above screenshot they both crashed and reported exactly the same issue, something like this:
Assertion "ptr->query_id == query_id +1" failed for function "memcached_get_by_key" likely for "Programmer error, the query_id was not incremented.", at libmemcached/get.cc:107
uWSGI to the rescue
So in the above case, we learned that multi-thread concurrent request towards memcached via pylibmc could cause issue, this somehow doesn't bother uWSGI with multiple worker process.
To prove that, I start uWSGI with the following settings included:
master = true
processes = 2
This tells uWSGI to start two worker process, I then tells nginx to server any django static files, and route non-static requests to uWSGI, to see what happens. With the server started, I launch the same locust test against django in localhost, and make sure there's enough requests per seconds to cause concurrent request against memcached, here's the result:
In the uWSGI console, there's no sign of dead worker processes, and no worker has been re-spawn, but looking at the upper part of the screenshot, there sure has been concurrent requests (5.6 req/s).
The question
I'm extremely curious about how uWSGI make this go away, and I couldn't learn that on their documentation, to recap, the question is:
How did uWSGI manage worker process, so that multi-thread memcached requests didn't cause django to crash?
In fact I'm not even sure that it's the way uWSGI manages worker processes that avoid this issue, or some other magic that comes with uWSGI that's doing the trick, I've seen something called a memcached router in their documentation that I didn't quite understand, does that relate?
Isn't it because you actually have two separate processes managed by uWSGI? As you are setting the processes option instead of the workers option, so you should actually have multiple uWSGI processes (I'm assuming a master + two workers because of the config you used). Each of those processes will have it's own loaded pylibmc, so there is not state sharing between threads (you haven't configured threads on uWSGI after all).
I am getting intermittent errors -
child process XXX still did not exit, sending a SIGTERM.. and then a SIGKILL. It occurs intermittently and the web page hangs.
I was not using Daemon process..but now I am, still the problem exists..
Also I have some Error opening file for reading: Permission Denied.
Please can someone help?
I am new to this forum, so sorry if that has been answered before.
If you were not using daemon mode of mod_wsgi, that would imply that Apache must have been restarted at the time that initial message was displayed.
What is occurring is that in trying to do a restart, Apache sends a SIGTERM to its child processes. If they do not exit by their own accord it will send SIGTERM again at 1 second intervals and finally send it a SIGKILL after 3 seconds. The message is warning you of the latter and that it force killed the process.
The issue now is what is causing the process to not shutdown promptly. There could be various reasons for this.
Using an extension module for Python which doesn't work in sub interpreters properly which is deadlocking and hanging the process, preventing it from shutting down. http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API
Use of background threads in the Python web application which have not been set as being daemon threads properly with the result they are then blocking process shutdown.
Your web application is simply trying to do too much on process shutdown somehow and not completing within the time limit.
Even if using daemon mode you will likely see this behaviour as it implements a similar shutdown timeout, albeit that the timeout is configurable for daemon mode.
Anyway, force use of the main Python interpreter as explained in the documentation link above
As to the permissions issue, read:
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Access_Rights_Of_Apache_User
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Application_Working_Directory
In short, ensure access permissions are correct of files/directories you need to access and ensure you are always using absolute path names when accessing the file system.
I am using Gunicorn to power Django application on a remote server (ubuntu), to which I connect by ssh. Once Gunicorn has started the status log pops up showing you what is going on and such. However when I close my ssh session and reconnect later on I cant seem to reopen the process without killing Gunicorn and rebooting the server.
Not sure if i understand your issue correctly...
When running django/gunicorn usually it is helpful to use some tools to control the processes. One really good option to do so is the use of supervisord:
http://docs.gunicorn.org/en/latest/deploy.html#supervisor
If you just want to run processes directly and being able to (dis-)connect - generally screen is a good option.
It allows you to to disconnect an ssh-session while leaving your 'virtual?' terminals running.
Just re-ssh to your server and reconnect using:
screen -xr
I can't quite figure out a pattern to why/when I see it. Cheers.
By "pipe" it means the TCP connection between the server and the browser. By "broken" it means closed.
You'll see broken pipes when somebody closes their browser window, hits stop, or sometimes just from timing out because something else breaks the connection.
The confusing thing is that the python process likely won't notice that the connection is closed until it tries to write to it, which could be well after the connection closes.
I get this error when a browser closes a connection (it can time out, or can be manually closed). Normally it happens when I send too many connections to runserver at once (i.e. I'm serving static media, and loading a heavy page for the first time).
Django's runserver should not be used in production, and it doesn't handle concurrent connections with any grace. If this happens a lot, you can consider using something like django_cpserver or gunicorn in development, but you don't get as much debug information out of them in the console.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How to restart Coldfusion Application Server when application is timeout?
Currently I have an ColdFusion application that causes server issues. After 1-2 days that serve doesn't respond until a manual restart is done.
I know that I had to find what is going wrong in my scripts and I spend time and time for several weeks.
But pending I would make a script that restart automatically ColdFusion service if it is bugged.
I have not many knowledge in batch script etc.. but I guess that the test would be a request to a .cfm and the response would be serve until a timeout. ?
Has anyone ever met a script like this ?
Config: Win 2k8 Server R2 - Coldfusion 9(.0.0)
Thank you
Two things here
The real way is to fix the issue and you can do that with Fusion Reactor - http://www.fusion-reactor.com/fr/ It will help you monitor and restart and self heal as it needs.
You could create a batch file, and create a Scheduled Task in Windows that ran it.
Using Net Start / Net Stop Commands
net stop "Macromedia JRun CFusion Server"
net start "Macromedia JRun CFusion Server"
Thought this may not always work so I have a batch file:
c:\JRun4\uninstall\KillJRun.exe
net start "Macromedia JRun CFusion Server"
Which works for me.
Your best bet is to use Pingdom or another server monitoring tool. When the server goes down (responds with a 503 error, service unavailable) you may be able to have Pingdom send a response to a PHP script on the server that calls a batch file. I am not sure if Pingdom supports pinging another server is one is down, but you could have Pingdom email to an inbox that your PHP can check every few minutes.
This may end up being more work than figuring out what is wrong with your script though.
Edit: You may want to look at this question. This will only work if the service has stopped, whereas usually when a script crashes ColdFusion it is hanging. If you run the script that crashes the server, then look at the service, if it says stopped, then this may work for you.
The other thing that I would check is the JVM memory. Often times crashes are due to processing large amounts of data from files or the database and the JVM doesn't have the memory to do that.
Nope. It cannot be restart automatically when your CF services/server is hanging. The only one way is to restart by windows schedule.
You could also use Nagios+Plugins to fire a restart script when the service hangs. But following the previous advice & finding out what the problem is is your best bet.