BeautifulSoup inside Django view makes WSGI timeout - django

For a strange reason when I instantiate a BeautifulSoup object inside Django's view, the WSGI timeout. Any help is appreciated as I am banging my head against the wall for hours and cannot find the root of this problem.
The view:
def index(request):
soup = BeautifulSoup('<b>Bold</b>') # Removing this line solve the proble
return HttpResponse('Hello')
The error message in Apache log:
[wsgi:error] [pid 4014] [client 127.0.0.1:50892] Timeout when reading response headers from daemon process 'test.local': /htdocs/test/test/wsgi.py
Update: This seems to be a bug in BeautifulSoup, however there is no soution.

Various third party packages for Python which use C extension modules, and this includes scipy, numpy and Beautifulsoup, will only work in the Python main interpreter and cannot be used in sub interpreters as mod_wsgi by default uses. You can find that in below link.
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API
You can solve this by writing below line in your conf file.
WSGIApplicationGroup %{GLOBAL}
If running multiple WSGI applications on same server, you would want to start investigating using daemon mode because some frameworks don't allow multiple instances to run in same interpreter. This is the case with Django. Thus use daemon mode so each is in its own process and force each to run in main interpreter of their respective daemon mode process groups.

Related

Django Webfaction 'Timeout when reading response headers from daemon process'

My Django app on my production server hosted on Webfaction was working fine until I just tried to restart it after pushing a change to the settings.py file. I ran
apache2/bin/restart
as usual. Then I tried to access my app on my browser, and I got a 504 Gateway timeout. I looked into the mod_wsgi logs and saw this:
[Thu Nov 03 23:46:53.605625 2016] [wsgi:error] [pid 8027:tid 139641332168448]
[client 127.0.0.1:34570] Timeout when reading response headers from daemon
process 'myapp' : /home/<me>/webapps/<myapp>/<ProjectName>/<myapp>/wsgi.py
What does this mean and how do I fix it? The only thing I changed in the settings.py file was moving some variable names around. I can still successfully interact with the app with
python2.7 manage.py shell
But I can't get to it on the web, nor use the API.
EDIT: Here's my wsgi.py file:
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "<myapp>.settings")
application = get_wsgi_application()
Python C extension modules, like numpy, are known to cause timeouts when used under mod_wsgi. There's a clear explanation of the problem (direct from the author of mod_wsgi) available at https://serverfault.com/a/514251/109598
If that sounds like it might be the cause of your problem, then the solution is probably simple - add the following to your httpd.conf:
WSGIApplicationGroup %{GLOBAL}
Be sure to restart your Apache instance after making that change.
Try increasing Timeout directive in httpd.conf, which defaults to 60 seconds in Apache 2.4. For example:
TimeOut 600
Here is how I was able to find the root cause of my issue.
python manage.py showmigrations
My app could not reach the database server, so it would eventually time out. Running manage.py I could see see the error message on the console.
In my case (Python 3.6), the mimetypes module caused this problem. I did not further investigate this, but removing a call to mimetypes.guess_type solved the problem. The call was made in the related Django view function.
I hit the same problem because the home directory of the user under which the wsgi process was running had became unavailable at some point during the server upgrade.
This might help someone.
Thank you to lenhhoxung who in the comments to one of the other solutions mentioned upgrading server capabilities. I had been successfully running a demo site on an AWS EC2 Nano instance for a long time, but for some reason it suddenly started erroring out on one page that has some complex computations. I upgraded it to an AWS EC2 Micro instance, problem solved. I think this is worthy of its own answer here considering this took a good chunk of a day. All credit to lenhhoxung, though! Thanks!

Does Python 2.7 write errors to a log file by default?

In OpenShift I am used to being able to view errors thrown by Python by using:
rhc tail -f app-root/logs/python.log [appname]
I have recently created a local development environment and want to be able to access the same functionality ie view any errors thrown by Python.
I can do this in regards to Apache errors:
sudo tail -100 /var/log/apache2/error.log
Is there something similar that can be viewed for errors in my Python application?
Environment
Linux Mint 17 Cinnamon
Python 2.7.6
UPDATE
Slight update to this scenario, I didn't note that I am using mod_wsgi which I now believe passes Python errors to the Apache log:
https://code.google.com/p/modwsgi/wiki/DebuggingTechniques
To get more detailed error messages, as the above document states, you can adjust the LogLevel setting in /etc/apache2/apache2.conf to:
LogLevel info
It seems to be catching errors such as when a function is called that has not been defined etc.
So, to my current understanding, it doesn't seem that recreating logging features within an application is necessary for my scenario.
[Thu Dec 04 13:00:44.899950 2014] [:error] [pid 6200] [client 127.0.0.1:48228] NameError: name 'undefined_function()' is not defined
No. Your logging on cloud services (whether from Apache or other tools) is a configuration of your server environment. If you've moved to a local environment that does not automatically support that, you won't have it.
Python does have a logging module (which is rather complete and fairly complex), but it is an infrastructure for programs logging to files (or other destinations). It is not an automatic funneling of all your Python errors to a designated log file. Generally, Python error messages (sys.stderr) appear on your console in which you ran the program (or in your debugging console, if you from from within an IDE).

Differentiate between Django dev server and apache

easy question:
In my settings file, I want to set a constant depending on whether I am running from the dev server or Apache.
Any elegant way of doing this?
I am running with mod_wsgi
Use:
try:
from mod_wsgi import version as MOD_WSGI_VERSION
except ImportError:
MOD_WSGI_VERSION = None
if MOD_WSGI_VERSION:
... running mod_wsgi

Deploying gevent in Django with mod_wsgi under Apache

It is working fine under Django's runserver with the monkey patch:
if __name__ == "__main__":
import gevent
from gevent import monkey
monkey.patch_all()
execute_manager(settings)
However, in production we are using Apache with mod_wsgi, and a wsgi file. Putting the above in the WSGI file has no effect. It seem that when the wsgi file is called, it is not as __main__, but removing the if also does nothing.
I found gevent.wsgi.WSGIHandler() and tried to replace django.core.handlers.wsgi with it, but it requires request and application as parameters, which I don't have in my wsgi file.
This is what my wsgi file looks like:
import os,sys
import django.core.handlers.wsgi
from gevent import wsgi
sys.path.append('/app/src')
sys.path.append('/app/src/webInterface')
os.environ['DJANGO_SETTINGS_MODULE'] = 'WebInterface.settings'
#application = django.core.handlers.wsgi.WSGIHandler()
application = wsgi.WSGIHandler()
You are correct that __name__ is not '__main__' in mod_wsgi. Even without the if(), where in the WSGI script file did you place the monkey patch call? You don't show that in what the WSGI script file looks like.
Overall, using gevent monkey patching in mod_wsgi is probably a bad idea anyway. This is because using gevent usually gives people a false sense of security that they no longer have to deal with thread locking because greenlets will to some degree order execution so for simple stuff it isn't needed. It is most definitely a bad idea to rely on that under mod_wsgi because all the request handler threads will still be real threads and not greenlets because the threads are created as external threads using Apache thread APIs. Thus very much still need to handle thread locking properly.
One last thing. You may want to add to your question what you are trying to achieve in doing this, because your attempts at replacing the application with the WSGIHandler from gevent make no sense.

Why is mod_wsgi not able to write data? IOError: failed to write data

What could be causing this error:
$ sudo tail -n 100 /var/log/apache2/error.log'
[Wed Dec 29 15:20:03 2010] [error] [client 220.181.108.181] mod_wsgi (pid=20343): Exception occurred processing WSGI script '/home/username/public_html/idm.wsgi'.
[Wed Dec 29 15:20:03 2010] [error] [client 220.181.108.181] IOError: failed to write data
Here is the WSGI script:
$ cat public_html/idm.wsgi
import os
import sys
sys.path.append('/home/username/public_html/IDM_app/')
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
Why would Django not be able to write data?
I'm running Django 1.2.4
That error, without any sort of Python traceback, may be a variation on issue described in:
http://code.google.com/p/modwsgi/issues/detail?id=29&can=1
That is, occurs when HTTP client connection is lost before the full response could be written back by the web server. It can manifest as 'client closed connection', 'failed to write data' or 'failed to flush data' IOError in Apache error log only. Ie., not seen by WSGI applicaton because the writing of data is occurring after WSGI application has returned and so can't throw exception back to the application to do anything with.
The question is whether you get an error message from Django if you configure errors to be sent to you in email. If you do, then instead is something happening in Django.
I have the same problem in an application that uses a lot of AJAX calls (mod_wsgi 3.3). Is there any known solution for this? I thought about just ignoring the exception, but that is normally not a very good idea.
UPDATE
Actually, this can be due to several things, but the most probable cause is that you are using the write callback instead of yielding your output.
I believe this will help:
http://groups.google.com/group/modwsgi/browse_thread/thread/c9cc1307bc10cfff
I've found the same problem with my python web app in Digital ocean and after checking the log file seriously I've discovered that It was a problem with my Database mysql !
The problem was due to the fact that I was running out storage (RAM)
So check those question and solve the problem!
`For Mysql
And this
Hope It will help
I'm wagering it is a permissions issue. True making the target directory/file universally writable. Then make the file owned by your www-data group (or whatever your apache user is), make it group writable, and make sure nothing in that folder is sensitive because this could be a security problem.