apache + django 1.3 memory issues - django

I've been running a medium Django 1.1 site without issues, for about 2 years, on a Linux server with 2GB memory. I had to upgrade to Django 1.3 (on the same server) in order to run a specific app and of course to take profit of the new Django features! However I've been experiencing terrible memory issues since :(
I've noticed that for every hit there is a huge increase on memory usage. This can't be due to expensive requests, because even for very single views the memory usage is high (eg 40MB).
I'm using mod_wsgi and I'm not running django on Debug mode....
Even with a few tenths of hits, the memory gets filled, the server starts swapping and eventually dies...a temporary solution is to force apache restarts and reloads every time memory gets filled.
But i have to find where the leak is. Is it django or apache? Could it be that the default configuration (I've followed the how to on the django and mod_wsgi page) along with the apache configuration creates the problem?
Any advise on how I should configure apache+mod_wsgi options is more than welcome!
Cheers,
N.L.

Try using some of memory profiling/analysis tools.
At least for me dowser was of a great help.
http://www.aminus.net/wiki/Dowser
So I've ended up integrating it more with django:
https://github.com/munhitsu/django-dowser
Good luck!

Related

Django bulk_upsert saves queries to memory causing memory errors

For our Django web server we have quite limited resources which means we have to be careful with the amount of memory we use. One part of our web server is a crom job (using celery and rabbitmq) that parses a ~130MB csv file into our postgres database. The csv file is saved to disk and then read using the csv module from python, reading row by row. Because the csv file is basically a feed, we use the bulk_upsert from the custom postgres manager from django-postgres-extra to upsert our data and override existing entries. Recently we started experiencing memory errors and we eventually found out they were caused by Django.
Running mem_top() showed us that Django was storing massive upsert queries(INSERT ... ON CONFLICT DO) including their metadata, in memory. Each bulk_upsert of 15000 rows would add 40MB memory used by python, leading to a total of 1GB memory used when the job would finish as we upsert 750.000 rows in total. Apparently Django does not release the query from memory after it's finished. Running the crom job without the upsert call would lead to a max memory usage of 80MB, of which 60MB is default for celery.
We tried running gc.collect() and django.db.reset_queries() but the queries are still stored in memory. Our Debug setting is set to false and CONN_MAX_AGE is also not set. Currently we're out clues for where to look to fix this issue, we can't run our crom jobs now. Do you know of any last resorts to try to resolve this issue?
Some more meta info regarding our server:
django==2.1.3
django-elasticsearch-dsl==0.5.1
elasticsearch-dsl==6.1.0
psycopg2-binary==2.7.5
gunicorn==19.9.0
celery==4.3.0
django-celery-beat==1.5.0
django-postgres-extra==1.22
Thank you very much in advance!
Today I've found the solutions for our issues so I thought it'd be great to share. It turned out that the issue was a combination of Django and Sentry (which we only use on our production server). Django would log the query and Sentry would then catch this log and keep it in memory for some reason. As each raw SQL query was about 40MB this ate a lot of memory. Currently, we turned Sentry off on our crom job server and are looking into a way to clear the logs kept by sentry.

Django/Python 2.7 on Google App Engine: performance fine in production, extremely slow on localhost

I am working on a new application which uses Django/Python 2.7 (ecommerce app built from within, pretty heavy-duty API).
As my title says, the app runs on GoogleAppEngine.
If I make any minor (or major) changes to CSS/HTML/JS/python and do a refresh on local host, the refresh time is painfully slow. Anywhere from 5 to 15 seconds.
Is this something with my computer? Something with the APP itself? Something with memory usage? On my boss's computer it is not as slow, but not much quicker either.
Where could I start looking for bottlenecks? Usually the case is fast on localhost, and slow in production...
Any advice on the matter is appreciated.
It's not really possible to answer your question without more specifics, but it does make sense for the app to run faster on Google app engine than on the built-in single-threaded runserver. I would start by downloading Django Debug Toolbar and see what that shows you.

what is fastest as a django production server : twisted.web2 vs. apache mod_wsgi

i want to deploy my django project, what is best (on performance) of these 2 deployment methodologies:
Django-On-Twisted
apache mod_wsgi
i knew that mod_wsgi was recommended by django developers but i feel twisted is more efficient when running multiple django instance.
As has been said, the server deployment setup won't be the bottleneck at this stage, however I still feel there's definitely value in picking and learning something now which you're more likely to continue using in future.
This recent benchmark generated a lot of discussion:
http://nichol.as/benchmark-of-python-web-servers
Read the comments as well as the numbers in order to get a feel for how benchmarks never show the full picture.
For a web server Nginx is a no brainer IMO.
For a WSGI server I like uWSGI because it seems performant and I get the feeling it has much of the community behind it. uwsgi is well supported by Nginx.
Hope that helps :> Let us know what you go for.

haystack's RealTimeSearchIndex causes django to hang on data entry

I'm using django-haystack and a xapian backend with real time indexing (haystack.indexes.RealTimeSearchIndexing) of model data and it works fine on my Ubuntu server. However, it causes django to hang upon data entry when I deployed the app on a RHEL5 server.
Everything is hunky dory if I switch to a standard SearchIndex.
Running ./manage.py rebuild_index manually works fine too.
The major differences between the two setups would be the versions of Python (2.4.3 vs 2.6.4) and the xapian (1.0.4-1 vs 1.0.15).
Any suggestions on what may be the problem?
Nothing interesting appears in the logs, and I've tried different databases (mysql, sqlite3) and deployment methods (mod_python, wsgi) with no luck yet.
I have noted the warning on the haystack docs stating that RealTimeSearchIndex is only handled gracefully with a Solr backend, however I'm running a very low traffic site with only occasional writes so I'm fine with some CPU overheads on writes.
Installing xapian-core and xapian-bindings from source solved the problem.
I initially used the RPM packages provided here.
Please note this from the author of xapian-haystack:
Because Xapian does not support simultaneous WritableDatabase connections, it is strongly recommended that users take care when using RealTimeSearchIndex to either set WSGIDaemonProcess processes=1 or use some other way of ensuring that there are not multiple attempts to write to the indexes. Alternatively, use SearchIndex and a cronjob to reindex content at set time intervals (sample cronjob can be found here http://gist.github.com/216247) or derive your own SearchIndex to implement some other form of keeping your indexes up to date.

Upgrading the JRE used by ColdFusion

I have a ColdFusion 8.1 application. It gets heavy use and I see jrun.exe getting very high memory usage in the task manager. This is a 32-bit windows 2003 server. When Jrun gets around a gig of memory usage ColdFusion will stop responding at some point. The logs are a little vague, but I start to see garbage collection and heap errors in the ColdFusion log. I assume that the JRE is running out of memory.
I have the max JVM heap set to 1.2gig. After some experimenting, this seemed to be the biggest amount I could allocate and still have ColdFusion start ok. I realize that going to 64-bit might solve the problem, but that is not an option at this time.
I am considering upgrading the JRE (it is at v6.x dated pre-2008, though I don't know the exact version. I am using the JRE that came with ColdFusion 8.1. Has anyone gone through this? I assume it's just a matter of installing the new JRE and pointing ColdFusion to the new JRE directory in the ColdFusion server settings.
tia
don
it's EXTREMELY easy to do.
1) download the Java SE Development Kit and install it like normal.
2) open up the jmv.config for cf in a text editor, located in c:\coldfusion8\runtime\bin
3) comment out the existing java.home line with a by putting a "#" at the beginning of the line add a new java.home line below it pointing to your jvm installation.
As an example, my java.home and jvm.config look like this:
java.home=C:/Program Files/Java/jdk1.6.0_11/jre
4) restart the CF services.
As a bonus, you can running JavaRa and free up some space by deleting all the old versions of the JRE.
Adobe has a Knowledge Base that covers issues like this. Check out http://www.adobe.com/go/2d547983 for instructions.
Sean Corfield has an article that provides some info on using Java 6 with ColdFusion 8 here:
http://corfield.org/blog/index.cfm/do/blog.entry/entry/Java_6_and_ColdFusion_8
As long as you install 1.6.0_10 or greater, you should be fine. You might check out ColdFusionBloggers.org from time to time in case other JVM issues come to light in the future.
You didn't specify whether or not you were using the stand-alone server instance or a multi-server configuration. If you're getting a heavy volume of traffic and have a dual core machine with a lot of physical memory, I would consider looking into the multi-server set-up for CF8 and putting together a cluster with load balancing. This will help to distribute your traffic across several instances of CF8 and, assuming you have a beefy server, make better use of the physical resources that you have available.
-Rick
Consider moving Java 7. Java 7 has the G1 Garbage collector which is better at memory deallocation.
If you are having out of memory issues it could be because
functions are not using var or local scope
<cfdump> is used in a production system
Sessions are too large or are not set to expire in a reasonable amount of time
Queries are way too large SELECT * can cause that.
Excessive number Query of Queries.
The site is connecting to a slow database. Resources are held until the DB returns data
DSN has the data buffer set to more than 64k