I have a site where it takes a few seconds to generate a page, due to having a crappy server. People visiting it will spam the refresh button. The problem is that the threads that are already loading the page don't stop, so you end up having 2-3 things generating the same page, all but one being discarded. Is there any way to check whether a page load is no longer necessary so I can abort?
Not that i'm aware of even if you could detect from the users browser I don't think apache can be notified and even if it could interrupting a running django thread isn't that easy a solution. Have you looked into caching your pages? Should give you the most performance improvement ESP. If generating pages is your bottleneck.
Related
I have to ask a more or less non-typical SO question and hope you don't mind. Right now I am developing my very first web application. I did set up an AJAX function that requests some data from a third party API and populates my html containers with the returned data.
Right now I query one single object and populate 3 html containers with around 15 lines of Javascript code. When i activate the process/function by clicking a button on my frontend, it needs around 6-7 seconds until the html content is updated.
Is this a reasonable time? The user experience honestly will be more than bad considering that I will have to query and manipulate far more data (I build a one-site dashboard related to soccer data).
There might be very controversal answers to that question, but what would be a fair enough time for the process to run using standard infrastructure? 1-2 seconds? (I will deploy the app on heroku or digitalocean and will implement a proper caching environment to handle "regular visitors").
Right now
I use a virtualenv and django server for the development
and a demo server from the third party API which might be slowed down for whatever reason (how to check this?)
which might effect the current time needed (there will be many more variables obv.).
Looking forward to your answers.
I personally think (probably a lot people might too) 6-7 secs is a significant delay for rendering a small page. The cause of this issue might not came from django directly. Check for the following:
I use a virtualenv and django server for the development
you may be running django devserver, production server might make things bit faster (use django-debug-toolbar to find what causing the delay)
Do db index in your model.
a demo server from the third party API which might be slowed down for whatever reason
use chrome developer tools 'network' tab to watch how long that third party call takes. it might not visible there if you call api in your view.py. in that case, add some timing code there to calculate how long it takes to return.
Why this (hopefully) isn't a broad question:
I've been looking at the Django source code on syndication. I understand functionally what these feeds are and what they do but I'm not sure how the magic happens.
Actual question:
What is Django doing under the hood to send these changes out across the wire? Is Django just creating an object (like an XML file) the Client reads and not even using the network? What mechanism is employed to ensure users get those updates in a 'reasonable' amount of time - is it a combination of the browser (or some other software) knowing to go look for updates while Django diligently adds data to a file, or does Django do most of the work?
There's no magic, and Django does not do anything to even try to ensure clients get updates in any particular amount of time.
Feeds, like almost everything on the web, are an entirely pull-based mechanism. Feed readers are responsible for periodically requesting updates from the client.
I am using a django with apache mod_wsgi, my site has dynamic data on every page and all of the media (css, images, js) is stored on amazon S3 buckets liked via "http://bucket.domain.com/images/*.jpg" inside the markup . . . . my question is, can varnish still help me speed up my web server?
I am trying to knock down all the stumbling blocks here. Is there something else I should be looking at? I have made a query profiler for my code at each page renders at about 0.120 CPU seconds which seems quick enough, but when I use ab -c 5 -n 100 http://mysite.com/ the results are only Requests per second: 12.70 [#/sec] (mean) . . .
I realize that there are lots of variables at play, but I am looking for some guidance on things I can do and thought Varnish might be the answer.
UPDATE
here is a screenshot of my profiler
The only way you can improve your performance is if you measure what is slowing you down. Though it's not the best profiler in the world, Django has good integration with the hotshot profiler (described here) and you can figure out what is taking those 0.120 cpu seconds.
Are you using 2 cpus? If that's the case than perhaps the limitation is in the db when you use ab? I only say that because 0.120 * 12.70 is 1.5 which means that there's .5 seconds waiting for something. This could also be IO or something.
Adding another layer for no reason such as varnish is generally not a good idea. The only case where something like varnish would help is if you have slow clients with poor connections hold onto threads, but the ab test is not hitting this condition and frankly it's not a large enough issue to warrant the extra layer.
Now, the next topic is caching, which varnish can help with. Are your pages customized for each user, or can it be static for long periods of time? Often times pages are static except for a simple login status screen -- in this case consider off loading that login status to javascript with cookies. If you are able to cache entire pages then they would be extremely fast in ab. However, the next problem is that ab is not really a good benchmark of your site, since users aren't going to just sit at one page and hit f5 repeatedly.
A few things to think about before you go installing varnish:
First off, have you enabled the page caching middleware within Django?
Are etags set up and working properly?
Is your database server performing optimally?
Have you considered setting up caching using memcached within your code for common results? (particularly front pages and static pages displayed to non-logged-in users)
Except for query heavy dynamic pages that absolutely must display substantially different data for each user, .12 seconds seems like a long time to serve a page. Look at how you can work caching into your app to improve that performance. If you have a page that is largely static other than a username or something similar, cache the computed part of the page.
When Django's caching is working properly, ab on public pages should be lightning fast. If you aren't using any of the other features of Apache, consider using something lighter and faster like lighttpd or nginx.
Eric Florenzano has put together a pretty useful project called django-newcache that implements some advanced caching behavior. If you're running into the limitations of the built-in caching, consider it. http://github.com/ericflo/django-newcache
I'm trying to deploy my first django site through mod_wsgi (on a VPS that also serves PHP pages). Once the first django page is loaded the site runs pretty quick, but loading up that first page is excruciating - at least 15 seconds, sometimes 30 seconds+.
During the first page loadup memory (384MB) is maxed out & other tasks also slow to a crawl. I'm pretty new to django so not quite sure how to solve this. Unfortunately running django through it's own server (as opposed to one that also serves PHP) isn't really feasible.
Any suggestions appreciated.
As answered on #django, likely because embedded mode being used and not daemon mode. Dangers of this described at:
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
I'm building a site with django that lets users move content around between a bunch of photo services. As you can imagine the application does a lot of api hits.
for example: user connects picasa, flickr, photobucket, and facebook to their account. Now we need to pull content from 4 different apis to keep this users data up to date.
right now I have a a function that updates each api and I run them all simultaneously via threading. (all the api's that are not enabled return false on the second line, no it's not much overhead to run them all).
Here is my question:
What is the best strategy for keeping content up to date using these APIs?
I have two ideas that might work:
Update the apis periodically (like a cron job) and whatever we have at the time is what the user gets.
benefits:
It's easy and simple to implement.
We'll always have pretty good data when a user loads their first page.
pitfalls:
we have to do api hits all the time for users that are not active, which wastes a lot of bandwidth
It will probably make the api providers unhappy
Trigger the updates when the user logs in (on a pageload)
benefits:
we save a bunch of bandwidth and run less risk of pissing off the api providers
doesn't require NEARLY the amount of resources on our servers
pitfalls:
we either have to do the update asynchronously (and won't have
anything on first login) or...
the first page will take a very long time to load because we're
getting all the api data (I've measured 26 seconds this way)
edit: the design is very light, the design has only two images, an external css file, and two external javascript files.
Also, the 26 seconds number comes from the firebug network monitor running on a machine which was on the same LAN as the server
Personally, I would opt for the second method you mention. The first time you log in, you can query each of the services asynchronously, showing the user some kind of activity/status bar while the processes are running. You can then populate the page as you get the results back from each of the services.
You can then cache the results of those calls per user so that you don't have to call the apis each time.
That lightens the load on your servers, loads your page fast, and provides the user with some indication of activity (along with incrimental updates to the page as their content loads). I think those add up to the best User Experience you can provide.