I have been visiting some sites hosted on GAE and I found them to be very slow.
Pretty much all of them take longer than usual to load.
Time: (in seconds) [ YSlow ]
9.9 giftag.com
3.1 hotskills.net
1.9 jeeyo.net
1.5 appspot.com
Is it that App Engine Cloud is too slow, Bigtable is too slow ... or what?
You're using the YSlow plugin to measure this, and YSlow tells you why the site is slow (the cunning name is the clue). For example, in the case of gifttag.com, YSlow reports that:
This page has 9 external Javascript
scripts. Try combining them into one.
This page has 3 external stylesheets.
Try combining them into one. This page
has 13 external background images. Try
combining them with CSS sprites.
So it's get an 'E' grade for that. That's going to kill the perceived load performance of the site.
None of this has anything to do with appengine.
YSlow has nothing to do with the speed of the web app on the server side since it's a completely client side speed measurement (css, javascript, browser rendering, image loading, etc). But on the other side, I have heard that your application may be slow on App Engine if doesn't have much hits and traffic. This makes the App Engine not to cache the python runtime environment (have cold start), so this can make significant difference in performance of applications with low traffic.
Analysis: Google App Engine alluring, will be hard to escape
GAE's data access is in the order of seconds compared to a database which is measured in milliseconds. The difference is that BigTable scales to the millions of concurrent access due to the inherent isolation level of Read Uncommitted and the relaxed consistency.
No RDBMS can compute with that and still give consistency guarantees. To be honest, you don't really want to because for some applications you want strong guarantees over scalability.
Related
We need our Sitecore web application to process 60-80 web requests per second. We are using Sitecore 7.0. We have tried a 1 Webserver + 1 Database server deployment, but it only processes 20-25 requests per second. Web server queues up all the other requests in the Memory. As we increase the load, memory fills up.(We did all Sitecore performance enhancements recommended). We need 4X performance to reach the goal :).
Will it be possible to achieve this goal by upgrading the existing server, or do we have to add more web servers in production environment.
Note: We are using Lucene indexing as well.
Here are some things you can consider without changing overall architecture of your deployment
CDN to offload media and static asset requests
This leaves your content delivery server available to handle important content queries and display logic.
Example www.cloudflare.com
Configure and use Sitecore's built-in caching
This is from the guide:
Investigation and configuration of the Sitecore Caches is broken down
into multiple tasks. This way each task is more focused and
simplified. The focus is on configuration and tuning of the Sitecore
Database Caches (prefetch, data, and item caches.)
For configuration
of the output rendering caching properties, the customer should be
made aware of both the Sitecore Cache Configuration Reference and the
Sitecore Presentation Component Reference as to how properly enable
and the properties to expire these caches.
Check out the Sitecore Tuning Guide
Find Slow Queries or Controls
It sounds like your application follows Sitecore best practices, but I leave this note in for anyone that might find this answer. Use Sitecore's built-in Debug mode to identify the slowest running controls and sublayouts. Additionally, if you have Analytics set up there is a "Slow Pages" report that might give you some information on where your application is slowing down.
Those things being said, if you're prepared to provision additional servers and set up a load-balanced environment then read on.
Separate Content Delivery and Content Management
To me the first logical step before load-balancing content delivery servers is to separate the content management from the equation. This is pretty easy and the Scaling Guide walks you through getting the HistoryEngine set up to keep those Lucene indexes up to date.
Set up Load Balancer with 2 or more Content Delivery servers
Once you've done the first step this can be as easy as cloning your content delivery server and adding it to your load balancer "pool". There are a couple of things to consider here like: Does your web application allow users to log in? So you'll need to worry about sticky sessions or machine keys. Does your web application use file media instead of blob media? I haven't had to deal with this, but I understand that's another consideration.
Scale your SQL solution
I've seen applications with up to four load balanced content delivery servers and the SQL Server did not have a problem - I think this will be unique to each case depending on a lot of factors: horsepower and tuning of SQL Server, content model of your application, complexity of your queries, caching configuration on content delivery servers, etc. Again, the Scaling Guide covers SQL Mirroring and Failover, so that is going to be your first stop on getting that going.
Finally, I would say contact Sitecore. These guys have probably seen more of what's gone right and what's gone wrong with installations and could get you on the right path. Good luck!
This answer written from a Sitecore developer perspective:
Bottom line: You need to figure out exactly where your performance bottleneck is. That is going to take some digging, but will be very worthwhile. You should definitely be able to serve 60-80 requests/s without any trouble... but of course that makes a lot of assumptions about the nature of your site and the requests.
For my site, I found Sitecore's caching implementation to be sub-par... I created some very simple and aggressive application-specific caches in my app and this made all the difference in the world. For instance, we have 900+ "Partner" items where our sites' advertisements live... and simply putting all these objects in an array in the Application object sped up page requests significantly. Finding an object in a Hashtable indexed by its Item.Name or ID is going to be a lot faster than Sitecore.Context.Database.GetItem("/itempath") or a SelectItems() call (at least, that's my experience). If your architecture and data set will allow this strategy, we've had good experience with it.
Another thing to watch out for is XSLT renderings. Personally, I avoid them completely in favor of ASP.NET UserControls. The XSLT rendering is just slow. As much as 10x slower than a native UserControl rendering the same HTML. So if you have a few of these... replace with some custom code and you'll see a world of difference.
Our website has about 200K images stored in sitecore database now. It runs more slowly than before. Does this large numbers of images stored in database will slow down the whole site?
If yes, how can I improve our image storage?
Thanks very much, our sitecore version is 6.2.
Have you considered setting up a CDN for your static assets? That would reduce load on your site and should speed it up.
Otherwise you might look at optimising the databases. Have a look at the Sitecore Optimisation Guide http://sdn.sitecore.net/upload/sitecore6/64/cms_tuning_guide_sc60-64-a4.pdf
In general, it depends on whether front-end or back-end is slow.
If you experience issues even when the site is not loaded with huge number of requests - then you should probably upgrade hardware.
If it's caused by high website load - there are two rather simple options:
1) Use dedicated image server for Sitecore http://pentialized.dk/2011/01/02/dedicated-image-server-in-sitecore-part-2/
2) Integrate Media Library with CDN, CloudFront is really simple and powerfull, see the example here: http://herskind.co.uk/blog/2012/04/using-cloudfront-for-sitecore-media-content
I need some guidance in the realm of server architecture for Django.
My current Django-based web app stats (reached in two weeks - run on one VPS w/ Apache, mod_wsgi, mysql):
10,000 users total
20 avg requests/user/day
200,000 requests/day
8,000 users access site daily
Where the app could reach (where I'd be panicking - this assumes approx linear growth):
200,000 users total
20 avg requests/user/day
4,000,000 requests/day
160,000 users access site daily
The issue here is really just handling page requests. I only store short strings of text-based data, so DB size shouldn't be an issue.
What sort of server architecture should I be setting up from a hardware and software perspective? I need to think about caching, load balancing, multiple processing servers, multiple DB servers, etc, but don't know where to start.
You're projected growth of ~45 / requests per second really isn't that intensive. I think using a standard nginx load balancer in front of your web servers will handle everything. If your DB access isn't very intense you will probably do fine with just 1 DB machine.
I really think the most important thing is not to do any premature optimization. Deal with issues as they come, or else you may end up wasting a lot of time.
There are tons of caching, multiple server configurations, and load balancing tutorials.
Google is a good place to start.
Growing traffic is a standard problem, there are no lack of tutorials on these things.
I am using a django with apache mod_wsgi, my site has dynamic data on every page and all of the media (css, images, js) is stored on amazon S3 buckets liked via "http://bucket.domain.com/images/*.jpg" inside the markup . . . . my question is, can varnish still help me speed up my web server?
I am trying to knock down all the stumbling blocks here. Is there something else I should be looking at? I have made a query profiler for my code at each page renders at about 0.120 CPU seconds which seems quick enough, but when I use ab -c 5 -n 100 http://mysite.com/ the results are only Requests per second: 12.70 [#/sec] (mean) . . .
I realize that there are lots of variables at play, but I am looking for some guidance on things I can do and thought Varnish might be the answer.
UPDATE
here is a screenshot of my profiler
The only way you can improve your performance is if you measure what is slowing you down. Though it's not the best profiler in the world, Django has good integration with the hotshot profiler (described here) and you can figure out what is taking those 0.120 cpu seconds.
Are you using 2 cpus? If that's the case than perhaps the limitation is in the db when you use ab? I only say that because 0.120 * 12.70 is 1.5 which means that there's .5 seconds waiting for something. This could also be IO or something.
Adding another layer for no reason such as varnish is generally not a good idea. The only case where something like varnish would help is if you have slow clients with poor connections hold onto threads, but the ab test is not hitting this condition and frankly it's not a large enough issue to warrant the extra layer.
Now, the next topic is caching, which varnish can help with. Are your pages customized for each user, or can it be static for long periods of time? Often times pages are static except for a simple login status screen -- in this case consider off loading that login status to javascript with cookies. If you are able to cache entire pages then they would be extremely fast in ab. However, the next problem is that ab is not really a good benchmark of your site, since users aren't going to just sit at one page and hit f5 repeatedly.
A few things to think about before you go installing varnish:
First off, have you enabled the page caching middleware within Django?
Are etags set up and working properly?
Is your database server performing optimally?
Have you considered setting up caching using memcached within your code for common results? (particularly front pages and static pages displayed to non-logged-in users)
Except for query heavy dynamic pages that absolutely must display substantially different data for each user, .12 seconds seems like a long time to serve a page. Look at how you can work caching into your app to improve that performance. If you have a page that is largely static other than a username or something similar, cache the computed part of the page.
When Django's caching is working properly, ab on public pages should be lightning fast. If you aren't using any of the other features of Apache, consider using something lighter and faster like lighttpd or nginx.
Eric Florenzano has put together a pretty useful project called django-newcache that implements some advanced caching behavior. If you're running into the limitations of the built-in caching, consider it. http://github.com/ericflo/django-newcache
I'm planing to deploy a django powered site. But I feel confused about the choice of web servers, which includes apache, lighttpd, nginx and others.
I've read some articles about the performance of each of these choice. But it seems no one agrees. So I'm wondering why not test the performance by myself?
I can't find information about the best approach to performance testing web servers. So my questions are:
Is there any easy approach to test the performance without the production site?
Or can I have a method to simulate the heavy traffic to have a fair test?
How can I keep my test fair and close to production situation?
After the test, I want to figure out:
Why some ones say nginx has a better performance when serving static files.
The cpu and memory needs of each web server.
My best choice.
Tools like ab are commonly used towards testing how much load you can take from a battering of requests at once, alongside cacti/munin/your system monitoring tool or choice you can generate data on system load & requests/sec. The problem with this is many people benchmarking don't realise that they need to request a lot of different requests, as different parts of your code executes it will take varying amounts of time. Profiling and benchmarking code and not requests is also important, to which plenty of folk have already done so for django, benchrun is also not a bad tool either.
The other issue, is how many HTTP requests each page view takes. The less amount of requests, and the quicker they can be processed is the key to having websites that can sustain a high amount of traffic, as the quicker you can finish and close connections, the quicker you allocate resources for new ones.
In terms of general speed of web servers, it goes without saying that a proxy server (running reverse at your end) will always perform faster than a webserver with static content. As for Apache vs nginx in regards to your django app, it seems that mod_python is indeed faster than nginx/lighty + FastCGI but that's no surprise because CGI, regardless of any speed ups is still slow. Executing and caching code at the webserver and letting it manage it is always faster (mod_perl vs use CGI, mod_php vs CGI, etc) if you do it right.
Apache JMeter is an excellent tool for stress-testing web applications. It can be used with any web server, not just Apache.
You need to set up the web server + website of your choice on a machine somewhere, preferably a physical machine with similar hardware specs to the one you will eventually be deploying to.
You then need to use a load testing framework, for example The Grinder (free), to simulate many users using your site at the same time.
The load testing framework should be on separate machine(s) and you should monitor the network and CPU usage of those machines as well to make sure that the limiting factor of your testing is in fact the web server and not your load injectors.
Other than that its just about altering the content and monitoring response times, throughput, memory and CPU use etc... to see how they change depending on what web server you use and what sort of content you are hosting.