Web page load time tracking sites - web-services

I want to track and analyze web page load times on my user's systems.
I ran across this article http://www.panalysis.com/tracking-webpage-load-times.php that uses Google analytics to track pages, but it's too coarse for my needs.
Are there any sites out there that specifically let you track page load times using a JavaScript snippet you embed in your web pages?
Ideally the snippet would look like this:
var startTime = new Date();
// code to load the tracker
window.onload=function() {
loadTimeTracker.sendData(<customer id>, document.path, new Date() - startTime)
}

are you looking to send data back to server? If not there are lots of tools to track this sort of thing. i know the firebug extention for firefox does.
if you are looking to send the data base to server doing it purely with javascript will have some drawbacks because it wont include page rendering times, only the time from the when the first line of javascript was recieved until the page finished loading which may skew your data.

Jiffy is pretty nice and open source.

Gomez has a service that tracks how long your website takes to load. It doesn't use any JavaScript as far as I know.
Another good resource is http://webpagetest.org/. It allows you to test the load time manually, but offers a lot of analysis of your page. Latency, time to first byte, assets, DNS lookups, etc. Great resource.

Enable trace on your page, that might be the good start to analyze how much time your page takes to load.

Related

Sitecore performance enhancements

We need our Sitecore web application to process 60-80 web requests per second. We are using Sitecore 7.0. We have tried a 1 Webserver + 1 Database server deployment, but it only processes 20-25 requests per second. Web server queues up all the other requests in the Memory. As we increase the load, memory fills up.(We did all Sitecore performance enhancements recommended). We need 4X performance to reach the goal :).
Will it be possible to achieve this goal by upgrading the existing server, or do we have to add more web servers in production environment.
Note: We are using Lucene indexing as well.
Here are some things you can consider without changing overall architecture of your deployment
CDN to offload media and static asset requests
This leaves your content delivery server available to handle important content queries and display logic.
Example www.cloudflare.com
Configure and use Sitecore's built-in caching
This is from the guide:
Investigation and configuration of the Sitecore Caches is broken down
into multiple tasks. This way each task is more focused and
simplified. The focus is on configuration and tuning of the Sitecore
Database Caches (prefetch, data, and item caches.)
For configuration
of the output rendering caching properties, the customer should be
made aware of both the Sitecore Cache Configuration Reference and the
Sitecore Presentation Component Reference as to how properly enable
and the properties to expire these caches.
Check out the Sitecore Tuning Guide
Find Slow Queries or Controls
It sounds like your application follows Sitecore best practices, but I leave this note in for anyone that might find this answer. Use Sitecore's built-in Debug mode to identify the slowest running controls and sublayouts. Additionally, if you have Analytics set up there is a "Slow Pages" report that might give you some information on where your application is slowing down.
Those things being said, if you're prepared to provision additional servers and set up a load-balanced environment then read on.
Separate Content Delivery and Content Management
To me the first logical step before load-balancing content delivery servers is to separate the content management from the equation. This is pretty easy and the Scaling Guide walks you through getting the HistoryEngine set up to keep those Lucene indexes up to date.
Set up Load Balancer with 2 or more Content Delivery servers
Once you've done the first step this can be as easy as cloning your content delivery server and adding it to your load balancer "pool". There are a couple of things to consider here like: Does your web application allow users to log in? So you'll need to worry about sticky sessions or machine keys. Does your web application use file media instead of blob media? I haven't had to deal with this, but I understand that's another consideration.
Scale your SQL solution
I've seen applications with up to four load balanced content delivery servers and the SQL Server did not have a problem - I think this will be unique to each case depending on a lot of factors: horsepower and tuning of SQL Server, content model of your application, complexity of your queries, caching configuration on content delivery servers, etc. Again, the Scaling Guide covers SQL Mirroring and Failover, so that is going to be your first stop on getting that going.
Finally, I would say contact Sitecore. These guys have probably seen more of what's gone right and what's gone wrong with installations and could get you on the right path. Good luck!
This answer written from a Sitecore developer perspective:
Bottom line: You need to figure out exactly where your performance bottleneck is. That is going to take some digging, but will be very worthwhile. You should definitely be able to serve 60-80 requests/s without any trouble... but of course that makes a lot of assumptions about the nature of your site and the requests.
For my site, I found Sitecore's caching implementation to be sub-par... I created some very simple and aggressive application-specific caches in my app and this made all the difference in the world. For instance, we have 900+ "Partner" items where our sites' advertisements live... and simply putting all these objects in an array in the Application object sped up page requests significantly. Finding an object in a Hashtable indexed by its Item.Name or ID is going to be a lot faster than Sitecore.Context.Database.GetItem("/itempath") or a SelectItems() call (at least, that's my experience). If your architecture and data set will allow this strategy, we've had good experience with it.
Another thing to watch out for is XSLT renderings. Personally, I avoid them completely in favor of ASP.NET UserControls. The XSLT rendering is just slow. As much as 10x slower than a native UserControl rendering the same HTML. So if you have a few of these... replace with some custom code and you'll see a world of difference.

Using JSON file to improve caching - good idea?

I'm a beginner to caching. I'm currently working on a small project with Django and will be implementing caching later via memcached.
I have a page with a video on it and the video has a bunch of comments. The only content on the page that is likely to change regularly is the comments and the "You are logged in as.../You are not logged in..." message.
I was thinking I could create a JSON file that serves the username and most recent comments, including it in the head with <script src="videojson.js"></script>. That way I could populate the HTML via Javascript instead of caching the whole page on a per-user basis.
Is this a suitable approach, or is the caching system smarter than I give it credit for?
How is the JavaScript going to get the json object? Are going to serve from a django view that the us calls? And in that view you will just pull out of memcached if available and DB if not?
That seems reasonable assuming your json isn't very big. If your comments change a lot and you have to spend a lot of time querying the db, building the json object and saving to memcache every time a new comment is written, it won't work well. But if you only fill the cache when your json expires, and you don't care about having the latest and greatest comments on there instantly, it should work.
One thing to point out is that if you aren't getting that much traffic now, you might be adding a level of complexity that won't give you much return on your time spent. But if you are using this to learn how to do caching then it is a good exercise.
Hope that helps

Django Performance

I am using a django with apache mod_wsgi, my site has dynamic data on every page and all of the media (css, images, js) is stored on amazon S3 buckets liked via "http://bucket.domain.com/images/*.jpg" inside the markup . . . . my question is, can varnish still help me speed up my web server?
I am trying to knock down all the stumbling blocks here. Is there something else I should be looking at? I have made a query profiler for my code at each page renders at about 0.120 CPU seconds which seems quick enough, but when I use ab -c 5 -n 100 http://mysite.com/ the results are only Requests per second: 12.70 [#/sec] (mean) . . .
I realize that there are lots of variables at play, but I am looking for some guidance on things I can do and thought Varnish might be the answer.
UPDATE
here is a screenshot of my profiler
The only way you can improve your performance is if you measure what is slowing you down. Though it's not the best profiler in the world, Django has good integration with the hotshot profiler (described here) and you can figure out what is taking those 0.120 cpu seconds.
Are you using 2 cpus? If that's the case than perhaps the limitation is in the db when you use ab? I only say that because 0.120 * 12.70 is 1.5 which means that there's .5 seconds waiting for something. This could also be IO or something.
Adding another layer for no reason such as varnish is generally not a good idea. The only case where something like varnish would help is if you have slow clients with poor connections hold onto threads, but the ab test is not hitting this condition and frankly it's not a large enough issue to warrant the extra layer.
Now, the next topic is caching, which varnish can help with. Are your pages customized for each user, or can it be static for long periods of time? Often times pages are static except for a simple login status screen -- in this case consider off loading that login status to javascript with cookies. If you are able to cache entire pages then they would be extremely fast in ab. However, the next problem is that ab is not really a good benchmark of your site, since users aren't going to just sit at one page and hit f5 repeatedly.
A few things to think about before you go installing varnish:
First off, have you enabled the page caching middleware within Django?
Are etags set up and working properly?
Is your database server performing optimally?
Have you considered setting up caching using memcached within your code for common results? (particularly front pages and static pages displayed to non-logged-in users)
Except for query heavy dynamic pages that absolutely must display substantially different data for each user, .12 seconds seems like a long time to serve a page. Look at how you can work caching into your app to improve that performance. If you have a page that is largely static other than a username or something similar, cache the computed part of the page.
When Django's caching is working properly, ab on public pages should be lightning fast. If you aren't using any of the other features of Apache, consider using something lighter and faster like lighttpd or nginx.
Eric Florenzano has put together a pretty useful project called django-newcache that implements some advanced caching behavior. If you're running into the limitations of the built-in caching, consider it. http://github.com/ericflo/django-newcache

Accessing World of Warcraft data from the web

I'm aware of the WoW add-on programming community, but what I can find no documentation on is any API for accessing WoW's databases from the web. I see third-party sites like WoWHeroes.com and Wowhead use game data (item and character databases,) so I know it's possible. But, I can't figure out where to begin. Is there a web service I can use or are they doing some sort of under-the-hood work that requires running the WoW client in their server environment?
Sites like Wowhead and WoWHearoes use client run addons from players which collect data. The data is then posted to their website. There is no way to access WoW's database. Your best bet is to hit the armory and extract the XML returned from your searches. The armory is just an xml transform on xml data returned.
Blizzard has recently (8/15/2011) published draft documentation for their RESTful APIs at the following location:
http://blizzard.github.com/api-wow-docs/
The APIs cover information about characters, items, auctions, guilds, PVP, etc.
Requests to the API are currently throttled to 3,000 per day for anonymous usage, but there is a process for registering applications that have a legitimate need for more access.
Update (January 2019): The new Blizzard Battle.net Developer Portal is here:
https://develop.battle.net/
Request throttling limits and authentication rules have changed.
Characters can be mined from the armory, the pages are xml.
Items are mined from the local installation game files, that's how wowhead does it at least.
It's actually really easy to get item data from the wow armory!
For example:
http://www.wowarmory.com/item-info.xml?i=33135
View the source of the page (not via Google Chrome, which displays transformed XML via XSLT) and you'll see the XML data!
You can use search listing pages to retrieve all blue gems, for example, then use an XML parser to retrieve the data
They are parsing the Armory information from www.wowarmory.com. There is no official Blizzard API for accessing it, but there is an open source PHP solution available (http://phparmory.sourceforge.net/)
Maybe a little late to the party, but for future reference check out the WoW API Documentation at http://blizzard.github.com/api-wow-docs/
Scraping HTML and XML is now pretty much obsolete and also discouraged by Blizzard.
The documentation:
http://blizzard.github.com/api-wow-docs/
enjoy
Sites like those actually get the data from the Armory. If you pull up any item, guild, character, etc. and do 'View Source' on the page you will see the XML data coming back. Here is a quick C# example of how to get the data.
This third-party site collection data from players. I think this collection based on addons for WoW or each player submit information manualy.
Next option is wraping wow site and parsing information from websites (HTML).
this is probably the wrong site for your question, but you're thinking of the wowarmory xml stuff. there is no official wow api. people just do httprequests and get the xml to do number crunching stuffs. try googling around. there are some libs out there in different languages that are already written for you. i know there are implementations in php/ruby. i was working on one in .net a while back until i got distracted. here's an article which kinda sums this all up.
http://www.wow.com/2008/02/11/mashing-up-wow-data-when-we-can-get-it-in-outside-applications/
Wowhead and other sites generally rely on data gathered by users with a wow add-in.
Wowhead also has a way for other sites to reference that data in hover pop-ups, so their content gets reused on a number of sites.
Powered by Wowhead
For actual ingame data collection:
cosmos.exe is what thottbot for example uses. It probably uses some form windows hack (dllinjection or something) or sniffs packets to determine what items have dropped and etc. (intercepts traffic from the wow server to your client and decodes it). It saves this data on the users computer and then uploads it to a webserver for storage. I don't know if any development libraries were created for this sort of thing.

Speeding up page loads while using external API's

I'm building a site with django that lets users move content around between a bunch of photo services. As you can imagine the application does a lot of api hits.
for example: user connects picasa, flickr, photobucket, and facebook to their account. Now we need to pull content from 4 different apis to keep this users data up to date.
right now I have a a function that updates each api and I run them all simultaneously via threading. (all the api's that are not enabled return false on the second line, no it's not much overhead to run them all).
Here is my question:
What is the best strategy for keeping content up to date using these APIs?
I have two ideas that might work:
Update the apis periodically (like a cron job) and whatever we have at the time is what the user gets.
benefits:
It's easy and simple to implement.
We'll always have pretty good data when a user loads their first page.
pitfalls:
we have to do api hits all the time for users that are not active, which wastes a lot of bandwidth
It will probably make the api providers unhappy
Trigger the updates when the user logs in (on a pageload)
benefits:
we save a bunch of bandwidth and run less risk of pissing off the api providers
doesn't require NEARLY the amount of resources on our servers
pitfalls:
we either have to do the update asynchronously (and won't have
anything on first login) or...
the first page will take a very long time to load because we're
getting all the api data (I've measured 26 seconds this way)
edit: the design is very light, the design has only two images, an external css file, and two external javascript files.
Also, the 26 seconds number comes from the firebug network monitor running on a machine which was on the same LAN as the server
Personally, I would opt for the second method you mention. The first time you log in, you can query each of the services asynchronously, showing the user some kind of activity/status bar while the processes are running. You can then populate the page as you get the results back from each of the services.
You can then cache the results of those calls per user so that you don't have to call the apis each time.
That lightens the load on your servers, loads your page fast, and provides the user with some indication of activity (along with incrimental updates to the page as their content loads). I think those add up to the best User Experience you can provide.