Can a CDN be beneficial over a DEDICATED SERVER when there seems to be no CDN in our region? - amazon-web-services

I am considering trying out Amazons CloudFront CDN, which utilizes their S3 service for file storage and springs data to servers closest to the browser, however, we have a dedicated server in South Africa, Johannesburg to be exact, so my question is this:
Amazons CloudFront seems to give you the option to have your base server in EU, America, Japan but nowhere near South Africa - I guess EU is the closest? So, will I still benefit from using the CDN to server static files (css, images, javascript - when not using Googles ajax API - and media files) rather than calling them from the same, dedicated server? Bear in mind that although we have a dedicated server, it is STILL a shared hosting environment as we host multiple clients websites on the server.
Secondly, if I DO use a CDN like Amazons CloudFront, can I benefit from caching my content and using far future expires headers, compression etc?
Many thanks

Well, the CDN would take load off your main dedicated server. This is quite advantageous, because it means that you can downgrade your server to something a little cheaper.
It also provides many, many more levels of redundancy than small-medium business could ever hope to achieve (especially with only a single server).
The fact that a CDN has mirrors all over the world is only 1 feature of such a network. It's a good one, but it is by no means the only one.
Really, you have to think about what you are hoping to achieve by using the CDN, and what you are using it for.
If your application relies on the response time and nothing else matters, then it might be worth keeping your dedicated server running everything because of the proximity to your location. That said, quite often applications that require quick response time only require it for a very small subsection of the overall functionality of the server, so you could unload only the non-time-critical elements to the CDN and still get the best of both worlds.
In short, CDNs have many functions. They save businesses time, money and provide infrastraucture that is otherwise unattainable to the large majority of developers and administrators.

I don't think it matters where your server is. You're still uploading stuff to Amazon (although, yes you'd probably want to make use of the EU servers so it's quicker if you're sending stuff from South Africa) but the idea of a CDN is that it has end points around the world so that you don't have to worry about having servers in those locations.
In short, yes a CDN such as Amazon's CloudFront is what you want to be able to serve static content quickly.

The main idea is that the closest server is (the better route is) to user, the better download speed is. CDN is a network of multiple servers geographically spread. This means that on the contrary to the single dedicated server, static content is given away from the closest point of presence. This allows better speeds in different regions. And yes - you can use different headers in CDN.

Related

AWS S3 + CDN Speed, significantly slower

Ok,
So I've been playing around with amazon web services as I am trying to speed up my website and save resources on my server by using AWS S3 & CloudFront.
I ran a page speed test initially, the page speed loaded in 1.89ms. I then put all assets onto an s3 bucket, made that bucket available to cloudfront and then used the cloudfront url on the page.
When I ran the page speed test again using this tool using all their server options I got these results:
Server with lowest speed of : 3.66ms
Server with highest speed of: 5.41ms
As you can see there is a great increase here in speed. Have I missed something, is it configured wrong? I thought the CDN was supposed to make the page speed load faster.
Have I missed something, is it configured wrong? I thought the CDN was supposed to make the page speed load faster.
Maybe, in answer to both of those statements.
CDNs provide two things:
Ability to scale horizontally as request volume increases by keeping copies of objects
Ability to improve delivery experience for users who are 'far away' (either in network or geographic terms) from the origin, either by having content closer to the user, or having a better path to the origin
By introducing a CDN, there are (again two) things you need to keep in mind:
Firstly, the CDN generally holds a copy of your content. If the CDN is "cold" then there is less likely to be any acceleration, especially when the test user is close to the origin
Secondly, you're changing the infrastructure to add an additional 'hop' into the route. If the cache is cold, the origin isn't busy, and you're already close to it, you'll almost always see an increase in latency, not a decrease.
Whether a CDN is right for you or not depends on the type of traffic you're seeing.
If I'm a distributor based in the US with customers all over the world, even for dynamic, completely uncachable content, then using a CDN can help me improve performance for those users.
If I'm a distributor based in North Virginia with customers only in North Virginia, and I only see one request an hour, then I'm likely to have no visible performance improvement - because the cache is likely not to be kept populated, the network path isn't preferable, and I don't have to deal with scale.
Generally, yes, a CDN is faster. But 1.89ms is scorchingly fast; you probably won't beat that, certainly not under any kind of load.
Don't over-optimize here. Whatever you're doing, you have bigger fish to fry than 1.77ms in load time based on only three samples and no load testing.
I've seen the same thing. I initially moved some low-traffic static sites to S3/cloudfront to improve performance, but found even a small linux ec2-instance running nginx would give better response times in my use cases.
For a high traffic, geographically dispersed set of clients S3/Cloudfront would probably outperform.
BTW: I suspect you don't mean 1.89ms, but instead 1.89 seconds, correct?

How to prepare Django for a possible slashdotting?

I would like to prepare my website for a possible influx in traffic. This is my first time using Django as a framework, so I'm unsure of the modifications that should be made to assure that I'm ready and won't go down. What are some of the common things one can do to prepare a Django website for production-level traffic?
I'm also wondering what to expect in terms of traffic numbers. I'm currently hosted at Webfaction with 600GB/month of traffic. Will this quickly run out? Are there statistics on how big 'slashdotted' events are?
Use memcache and caching middleware.
Be sure to offload serving statics.
Use CDN for statics. This doesn't directly affect Django, but will reduce your network traffic.
Anything beyond that — read up what others are using:
Scaling Django Web Apps By Mike Malone
Instagram Architecture
DISQUS Architecture
Since you are at Webfaction you have an easy answer for handling your statics:
Create a Static-only application. (Not the Static CGI/PHP app)
Add it under you current website.
Put all of your statics under it (or symlink to them, which is what I do).
This will serve all statics through their nginx frontend -- blindingly fast.
Regarding your bandwidth allocation:
You don't say what type of content you are offering. If it is anything even slightly vanilla you are unlikely to approach 600GB/mo. I have one customer who offers adult-oriented videos teaching tantric sex techniques and their video bandwidth (for both free & member-only videos) is about 400-450GB/mo. The HTML portion of the site (with tons of images) runs about 50-60GB/mo.

What configurations need to be set for a LAMP server for heavy traffic?

I was contracted to make a groupon-clone website for my client. It was done in PHP with MYSQL and I plan to host it on an Amazon EC2 server. My client warned me that he will be email blasting to about 10k customers so my site needs to be able to handle that surge of clicks from those emails. I have two questions:
1) Which Amazon server instance should I choose? Right now I am on a Small instance, I wonder if I should upgrade it to a Large instance for the week of the email blast?
2) What are the configurations that need to be set for a LAMP server. For example, does Amazon server, Apache, PHP, or MySQL have a maximum-connections limit that I should adjust?
Thanks
Technically, putting the static pages, the PHP and the DB on the same instance isn't the best route to take if you want a highly scalable system. That said, if the budget is low and high availablity isn't a problem then you may get away with it in practise.
One option, as you say, is to re-launch the server on a larger instance size for the period you expect heavy traffic. Often this works well enough. You problem is that you don't know the exact model of the traffic that will come. You will get a certain percentage who are at their computers when it arrives and they go straight to the site. The rest will trickle in over time. Having your client send the email whilst the majority of the users are in bed, would help you somewhat, if that's possible, by avoiding the surge.
If we take the case of, say, 2,000 users hitting your site in 10 minutes, I doubt a site that hasn't been optimised would cope, there's very likely to be a silly bottleneck in there. The DB is often the problem, a good sized in-memory cache often helps.
This all said, there are a number of architectural design and features provided by the likes of Amazon and GAE, that enable you, with a correctly designed back-end, to have to worry very little about scalability, it is handled for you on the most part.
If you split the database away from the web server, you would be able to put the web server instances behind an elastic load balancer and have that scale instances by demand. There also exist standard patterns for scaling databases, though there isn't any particular feature to help you with that, apart from database instances.
You might want to try Amazon mechanical turk, which basically lots of people who'll perform often trivial tasks (like navigate to a web page click on this, etc) for a usually very small fee. It's not a bad way to simulate real traffic.
That said, you'd probably have to repeat this several times, so you're better off with a load testing tool. And remember, you can't load testing a time-slicing instance with another time-slicing instance...

Migrate hosted LAMP site to AWS

Is there an easy way to migrate a hosted LAMP site to Amazon Web Services? I have hobby sites and sites for family members where we're spending far too much per month compared to what we would be paying on AWS.
Typical el cheapo example of what I'd like to move over to AWS:
GoDaddy domain
site hosted at 1&1 or MochaHost
a handful of PHP files within a certain directory structure
a small MySQL database
.htaccess file for URL rewriting and the like
The tutorials I've found online necessitate PuTTY, Linux commands, etc. While these aren't the most cumbersome hurdles imaginable, it seems overly complicated. What's the easiest way to do this?
The ideal solution would be something like what you do to set up a web host: point GoDaddy to it, upload files, import database, done. (Bonus points for phpMyAdmin being already installed but certainly not necessary.)
It would seem the amazon AWS marketplace has now got a solution for your problem :
https://aws.amazon.com/marketplace/pp/B0078UIFF2/ref=gtw_msl_title/182-2227858-3810327?ie=UTF8&pf_rd_r=1RMV12H8SJEKSDPC569Y&pf_rd_m=A33KC2ESLMUT5Y&pf_rd_t=101&pf_rd_i=awsmp-gateway-1&pf_rd_p=1362852262&pf_rd_s=right-3
Or from their own site
http://www.turnkeylinux.org/lampstack
A full LAMP stack including PHPMyAdmin with no setup required.
As for your site and database migration itself (which should require no more than file copies and a database backup/restore) the only way to make this less cumbersome is to have someone else do it for you...
Dinah,
As a Web Development company I've experienced an unreal number of hosting companies. I've also been very closely involved with investigating cloud hosting solutions for sites in the LAMP and Windows stacks.
You've quoted GoDaddy, 1And1 and Mochahost for micro-sized Linux sites so I'm guessing you're using a benchmark of $2 - $4 per month, per site. It sounds like you have a "few" sites (5ish?) and need at least one database.
I've yet to see any tool that will move more than the most basic (i.e. file only, no db) websites into Cloud hosting. As most people are suggesting, there isn't much you can do to avoid the initial environment setup. (You should factor your time in too. If you spend 10 hours doing this, you could bill clients 10 x $hourly-rate and have just bought the hosting for your friends and family.)
When you look at AWS (or anyone) remember these things:
Compute cycles is only where it starts. When you buy hosting from traditional ISPs they are selling you cycles, disk space AND database hosting. Their default levels for allowed cycles, database size and traffic is also typically much higher before you are stopped or charged for "overage", or over-usage.
Factor in the cost of your 1 database, and consider how likely it will be that you need more. The database hosting charges can increase Cloud costs very quickly.
While you are likely going to need few CCs (compute cycles) for your basic sites, the free tier hosting maximums are still pretty low. Anticipate breaking past the free hosting and being charged monthly.
Disk space it also billed. Factor in your costs of CCs, DB and HDD by using their pricing estimator: http://calculator.s3.amazonaws.com/calc5.html
If your friends and family want to have access to the system they won't get it unless you use a hosting company that allows "white labeling" and provides a way to split your main account into smaller mini-hosting accounts. They can even be setup to give self-admin and direct billing options if you went with a host like www.rackspace.com. The problem is you don't sound like you want to bill anyone and their minimum account is likely way too big for your needs.
Remember that GoDaddy (and others) frequently give away a year of hosting with even simple domain registrations. Before I got my own servers I used to take HUGE advantage of these. I've probably been given like 40+ free hosting accounts, etc. in my lifetime as a client. (I still register a ton of domain through them. I also resell their hosting.)
If you aren't already, consider the use of CMS systems that support portaling (one instance, many websites under different domains). While I personally prefer DotNetNuke I'm sure that one of its LAMP stack competitors can do the same for you. This will keep you using only one database and simplify your needs further.
I hope this helps you make a well educated choice. I think it'll be a fine-line between benefits and costs. Only knowing the exact size of every site, every database and the typical traffic would allow this to be determined in advance. Database count and traffic will be your main "enemies". Optimize files to reduce disk-space needs AND your traffic levels in terms of data transferred.
Best of luck.
Actually it depends upon your server architecture, whether you want to migrate whole of your LAMP stack to Amazon EC2.
Or use different Amazon web services for different server components like Amazon S3 for storage and Amazon RDS for mysql database and so.
In case if you are going with LAMP on EC2: This tutorial will atleast give you a head up.
Anyways you still have to go with essential steps of setting up the AMI and installing LAMP through SSH.

I've got a django site with a good deal of javascript but my clients have terrible connectivity - how to optimize?

We're hosting a django service for some clients using really really poor and intermittent connectivity. Satellite and GPRS connectivity in parts of Africa that haven't benefited from the recent fiber cables making landfall.
I've consolidated the javascripts and used minificatied versions, tried to clean up the stylesheets, and what not...
Like a good django implementer, I'm letting apache serve up all the static information like css and JS and other static media. I've enabled apache modules deflate (for gzip) and expired to try to minimize retransmission of the javascript packages (mainly jQuery's huge cost). I've also enabled django's gzip middleware (but that doesn't seem to do much in combination with apache's deflate).
Main question - what else is there to do to optimize bandwidth utilization?
Are there django optimizations in headers or what not to make sure that "already seen data" will not travel over the network?
The django caching framework seems to be tailored towards server optimization (minimize hitting the database) - how does that translate to actual bandwidth utilization?
what other tweaks on apache are there to make sure that the browser won't try to get data it already has?
Some of your optimizations are important for wringing better performance out of your server, but don't confuse them with optimizing bandwidth utilization. In other words gzip/deflate are relevant but Apache serving static content is not (even though it is important).
Now, for your problem you need to look at three things: how much data is being sent, how many connections are required to get the data, and how good are the connections.
You mostly have the first area covered by using deflate/gzip, expires, minimization of javascript etc. so I can only add one or two things you might not know about. First, you should upgrade to Django 1.1, if you haven't already, because it has better support for ETags/Expires headers for your Django views. You probably already have those headers working properly for static data from Apache but if you're using older Django they (probably) aren't being set properly on your dynamic views.
For the next area, number of connections, you need to consolidate your javascript and css files into as few files as possible to reduce the number of connections. Also very helpful can be consolidating your image files into a single "sprite" image. There are a few Django projects to handle this aspect: django-compress, django-media-bundler (which is the only one that will create image sprites), and you can also see this SO answer.
For the last area of how good are the connections you should look at global CDN as suggested by Alex, or at the very least host your site at an ISP closer to your users. This could be tough for Africa, which in my experience can't even get decent connectivity into European ISP's (at least southern Africa... northern Africa might be better).
You could delegate jQuery to a CDN which may have better connectivity with Africa, e.g., google (and, it's a free service!-). Beyond that I recommend anything every written (or spoken on video, there's a lot of that!-) by Steve Souders -- while his talks and books and essays are invaluable to EVERY web developer I think they're particularly precious to ones serving a low-bandwidth audience (e.g., one of his tips in his latest books and talks is about a substantial fraction of the world's browsers NOT getting compression benefits from deflate or gzip -- it's not so much about the browsers themselves, but about proxies and firewalls doing things wrong, so "manual compression" is STILL important then!).
This is definitely not an area I've had a lot of experience in, but looking into Django's ConditionalGetMiddleware might prove useful. I think it might help you solve the first of your bullet points.
EDIT: This might be a good place to start: http://docs.djangoproject.com/en/dev/topics/conditional-view-processing/