Amount of traffic t2.micro can handle - amazon-web-services

How can I estimate how many page views per second or in parallel an instance like t2.micro can handle? I know this varies depending on database queries, template processing and such, but I need some conservative estimates or real world examples just for a point of reference.

You're going to have issues if you try and apply a typical VPS type thought process to AWS. One of the strengths of AWS is that you have elasticity. Put up some instances, then add more when demand increases. Auto Scaling Groups mixed with an Elastic Load Balancer helps greatly with automatically dealing with demand (though it's not going to handle unexpected spike traffic very well, you'll just have to have a lot of standby instances ready if you want to deal with that).
One reason why you don't want to have t1.micros directly serving up requests is because slow clients can take up sockets that could be used to connect to the database. That's why you let ELB handle the clients instead so you don't have to deal with that. Also how many clients you can handle will be very much based on the number of available sockets, available resources, what type of web server you have installed, etc. etc.
If you're serving up static files then just use S3, potentially mixed with CloudFront to deal with that. For dealing with simple API calls that do CRUD operations on a database just use Lambda Functions with API Gateway. Since both Lambda and API Gateway will scale up with demand you won't really have to worry about the page views issue as much.
You're simply going to have an extremely difficult time finding a direct answer to your question just due to the way AWS works and how folks utilize it.

Related

Small instance and autoscale or big instance?

I'm about to get an Amazone ec2 account for my 4 e-commerce websites. Sometimes there is a big traffic or i have to make a lot of queries to generate some products feeds.
But there is something i can't figure out about this system : Why there is different instances specs ?
I mean if i pay per hour, then why don't i get the smallest instance spec, and if needed the auto-scale will make 2 or 100 instances to deal with the traffic.
So can someone advice or explain why should i or shouldn't do what i said above ?
And why there is different instance sizes ?
Thank you
Not all applications are same, the number of requests these web applications handle quite different, basically Amazon AWS provides various flavors of virtual servers http://aws.amazon.com/ec2/instance-types/, some are best suitable for E commerce websites with predictable bursts, blogs with less usage, Compute intensive apps, Media Streaming Servers which requires higher RAM etc.
You decide the best suitable type and size of the virtual server for your application to begin with and then scale up and scale down based on the load. AWS provides reference use-cases and best practices here http://aws.amazon.com/architecture/ to architect elastic, scalable and fault tolerant web services.

EC2 Architecture design for Website

I have a site that I will be launching soon. Not entirely sure how heavy the traffic will get.
I am using Django+Nginx+Gunicorn+Mysql. There will be support for SSL/HTTPS.
As a starting point, I am thinking of having two micro instances balanced by Elastic Load Balancing.
The MySql database will be on one of the instances. If traffic gets heavy, I might move static files to a CDN. The micro instances serve as front-end servers responsible for only churning out HTML/JSON and serving static files. Static files are mainly CSS/js and several images (not many). I foresee database will be read-heavy and less writes.
Questions:
Assuming the traffic rises to 100k page views per day, will the 2 micro instances suffice?
Do I have to move the database to a separate instance? And what instance type would be good?
What if the traffic is only 1k page views per day?
How many gunicorn processes to run on a micro instance?
In general, what type of metrics will help me determine what kind and how many instances I would need? What is the methodology to decide what kind of architecture I would need?
Thanks a lot!
Completely dependant on how dynamic the site is planning to be. Do users generate content towards the service or is it mostly static? If the former you're going to get a lot from putting stuff like avatars, images etc. into S3 and putting that on Cloudfront. Same with your static files... keeping your servers stateless will allow you scale with ease.
At 100k page views a day you will definitely struggle with just micros... you really should only use those in a development environment and aren't meant to handle stuff like serving users. I'd use at a minimum a single small instance in-front of a Load Balancer, may sound strange but you will be able to throw in another instance when things get busy without having to mess with Route 53 or potentially having your site fail. The stateless stuff is quite important now as user-generated assets may only be reference able from one instance and not the other.
At 1k page views I'd still use a small for web serving and another small for MySQL. You can look into RDS which is great if you're doing this solo, forget about needing to upgrade versions and stuff like maintenance, backups etc.
You will also be able to one-click spin up read replicas for peak. Look into the Amazon CLI as well to help automate those tasks. Cronjobs will make it a cinch if you're time stressed otherwise Opsworks, Cloudformation and Auto-Scaling will all help with the above.
Oh and just as a comparison, an Application server of mine running Apache, PHP with APC to serve our users starts to struggle with about 80 concurrent users. Runs on a small EC2 Instance with a Small RDS (which sits at about 15% at the same time as the Application Server is going downhill)
Probably not. Micro instances are not designed for heavy production loads. They use a burstable CPU profile. They can run at 2 ECU for a couple of minutes, and then they get locked at 0.1-0.2 ECU. I tend to like c1.medium, but small may be enough for you.
Maybe, as long as they are spread out during the day and not all in a short window.
1-2 per core. Micro only has 1 core.
Every application is different. The best thing to do is run your own benchmarks using tools like ab (Apache Bench)
Following the AWS best practices architecture diagram is always a good start.
http://media.amazonwebservices.com/architecturecenter/AWS_ac_ra_web_01.pdf
I strongly advise you to store all your files on Amazon S3, and use a Route 53 DNS (or any other DNS if you want) in front of it to distribute the files, because later on if you decide to use CloudFront CDN it will be very easy to change. And, just to mention using CloudFront as CDN will increase your cost only a little bit, not a huge thing.
Doesn't matter the scenario, if we're talking a about production, you should definitely go for separate instances, at least 1 EC2 for web and 1 EC2/RDS for database.
If you are geek and like to get into the nitty gritty details, create your own infrastructure and feel free to use any automation tool (puppet, chef) or not. Or if you just want to collect the profit, or have scarce resources to take care of everything, you should try Elastic Beanstalk (http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_Python_django.html)
Anyway, going to create your own infrastructure or choose elastic beanstalk, always execute stress tests to have a better overview of your capacity planning needs. After you choose your initial environment, stress it using apache bench, siege or whatever other tool you may like.
Hope this helps.
I would suggest to use small instances instead of micro as micro instances often stop responding on heavy load and then it requires a stop-start. Use s3 for static files which helps in faster loading and have a look over cloud front.
The region for instance also helps in serving requests and if you target any specific region, create the instance selecting that region.
Create the database in new instance and attach ebs volume to that instance. Automate backup script to copy database files and store in ebs to avoid any issues. The instance selected here can be iops for faster processing over standard. Aws services provide lot of flexibility but you need to have scripts running to scale up and down the servers as per the timings.
Spot instance can help in future as they come cheap in case you are scaling up.

Realistically, how do I setup Amazon AWS to make it auto-scale?

I have had an EC2 instance working just fine for months (still developing, app not live yet), but I just realized I don't even know how to make my EC2 instance scale up / down depending on traffic.
The sheer number of services offered by Amazon is overwhelming, and I'm very confused.
Initially, I though I'd just have one instance, and Amazon would transparently allocate resources or create identical instances to handle traffic but it seems my impression was wrong.
My question is: can someone please tell me (in simple words, bullet list or point me to a tutorial) how to make my instance automatically grows to handle 100,000 simultaneous users then automatically goes back when the surge is done?
Assuming this is possible, can I do this via the AWS control panel? If so, how?
All I can see is micro, small, medium, etc.. instances. Each one has limited resources and it's not clear how to automatically setup the instance so that Amazon dynamically allocate additional resources to handle traffic spikes (or even gradually go up to keep with natural traffic growth for that matter).
Side question May I assume that Amazon auto-handle DDOS attacks when scaling up? (meaning rogue traffic would eventually stopped/slowed down by Amazon and scaling would only affect legitimate traffic spike). I realize this side question may be really stupid, keep in mind I didn't take my coffee yet :)
This article details how to auto scale using load balancers and EC2: http://kkpradeeban.blogspot.com/2011/01/auto-scaling-with-amazon-ec2.html
For scalability you may also want to look into this article on implementing a pub/sub system for distributed systems: http://www.infoq.com/articles/AmazonPubSub
You can't automatically change the instance type (m1.small, m1.large, etc.) in response to changing load. You can, however, have AWS automatically create new instances as your load increases, and tear them down when load subsides.
I believe this article will help you: http://aws.amazon.com/autoscaling/.

need some guidance on usage of Amazon AWS

every once in a while i read/hear about AWS and now i tried reading the docs.
But such docs seem to be written for people who already know which AWS they need to use and only search for how it can be used.
So, for myself, to understand AWS better i try to sketch a hypothetical Webapplication with a few questions.
The apps purpose is to modify content like videos or images. So a user has some kind of webinterface where he can upload his files, do some settings and a server grabs the file and modifies it (e.g. reencoding). The Service also extracts the audio track of a video and trys to index the spoken words so the customer can search within his videos. (well its just hypothetical)
So my questions:
given my own domain 'oneofmydomains.com' is it possible to host the complete webinterface on AWS? i thought about using GWT to create the interface and just deliver the JS/images via AWS, but which one, simple storage? what about some kind of index.html, is there an EC2 instance needed to host a webserver which has to run 24/7 causing costs?
now the user has the interface with a login form, is it possible to manage logins with an AWS? here i also think about an EC2 instance hosting a database, but it would also cause costs and im not sure if there is a better way?
the user has logged in and uploads a file. which storage solution could be used to save the customers original and modified content?
now the user wants to browse the status of his uploads, this means i need some kind of ACL, so that the customer only sees his own files. do i need to use a database (e.g. EC2) for this, or does amazon provide some kind of ACL, so the GWT webinterface will be secure without any EC2?
the customers files are reencoded and the audio track is indexed. so he wants to search for a video. Which service could be used to create and maintain the index for each customer?
hope someone can give a few answers so i understand AWS better on how one could use it
thx!
Amazon AWS offers a whole ecosystem of services which should cover all aspects of a given architecture, from hosting to data storage, or messaging, etc. Whether they're the best fit for purpose will have to be decided on a case by case basis. Seeing as your question is quite broad I'll just cover some of the basics of what AWS has to offer and what the different types of services are for:
EC2 (Elastic Cloud Computing)
Amazon's cloud solution, which is basically the same as older virtual machine technology but the 'cloud' offers additional knots and bots such as automated provisioning, scaling, billing etc.
you pay for what your use (by hour), for the basic (single CPU, 1.7GB ram) would prob cost you just under $3 a day if you run it 24/7 (on a windows instance that is)
there's a number of different OS to choose from including linux and windows, linux instances are cheaper to run without the license cost associated with windows
once you're set up the server to be the way you want, including any server updates/patches, you can create your own AMI (Amazon machine image) which you can then use to bring up another identical instance
however, if all your html are baked into the image it'll make updates difficult, so normal approach is to include a service (windows service for instance) which will pull the latest deployment package from a storage (see S3 later) service and update the site at start up and at intervals
there's the Elastic Load Balancer (which has its own cost but only one is needed in most cases) which you can put in front of all your web servers
there's also the Cloud Watch (again, extra cost) service which you can enable on a per instance basis to help you monitor the CPU, network in/out, etc. of your running instance
you can set up AutoScalers which can automatically bring up or terminate instances based on some metric, e.g. terminate 1 instance at a time if average CPU utilization is less than 50% for 5 mins, bring up 1 instance at a time if average CPU goes beyond 70% for 5 mins
you can use the instances as web servers, use them to run a DB, or a Memcache cluster, etc. choice is yours
typically, I wouldn't recommend having Amazon instances talk to a DB outside of Amazon because of the round trip is much longer, the usual approach is to use SimpleDB (see below) as the database
the AmazonSDK contains enough classes to help you write some custom monitor/scaling service if you ever need to, but the AWS console allows you to do most of your configuration anyway
SimpleDB
Amazon's non-relational, key-value data store, compared to a traditional database you tend to pay a penalty on per query performance but get high scalability without having to do any extra work.
you pay for usage, i.e. how much work it takes to execute your query
extremely scalable by default, Amazon scales up SimpleDB instances based on traffic without you having to do anything, AND any control for that matter
data are partitioned in to 'domains' (equivalent to a table in normal SQL DB)
data are non-relational, if you need a relational model then check out Amazon RDB, I don't have any experience with it so not the best person to comment on it..
you can execute SQL like query against the database still, usually through some plugin or tool, Amazon doesn't provide a front end for this at the moment
be aware of 'eventual consistency', data are duplicated on multiple instances after Amazon scales up your database, and synchronization is not guaranteed when you do an update so it's possible (though highly unlikely) to update some data then read it back straight away and get the old data back
there's 'Consistent Read' and 'Conditional Update' mechanisms available to guard against the eventual consistency problem, if you're developing in .Net, I suggest using SimpleSavant client to talk to SimpleDB
S3 (Simple Storage Service)
Amazon's storage service, again, extremely scalable, and safe too - when you save a file on S3 it's replicated across multiple nodes so you get some DR ability straight away.
you only pay for data transfer
files are stored against a key
you create 'buckets' to hold your files, and each bucket has a unique url (unique across all of Amazon, and therefore S3 accounts)
CloudBerry S3 Explorer is the best UI client I've used in Windows
using the AmazonSDK you can write your own repository layer which utilizes S3
Sorry if this is a bit long winded, but that's the 3 most popular web services that Amazon provides and should cover all the requirements you've mentioned. We've been using Amazon AWS for some time now and there's still some kinks and bugs there but it's generally moving forward and pretty stable.
One downside to using something like aws is being vendor locked-in, whilst you could run your services outside of amazon and in your own datacenter or moving files out of S3 (at a cost though), getting out of SimpleDB will likely to represent the bulk of the work during migration.

Creating External Monitoring for a web app

The company I work for built and hosts a web app used by our customers and I am interested in creating some kind of external monitoring page (similar to trust.salesforce.com) that users can go to to see the current state of our servers/app. I know there are tons of different 'monitoring' services out there but I want to create the service myself, to have complete control and customization. Obviously, the service would have to be hosted in a different location and data center than the app itself. One thing I am concerned about is that if I just choose a different host in a different location, if that host goes down for any reason (power failure, server failure, or even ISP failure) the monitoring software is down. For this reason, I am thinking of hosting the monitoring app on an amazon EC2 instance. With their elastic IP feature, if for some reason the data center or point where the instance is running fails, I can just create a duplicate instance with the same data (but in a different location) and everything would work fine still.
Does this sound like a feasible plan? For even more security, I was thinking of creating 2 instances in different locations and monitoring from both of them. If one instance fails, the other would still be up. Obviously, one instance has to act as the actual web host for the monitoring page. Is it possible programatically for one instance to switch the elastic IP over to itself if it detects the other instance has failed for any reason?
I know there's a lot of different things involved in this question, I'm just looking for feedback regarding ANY of it...
If you've made it this far, thanks for taking the time to read this!
What you are talking about is a complicated solution for a complicated issue. I think you are on the right track with using something like Amazon's EC2 to reduce the chance of your monitoring app of going down. Also, you could develop it yourself but there are a great deal of free monitoring solutions out there like Nagios that will do everything you are asking for and is highly extensible so you can spend your time making it look and feel like you want while leaving the more complicated portions under the hood to software that is tried and tested. The worst thing would be for you to have a bug in your software that shows something as up when it is actually down. Based off of what you are talking about doing, I would assume that would be a huge issue.
Instead of using an elastic ip - which is only assigned to one instance, consider using the Elastic Load Balancer http://aws.amazon.com/elasticloadbalancing/ which then can route over instances in any of the availability zones. This way AWS manages taking instances in/out of the pool if they become unavailable for some reason and you do not have to spend time 'moving' the Elastic IP around. It is then easy to assign your monitoring cname to the ELB hostname.
I think RandomBen's idea of using Nagios on your instances is a good one because then you do not have to recreate all the functionality in Nagios. You then spend development time setting up the system and customizing the look and feel to your needs.
Also, if you can use MySQL, you should consider using RDS http://aws.amazon.com/rds/ although you will need to pay transfer fees if you have servers outside of a region accessing the RDS in another region.