Creating External Monitoring for a web app - web-services

The company I work for built and hosts a web app used by our customers and I am interested in creating some kind of external monitoring page (similar to trust.salesforce.com) that users can go to to see the current state of our servers/app. I know there are tons of different 'monitoring' services out there but I want to create the service myself, to have complete control and customization. Obviously, the service would have to be hosted in a different location and data center than the app itself. One thing I am concerned about is that if I just choose a different host in a different location, if that host goes down for any reason (power failure, server failure, or even ISP failure) the monitoring software is down. For this reason, I am thinking of hosting the monitoring app on an amazon EC2 instance. With their elastic IP feature, if for some reason the data center or point where the instance is running fails, I can just create a duplicate instance with the same data (but in a different location) and everything would work fine still.
Does this sound like a feasible plan? For even more security, I was thinking of creating 2 instances in different locations and monitoring from both of them. If one instance fails, the other would still be up. Obviously, one instance has to act as the actual web host for the monitoring page. Is it possible programatically for one instance to switch the elastic IP over to itself if it detects the other instance has failed for any reason?
I know there's a lot of different things involved in this question, I'm just looking for feedback regarding ANY of it...
If you've made it this far, thanks for taking the time to read this!

What you are talking about is a complicated solution for a complicated issue. I think you are on the right track with using something like Amazon's EC2 to reduce the chance of your monitoring app of going down. Also, you could develop it yourself but there are a great deal of free monitoring solutions out there like Nagios that will do everything you are asking for and is highly extensible so you can spend your time making it look and feel like you want while leaving the more complicated portions under the hood to software that is tried and tested. The worst thing would be for you to have a bug in your software that shows something as up when it is actually down. Based off of what you are talking about doing, I would assume that would be a huge issue.

Instead of using an elastic ip - which is only assigned to one instance, consider using the Elastic Load Balancer http://aws.amazon.com/elasticloadbalancing/ which then can route over instances in any of the availability zones. This way AWS manages taking instances in/out of the pool if they become unavailable for some reason and you do not have to spend time 'moving' the Elastic IP around. It is then easy to assign your monitoring cname to the ELB hostname.
I think RandomBen's idea of using Nagios on your instances is a good one because then you do not have to recreate all the functionality in Nagios. You then spend development time setting up the system and customizing the look and feel to your needs.
Also, if you can use MySQL, you should consider using RDS http://aws.amazon.com/rds/ although you will need to pay transfer fees if you have servers outside of a region accessing the RDS in another region.

Related

Optimizing latency between application server (EC2) and RDS

here's how the story goes.
We started transforming a monolith, single-machine, e-commerce application (Apache/PHP) to cloud infrastructure. Obviously, the application and the database (MySQL) were on the same machine.
We decided to move to AWS. And as the first step of transformation, we decided to split the database and application. Hosting application on a c4.xlarge machine. And hosting database to RDS Aurora MySQL on a db.r5.large machine, with default options.
This setup performed well. Especially the database performance went up high.
Unfortunately, when the traffic spiked up, we started experiencing long response times. Looked like RDS, although being really fast for executing queries, wasn't returning results fast enough over the network to the EC2 machine.
So that was our conclusion after an in-depth analysis of the setup including Apache/MySQL/PHP tuning parameters. The delayed response time was definitely due to the network latency between EC2 and RDS/Aurora machine, both machines being in the same region.
Before adding additional resources (ex: ElastiCache etc) we'd first like to look into any default configuration we can play around with to solve this problem.
What do you think we missed there?
One of the bigest strength with the cloud is the scalability and you should always design your application to utilise it and it sounds like your RDS instance is getting chocked due to nr of request more than the process time for the queries. So rather go more small instances with load balancing than one big doing all the job. And with Load Balancers you will get away from a singel point of failure due to you can have replicas of your database and they can even be placed in different AZ.
Here is a blogpost you can read on the topic:
https://aws.amazon.com/blogs/database/scaling-your-amazon-rds-instance-vertically-and-horizontally/
Good luck in your aws journey.
The Best answer to your question is using read replicas, but remember only your read requests could be sent to your read replicas so you would need to design your application that way
Also for some cost savings, you should try aurora serverless
One more option is passing traffic between ec2 and rds through a private network rather than using the public internet to connect your ec2 to rds that can be one of the mistakes that might be happening

EC2, Webserver, and MySQL

I am about to launch an iOS app that will be communicating with my custom REST API. Right now I am running a single EC2 t2.micro instance running an Apache web server with MySQLi. Before I go ahead and launch it for the public, I want to hear what proper steps should be taken regarding the following.
Should I run two separate EC2 instances? One only for the web server and the other to handle only the database?
How should I approach setting up the database? Should I still use MySQLi or should I start using Amazon's RDS?
In relationship to number two, when the database and/or web server runs out of space, how is this issue handled so that it seamlessly adds space to allow the database/web server to continue growth? I also read something regarding auto-scale.
I will be expecting many requests per minute to my web server and want to take precaution.
The answer to these questions largely depends on the requirements of your application, your budget, and on what you decide to manage vs. what you'd prefer to allow AWS to manage. However, I'll answer these as best I can.
1) Yes. Separating the database from the web server (that is, 2 different EC2 instances) makes sense for a lot of reasons. This will allow you to tailor resources like memory, CPU, etc. to each layer of your application separately. You do not want your web and database competing for the same resources. Additionally, an issue that forces you to take down one (web or database) will not force you to also take down the other. If your database lives on one of the web servers and you need to perform maintenance, your app will effectively become offline, since down goes your database as you perform updates. Also, ideally you would protect your database server within a private subnet in your VPC. If you have the web and database on the same server, they will both be in a public subnet, since you're web will require access to an internet gateway.
2) Depends. If you want to maintain total control of the database server, than use an EC2 instance where you retain operating system control. If you want to take advantage of features like Multi-AZ for high availability or allowing AWS to manage things like updates for you, RDS can be a great option. Cost also plays a role. For things like read-replicas and Multi-AZ, you will pay more, but you are purchasing performance and high availability. Thus, depends on your requirements. You can find the features of RDS here: RDS Product Details
3) For anything running on an EC2 instance (database or web) or if you decide to use RDS, you may provision and attach additional storage volumes as necessary. The type of storage you select will depend on the performance requirements, your budget, and the kind of workload you expect your database to face. Amazon provides the storage options available to you as well as a section for adding more storage here: RDS Storage Options
If you are worried about too many requests overwhelming your EC2 t2.micro instance, consider creating an ELB load balancer and setting up an auto-scaling group which will allow you to expand your capacity as necessary while distributing traffic such that no one server gets overwhelmed.

EC2 Architecture design for Website

I have a site that I will be launching soon. Not entirely sure how heavy the traffic will get.
I am using Django+Nginx+Gunicorn+Mysql. There will be support for SSL/HTTPS.
As a starting point, I am thinking of having two micro instances balanced by Elastic Load Balancing.
The MySql database will be on one of the instances. If traffic gets heavy, I might move static files to a CDN. The micro instances serve as front-end servers responsible for only churning out HTML/JSON and serving static files. Static files are mainly CSS/js and several images (not many). I foresee database will be read-heavy and less writes.
Questions:
Assuming the traffic rises to 100k page views per day, will the 2 micro instances suffice?
Do I have to move the database to a separate instance? And what instance type would be good?
What if the traffic is only 1k page views per day?
How many gunicorn processes to run on a micro instance?
In general, what type of metrics will help me determine what kind and how many instances I would need? What is the methodology to decide what kind of architecture I would need?
Thanks a lot!
Completely dependant on how dynamic the site is planning to be. Do users generate content towards the service or is it mostly static? If the former you're going to get a lot from putting stuff like avatars, images etc. into S3 and putting that on Cloudfront. Same with your static files... keeping your servers stateless will allow you scale with ease.
At 100k page views a day you will definitely struggle with just micros... you really should only use those in a development environment and aren't meant to handle stuff like serving users. I'd use at a minimum a single small instance in-front of a Load Balancer, may sound strange but you will be able to throw in another instance when things get busy without having to mess with Route 53 or potentially having your site fail. The stateless stuff is quite important now as user-generated assets may only be reference able from one instance and not the other.
At 1k page views I'd still use a small for web serving and another small for MySQL. You can look into RDS which is great if you're doing this solo, forget about needing to upgrade versions and stuff like maintenance, backups etc.
You will also be able to one-click spin up read replicas for peak. Look into the Amazon CLI as well to help automate those tasks. Cronjobs will make it a cinch if you're time stressed otherwise Opsworks, Cloudformation and Auto-Scaling will all help with the above.
Oh and just as a comparison, an Application server of mine running Apache, PHP with APC to serve our users starts to struggle with about 80 concurrent users. Runs on a small EC2 Instance with a Small RDS (which sits at about 15% at the same time as the Application Server is going downhill)
Probably not. Micro instances are not designed for heavy production loads. They use a burstable CPU profile. They can run at 2 ECU for a couple of minutes, and then they get locked at 0.1-0.2 ECU. I tend to like c1.medium, but small may be enough for you.
Maybe, as long as they are spread out during the day and not all in a short window.
1-2 per core. Micro only has 1 core.
Every application is different. The best thing to do is run your own benchmarks using tools like ab (Apache Bench)
Following the AWS best practices architecture diagram is always a good start.
http://media.amazonwebservices.com/architecturecenter/AWS_ac_ra_web_01.pdf
I strongly advise you to store all your files on Amazon S3, and use a Route 53 DNS (or any other DNS if you want) in front of it to distribute the files, because later on if you decide to use CloudFront CDN it will be very easy to change. And, just to mention using CloudFront as CDN will increase your cost only a little bit, not a huge thing.
Doesn't matter the scenario, if we're talking a about production, you should definitely go for separate instances, at least 1 EC2 for web and 1 EC2/RDS for database.
If you are geek and like to get into the nitty gritty details, create your own infrastructure and feel free to use any automation tool (puppet, chef) or not. Or if you just want to collect the profit, or have scarce resources to take care of everything, you should try Elastic Beanstalk (http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_Python_django.html)
Anyway, going to create your own infrastructure or choose elastic beanstalk, always execute stress tests to have a better overview of your capacity planning needs. After you choose your initial environment, stress it using apache bench, siege or whatever other tool you may like.
Hope this helps.
I would suggest to use small instances instead of micro as micro instances often stop responding on heavy load and then it requires a stop-start. Use s3 for static files which helps in faster loading and have a look over cloud front.
The region for instance also helps in serving requests and if you target any specific region, create the instance selecting that region.
Create the database in new instance and attach ebs volume to that instance. Automate backup script to copy database files and store in ebs to avoid any issues. The instance selected here can be iops for faster processing over standard. Aws services provide lot of flexibility but you need to have scripts running to scale up and down the servers as per the timings.
Spot instance can help in future as they come cheap in case you are scaling up.

Realistically, how do I setup Amazon AWS to make it auto-scale?

I have had an EC2 instance working just fine for months (still developing, app not live yet), but I just realized I don't even know how to make my EC2 instance scale up / down depending on traffic.
The sheer number of services offered by Amazon is overwhelming, and I'm very confused.
Initially, I though I'd just have one instance, and Amazon would transparently allocate resources or create identical instances to handle traffic but it seems my impression was wrong.
My question is: can someone please tell me (in simple words, bullet list or point me to a tutorial) how to make my instance automatically grows to handle 100,000 simultaneous users then automatically goes back when the surge is done?
Assuming this is possible, can I do this via the AWS control panel? If so, how?
All I can see is micro, small, medium, etc.. instances. Each one has limited resources and it's not clear how to automatically setup the instance so that Amazon dynamically allocate additional resources to handle traffic spikes (or even gradually go up to keep with natural traffic growth for that matter).
Side question May I assume that Amazon auto-handle DDOS attacks when scaling up? (meaning rogue traffic would eventually stopped/slowed down by Amazon and scaling would only affect legitimate traffic spike). I realize this side question may be really stupid, keep in mind I didn't take my coffee yet :)
This article details how to auto scale using load balancers and EC2: http://kkpradeeban.blogspot.com/2011/01/auto-scaling-with-amazon-ec2.html
For scalability you may also want to look into this article on implementing a pub/sub system for distributed systems: http://www.infoq.com/articles/AmazonPubSub
You can't automatically change the instance type (m1.small, m1.large, etc.) in response to changing load. You can, however, have AWS automatically create new instances as your load increases, and tear them down when load subsides.
I believe this article will help you: http://aws.amazon.com/autoscaling/.

How to manage and connect to dynamic IPs of EC2 instances?

When writing a web app with Django or such, what's the best way to connect to dynamic EC2 instances, such as a cluster of Redis or memcache instances? IP addresses change between reboots, etc. Elastic IPs are limited to 5 by default - what are some other options for auto-discovering/auto-updating which machines are available?
Late answer, but use Boto: http://boto.cloudhackers.com/en/latest/index.html
You can use security groups, tags, and other means to hit the EC2 API and pick the instances/IPs for each thing (DB Server, caching server, etc.) at load-time. We do this with great success in deployment, and are moving that way with our Django settings.py, as well.
One method that I heard mentioned recently in an AWS webinar was to store this sort of information in SimpleDB. Essentially, you would use SimpleDB as the central configuration location, and each instance that you launch would register its IP etc. with this configuration, so you would always have a complete description of all of your instances in one place. I haven't seen this in practice so I don't know what the best practices would be exactly, but the idea sounds reasonable. I suppose you could use SNS or something to signal all the other instances whenever the configuration changes, so everyone could refresh their in-memory cache of the configuration.
I don't know the AWS administrative APIs yet really, but there's probably an API call to list your EC2 instances, at which point you could use some sort of custom protocol to ping each of them and ask it what it is -- part of the memcache cluster, Redis, etc.
I'm having a similar problem and didn't found a solution yet because we also need to map Load Balancers addresses.
For your problem, there are two good alternatives:
If you are not using EC2 micro instances or load balancers, you should definitely use Amazon Virtual Private Cloud, because it lets you control instances IPs and routing tables (check all limitations before using this service).
If you are only using EC2 instances, you could write a script that uses the EC2 API tools to run the command ec2-describe-instances to find all instances and their public/private IPs. Then, the script could parameterize instances names to hosts and update /etc/hosts. Finally, you should put the script in the crontab of every computer/instance that need to access the EC2 instances (see ec2-describe-instances).
If you want to stay with EC2 instances (I'm in the same boat, I've read that you can do such things with their VPC or use an S3 bucket or something like that.) but with EC2, I'm in the middle of writing stuff like this...it's all really simple up till the part where you need to contact the server with a server from your data center or something. The way I'm doing it currently is using the API to create the instance and start it...then once its ready, I contact the server to execute a powershell script that I have on the server....the powershell renames the computer and reboots it...that takes care of needing the hostname and MAC for our data center firewalls. I haven't found a way yet to remotely rename a computer.
As far as knowing the IP, the elastic IPs are the way to go. They say you're only allowed 5 and gotta apply for more but we've been regularly requesting more and they give em to us..we're up to like 15 now and they haven't complained yet.
Another option if you dont' want to do all the computer renaming and such...you could use DHCP and set your computer up so when it boots it gets the computer name and everything from DHCP....I'm not sure how to do this exactly, I've come across very smart people telling me that's the way to do it during my research for Amazon.
I would definitely recommend that you get into the Amazon API...I've been working with it for less than a month and I can do all kinds of crazy things. My code can detect areas of our system that are getting stressed, spin up 10 amazon servers all configured to act as whatever needs stress relief, and be ready to send jobs to all in less than 7 minutes. Brings a tear to my eye.
The documentation is very complete...the API itself is a work of art and a joy to program against...I've very much enjoyed working with it. (and no, i dont' work for them lol)
Do it the traditional way: with DNS. This is what it was built for, so use it! When a machine boots, have it ask for the domain name(s) related to its function, and use that for your configuration. If it stops responding, re-resolve the DNS (or just do that periodically anyway).
I think route53 and the elastic load balancing stuff can be used to do this, if you want to stick to Amazon solutions.