I've been trying to wrap my head around the best solution for hosting development sites for our company lately.
To be completely frank I'm new to AWS and it's architecture, so more then anything I just want to know if I should keep learning about it, or find another more suitable solution.
Right now we have a dedicated server which hosts our own website, our intranet, and a lot of websites we've developed for clients.
Our own web and the intranet isn't an issue, however I'm not quite sure about the websites we produced for our clients.
There are about 100 of them right now, these sites are only used pre-launch so our clients can populate the sites with content. As soon as the content is done we host the website somewhere else. And the site that is still on our developer server is no longer used at all, but we keep them there if the client wants a new template/function so we can show it there before sending it to production.
This means the development sites have almost zero traffic, with perhaps at most 5 or so people adding content to them at any given time (5 people for all 100 sites, not 5 per site).
These sites needs to be available at all times, and should always feel snappy.
These are not static sites, they all require a database connection.
Is AWS (ES2, or any other kind of instance, lightsail?) a valid solution for hosting these sites. Or should I just downgrade our current dedicated server to a VPS, and just worry about hosting our main site on AWS?
I'll put this in an answer because it's too long, but it's just advice.
If you move those sites to AWS you're likely to end up paying (significantly) more than you do now. You can use the Simple Monthly Calculator to get an idea.
To clarify, AWS is cost-effective for certain workloads. It is cost effective because it can scale automatically when needed so you don't have to provision for peak traffic all the time. And because it's easy to work with, so it takes fewer people and you don't have to pay a big ops team. It is cost effective for small teams that want to run production workloads with little operational overhead, up to big teams that are not yet big enough to build their own cloud.
Your sites are development sites that just sit there and see very little activity. Which means those sites are probably under the threshold of cost effective.
You should clarify why you want to move. If the reason is that you want as close to 100% uptime as possible, then AWS is a good choice. But it will cost you, both in terms of bill paid to Amazon and price of learning to set up such infrastructure. If cost is a primary concern, you might want to think it over.
That said, if your requirements for the next year or more are predictable enough and you have someone who knows what they are doing in AWS, there are ways to lower the cost, so it might be worth it. But without further detail it's hard for anyone to give you a definitive answer.
However. You also asked if you should keep learning AWS. Yes. Yes, you should. If not AWS, one of the other major clouds. Cloud and serverless[1] are the future of much of this industry. For some that is very much the present. Up to you if you start with those dev sites or something else.
[1] "Serverless" is as misleading a name as NoSQL. It doesn't mean no servers.
Edit:
You can find a list of EC2 (Elastic Cloud Compute) instance types here. That's CPU and RAM. Realistically, the cheapest instance is about $8 per month. You also need storage, which is called EBS (Elastic Block Store). There are multiple types of that too, you probably want GP2 (General Purpose SSD).
I assume you also have one or more databases behind those sites. You can either set up the database(s) on EC2 instance(s), or use RDS (Relational Database Service). Again, multiple choices there. You probably don't want Multi-AZ there for dev. In short, Multi-AZ means two RDS instances so that if one crashes the other one takes over, but it's also double the price. You also pay for storage there, too.
And, depending on how you set things up you might pay for traffic. You pay for traffic between zones, but if you put everything in the same zone traffic is free.
Storage and traffic are pretty cheap though.
This is only the most basic of the basics. As I said, it can get complicated. It's probably worth it, but if you don't know AWS you might end up paying more than you should. Take it slow and keep reading.
I am new to micro services and I am keen to use this architecture. I am interested to know what architecture structure should be used for systems with multiple customer interfaces where customer systems may use one or many of the available services. Here is a simple illustration of a couple of ways I think it would be used:
An example of this type of system could be:
Company with multiple staff using system for quotes of products
using products, quotes and users mirco services
Company with website to display products
using products micro service
Company with multiple staff using system for own quotes
using quotes and users micro services
Each of these companies would have their own custom build interface only displaying relevant services.
As in the illustrations all quotes, products and users could be stored local to the mirco services, using unique references to identify records for each company. I dont know if this is advisable as it could make data difficult to manage and could grow fast making it difficult to manage.
Alternatively I could store such as users and quotes local to the client system and reference the micro services for data thats generic. Here mirco services could be used just to handle common logic and return results. This does feel someone illogical and problematic to me.
I've not been able to find anything online to explain the best course of action for this scenario and would be grateful for any experienced feedback on this.
I am afraid you will not find many useful recipes or patterns for microservice architectures yet. I think that the relative quiet on your question is that it doesn’t have enough detail for anybody to readily grasp. I will make a wag:
From first principles, you have the concept of a quote which would have to interrogate the product to get a price and other details. It might need to access users to produce commission information, and customers for things like discounts and lead times. Similar concepts may be used in different applications; for example inventory, catalog, ordering [ slightly different from quote ].
The idea in microservices is to reduce the overlap between these concepts by dispatching the common operations as their own (micro) services, and constructing the aggregate services in terms of them. Just because something exists as a service does not mean it has to be publicly available. It can be private to just these services.
When you have strained your system into these single function services, the resulting system will communicate more, but will be able to be deployed more flexibly. For example, more resources &| redundancy might be applied to the product service if it is overtaxed by requests from many services. In the end, infrastructure like service mesh help to isolate the implementation of these micro services from the sorts of deployment considerations.
Don’t be misled into thinking there is a free lunch. Micro service architectures require more upfront work in defining the service boundaries. A failure in this critical area can yield much worse problems than a poorly scaled monolithic app. Even when you have defined your services well, you might find they rely upon external services that are not as well considered. The only solace there is that it is much easier to insulate your self from these if you have already insulated the rest of your system from its parts.
After much research following various courses online, video tutorials and some documentation provided by Netflix, I have come to understand the first structure in the diagram in the best solution.
Each service should be able to effectively function independently, with the exception of referencing other services for additional information. Each service should in effect be able to be picked up and put into another system without any need to be aware of anything beyond the API layer of the architecture.
I hope this is of some use to someone trying to get to grips with this architecture.
This is my first attempt at looking into cloud hosting and I'm feeling like a complete idiot. I have always had my own dedicated server with which I would would remote in and install/manage everything myself. So this cloud thing is completely new for me. I just can't seem to grasp basic things... like how I would get Tomcat and PostgreSQL installed in a way that they could talk to each other or get my domain and SSL cert on there, etc.
If I could just get a feel for where I should start, then I could probably calculate my costs and jump into the free trial where hopefully things will click for me.
Here are my basic, high-level requirements...
My web app running in Tomcat over HTTPS
Let's say approximately 1,000 page views per day
PostgreSQL supporting my web app.
Let's say approximately 10GB database storage
Throughout the day, a fairly steady stream of inbound SFTP data (~ 100MB per day)
The processing load on the app server side should be fairly light. The heaving lifting will be on the DB side sorting through and processing lots of data.
I'm having trouble figuring out which options I would install and calculating costs. If someone could help me get started by saying something like "You would start with a std-xyz-med server, install ABC located here at http://blahblah, then install XYZ located at http://XYZ.... etc.. etc. You can expect to pay somewhere around $100-$200 per month"....
Thoughts?
I would be eternally grateful. It seems like they should have some free sales support channel to ask someone at Google about this, but I don't see it.
Thank You!
I'll try to give you some tips where to start looking.
I will be referring to some products, here are the links
If you want to stick to your old ways, you can always spin up an instance on Compute Engine and set it up the same way you did before, these are just regular virtual machines. For some use cases this is completely valid.
You can split different components of your stack to different products:
For example, if your app is fine with postgresql, you can spin up a fully managed service in Cloud SQL, which might make it easier to manage backup or have several apps access the same db.
Alternatively, have a look at the different DB offerings to see if any of them matches your needed workload better. Perhaps have a look at BigQuery?
If you want to turn your app into a microservice, which is then easier to autoscale and is more fault tolerant, have a look at App Engine. That way you don't need to manage a virtual machine. The docs here will lead you through some easy to follow examples on how to set up SSL.
For the services to talk to each other, refer to docs of the individual components. It's usually very simple.
With pricing, try https://cloud.google.com/products/calculator/
Things like BigQuery have different pricing models - you don't pay for server uptime, but for amounts of data stored & processed with your queries.
is there someone who has expirience in hosting django applications on ep.io?
Waht are the pros/cons on it?
I'm currently using ep.io, I'm still in development with my app but I have an app deployed and running.
When you use a service like this you go into it knowing that it isn't going to be the perfect solution for every case. Knowing the pros and cons before hand will help set your expectations so that you aren't disappointed later on.
ep.io is still very young and I believe still in beta, and isn't available to the general public. To be totally fair to them, it is still a work in progress and some of these pros and cons may change as they roll out new features. I will try and come back and update this post as the new versions become available, and my experience with the service continues.
So far I am really pleased with what they have, they took the most annoying part of developing an application and made it better. If you have a simple blog app, it should be a breeze to deploy it, and probably not cost that much to host.
Pros:
Server Management: You don't have to worry about your server setup at all, it handles everything for you. With a VPS, you would need to worry about making sure the server is up to date with security patches, and all that fun stuff, with this, you don't worry about anything, they take care of all that for you.
deployment: It makes deploying an app and having it up and running really quickly. deploying a new version of an app is a piece of cake, I just need to run one maybe two commands, and it handles everything for me.
Pricing: you are only charged for what you use, so if you have a very low traffic website, it might not cost you anything at all.
Scaling: They handle scaling and load balancing for you out of the box, no need for you to worry about that. You still need to write your application so that it can scale efficiently, but if you do, they will handle the rest.
Background tasks: They have support for cronjobs as well as background workers using celery.
Customer support: I had a few questions, sent them an email, and had an answer really fast, they have been great, so much better then I would have expected. If you run your own VPS, you really don't have anyone to talk to, so this is a major plus.
Cons:
DB access: You don't have direct access to the database, you can get to the psql shell, but you can't connect an external client gui. This makes doing somethings a little more difficult or slow. But you can still use the django admin or fixtures to do a lot of things.
Limited services available: It currently only supports Postgresql and redis, so if you want to use MySQL, memcached, mongodb,etc you are out of luck.
low level c libs: You can't install any dependencies that you want, similar to google app engine, they have some of the common c libs installed already, and if you want something different that isn't already installed you will need to contact them to get it added. http://www.ep.io/docs/runtime/#python-libraries
email: You can't send or recieve email, which means you will need to depend on a 3rd party for that, which is probably good practice anyway, but it just means more money.
file system: You have a more limited file system available to you, and because of the distributed nature of the system you will need to be very careful when working from files. You can't (unless i missed it) connect to your account via (s)ftp to upload files, you will need to connect via the ep.io command line tool and either do an rsync or a push of a repo to get files up there.
Update: for more info see my blog post on my experiences with ep.io : http://kencochrane.net/blog/2011/04/my-experiences-with-epio/
Update: Epio closed down on May 31st 2012
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
We are finishing up our web application and planning for deployment. Very important aspect of deployment to production is monitoring the health of the system. Having a small team of developers/support makes it very critical for us to get the early notifications of potential problems and resolve them before they have impact on users.
Using Nagios seams like a good option, but wanted to get more opinions on what are the best monitoring tools/practices for web application in general and specifically for Django app? Also would welcome recommendations on what should be monitored aside from the obvious CPU, memory, disk space, database connectivity.
Our web app is written in Django, we are running on Linux (Ubuntu) under Apache + Fast CGI with PostgreSQL database.
EDIT
We have a completely virtualized environment under Linode.
EDIT
We are using django-logging so we have a way separate info, errors, critical issues, etc.
Nagios is good, it's good to maybe have system testing (Selenium) running regularily.
Edit: Hyperic and Groundwork also look interesting.
There is probably a test suite system that can keep pressure testing everything as well for you. I can't remember the name off the top of my head, maybe someone can mention one below.
Other things I like to do:
The best motto for infrastructure is always fix, detect, repair. Get it up, get to the root of it, and cure/prevent it if you can.
Since a system exists at many levels, we should test at many levels:
Edit: Have all errors or warnings posted directly to your case manager via email. That way you can track occurrences in one place.
1) Connection : monitor your internet connectivity from the server and from the outside. Log this somewhere
2) Server : monitor all the processes that you need to to ensure they are running and not pinning the server. Use a HP Server or something equivalent with hardware failure notification that it can do from a bios level. Notify and log if they are.
3) Software : Identify the key software that always needs to be running. Set the performance levels if any and then monitor them. Nagios should be able to help with this. On windows it can be a bit more. When an exception occurs, you should be able to run a script from it to restart processes automatically. My dream system is allowing me to interact with servers via SMS if the server sees it as an exception that I have to either permit, or one that will happen automatically unless I cancel by sms. One day..
4) Remote Power : Ensure Remote power-reset capabilities are in your hand. You might want to schedule weekly reboots if you ever use windows for anything.
5) Business Logic Testing : Have regularly running scripts testing the workflow of your system. Selenium can probably achieve some of this, but I like logging the results as well to say this ran at this time and these files had errors. If possible anywhere, have the system monitor itself through your scripts.
6) Backups : Make a backup that you can set and forget. If you can get things into virtual machines it would be ideal as you can scale, move, or deploy any part of your infrastructure anywhere. I have had instances where I moved a dead server onto my laptop, let it run in vmware while I fixed a problem.
Monitoring the number of connections to your Web server and your database is another good thing to track. Chances are if one shoots through the roof, something is starving for resources and the site is about to go down.
Also make sure you have a regular request for a URL that is a reasonable end-to-end test of the system. If your site supports search, then have nagios execute a search - that should make sure the search index is healthy, the Web server and the database server.
Also, make sure that your applications sends you email anytime your users see an error, or there is an unhandled exception. That way you know how the application is failing in the field.
If I had to pick one type of testing it would be to test the end-user functionality of the system. The important thing to consider is the user. While testing things like database availability, server up-time, etc, are all important, testing work-flows through your system via a remote UI testing system covers all these bases. If you know that the critical parts of your system are available to the end-user, then you know your system is prolly Ok.
Identify the important work-flows in your system. For example, if you wrote an eCommerce site you might identify a work-flow of "search for a product, put product in shopping cart, and purchase product".
Prioritize the work-flows, and build out higher-priority tests first. You can always add additional tests after you roll out to production.
Build UI tests using one of the available UI testing frameworks. There are a number of free and commercial UI testing frameworks that can be run in an automated fashion. Build a core set of tests first that address critical work-flows.
Setup at least one remote location from which to run tests. You want to test every aspect of your system, which means testing it remotely. Is the internet connection up? Is the web server running? Is the connection to the database server working? Etc, etc. If you test remotely you make sure you system is available to the outside world which means it is most likely working end-to-end. You can also run these tests internally, but I think it is critical to run them externally.
Make sure your solution includes both reporting and notification. If one of your critical work-flow tests fails, you want someone to know about it to fix the problem ASAP. If a non-critical task fails, perhaps you only want reporting so that you can fix problems out-of-band.
This end-user testing should not eliminate monitoring of system in your data-center, but I want to reiterate that end-user testing is the most important type of testing you can do for a web application.
Ahhh, monitoring. How I love thee and your vibrations at 3am.
Essentially, you need a way to inspect the internal state of your application, both at a specific moment, as well as over spans of time (the latter is very important for detecting problems before they occur). Another way to think of it is as glorified unit-testing.
We have our own (very nice) monitoring system, so I can't comment on Nagios or other apps. Our use case is similar to yours, though (cgi app on apache).
Add a logging.monitor() type method, which will log information to disk. This should support, at the least, logging simple numbers and dicts of numbers (the key=>value association can be incredibly handy).
Have a process that scrapes the monitoring logs and stores them into a database.
Have a process that takes the database information, checks them against rules, and sends out alerts. Keep in mind that somethings can be flaky. Just because you got a 404 once doesn't mean the app it down.
Have a way to mute alerts (very useful for maintenance or to read your email).
Thats all pretty high level. The important thing is that you have a history of the state of the application over time. From this, you can then create rules (perhaps just raw sql queries you put into a config somewhere), that say "If the queries per second doubled, send a SlashDotted alert", or "if 50% of responses are 404, send an alert". It also bedazzles management because you can quantify any comment about whether its up, down, fast, or slow.
Things to monitor include (others probably mentioned these as well): http status, port accessible, http load, database load, open connection, query latency, server accessibility (ssh, ping), queries per second, number of worker processes, error percentage, error rate.
Simple end-to-end tests are also very handy, though they can be brittle. Its best to keep them simple, but you should have one that tries to touch core pieces of the app (caching, database, authentication).
I use Munin and Monit, and have been very happy with both of them.
Internal logging is fine and dandy but when your whole app goes down or your box/enviro crashes you need an outside check too. http://www.pingdom.com/ has been very reliable for me.
My only other advice is I wouldnt spent too much time on this. my best example is twitter, how much energy did they put into the system being able to half-die instead of just investing that time and energy into throwing more hardware / scaling it out.
Chances are what ends up taking you down, your logging and health systems will have missed anyway.
The single most important way to monitor any online site is to monitor externally. The goal should be to monitor your site in a way that most closely reflects how your users use the site. In 99% of cases, as soon as you know that your site is down externally, it's relatively easy to find the root cause. The most important thing is to know as soon as possible that your customers are unable to load your site.
This generally means using an external performance monitoring service. They very from the very low end (mon.itor.us, pingdom) to the high end (Webmetrics, Gomez, Keynote). And as always, you get what you pay for. The things to look for when shopping around for a monitoring service include:
The size and distribution of the monitoring network
Whether or not the monitoring solution is able to monitor your site using a real browser (otherwise you aren't testing your site like a real user would)
The scripting language (to script the transactions against your site)
The support department, to help you along the way, and provide expertise on how to monitor correctly
Good luck!
Web monitoring by IP Patrol or SiteSentry have been useful for us. The second is a bit like site confidence but slightly prettier lol.
Have you thought about monitoring the functionality as well? A script (either in a scripting language like Perl or Pyton or using some tool like WebTest) that talks to your application and does some important steps like logging in, making a purchase, etc is very nice to have.
Aside from what to monitor, which has already been answered, you need to make sure - whatever system you use - that you get only one notification of an error that happens multiple times, on each request. Or your inbox will run out of memory :) Plus, it's plain annoying...
Divide the standby shifts among the support/dev team, so one person does not have to be on call every single evening. That will wear people down. Monitoring is a good thing, but everyone needs to get a chance to have a life once in a while. Your cellphone buzzing at 2AM for a few nights will get very old pretty soon, trust me. And not every developer is used to 24/7 support, so you need to find the balance between using monitoring and abusing monitoring.
Basically, have distinct escalation levels, and if the sky is not falling, define a "serenity now" window at night where smaller escalation levels don't go out.
I've been using Nagios + CruiseControl + Selenium for running high-level tests on mission critical web applications. I got burned pretty hard by a simple jquery error that stopped users from proceding through an online signup form.
http://www.agileatwork.com/the-holy-trinity-of-web-2-0-application-monitoring/
You can take a look at AlertGrid. This web application allows you to filter and forward alerts to your team (worldwide). It has also nice ability to monitor if something did not happen.
To paraphrase Richard Levasseur: ah, monitoring tools, how your imperfections frustrate me. There doesn't seem to be a perfect tool out there; Nagios is pretty easy to set up but the UI is kinda old fashioned and you have to have a daemon running on each server being monitored. Zenoss has a much nicer UI including trend graphs of resource usage, but it uses SNMP so you have to have some familiarity with that to get it working properly, and the documentation is not the best - there are hundreds of pages but it's really hard to find just the info you need to get started.
Friends of mine have also recommended Cacti and Hyperic, but I don't have personal experience with those.
One last thing - one of the other answers suggested running a tool that stresses your site. I wouldn't recommend doing that on your live site unless you have a reliable quiet period when nobody is hitting it; even then you might bring it down unexpectedly. Much better to have a staging server where you can run load tests before putting changes into production.
One of our clients uses Techout (www.techout.com) and is very pleased with the service.
There is no charge for alerts, no matter what kind or how many, and they offer email, voicemail and SMS alerts -- and if something major happens, a phone call from a live person to help you out.
It's all based on service -- you don't install the software and you have a consultant who works with you to determine the best approach for your business. It's one of the most convenient web application monitoring services because they take care of everything.
I would just add that you can predict error likelihood somewhat based on history of past errors and having fixed them. With smaller scale internal testing if you were to graph the frequency and severity of problems that have been corrected to this point you'll have an overview of predictable new problems. If everything has been running error free for some time now, then the two sources of trouble would be recent changes or scalability issues.
From the above it sounds like scalability is your only worry, but I just mention the past-error frequency test because the teams I've been on invariably think they got the last error fixed and there are no more. Until there is.
Changing the line a little bit, something I really think is useful and changed a lot how I monitor my apps is to log javascript exceptions somewhere. There's a very nice implementation that logs that directly from user browsers to Google Analytics.
This is a must for Javascript centered web applications, and can give you results based directly on users browsers what can lead to very unexpected errors (iE and mobile browser are pain)
Disclaimer: My post bellow
http://www.directperformance.com.br/en/javascript-debug-simples-com-google-analytics
For the internet presence monitoring, I would suggest the service that I am working on: Sucuri NBIM (Network-based integrity monitor).
It does availability and integrity checks, looking for changes on your internet presence (sites, DNS, WHOIS, headers, etc) and loss of connectivity. It is free and you can try it out here.