I am new to AWS and recently set up a free t3.micro instance. My goal is to achieve a stable hosting of an Angular application with 2 spring boot services. I got everything working, but after a while, the spring boot services are not reachable anymore. When i redeploy the service it will run again. The spring boot services are packed as jar and after the deployment the process is started as a java process.
I thought AWS guarantees permanent availability out of the box. Do i need some more setup such as autoscaling to achieve the desired uptime of the services or is the t3.micro instance not suffienciently performant, so that i need to upgrade to a stronger instance to avoid the problem?
It depends :)
I think you did the right thing by starting with a small instance type and avoid over provisioning in the first place. T3 instance types are generally beneficial for 'burst' usage scenarios i.e. your application sporadically needs a compute spike but not a persistent one. T3 instance types usually work with credits based system, where you instance 'earns' credits when it is idle, and that buffer is always available in times of need (but only until consumed entirely). Then you need to wait for some time window again and earn the credits back.
For your current problem, I think first approach can be to get an idea of the current usage by going through the 'Monitoring' tab on the EC2 instance details page. This will help you understand if the needs are more compute related or i/o related and then you can choose an appropriate instance type from :
https://aws.amazon.com/ec2/instance-types
Next step could also be to profile your application and understand the memory, compute utilisation better. AWS does guarantee availability/durability of resources, but how you consume those resources is more of an application thing, which AWS does not guarantee/control
For your ideas around, autoscaling and availability, it again depends on what your needs are in terms of cost, outages in AWS data centres etc. To have a reliable production setup, you could consider them, but not something really important in the first place.
Related
I have a web service running on several EC2 boxes. Based on the Cloudwatch latency metric, I'd like to scale up additional boxes. But, given that it takes several minutes to spin up an EC2 from an AMI (with startup code to download the latest application JAR and apply OS patches), is there a way to have a "cold" server that could instantly be turned on/off?
Not by using AutoScaling. At least not, instant in the way you describe. You could make it much faster however, by making your own modified AMI image where you place the JAR and the latest OS patches. These AMI's can be generated as part of your build pipeline. In that case, your only real wait time is for the OS and services to start, similar to a "cold" server.
Packer is a tool commonly used for such use cases.
Alternatively, you can mange it yourself, by having servers switched off, and start them by writing some custom Lambda scripts that gets triggered by Cloudwatch alerts. But since stopped servers aren't exactly free either, i would recommend against that for cost reasons.
Before you venture into the journey of auto scaling your infrastructure and spending time/effort. Perhaps you should do a little bit of analysis on the traffic pattern day over day, week over week and month over month and see if it's even necessary? Try answering some of these questions.
What was the highest traffic ever your app handled, How did the servers fare given the traffic? How was the user response time?
When does your traffic ramp up or hit peak? Some apps get traffic during business hours while others in the evening.
What is your current throughput? For example, you can handle 1k requests/min and two EC2 hosts are averaging 20% CPU. if the requests triple to 3k requests/min are you able to see around 60% - 70% avg cpu? this is a good indication that your app usage is fairly predictable can scale linearly by adding more hosts. But if you've never seen traffic burst like that no point over provisioning.
Unless you have a Zynga like application where you can see large number traffic at once perhaps better understanding your traffic pattern and throwing in an additional host as insurance could be helpful. I'm making these assumptions as I don't know the nature of your business.
If you do want to auto scale anyway, one solution would be to containerize your application with Docker or create your own AMI like others have suggested. Still it will take few minutes to boot them up. Next option is the keep hosts on standby but and add those to your load balancers using scripts ( or lambda functions) that watches metrics you define (I'm assuming your app is running behind load balancers).
Good luck.
I was looking for a way to autoscale a mesos or DCOS EC2 cluster dynamically. An example scenario would be if cluster CPU usage is above x % for x minutes spin up new instances, if memory is above X % for x minutes, spin up new instances.
Ideally the instance type should be dynamically determined by the type and amount of resources needed. I saw this projects:
https://github.com/thefactory/autoscale-python
which I suppose can be run as a mesos marathon task itself to handle that, but I was wondering if there is a built in utility in mesos or a generic way to do that on EC2 or GCE. Thanks!
As recently as last week, I had a talk with Mesosphere people, and can confirm that this functionality is not yet built-in to DCOS. I didn't get a feeling that this is something on their cards anytime soon as they kept referring me to Application autoscaling instead of Infra autoscaling (which is what we want here).
The python script from thefactory (Tendril) seems to be not maintained anymore as evidenced by a PR hanging in there for 12months now.
I'm currently looking at Netflix Fenzo which also seems to have been written to enable autoscaling of the infrastructure, among other things.(http://techblog.netflix.com/2015/08/fenzo-oss-scheduler-for-apache-mesos.html)>
I will try to post back once I get a better idea of Fenzo and how it integrated to DC/OS.
Curious if this is possible:
We have a web application that at MOST times, works just fine with our single small instance. However, when we get multiple customers running simultaneously intense queries (we are a cloud scheduling service); our instance bogs way down to near 80% cpu load and becomes pretty unresponsive.
Is there a way to have AWS fire up another small instance (or a few), quickly, only for the times that its operating under this intense load? BUT, the real question is how does this work when we have very frequent programming updates to our application? Do we have to manually create a new image everytime we upload a code change???
Thanks
You should never be running anything important on a single EC2 instance. Instances can--and do--go offline randomly. Always use an autoscaling (AS) group that spans multiple availability zones. An AS group will automatically bring new instances online when you hit a certain trigger (in your case, CPU utilization). And then it will scale down the instances when traffic subsides. Autoscaling is the heart and soul of AWS and if you're not using it, you might as well be using a cheaper (and more durable) VPS host.
No, you don't want to be creating a new AMI for each code release. Ideally you should use a base AMI (like one of Amazon's official ones) and then have it auto-provision at boot. You can use the "user data" field when you launch an AMI to bootstrap this process. It can be as simple as a bash script that pulls from your Git repo to as something as sophisticated as Puppet or Chef.
The only time I create custom AMI's is if the provisioning process just takes too long. However that can almost always be solved by storing the needed files in S3.
I have had an EC2 instance working just fine for months (still developing, app not live yet), but I just realized I don't even know how to make my EC2 instance scale up / down depending on traffic.
The sheer number of services offered by Amazon is overwhelming, and I'm very confused.
Initially, I though I'd just have one instance, and Amazon would transparently allocate resources or create identical instances to handle traffic but it seems my impression was wrong.
My question is: can someone please tell me (in simple words, bullet list or point me to a tutorial) how to make my instance automatically grows to handle 100,000 simultaneous users then automatically goes back when the surge is done?
Assuming this is possible, can I do this via the AWS control panel? If so, how?
All I can see is micro, small, medium, etc.. instances. Each one has limited resources and it's not clear how to automatically setup the instance so that Amazon dynamically allocate additional resources to handle traffic spikes (or even gradually go up to keep with natural traffic growth for that matter).
Side question May I assume that Amazon auto-handle DDOS attacks when scaling up? (meaning rogue traffic would eventually stopped/slowed down by Amazon and scaling would only affect legitimate traffic spike). I realize this side question may be really stupid, keep in mind I didn't take my coffee yet :)
This article details how to auto scale using load balancers and EC2: http://kkpradeeban.blogspot.com/2011/01/auto-scaling-with-amazon-ec2.html
For scalability you may also want to look into this article on implementing a pub/sub system for distributed systems: http://www.infoq.com/articles/AmazonPubSub
You can't automatically change the instance type (m1.small, m1.large, etc.) in response to changing load. You can, however, have AWS automatically create new instances as your load increases, and tear them down when load subsides.
I believe this article will help you: http://aws.amazon.com/autoscaling/.
I do have a free micro instance on AWS and quite often my CPU is throttled making it very hard to use.
I do want to know if there is any way to test a bigger instance so I see which one would be ok.
Side questions:
Can I go back to the free micro if I want?
Can I limit the cost of the testing, or do an estimate on it? I don't want to endup with a surprise bill as the result of the testing.
You can of course launch a new instance of a larger size, run your tests, then terminate the instance. It will not effect your running micro instance in any way at all.
AWS publishes their pricing data, so you can either calculate the cost manually or use the cost calculator: http://calculator.s3.amazonaws.com/calc5.html
There is no way to "cap" your AWS spend.
Mike Ryan's answer is correct as such, but there might be a better way to achieve your goal, because it is possible to upgrade your Amazon EC2 t1.micro instance in place. This process (and the few constraints) are summarized in Eric Hammond's article Moving an EC2 Instance to a Larger (or Smaller) Instance Type:
When you discover that the entry level t1.micro instance size is
simply not cutting it for your growing application needs, you may want
to try upgrading it to a larger instance type, perhaps an m1.small or
even a c1.medium.
Instead of starting a new instance and having to configure it from
scratch, you may be able to simply resize the existing instance by
asking Amazon move it to better hardware for you. Of course, since
this is AWS, you don’t have to actually talk to anybody—just type a
few commands and the job is done automatically.
Eric describes how to achieve this via the command line, but the same can be done via the AWS Management Console as well if your prefer, the instance menu features a respective command Change Instance Type (only enabled when the instance is stopped).
Alternatively you might also want to get acquainted with the ease of duplicating an EBS-Backed EC2 instance by means of an Amazon Machine Image (AMI), which allows you to start any number of exact duplicates of your current instance - this process is outlined in Creating Amazon EBS-Backed AMIs Using the Console for example.