Which amazon web service instance must I use for standalone? - amazon-web-services

I am running an ns3 network simulator experiment (https://www.nsnam.org/) on c4.4xlarge compute-optimized instance.
My goal is just to borrow a good ubuntu computer to run simulation which takes a long time to run (~ 2days). It is standalone and doesn't have to be a server.
However I noticed with only one instance it is quite slow. What should I do to make the most of the service in this case?

If the job you wish to run is too slow, then you have three choices:
Have patience, or
Use a more powerful EC2 instance, or
Use more EC2 instances
If the simulator is able to run across multiple VMs in parallel, take a look at Amazon EC2 Spot Instances. They are low-priced instances that are made available when there is spare capacity on EC2. They can be as little as 90% of the standard price. The downside is that a rise in the Spot Price could cause the instances to be terminated with only two minutes notice.

Related

AWS Autoscaling Group EC2 instances go down during cron jobs

I tried autoscaling groups and alternatively just a bunch of EC2 instances tied by load balancer. Both configs are working fine at first glance.
But, when the EC2 is a part of autoscaling group it goes down sometimes. Actually it happens very often, almost once a day. And they go down in a "hard reset" way. The ec2 monitoring graphs show that CPU usage goes up to 100%, then the instance become not responsive and then it is terminated by autoscaling group.
And it has nothing to do with my processes on these instances.
When the instance is not a part of Autoscaling groups, it can work without the CPU usage spikes for years.
The "hard reset" on autoscaling group instances are braking my cron jobs. As much as I like the autoscaling groups I cannot use it.
It there a standard way to deal with the "hard resets"?
PS.
The cron jobs are running PHP scripts on Ubuntu in my case. I managed to make only one instance running the job.
It sounds like you have a health check that is failing when your cron is running, as as a result the instance is being taken out of service.
If you look at the ASG, there should be a reason listed for why the instance was taken out. This will usually be a health check failure, but there could be other reasons as well.
There are a couple things you can do to fix this.
First, determine why your cron is taking 100% of CPU, and how long it generally takes.
Review your health check settings. Are you using HTTP or TCP? What is the interval, and how many checks have to fail before it is taken out of service?
Between those two items, you should be able to adjust the health checks so that it doesn't take it out of service during the cron running time. It is possible that the instance is failing, typically this would be because it runs out of memory. If that is the case, you may want to consider going to a large instance type and/or enabling swap.
Once I had a similar issue, in that situation was the system auto update running. The system (Windows server) was downloaded a big update and took 100% of the CPU during hours. My suggestion is to try to monitoring which service is running at that moment (even if the SO is Linux), also check for any schedule task (as looks like it is running periodically). Other than that try to keep the task list opened during the event and see what is going on.

ECS Service auto scaling and auto scaling group

I've decided to start playing with AWS ECS service, and created cluster and one service my issue is that I want to connect it to the AWS auto scaling group.
I have followed the following guide.
The guide works, my issue is that its a total waste of money.
The guide says that I need to add machine when the total amount of CPU units that my services reserve is above 75, but in reality my services always reserve 100%
because I don't want waste money, also its pretty useless to put 3 nodejs tasks on 2 cpu machine, there is no hard-limit anyway.
I am breaking my head on it for few days now, I have no idea how to make them work together properly
EDIT:
Currently this is what happens:
CPU getting above 75%, Service scaling is created 2 new tasks on the same server, which means that now I have 1 instance with 4 tasks
Instance reservation is now 100%, Auto Scaling Group is creating new instance
Once the new instance is created, Service Scaling is removing 2 tasks from the old instance and adding 2 new tasks to the new instance
Its just me or this whole process looks like waste of time? is this how it really should be or (probably) i done something wrong?
I think you are missing a few insights.
For ECS autoscaling to work properly you also have to set up scaling on a ECS Service level.
Then, the scaling flow would look like this:
ECS Service is reaching 100% used CPU and has 100% reserved CPU
ECS Service scales by starting an additional task, making the reserved CPU total at 200%
Auto scaling group sees there is more Reserved capacity than available capacity and launches a new machine.
In addition, you can perfectly run multiple nodejes tasks on a 2 CPU machine. Especially in a micro service environment, these nodejs services can be quite small (128 CPU for example) and still run perfectly fine all together on the same host machine.
Eventually, I figured, what I want to do is not possible.
It is not possible to create a resource-optimized cluster with ECS like in Kubernetes.
(unless of course if you write some magic with lambda)
Service auto-scaling and auto-scaling groups don't work together, you can, however, make it work perfectly with fargate but its expansive, the main issue is that you don't have a trigger to cluster reservation above 100%

finding best deployment locations in aws regions

Given we are on aws platform we need to subscribe to different sources of data, which are located around the world. How can we efficiently determine what is the region with lowest latency to some target IP (not our browser)?
There is a service called cloudping which pings from your current browser to aws regions, but this cannot be useful for obvious reasons.
Is there any tool similar to cloudping that such that we could specify what ip we want to ping to?
And a secondary question. I suppose it is possible to spawn instances using aws console api, does amazon have some significant fees if i have a script that spawns a compute instance does some short work and terminates it and does this for every single region?
Worst case we could spawn instances on all regions for short amount of time and ping to all destinations we are interested but that would be a lot of work for something rather simple... My assumption is that even within one region you might end up with some instances having significantly better latency than others, a script could spawn instances until the best one is found and terminates others...
UPDATE
It seems it rather easy to spawn instances and execute commands in them, shouldnt be hard to terminate them as well. Here is a good tool for this, now the question is will aws punish me with bills and isn't there already solution for this?
You can certainly launch and terminate Amazon EC2 instances all any region you wish. Amazon will not "punish" you -- the system will simply charge the normal cost for resources you use.
If you launch an Amazon EC2 instance with the Amazon Linux AMI, then the instance will be charged per-second, so the cost will be very low. For example, you could use a t2.micro instance for a few cents per hour (charged per second).
You could then run your own timing test from each region. However, you could probably predict the best performance simply based upon the location of the region (US East, US West, Frankfurt, Sydney, etc).
Also, please note that Ping is not a reliable measure for how your actual application would perform. To obtain the best measure, you should run an application in each region that connects to the 'source of data' you are trying to use. Measure performance as it would be used by your actual application. You might find that the remote service has higher latency than the network, meaning that location would only have a minor impact on performance.
If you use somebody else's timing or somebody else's tool, it will not be as accurate as measuring your actual application doing "real" work.

Is it a bad idea to create a large amount of AWS EC2 instances for automated integration tests with Vagrant?

I've created an automated integration test that uses Vagrant to create an environment with MySQL database, runs my application under test, asserts the database, and finally destroys the Vagrant box. I want to have it run automatically when somebody commits code so I want to create a Jenkins job. The problem is we don't have a dedicated machine for running Vagrant. One of the options that I was suggested is to use AWS. So I finally got "vagrant up --provider=aws" to work. It created an instance but I started to think that we'll have large amount of integration tests constantly creating and destroying EC2 instances. Question: is it a bad idea to create and destroy constantly new instances of EC2?
The downside of this approach is, obviously, the cost. In AWS EC2, you pay for each hour that your instance is running and whenever you launch an instance, you are charged for it even if you terminate it immediately (so if you launch an instance 100 times and stop it after each test, you are still paying for 100 hours). Though you still get some free hours in the free tier (I suspect it is 750, so it really depends on how many tests and how frequent are you going to run).
The prices differ for each region and platform and you can save a lot by purchasing a reservations: http://aws.amazon.com/ec2/pricing/
Try taking a look on this SO question - you may find what you need there:
How to combine Vagrant with Jenkins for the perfect Continuous Integration Environment?

Is it possible to auto scale with amazon web services, with ever changing AMI's?

Curious if this is possible:
We have a web application that at MOST times, works just fine with our single small instance. However, when we get multiple customers running simultaneously intense queries (we are a cloud scheduling service); our instance bogs way down to near 80% cpu load and becomes pretty unresponsive.
Is there a way to have AWS fire up another small instance (or a few), quickly, only for the times that its operating under this intense load? BUT, the real question is how does this work when we have very frequent programming updates to our application? Do we have to manually create a new image everytime we upload a code change???
Thanks
You should never be running anything important on a single EC2 instance. Instances can--and do--go offline randomly. Always use an autoscaling (AS) group that spans multiple availability zones. An AS group will automatically bring new instances online when you hit a certain trigger (in your case, CPU utilization). And then it will scale down the instances when traffic subsides. Autoscaling is the heart and soul of AWS and if you're not using it, you might as well be using a cheaper (and more durable) VPS host.
No, you don't want to be creating a new AMI for each code release. Ideally you should use a base AMI (like one of Amazon's official ones) and then have it auto-provision at boot. You can use the "user data" field when you launch an AMI to bootstrap this process. It can be as simple as a bash script that pulls from your Git repo to as something as sophisticated as Puppet or Chef.
The only time I create custom AMI's is if the provisioning process just takes too long. However that can almost always be solved by storing the needed files in S3.