Ec2 spot pricing graph stopped working - amazon-web-services

A long time ago, there was the most useful spot price comparison graph that I have ever used, but it stopped working, as far as I know, because the creator ran out of time to maintain it. The website is still active ec2price.com and the code is on Github. Does anyone know if anyone has replicated this? or any way to do it myself? As I said it was really useful to decide which spot instance to choose.

You can see this information in the EC2 console by browsing to Spot Requests and selecting the Pricing History button.
If you want to select the cheapest ec2 instance type automatically you can create a spot fleet request; select all the instance types you might want to use and an allocation strategy of lowestPrice. Deploy this to a VPC with a subnet in all availability zones in your region to get the lowest price possible.

Besides codes, somebody must pay to maintain server that polling the information.
Check out How Spot Fleet Works. Spot fleet is way better than price monitoring. You can make request base on pricing for a fleet of instance type than limit yourself to specific instance type. You can kick start instances from a large fleet base on maximum instance price or vCPU price.
If you are using a SPOT ready batch application, after submit your bidding for fleet of different instance type and set the maximum per vCPU price, Spotfleet will automatically launch available instance with the best price. So you don't need to compete with limited popular instance(for example, c4.* SPOT instance is scarce for most region).
This is a win-win for both AWS and customer, as AWS able to spread the usage to underutilized instance type. IMHO, There is no point of keep raising the bid for particular type if those instance exhausted, while there is still many idle AWS instances in alternate zone that are not fully utilised for grab.

Related

How to use existing On-Demand instance in spot fleet?

I'm trying to reduce my expenses and want to start using AWS's spot pricing service. I'm completely new to it, but as I understand I can have instances running for certain amounts of time based on the price that will eventually stop running based on certain conditions. That's fine, I'm also aware you can have spot fleets, and in these fleets you can have an On-Demand instance for when the spot instance is interrupted.
I currently have a an On-Demand instance that hosts an ElasticBeanStalk application (it's an API), is there a way to use this instance inside the spot fleet so that when there's an available spot-instance it's servicing my EBS application then when the spot-instance is interrupted it just goes back to using my current On-Demand instance until another spot-instance is available?
Sadly, spot fleets don't work like this. If your spot instance gets terminated, no on-demand replacement is going to be created for you automatically. If it worked like this, everyone would be using spot instances in my view.
The on-demand portion of your spot fleet is separate from spot portion. This way your application will always run at minimum capacity (without spot). When spot is available, your spot instances will run along side your on-demand. This way you will have more computational power for cheap, which is very beneficial for any heavy processing application (e.g. batch image processing).
Details of how spot fleet and spot instances work are in How Spot Fleet works and How Spot Instances work docs.
Nevertheless, if you would like to have such replacement provisioned you would have to develop a custom solution for that.
There's a third-party solution called Spot.io that not only replaces the spot instance for an on-demand instance in a scenario like the one you describe but it has an algorithm that anticipates the interruption event and stands up an On-demand instance and has it ready before the interruption occurs.

reduce price by on AWS (EC2 and spot instances)

I have a queue of jobs and running AWS EC2 instances which process the jobs. We have an AutoScaling groups for each c4.* instance type in spot and on-demand version.
Each instance has power which is a number equal to number of instances CPUs. (for example c4.large has power=2 since it has 2 CPUs).
The the exact power we need is simply calculated from the number of jobs in the queue.
I would like to implement an algorithm which would periodically check the number of jobs in the queue and change the desired value of the particular AutoScaling groups by AWS SDK to save as much money as possible and maintain the total power of instances to keep jobs processed.
Especially:
I prefer spot instances to on-demand since they are cheaper
EC2 instances are charged per hour, we would like to turn off the instance only at the very last minute of its 1hour uptime.
We would like to replace on-demand instance by spot instances when possible. So, at 55min increase spot-group, at 58 check the new spot instance is running and if yes, decrease on-demand-group.
We would like to replace spot instances by on-demand if the bid would be too high. Just turn off the on-demand one and turn on the spot one.
Seems the problem is really difficult to handle. Anybody have any experience or a similar solution implemented?
You could certainly write your own code to do this, effectively telling your Auto Scaling groups when to add/remove instances.
Also, please note that a good strategy for lowering costs with Spot Instances is to appreciate that the price for a spot instance varies by:
Region
Availability Zone
Instance Type
So, if the spot price for a c4.xlarge goes up in one AZ, it might still be the same cost in another AZ. Also, the price of a c4.2xlarge might then be lower than a c4.xlarge, with twice the power.
Therefore, you should aim to diversity your spot instances across multiple AZs and multiple instance types. This means that spot price changes will impact only a small portion of your fleet rather than all at once.
You could use Spot Fleet to assist with this, or even third-party products such as SpotInst.
It's also worth looking at AWS Batch (not currently available in every region), which is designed to intelligently provide capacity for batch jobs.
Autoscaling groups allow you to use alarms and metrics that are defined outside of the autoscaling group.
If you are using SNS, you should be able to set up an alarm on your SNS queue and use that to scale up and scale down your scaling group.
If you are using a custom queue system, you can push metrics to cloudwatch to create a similar alarm.
You can determine how often scaling actions do occur, but it may be difficult to get the run time to exactly one hour.

Request EC2 spot block with automated bidding using Boto3

Correct me if I'm wrong, but there seem to be some inconsistencies between creating sport block requests using EC2 console and AWS SDK (Boto3, namely). When requesting a spot block using AWS Management console, the only pricing option is "Use automated bidding".
However when doing the same via Boto3, SpotPrice parameter is marked as required with no indication that it might represent, say, a percentage of on-demand price.
Is there any option to use automated bidding programmatically without hard-coding on-demand prices in the requests?
The console is merely trying to present a simplified process. I think it is simply setting SpotPrice to the on-demand price. That's a much cleaner interface than asking for a different price per selected Instance Type.
You always only pay the current Spot Price. Bidding is always automatic up to your Spot price, which represents the maximum you are willing to pay.
If you want to make an equivalent bid without hard-coding the On-Demand Price, you could use the AWS PriceList API, which is really just some downloadable JSON/CSV files with pricing information. Pricing doesn't change very often, so you could cache that information and occasionally refresh it.
Because you choose Reserved for duration. Automatic bidding is the only way you can do this.
IMHO, you should consider your SPOT requirement before jump into reserved for duration. If your application is spot instance ready , then you should specify a fleet of instance with your desire minimum price. Because AWS Spot always has under utilised spare instance, in fact, this minimise the interruption without even the need of reservation.
Perhaps due to c4.* pricing and cause many people move from c3.* to c4., it seems c3. pricing is all time low (e.g. us-east-1* show a price below $0.02)

EC2 instance billing type programmatically

I have reservation of type t2.micro and us-ease-1d availability zone, also I have multiple instances running, including the one with mentioned type and zone.
As a result I expect billing for this one instance will take into account the reservation. The problem is that I haven't found any link between reservation and actual instances.
I've used aws ec2 describe-instances and aws ec2 describe-reserved-instances CLI commands but I wasn't able to find any link.
Is it possible to realise which billing approach will be used for each instance using Amazon SDK?
So f.e. I will see that some instance is linked to some reserved instance (reservation)
Is it possible to realise which billing approach will be used for each instance using Amazon SDK?
No, it isn't.
The "link" between reserved instances and running instances is not something the EC2 operational infrastructure knows about. It's all done after the fact, in billing.
Each hour, for a given instance type and availability zone placement, you're billed for your reserved instances (depending on the terms of the reservation, this happens whether you have this many instances running or not, though in some cases, the amount billed here is $0 since you've already paid). Then, if the number of running on-demand instances for that type and placement exceeds exceeds the number of reserved instances for that type and placement, the difference for that hour is billed at the on-demand rate.
So if you had purchased one reserved instance matching a certain spec and during a given hour you had two such instances running, it's not really the case that one of your instances "is the reserved instance" and the other one isn't. If you stop either one, then the next hour, the reserved instance pricing applies to the one that remains running... but EC2 can't tell you which is which and, in fact, the billing logic is such that it doesn't matter.
There isn't really a link between reservations and specific instances. Think of it more like a discount that gets applied to your bill, after you have incurred some instance charges.
You can use the Reserved Instance Utilization Report to see how your reservations have been applied to the instance hours you have been charged for.

How to autoscale EMR task instances

I am using EMR with task instance groups as spot instances. I want to maintain minimum number of task instances always.
Means, whenever EMR terminates task instances because of bid price goes higher than what we set, my application should launch another task instance with little higher bid price.
My research-
Use Cloudwatch to inform when it breaches threshhold, and auto-scale task instances. But as per study, there is no concept of auto-scaling in EMR.
Use Cloudwatch, and notify SQS when threshhold breahes, and there is one service who is always consuming and expand task instances.
Questions
Is there any auto-scaling present in EMR ? If that is available, then my efforts will reduce to just set threshhold, and corresponding expansion task instances action.
If you have any other approach to solve this problem, please suggest.
How Spot Prices Work
When an Amazon EC2 instance is launched with a spot price (including when launched from Amazon EMR), the instance will start if the current spot price is below the provided bid price. If the spot price rises above the bid price, the instance is terminated. Instances are only charged the current spot price.
Therefore, the logic of launching a new spot instance with a "little higher bid price" is not necessary. The instance will always be charged the current spot price, so simply bid as high as you are willing to pay for a spot instance. You will either pay less than the spot price (great!) or your instance will be terminated because the price has gone higher than you are willing to pay (in which case you don't want to pay a "little higher" for the instance).
If you wish to "maintain minimum number of task instances" at all times, then either pay the normal EMR charge (which means the instances won't be terminated) or bid a particularly large price for the spot instances, such as 2 x the normal price. Yes, you might occasionally pay more for instances, but on average your price will be quite low.
If you wish to be particularly sneaky, you could bid up to the normal price for the EC2 instances then, if instances are terminated, launch more task nodes without using spot pricing. That way, your instances won't be terminated and you won't pay more than the normal EC2 price. However, you would have to terminate and replace those instances when the spot price drops, otherwise you are paying too much. That's why it might be better just to provide a high bid price on your spot instances.
Bottom line: Use spot pricing, but bid a high price. You'll get a good price most of the time.
AWS EMR does not have a autoscaling option available. But you can use a work around and integrate Autoscaling using AWS SQS. This is a rough picture what you can integrate.
Launch you EMR cluster using spot instance.
Set up a SQS Queue and create 3 triggers one for CPU threshold , second for EC2 spot instance termination notice and third for changing the spot instance bid prices.
So if the CPU usage increases SQS will trigger an event to launch a new instance to cluster, if there is spot instance termination notice SQS will trigger to launch another instance to balance the load and send a event to change the bid price to launch another spot instance. (This is just rough sketch but I guess you will understand the logic.
This is guide to AWS SQS Autoscaling.
https://docs.aws.amazon.com/autoscaling/latest/userguide/as-using-sqs-queue.html
As has been correctly pointed, the EMR API provides all necessary ingredients to 1) collect monitoring data, and 2) programmatically scale the cluster up and down.
Basically, there are two main options to implement autoscaling for EMR clusters:
Autoscaling Loop: A process that is running on a server and continuously monitors the cluster for its current load. Performance metrics (memory, CPU, I/O, etc) can be collected in regular intervals and stored in a database. Autoscaling rules are evaluated against the performance metrics, and the cluster's task nodes are scaled up or down if required.
Event-Based Autoscaling: Using CloudWatch metrics (e.g., metrics for EMR or EC2), you can programmatically define triggers that are fired under certain conditions (for instance, add nodes if average CPUUtilization of all nodes exceeds 80%).
Both options have their pros and cons. The main advantage of option 2 is that it is a server-less approach (does not require to run your own server). Option 1, on the other hand, does require a server, but therefore comes with more control to customize the logic of your scaling rules. Also, it allows to keep searchable records of the history of the scaling decisions.
You could take a look at Themis, an EMR autoscaling framework developed at Atlassian. Themis implements the autoscaling loop as discussed in option 1 above. Current features include proactive as well as reactive autoscaling, support for spot/on-demand task nodes, it comes with a Web UI, and the tool is very easy to configure.
I have had a similar problem, and I wanted to share one possible alternative. I have written a Java tool to dynamically resize an EMR cluster during the processing. It might help you. Check it out at:
http://www.lopakalogic.com/articles/hadoop-articles/dynamically-resize-emr/
The source code is available on Github