Cost to train on AWS? - amazon-web-services

I'm coming from academia where I had HPC clusters at my disposal. Now I'm trying to deploy something on AWS.
I'm trying to budget for what it would cost, $-wise, to train some standard neural nets on standard data sets so I have an idea what other training will cost. Even ballparkish estimates are appreciated.
I know you can request faster or more sets of GPUs, so I also don't know the spread of speed vs. cost either; any insight here is also appreciated.
What would it cost to train ResNet-50 (or really any smallish ResNet) on CIFAR-10, a relatively small net on a small data set? (say, 100 epochs with reasonable batch size)

I do not know anything on ResNet or CIFAR, but as far as pricing for AWS EC2 goes, it depends on instance family, type and reservations:
On Demand Instances: Higest cost. Ideal for prototyping and short lived environments. Pricing
Reserved Instances: Discount are based on the tenure of the reservation. You also have a No-Upfront option where you can reserve instances from 1 year to 3 years which does not provide maximum savings but significantly saves cost. Pricing
Spot Instances: Cheapest option of all, but your application should be designed in way to handle interruptions as AWS will terminate your instance without notice. Recent announcement from AWS provides support for a termination notification for certain types of spot instances, which you may want to investigate. Pricing

Related

How do you implement cloud solutions without incurring costs during development?

I am completely new to the implementation of cloud solutions. I've just started taking AWS training courses.
But I already have a very fundamental question about the flow of development in cloud projects:
How do you go about developing solutions without incurring costs? I know that there are free tiers, but in practice you need a lot of unfree elements. Especially when working with infrastructure-as-code approaches (e.g. CloudFormation), it can happen that every time you try out the templates, costs can be incurred immediately.
Is there maybe something like a sandbox mode or how else do you go about it in practice?
Outside of the AWS Free Tier you will be billed for creating services.
The best way to keep costs as low as possible is to combing the lowest priced settings (such as instance class) with removing resources you're not using after you're complete. I understand that this will cost, however many resources are now moving to per second billing (where you normally have to pay for at least the first minute) so the cost is kept low.
Additionally when dealing with some services (such as EC2, ECS, Fargate and ECR) you can make use of spot instances to pay sometimes as low as 10% of the original cost which will help to reduce these resources.
To ensure you can recreate resources when you want them use infrastructure as code to reroll out as you need the resources (CloudFormation or Terraform are great offerings for this).
Finally be on the lookout for AWS conferences, they are a great way to pickup AWS credits for attending which will offset your bill against most AWS services.

I want AWS Spot pricing for a long-running job. Is a spot request of one instance the best way to achieve this?

I have a multi-day analysis problem that I am running on a 72 cpu c5n EC2 instance. To get spot pricing, I made my code interruption-resilient and am launching a spot request of one instance. It works great, but this seems like overkill given that Spot can handle thousands of instances. Is this the correct way to solve my problem or am I using a sledgehammer to squash a fly?
I've tried normal EC2 launching, which works great, except that it is four times the price. I don't know of any other way to approach this except for these two ways. I thought about Fargate or containers or something, but I am running a 72 cpu c5n node, and those other options won't let me use that kind of horsepower (that I know of, hence my question).
Thanks!
Amazon EC2 Spot Instances are an excellent way to get cheaper compute (up to 90% discount). The only downside is that the instances might be stopped/terminated (your choice) if there is insufficient capacity.
Some strategies to improve your chance of obtaining spot instances:
Use instances across different Instance Types and Availability Zones because they each have different availability pools (EC2 Spot Fleet can assist with this)
Use resources on weekends and in evenings (even in different regions!) because these tend to be times of lower usage
Use Spot Instances with a specified duration (also known as Spot blocks), but this is at a higher price and a maximum duration of 6 hours
If your software permits it, you could split your load between multiple instances to get the job done faster and to be more resilient against any stoppages of your Spot instances.
Hopefully your application is taking advantage of all the CPUs, otherwise you'd be better-off with smaller instances.

How do you measure the energy consumption of an EC2 instance ?

I have asked to give a report on the energy consumption of an m5.12xlarge instance. I am wondering how to come up with a factual and approximate number here. Anyone had come across this issue anywhere ?
An important concept of the Cloud is that server utilization is often much higher.
In a normal data center, a server is running 24 hours but is only used a few hours a day, and often not fully utilized in those hours.
In the Cloud, when an instance is not required it can be turned off, which means that the capacity is available for somebody else to use. Instance size can also be selected so that it is big enough for the desired workload, without having to get large servers for a potential future workload.
Thus, there is significantly less wastage in the cloud compared to on-premises servers. I mention this because somebody has presumably asked you to measure the environmental impact of a choice of computing infrastructure, and it's not really accurate to directly compare on-premises vs the Cloud since on-premises is typically over-provisioned and under-utilized.
A good article on this topic: Cloud Computing, Server Utilization, & the Environment | AWS News Blog

Choosing the right EC2 instance type?

I'm trying to determine if it makes sense to switch our hosting to EC2 from a dedicated dreamhost server, and if so, what EC2 instance type I should choose to get a good idea of the cost prior to switching. I would like to go low and then bump up if need be.
Current Usage:
dedicated server with 4 GB RAM and 4 CPUs
average disk usage: 783 MB
average bandwidth: 8.5 GB
This is really all the info I get from our dreamhost control panel, so hopefully it's enough to provide some recommendations on where to start.
Using the calculator located here, I'm leaning towards a t2.xlarge. Is that too much? not enough?
It is not possible for anyone to recommend the 'correct' instance type. This is because it depends on the operation of your particular application. It might be CPU-intensive, RAM-intensive, network-heavy, highly parallel, etc.
Some applications might need to handle occasional spikes of traffic, whereas other applications might be relatively consistent in their load.
The correct way to determine your 'best' instance type is to run tests that simulate the expected application load. If you can create an automated test, then you could run it against many different instance types and compare the performance vs cost.
Also, many applications are designed to be able to run across multiple instances, so it would be better to test various quantities of servers as well as their instance type.
You might also consider using Amazon EC2 Auto Scaling, which gives the ability to automatically add/remove servers based upon workload. This means that you could use much more powerful instances, but automatically turn some of them off during less-used periods. This affects the cost calculation because the more-powerful instances are more expensive, but you won't be using them all the time.
Then, you could also consider using Amazon EC2 Spot Instances, which can be up to 90% less cost but might be terminated when the demand for such instances is higher. You can also combine On-Demand and Spot Instances to give additional capacity at a lower cost.
(Spot and Auto Scaling are only really applicable if you are using more than one instance to host your application.)
And finally, if your application only requires one instance, you could also consider using Amazon Lightsail that combines the price for instance type and network bandwidth to make the price more predictable.
Bottom line: It depends!
One final word: Most companies consider switching to AWS not purely on a cost basis ("if it makes sense to switch our hosting to EC2 from a dedicated dreamhost server"), but rather on the breadth of features that AWS offers that are not available in a traditional server hosting service. If all you need is "a server", it's probably easiest to consider Amazon LightSail or keep whatever is currently working for you. The cost saving with AWS won't be dramatic (or it might not even be cheaper!), but it will offer you a lot more capabilities if you ever grow beyond just requiring "a server".

Pricing and operation concerns for Amazon EC2

My knowledge regarding servers is limited and I'm trying to figure learn more. I'm currently looking into EC2 and I have a question regarding their 'hours of runtime' for a single instance.
Say I go with an m1.medium instance which is $0.120 per hour. Is there any kind of underlying meaning to that? Or is it literally, if my server is working on something 24/7 for a month (31 days) that I'll be billed at $89.28 (24 * .120 * 31)? If I have an unusually high period of activity I don't want to receive a $1000 bill because I didn't fully understand the server pricing.
Also, would 2 m1.small instances perform about the same as 1 m1.medium instance, or is the relationship not entirely linear?
Thanks
Those $89.28 are indeed money billed for a month of ec2.medium instance usage. But you also should be aware of
DataTransfer costs (for example, if you host a web application, amout of data served to your end users is billed)
Storage price, as your instance should have some storage, same applies for backups (for example in form of snapshots, they are billed for space used)
You also might be billed for other services (such as EMR), but just in case you use them, so no need to worry right from the start.
Refer to EC2 price or price calculator
If you worry about unexpected bills, set up a billing alert. You'll be notified if your bill exceeds your expectations.
As for performance, 2 m1.small is roughly equal to m1.medium only in terms of CPU, but performance often depends on IO, architecture (32b vs 64b) and other factors. I had a use case when t1.micro instance outperformed m1.medium.