I am trying to calculate the GCP Cloud Run cost if I run the service for a month. In the attached picture, you can see that it did not add the cost for the total number of requests. Cloud Run charges 0.40 USD for one million requests. I think I need to add that cost for the total number of requests on top of the cost it has calculated which is pretty misleading while computing the cost. For example in the UI, we do not have the option to choose the number of requests if we choose CPU is always allocated. I know that the warm instances (2 instances) should be running 24/7 for 30 days. So If we calculate 730 hours per month, it would be 1460 hours (5256000 seconds) which means that will incur bills:
5256000 * 0.00002160 = 113.52596 USD for the CPU cost. Here 0.00002160 is the price for the vCPU second.
5256000 * 0.00000240 = 12.61 USD for the memory cost. Here 0.00000240 is the price for the memory GiB second
So now if we deduct 13402800 - 5256000 = 8146800 we get 8146800 seconds for the CPU and for the memory we would get 268056000 - 5256000 = 267530400. So price would come down to this:
CPU = 8146800 * 0.00002160 = 175.97
Memory = 267530400 * 0.00000240 = 642.07
Total would be = 175.97 + 642.07 + 113.52 + 12.61 = 944.17 + 4 (1 million request is 0.4 USD = 10 million * 0.4 = 4.00 USD) = 948.17
I also tried to calculate this way:
CPU cost = 24 * 30 * 0.00002160 * 3600 * 60 = 3359.23
Memory cost = 24 * 30 * 0.00000240 * 3600 * 60 = 373.24
Total = 3732.47 USD
I have looked into this answer on StackOverflow but I think it is a wrong calculation.
Can someone break down this cost that matches the output shown by the GCP pricing calculator?
The estimator is quite stupid. After few test, I understood it's configuration.
here some details
100 (peak) - 2 (min) = 98 -> number of possible instance up and down. Arbitrary, the calculator say 50% of the time it's UP, 50% of the time is down. Therefore it consider 49 instance up full time in the month, in average.
In addition of those 49, 2 (the min) are ALWAYS on. therefore, the total number of instance to consider always on in the month in 51.
52 * 730 * 3600 -> 134 million .... the number of CPU hour of the calculator.
Now, your second way to calculate:
CPU cost = 24 * 30 * 0.00002160 * 3600 * 60 = 3359.23
Have a close look to the number used:
24: number of hour per day
30: number of day per month
0.0000...: CPU cost
3600: number of second per hour
60: ???? What's that? the number of instance per months? the number of second per minute? Number of minute per hour? (for the 2 last answer, it's already take into account in the 3600)
Final word, when you talk about number, take care of the number. you forgot many 0 and it's difficult to understand your issue.
I don't know if I answered your question. in any case, it's difficult to know exactly the cost of pay as you use product. You can know the max cost, by setting a max instance, and you know you will never go above that threshold, but, if you haven't a clear view on your traffic and the number of request (and you also forgot the egress cost) it's impossible to have a precise estimation.
Related
I have 3 EC2 instances which are created by Elastic Beanstalk. Their current CPU Credit Balance are as the following:
And this is the monitoring page in Elastic Beanstalk:
Why is "Sum CPUCreditBalance" equal to 1.8K?
As you can see from the first picture, the CPU credit balances of the 3 EC2 instances are all below 120. 120 * 3 = 360 is far smaller than 1.8K = 1800.
How is 1.8K calculated?
Here are the options I used when creating Sum CPUCreditBalance:
It is the sum of all data points (CPU Credit Balance) in the graph.
Roughly calculating data points: 11x20 + 7x50 + 110x11 = 1780
SUM() isn't a meaningful aggregation of a sampled statistic like CPU Credit Balance. You're adding up all the values from the samples recorded in the time range, and that provides no useful information for this type of measurement.
SUM() only makes sense when the metric itself is a raw count of things per sampling period, such as the number HTTP requests or errors.
Sum -- All values submitted for the matching metric added together. This statistic can be useful for determining the total volume of a metric.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#Statistic
I am using a T2.medium instance. A third of the day I am doing intensive statistical calculations and figured that the rest 2/3 of the time I would "earn" credits at a rate at 24 per hour.
But that is not happening. This is my usage the last two days:
And this is my credit account:
I hadn´t used it for (more than) a day until yesterday 6 pm. I use it intensive for five hours. Then I would expect my "account" to acummulate 24 credits per hour but for 9-10 hours almost nothing happens, then it acummulate as expected for 9 hours and then goes flat again.
I am unable to figure out what is going on and if it is a fault. Do anyone have a good explanation?
EDIT: I have included a week of activity below. I still can´t figure out the algoritm:
Update: The rules used to calculate t2 CPU credit balances appear to have changed such that the issue prompting this question should no longer have an impact.
Based on customer feedback, we’ve updated T2 instances with a new CPU Credit allocation policy that is the same as or better than the previous policy in all cases.
...
Now, earned CPU Credits do not expire until the instance is terminated or stopped. A T2 instance can still earn up to the same maximum level allowed by the instance size. The CPUCreditBalance will now increase anytime the current CPUCreditUsage is below the baseline and can grow to the maximum allowed for the instance size
https://forums.aws.amazon.com/ann.jspa?annID=5196
h/t: Last Week in AWS for the update.
The original answer follows.
This question has caused me quite a bit of mental anguish over the last few hours, because the graphs almost make sense, based on what I know about t2 instances. Almost, but not quite, and I couldn't put my finger on the problem. That's the worst kind. Particularly being a huge fan of the value proposition offered by t2 machines.
But I did finally figure out what's going on here.
There's one concept of CPU credits the documentation doesn't seem to explain, but the math works out, and the explanation holds up nicely under real-world observations:
The most recently earned CPU credits are spent first, not last.
Does order matter? It does.
For testing, I used a t2.micro (primarily because I had an idle one that had been running for several days, and needed something to do, and I didn't want the extra "initial" credits of a new instance to cloud up the observations) but all instance types in the t2 class have similar behavior.
By way of background: in the t2 class, CPU credits are earned at different rates, but CPU credits are used at the same rate for all instance types in the class:
A CPU Credit provides the performance of a full CPU core for one minute.
The t2.micro and t2.small have only one core, so they can burn up to 1 credit per minute or 60 credits per hour, at 100% CPU utilization. The t2.medium and t2.large are dual core, so they can burn up to 2 credits per minute, or 120 credits per hour, at 100% CPU utilization on both cores.
If 1 credit = 100% of 1 core for 1 minute, then 1 credit is also equal to 20% of 1 core for 5 minutes. Since the Cloudwatch graph interval is in 5 minute increments, I set up the following test:
On a t2.micro that has been running for several weeks with essentially no load, I installed lookbusy, a handy utility that allows you to make a machine "look busy" with parameters you specify -- e.g, keep the CPU at 20% utilization.
$ screen -S eat_cpu
$ ./lookbusy -v -c 20 -r fixed
This does exactly what you'd expect, burning 1 CPU credit every 5 minutes. The "CPU Credit Usage" graph confirms this, showing 1 credit being used every 5 minutes. (The CPU Utilization graph, and top, both confirm the 20%.)
But what's happening to my credit balance? It's being depleted by 1 credit every 5 minutes. That seems wrong, doesn't it? I mean, yes, I just said that's how many I'm using, but... I'm also supposed to be earning 6 credits per hour, so I should only be depleting by balance by a net of 0.5 credits every 5 minutes, right?
Hold on... checking the numbers, again: I'm earning 6 per hour, spending 12 per hour, so, yes... that seems like it should be a net decrease of only 6 per hour, not 12... right? Clearly, something doesn't add up the way I expected, because my balance is definitely going down by 12 per hour, and my CPU is definitely only running at 20%.
I seem to be earning no credits to offset my usage. How is that possible?
Unless...
Unused earned credits from a given 5 minute interval expire 24 hours after they are earned
Well, 24 hours ago, my instance was completely idle. During that hour, I earned 6 credits that I... didn't (?) use. Am I not using them now? Shouldn't I be?
any expired credits are removed from the CPU credit balance at that time, before any newly earned credits are added
Crud. Could this be related? This hour, I earned 6 new credits. But right before that, I lost 6 credits from 24 hours ago. Then I spent 12 credits this hour... so my balance when down by 6, up by 6, and down by another 12. Well, that explains the -12 change for the hour, but...
Can that be the reason?
I'm a voracious reader of documentation, so I knew about the expiring credits aspect... but I assumed all along that this was nothing more than the reason an idle instance hovers near its maximum balance, and did not have any other significance. How could it? If I have less than the maximum (6 x 24 = 144 for a t2.micro) then how could I have credits the need to expire?
If my credits from 24 hours ago are always counting against me, wouldn't my balance tend toward zero, regardless of what I do?
Unless...
After tossing and turning most of the night while contemplating sliding around piles of imaginary tokens (representing CPU credits) on an imaginary table top (representing time)... I realized that the "expiration" rule would cause exactly the behavior we observe if, counter-intuitively, credits are not spent in the order in which they are earned (FIFO), but rather in the reverse order (LIFO).
Following that line of reasoning, the explanation for what my 20% CPU test is actually doing is this, where the first hour of my test was "hour 0" --
| spends 6+6 credits | expire 6 credits
test | earned this many | earned this many
hour | hours before hour 0 | hours before hour 0
-----+---------------------+--------------------
0 -1, -2 -24
1 -3, -4 -23
2 -5, -6 -22
3 -7, -8 -21
4 -9, -10 -20
5 -11, -12 -19
6 -13, -14 -18
7 -15, -16 -17
And they meet in the middle.
Is this genuine, or am I guessing? I'm not guessing, and here's the evidence:
After 8 hours, my CPU credit usage graph remains solid, still holding steady at 1 credit per 5 minutes, but after the same 8 hours, my CPU credit balance finally begins to deplete at the (slower) rate I originally expected: 0.5 credits every 5 minutes.
Apparently, as I worked backward in time, spending previously earned credits "newest first," I caught up with my old credits that were about to expire, finally reaching the point where I was using them before they had a chance to expire. Now, I have no credits that are 24 hours old, and so no credits are expiring -- so I am no longer losing credits before new credits are earned. I am now able to keep the 6 that I earn per hour, because I used up the old ones, decreasing the net impact to my credit balance to the expected level.
This explains the only reservation I had about the graphs in the question: why, when utilization drops off, does it take so long for the balance to rebound?
The TL;DR answer is this: the balance doesn't rebound immediately, after a burst of heavy utilization, because you still have unused credits from 24 hours prior, which are canceling out your newly-earned credits, until you reach the point in time when you don't have any 24-hour-old unused credits. When that happens, your credit balance increases again.
Leave the instance completely idle for 24 hours and you will eventually see the balance steadily (for the most part) rise to the maximum again, as expected. Anything less than 24 hours completely idle will cause your balance to remain perpetually be somewhere below the max.
My test script eventually depleted my credit balance almost all the way down. When I killed the process eating the CPU, the credit balance began to recover immediately, at the expected rate of 6 credits per hour.
Conversely, when I took a different machine that had seen low utilization for 24 hours, and ran it's CPU to 100% for a few minutes, then took it back to idle, the credits did not begin to accumulate immedately... being offset by old, expiring ones.
Quotes are from http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html.
I have 6 instances of type m3.large.elasticsearch and storage type instance.
I don't really get what does Average, Minimum, Maximum ..mean here?
I am not getting any logs into my cluster right now although it shows FreeStorageSpace as 14.95GB here:
But my FreeStorageSpace graph for "Minimum" has reached zero!
What is happening here?
I was also confused by this. Minimum means size on single data node - one which has least free space. And Sum means size of entire cluster (summation of free space on all data nodes). Got this info from following link
http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-managedomains.html
We ran into the same confusion. Avg, Min, Max spreads the calculation across all nodes and Sum combines the Free/Used space for the whole cluster.
We had assumed that Average FreeStorageSpace means average free storage space of the whole cluster and set an alarm keeping the following calculation in mind:
Per day index = 1 TB
Max days to keep indices = 10
Hence we had an average utilization of 10 TB at any point of time. Assuming, we will go 2x - i.e. 20 TB our actual storage need as per https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/sizing-domains.html#aes-bp-storage was
with replication factor of 2 is:
(20 * 2 * 1.1 / 0.95 / 0.8) = 57.89 =~ 60 TB
So we provisioned 18 X 3.8 TB instances =~ 68 TB to accomodated 2x = 60 TB
So we had set an alarm that if we go below 8 TB free storage - it means we have hit our 2x limit and should scale up. Hence we set the alarm
FreeStorageSpace <= 8388608.00 for 4 datapoints within 5 minutes + Statistic=Average + Duration=1minute
FreeStorageSpace is in MB hence - 8 TB = 8388608 MB.
But we immediately got alerted because our average utilization per node was below 8 TB.
After realizing that to get accurate storage you need to do FreeStorageSpace sum for 1 min - we set the alarm as
FreeStorageSpace <= 8388608.00 for 4 datapoints within 5 minutes + Statistic=Sum + Duration=1minute
The above calculation checked out and we were able to set the right alarms.
The same applies for ClusterUsedSpace calculation.
You should also track the actual free space percent using Cloudwatch Math:
How should I interpret the AWS EC2 CloudWatch NetworkIn and NetworkOut metrics?
What does the Statistic: Average in the chart refer to?
The docs state that "the units for the Amazon EC2 NetworkIn metric are Bytes because NetworkIn tracks the number of bytes that an instance receives on all network interfaces”.
When viewing the chart below, Network In (Bytes), with Statistic: Average and a Period: 5 Minutes (note that the time window is zoomed in to around five hours, not one week), it is not immediately obvious how the average is calculated.
Instance i-aaaa1111 (orange) at 15.29: 2664263.8
If I change Statistic to “Sum”, I get this:
The same instance (i-aaaa1111), now at 15.31: 13321319
It turns out 13321319/5 = 2664263.8, suggesting that incoming network traffic during those five minutes was, on average, 2664263.8 Bytes/minute.
=> 2664263.8/60 ≈ 44404.4 Bytes/second
=> 4404.39/1024 ≈ 43.3KB/s
=> 43.3*8 ≈ 350Kbps
I tested this by repeatedly copying a large file from one instance to another, transferring at an average speed of 30.1MB/s. The CloudWatch metric was 1916943925 Bytes (Average) => around 30.5MB/s
The metric, "Network In (Bytes)", refers to bytes/minute.
It appears in my case that the average is computed over the period specified. In other words: for '15 Minutes', it divides the sum of bytes for the 15-minute period by 15, for '5 Minutes', it divides the sum for the 5-minute period by 5.
Here is why I believe this: I used this chart to debug an upload where rsync was reporting ~710kB/sec (~727,000 bytes / sec) when I expected a faster upload. After selecting lots of different sum values in the EC2 plot, I determined that the sums were correct numbers of bytes for the period specified (selecting a 15 minute period tripled the sum compared to a 5 minute period). Then viewing the average and selecting different periods shows that I get the same value of ~45,000,000 when I select a period of "5 Minutes", "15 Minutes", or "1 Hour".
45,000,000 (bytes/???) / 730,000 (bytes/sec) is approximately 60, so ??? is a minute (60 seconds). In fact, ~45,000,000 / 1024 / 60 = ~730 kB/sec and this is within 3% of what rsync was reporting.
Incidentally, my 'bug' was user error - I had failed to pass the '-z' option to rsync and therefore was not getting the compression boost I expected.
I am using a T2.medium instance. A third of the day I am doing intensive statistical calculations and figured that the rest 2/3 of the time I would "earn" credits at a rate at 24 per hour.
But that is not happening. This is my usage the last two days:
And this is my credit account:
I hadn´t used it for (more than) a day until yesterday 6 pm. I use it intensive for five hours. Then I would expect my "account" to acummulate 24 credits per hour but for 9-10 hours almost nothing happens, then it acummulate as expected for 9 hours and then goes flat again.
I am unable to figure out what is going on and if it is a fault. Do anyone have a good explanation?
EDIT: I have included a week of activity below. I still can´t figure out the algoritm:
Update: The rules used to calculate t2 CPU credit balances appear to have changed such that the issue prompting this question should no longer have an impact.
Based on customer feedback, we’ve updated T2 instances with a new CPU Credit allocation policy that is the same as or better than the previous policy in all cases.
...
Now, earned CPU Credits do not expire until the instance is terminated or stopped. A T2 instance can still earn up to the same maximum level allowed by the instance size. The CPUCreditBalance will now increase anytime the current CPUCreditUsage is below the baseline and can grow to the maximum allowed for the instance size
https://forums.aws.amazon.com/ann.jspa?annID=5196
h/t: Last Week in AWS for the update.
The original answer follows.
This question has caused me quite a bit of mental anguish over the last few hours, because the graphs almost make sense, based on what I know about t2 instances. Almost, but not quite, and I couldn't put my finger on the problem. That's the worst kind. Particularly being a huge fan of the value proposition offered by t2 machines.
But I did finally figure out what's going on here.
There's one concept of CPU credits the documentation doesn't seem to explain, but the math works out, and the explanation holds up nicely under real-world observations:
The most recently earned CPU credits are spent first, not last.
Does order matter? It does.
For testing, I used a t2.micro (primarily because I had an idle one that had been running for several days, and needed something to do, and I didn't want the extra "initial" credits of a new instance to cloud up the observations) but all instance types in the t2 class have similar behavior.
By way of background: in the t2 class, CPU credits are earned at different rates, but CPU credits are used at the same rate for all instance types in the class:
A CPU Credit provides the performance of a full CPU core for one minute.
The t2.micro and t2.small have only one core, so they can burn up to 1 credit per minute or 60 credits per hour, at 100% CPU utilization. The t2.medium and t2.large are dual core, so they can burn up to 2 credits per minute, or 120 credits per hour, at 100% CPU utilization on both cores.
If 1 credit = 100% of 1 core for 1 minute, then 1 credit is also equal to 20% of 1 core for 5 minutes. Since the Cloudwatch graph interval is in 5 minute increments, I set up the following test:
On a t2.micro that has been running for several weeks with essentially no load, I installed lookbusy, a handy utility that allows you to make a machine "look busy" with parameters you specify -- e.g, keep the CPU at 20% utilization.
$ screen -S eat_cpu
$ ./lookbusy -v -c 20 -r fixed
This does exactly what you'd expect, burning 1 CPU credit every 5 minutes. The "CPU Credit Usage" graph confirms this, showing 1 credit being used every 5 minutes. (The CPU Utilization graph, and top, both confirm the 20%.)
But what's happening to my credit balance? It's being depleted by 1 credit every 5 minutes. That seems wrong, doesn't it? I mean, yes, I just said that's how many I'm using, but... I'm also supposed to be earning 6 credits per hour, so I should only be depleting by balance by a net of 0.5 credits every 5 minutes, right?
Hold on... checking the numbers, again: I'm earning 6 per hour, spending 12 per hour, so, yes... that seems like it should be a net decrease of only 6 per hour, not 12... right? Clearly, something doesn't add up the way I expected, because my balance is definitely going down by 12 per hour, and my CPU is definitely only running at 20%.
I seem to be earning no credits to offset my usage. How is that possible?
Unless...
Unused earned credits from a given 5 minute interval expire 24 hours after they are earned
Well, 24 hours ago, my instance was completely idle. During that hour, I earned 6 credits that I... didn't (?) use. Am I not using them now? Shouldn't I be?
any expired credits are removed from the CPU credit balance at that time, before any newly earned credits are added
Crud. Could this be related? This hour, I earned 6 new credits. But right before that, I lost 6 credits from 24 hours ago. Then I spent 12 credits this hour... so my balance when down by 6, up by 6, and down by another 12. Well, that explains the -12 change for the hour, but...
Can that be the reason?
I'm a voracious reader of documentation, so I knew about the expiring credits aspect... but I assumed all along that this was nothing more than the reason an idle instance hovers near its maximum balance, and did not have any other significance. How could it? If I have less than the maximum (6 x 24 = 144 for a t2.micro) then how could I have credits the need to expire?
If my credits from 24 hours ago are always counting against me, wouldn't my balance tend toward zero, regardless of what I do?
Unless...
After tossing and turning most of the night while contemplating sliding around piles of imaginary tokens (representing CPU credits) on an imaginary table top (representing time)... I realized that the "expiration" rule would cause exactly the behavior we observe if, counter-intuitively, credits are not spent in the order in which they are earned (FIFO), but rather in the reverse order (LIFO).
Following that line of reasoning, the explanation for what my 20% CPU test is actually doing is this, where the first hour of my test was "hour 0" --
| spends 6+6 credits | expire 6 credits
test | earned this many | earned this many
hour | hours before hour 0 | hours before hour 0
-----+---------------------+--------------------
0 -1, -2 -24
1 -3, -4 -23
2 -5, -6 -22
3 -7, -8 -21
4 -9, -10 -20
5 -11, -12 -19
6 -13, -14 -18
7 -15, -16 -17
And they meet in the middle.
Is this genuine, or am I guessing? I'm not guessing, and here's the evidence:
After 8 hours, my CPU credit usage graph remains solid, still holding steady at 1 credit per 5 minutes, but after the same 8 hours, my CPU credit balance finally begins to deplete at the (slower) rate I originally expected: 0.5 credits every 5 minutes.
Apparently, as I worked backward in time, spending previously earned credits "newest first," I caught up with my old credits that were about to expire, finally reaching the point where I was using them before they had a chance to expire. Now, I have no credits that are 24 hours old, and so no credits are expiring -- so I am no longer losing credits before new credits are earned. I am now able to keep the 6 that I earn per hour, because I used up the old ones, decreasing the net impact to my credit balance to the expected level.
This explains the only reservation I had about the graphs in the question: why, when utilization drops off, does it take so long for the balance to rebound?
The TL;DR answer is this: the balance doesn't rebound immediately, after a burst of heavy utilization, because you still have unused credits from 24 hours prior, which are canceling out your newly-earned credits, until you reach the point in time when you don't have any 24-hour-old unused credits. When that happens, your credit balance increases again.
Leave the instance completely idle for 24 hours and you will eventually see the balance steadily (for the most part) rise to the maximum again, as expected. Anything less than 24 hours completely idle will cause your balance to remain perpetually be somewhere below the max.
My test script eventually depleted my credit balance almost all the way down. When I killed the process eating the CPU, the credit balance began to recover immediately, at the expected rate of 6 credits per hour.
Conversely, when I took a different machine that had seen low utilization for 24 hours, and ran it's CPU to 100% for a few minutes, then took it back to idle, the credits did not begin to accumulate immedately... being offset by old, expiring ones.
Quotes are from http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html.