How to estimate Google Cloud Run costs? [closed]

How to estimate Google Cloud Run costs? [closed] - google-cloud-platform

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have built my machine learning model and now as a next step I wish to make it publicly available. The plan is:
To build the UI by using Streamlit.
To package the application using Docker image.
To deploy the Docker image into the Google Cloud Run environment.
To start the Docker container in Google Cloud Run environment.
But the problem is that I cannot estimate what will be the costs when starting and running the container in Google Cloud Run (I'm still a beginner). How to estimate the worst-case scenario, i.e. what could be the maximum cost that it should generate? There are CPU, memory, requests and networking properties in the Google Cloud Run pricing table but:
How can I know exactly how much of these resources my application will take?
Could it happen that if the application is publicly available and the requests exceed the free limit quota, I could get an astronomical bill.
Can I set some limits over which my bill cannot exceed?

The billing is quite simple: you pay the CPU time and the memory time that you allocate (and a little bit the number of requests).
The current worst case is: (NbCpu per instance per seconde + Memory per instance per seondes) * 3600 * 24 * 30 * 1000.
3600 is to transform second in hour
24 to transform hour in day
30 to transform day in month
1000 the default max instances parameter
If you want to limit the cost, limit the max instance parameter. But you will also limit the scalability of the application. But that prevent you to have astronomical bill, it's a matter of tradeoffs.
About the billing, you can also set an alert in the budget of your project
Your last question about the quantity of resource used by the application. It's hard to say! However, you have to know your application. The metrics graph can help you to provide some inputs.
However, Cloud Run is able to process up to 80 concurrent requests. Thus your metric when your process only 1 request at a time can change dramatically with 80 request! Especially in memory consumption!
You can also play with this concurrency param to limit the size of one instance. But is you reduce it, don't set the max instance param too low, else some request won't be served.
Matter of tradeoffs

Related

GCP - Rolling update max unavailable [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 months ago.
Improve this question
I am trying to understand the reasoning behind the GCP error message.
To give you the context,
I have 3 instances running 1 instance per zone using managed instance group.
I want to do an update. I would like to do the update one by one. So max unavailable should be 1. However GCP does not seem to like it.
How to achieve high availability here if I give max unavailable 3?

The reasoning behind the error is because when you initiate an update to a regional MIG, the Updater always updates instances proportionally and evenly across each zone, as described in the official documentation. If you set the number of instances lower than the number of zones, then the update could not be proportionally and evenly across zones.
Now, as you said, it does not make much sense from the high availability stand point; but this is because you are keeping the instance names when replacing them, and this forces the Replacement method to be RECREATE instead of SUBSTITUTE. The Maximum Surge for the RECREATE method should be 0 and that is because the original VM should be terminated before the new one is created in order to use the same name.
On the other hand, using the SUBSTITUTE method allows configuring a maximum surge that will be enforced during the update process, creating new VMs with a different name before terminating the old ones, and thus always having VMs available.
The recommendation then is to use the SUBSTITUTE method instead to achieve high availability during your Rolling Updates; if for some reason you need to preserve the instance names, then you can achieve high availability by instantiating more than 1 VM per zone.

I don't think that's really achievable, in your context since there is only 1 instance per zone.. in a managed instance group, it would not be highly available if 33% of your instances would be unavailable, so rather it will be 99% and after the update the high availability is on again.
I would suggest giving a good good read to [1] in order to properly understand how MIGs availability is defined on GCP, essentially you could of have had 2 2 2 and then have 2 2 2 update and again 2 2 2.
Also please check [2] As it's a proven example of my 33% statement above.
[1]
https://cloud.google.com/compute/docs/instance-groups/regional-migs#provisioning_a_regional_managed_instance_group_in_three_or_more_zones
[2]https://cloud.google.com/compute/docs/instance-groups/regional-migs#:~:text=Use%20the%20following%20table%20to%20determine%20the%20minimum%20recommended%20size%20for%20your%20group%3A

AWS RDS MariaDB capacity planning [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Customer has provisioned following for AWS RDS MariaDB instance:
Instance type: db.m5.large, vCPUs: 2, RAM: 8 GB, Multi AZ: No, Replication: No, Storage: 100 GB, Type: General purpose SSD
We are not sure what is the basis for provisioning the instance. Questions are:
What all factors should be considered to do capacity planning?
Is this a typical production grade database configuration?

Since
Customer has provisioned
we should account for customer's opinion and consider the factors which let them arrive to this plan, however there are factors which can help you in capacity planning i.e
If the transaction size static or dynamic.
If it is dynamic what could be the maximum transaction size.
What is the amount of network bandwidth each transaction is going to consume.
Will the number of transaction grow over the time ( it is suppose to grow anyways)
About production grade database configuration is anyways subjective question and can be debated however a basic architecture which is production grade looks like below -
Aws Pricing calculator is a good place to start with for most of the factors which should be considered.

Build the system on your laptop. If it scales well enough there, get an RDS of similar specs.
If it clearly does not scale well enough, get an RDS of the size up. Then work on optimizing things.
In some situations, you may have a slow query that just needs a better index. Keep in mind that "throwing hardware at a problem" is rarely optimal.

It is impossible to answer this question without knowing the exact specifics of the workload. However, it is unusual to have only 8GB of RAM for a 100GB database size. This puts you, optimistically, at about 5% ratio between buffer pool (cache) and data size, so unless the amount of hot data is surprisingly small in the specific workload that is intended, you will probably want at least double that amount of memory.

GCP Autoscale Down Protection

So I have a set of long running tasks that have to be run on Compute Engine and have to scale. Each task takes approximately 3 hours. So in order to handle this I thought about using:
https://cloud.google.com/solutions/using-cloud-pub-sub-long-running-tasks
Architecture. And while it works fine there is one huge problem. On scale down, I'd really like to avoid it scaling down a task that is currently running! I'd potentially lose 3 hours worth of processing.
Is there a way to ensure that autoscale down doesn't scale down a VM with a long running / uptime?
EDIT: A few people have asked to elaborate my task. So it's similar to what's described in the link above which is many long running tasks that need to be run on a GPU. There is a chunk of data that needs to be processed. It takes 4 hours (video encoding) then once completed it outputs to a bucket. Well it can take anywhere from 1 to 6 hours depending on the length of the video. Just like the architecture above it would be nice to have the cluster scale up based on queue size. But when scaling down I'd like to ensure that it's not scaling down currently running tasks which is what is currently happening. It being GPU bound doesn't allow me to use the CPU metric.

I think you should probably add more details about what kind of task you are running. However, as #Jhon Hanley suggestion, it worth to take a look of Cloud Tasks and see as well the following documentation that talks about the scaling risks.

Which EC2 instance size should I use to serve 10K users [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm modeling costs for a rest api for an e-commerce application mobile and want to determine the appropriate instance size and numbers.
Our config:
- Apache
- Slim framework PHP
Our Estimation:
- Users/Day: 10000
- Page views / User: 10
- Number total of users: 500 000
- Number total of products: 36 000

That's an extremely difficult question to provide a concrete answer to, primarily because the instance-type most appropriate for you is going to be based on the application requirements. Is the application memory intensive (use the r3 series)? Is it processing intensive (use the c4 series)? If it's a general application that is not particularly memory or processor intensive, you can stick with the M4 series, and if the web application really doesn't do much of anything besides serve up pages, maybe some database access, than you can go with the T2 series.
Some things to keep in mind:
The T2 series instances don't give you 100% of the processor. You are given a % of the processor (base performance) and then credits to use if your application spikes. When you run out of credits, you are dropped down to base performance.
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html#t2-instances-cpu-credits
t2.nano --> 5%
t2.micro --> 10%
t2.small --> 20%
t2.medium --> 40%
t2.large --> 60%
Each of the instances in each of the EBS-backed series (excluding T2) offer different max throughput to the EBS volume.
https://aws.amazon.com/ec2/instance-types/
If I had to wager a guess, for 100,000 page views per day, assuming the web application does not do very much other than generate the pages, maybe some DB access, I would think a t2.large would suffice with possibility to move up to m4.large as the smallest M4 instance.
But, this all defeats the wonders of AWS. Just spin up an instance and try it for a few days. If you notice it's failing, figure out why (processes taking too long, out of memory errors, etc.), shut down the instance and move up to the next instance.
Also, AWS allows you to easily build fault tolerance into your architecture and to scale-OUT, so if you end up needing 4 processors and 16gb memory (1 x m4.xlarge instance), you may do just as well to have 2 x m4.large instances (2 processors and 8gb memory) behind a load balancer. Now, you have 2 instances with the same specs and roughly the same cost (I think it's marginally cheaper actually).
You can see instance pricing here:
https://aws.amazon.com/ec2/pricing/
You can also put together your (almost) entire AWS architecture costs using this calculator:
http://calculator.s3.amazonaws.com/index.html

Can the way a site is coded affect how much we spend on hosting? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
Our website is an eCommerce store trading in ethically sourced loose diamonds. We do not get much traffic and yet our Amazon bill is huge ($300/month for 1,500 unique visits). Is this normal?
I do know we are daily doing some database pulling twice from another source and that the files are large. Does it make sense to just use regular hosting for this process and then the Amazon one just for our site?
Most of the cost is for Amazon Elastic Compute Cloud. About 20% is for RDS service.
I am wondering if:
(a) our developers have done something which leads to this kind of usage OR
(b) Amazon is just really expensive
IS THERE A PAID FOR SERVICE WHICH WE CAN USE TO ENSURE OUR SITE IS OPTIMISED FOR ITS HOSTING - in terms of cost, usage and speed?

It should probably cost you around 30-50 dollars a month. 300 seems higher than necessary.
for 1500 vistors, you can get away with using an m1.small instance most likely
I'd say check out the AWS trusted advisor service that will tell you about your utilization and where you can optimize your usage, but you can only get that with AWS Business support (100/month). However considering your way over what is expected, it might be worth looking into
Trusted advisor will inform you of quite a few things:
cost optimization
security
fault tolerance
performance
I've generally found it to be one of the most useful additions to my AWS infrastructure.
Additionally if you were to sign up for Business support, not only do you get trusted advisor, but you can ask questions directly to the support staff via chat, email, or phone. Would also be quite useful to help you pinpoint your problem areas.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js