Request Limit Per Second on GCP Load Balance in front of Storage Bucket website

Request Limit Per Second on GCP Load Balance in front of Storage Bucket website - google-cloud-platform

I want to know the limit of requests per second for Load Balancer on Google Cloud Platform. I didn't found this information on documentation.
My project is a static website hosted on Storage Bucket behind the Load Balancer and CDN active,
This website will receive a campaign in a Television channel and the estimative is that 100k requests per second for 5 minutes.
Could anyone help me with this information? Its necessary to ask Support for pre-warmup the load balancer before the campaign starts?

From the front page of GCP Load Balancing:
https://cloud.google.com/load-balancing/
Cloud Load Balancing is built on the same frontend-serving
infrastructure that powers Google. It supports 1 million+ queries per
second with consistent high performance and low latency. Traffic
enters Cloud Load Balancing through 80+ distinct global load balancing
locations, maximizing the distance traveled on Google's fast private
network backbone.
This seems to say that 1 million+ request per second is fully supported.
However, with all that said ... I wouldn't wait for "the day" before testing. See if you can't practice a suitable load. Given that this sounds like a finite event with high visibility (television), I'm sure you don't want to wait for the event only to find out something was wrong in the setup or theory. From the perspective of "is 100K request per second through a load balancer" ... the answer appears to be yes.
If you (or you asking on behalf of) a GCP consumer, Google has Technical Account Managers associated with accounts that can be brought into the planning loop ... especially if there are questions on "can we do this". One should always be cautious of sudden high volume needs of GCP resources. Again, through a Technical Account Manager, it does no harm to pre-warn Google of large resource requests. For example, if you said that you needed an extra 5000 Compute Engines, you may be constrained on what regions are available to you given a finite existing capacity. Google, just like other public cloud providers, has to schedule and balance resources in its regions. Timing is also very important. If you need a sudden burst of resources and the time that you need them happens to coincide with some event such as Black Friday (US) or Singles Day (China) special preparation may be needed.

Related

How much ram i need in cloud hosting for 100 kb website storage?

I have one website that has simple one page that fetches trending videos from youtube with the use of youtube api and size of the website is just 100 kb. website is created by using of HTML,CSS,PHP. I want to host it on any good cloud hosting. Suppose i will get 10000 daily visitors to my website then 1gb ram and 1 coreCPU is sufficient for that?

Nobody can answer this question for you because every application is different, and much of it depends on the patterns of your particular user base.
The only way to know the requirements is to deploy the system and then simulate user traffic. Monitor the system to identify stress points, which could be RAM, CPU or Network. You can then adjust the size of the instance accordingly, and even change Instance Type to obtain a different mix of RAM and CPU.
Alternatively, just deploy something and monitor it closely. Then, adjust things based on usage patterns you see. This is "testing in production".
You could also consider using Amazon EC2 Auto Scaling, which can automatically launch new instances to handle an increased load. This way, the resources vary based on usage. However, this design would require a Load Balancer in front of the instances.
Then, if you want to get really fancy, you could simply host a static web page from Amazon S3 and have the page make API calls to a backend hosted in AWS Lambda. The Lambda function will automatically scale by running multiple functions in parallel. This way, you do not require any Amazon EC2 instances and you only pay for resources when somebody actually uses the website. It would be the cheapest architecture to run. However, you would need to rewrite your web page and back-end code to fit this architecture.

Cloud Run Web Hosting limitation

I'm considering a cloud run for web hosting rather than a complex compute engine.
I just want to make an api with node.js. I heard that automatic load balancing is also available. If so, is there any problem with concurrent traffic of 1 million people without any configuration? (The database server is somewhere else (which is serverless like cockroachDB)
Or Do I have to configure various complicated settings like aws ec2 or gce?

For such traffic, out of the box configuration must be fine tuned.
Firstly, the concurrency parameter on Cloud Run. This parameter indicate how many concurrent request can be handle per instance. It's 80 the default value, and you can set up to 1000 concurrent requests per instance.
Of course, if you handle 1000 concurrent request per instance (or less) you should require more CPU and Memory. You can also play with those parameters
You also have to change the max instance limit. By default, you are limited to 1000.
If you set 1000 concurrent requests and 1000 instances, you can handle 1 million of concurrent request.
However, you don't have a lot of margins, or your instance with 1000 concurrent requests can be struggle even with max CPU and memory.
You can request more than 1000 instances with a quota increase request.
You can also optimise differently, especially if your 1 million users aren't in the same country/Google Cloud Region. if so, you can deploy a HTTPS load balancer in front of your cloud run service and deploy it in all the region of your users. (The Cloud Run services deployed in different regions must have the same name).
Like that, it's not only one service that will have to absorb 1 million of users, but several, in different regions. In addition, the HTTPS load balancer route the request to the closest region and therefore your optimize the latency, and reduce the egress/cross region traffic.

Azure VM Inbound Throttling to VMs?

We have 2 Elastic VMs (Linux) (Currently DS2V2) behind an Azure Load Balancer. We are doing HTTP Posts from our local lan into the Load Balancer, but we seem to be getting throttled. We have tried: Changing the size of the VMs, no difference; adding additional premium SSDs, again no difference; running multiple threads on our end, again no differenece.
What we did do though, was to having the Elastic Engine suck in all of the log files from the Linux boxes and the index rate jump pretty high while it was ingesting them. So we are assuming that it's not really the Linux Elastic boxes that are throttling us.
We do have Kibana installed on the boxes, and as a base line, we're just using the "Cluster Indexing Rate" for both our local posts to the box, and the local ingestion of the log files.
We do understand that yes, there is going to be some latency and overhead since we are now involving the internet, but not the rates we are currently getting. (We have a 1G pipe to the internet, it's nowhere near capacity, so we can rule out at least getting out of our company).
The question is, where else can we look to determine where we might be getting throttled?

For the performance "MUCH slower", it is a bit subjective question and hard to identify. I just provide some information that may impact it.
Azure Compute requests may be throttled at a subscription and on a per-region basis. If you have an API throttling error, you could refer to this document to troubleshoot throttling issues, and best practices to avoid being throttled.
Some factors CPU and storage limits that differ on Azure VM sizes may impact the Azure VM to process incoming data. You may change the size to a higher CPU and premium SSD disk. You could also change Azure resources to another region which is close to your location. You could refer to this article.

AWS Network out

Our web application has 5 pages (Signin, Dashboard, Map, Devices, Notification)
We have done the load test for this application, and load test script does the following:
Signin and go to Dashboard page
Click Map
Click Devices
Click Notification
We have a basic free plan in AWS.
While performing load test, till about 100 users, we didn’t get any error. please see the below image. We could see NetworkIn, CPUUtilization seems to be normal. But the NetworkOut showed 846K.
But when reach around 114 users, we started getting error in the map page (highlighted in red). During that time, it seems only NetworkOut is high. Please see the below image.
We want to know what is the optimal score for the NetworkOut, If this number is high, is there any way to reduce this number?
Please let me know if you need more information. Thanks in advance for your help.

You are using a t2.micro instance.
This instance type has limitations on CPU that means it is good for bursty workloads, but sustained loads will consume all the available CPU credits. Thus, it might perform poorly under sustained loads over long periods.
The instance also has limited network bandwidth that might impact the throughput of the server. While all Amazon EC2 instances have limited allocations of bandwidth, the t2.micro and t2.nano have particularly low bandwidth allocations. You can see this when copying data to/from the instance and it might be impacting your workloads during testing.
The t2 family, especially at the low-end, is not a good choice for production workloads. It is great for workloads that are sometimes high, but not consistently high. It is also particularly low-cost, but please realise that there are trade-offs for such a low cost.
See:
Amazon EC2 T2 Instances – Amazon Web Services (AWS)
CPU Credits and Baseline Performance for Burstable Performance Instances - Amazon Elastic Compute Cloud
Unlimited Mode for Burstable Performance Instances - Amazon Elastic Compute Cloud
That said, the network throughput showing on the graphs is a result of your application. While the t2 might be limiting the throughput, it is not responsible for the spike on the graph. For that, you will need to investigate the resources being used by the application(s) themselves.

NetworkOut simply refers to volume of outgoing traffic from the instance. You reduce the requests you are sending from this instance to reduce the NetworkOut .So you may need to see which one of click Map, Click Devices and Click Notification is sending traffic outside of the instances. It may not necessarily related only to the number of users but a combination of number of users and application module.

AWS Server Configuration

I'm new in AWS. For one project we require to purchase server on AWS. I don't know what configuration is required for the server. Our website will be like https://www.justdial.com/ and minimum 1000 users every time will be online on the website. Please, what configuration will be best with minimum pricing. I'm mentioning details below, what we want;
> • 1 - Elastic IP
> • 1 - Load Balancer
> • 2 - Webserver + autoscaling
> • 1 - Database SQL
> • 1 - S3 storage backup
> • CDN
if anything else is missing please guide me.

Go for micro-service architecture, and create labmda for every service. You can use private RDS for security. Using labmda based serverless approach will cost you on the basis of API request per API. Since during night time request reduces to close to zero, for that duration you wont be charged. AWS lambda auto balance load and availability of service to everyone with minimum CPU, memory usage. You won't be needing your load balanced as AWS does it by default.
Based upon your requirements use of a VM won't be a good idea, as most of these, Load-Balancer, Webserver autoscaling, are free for serverless lambda, and using RDS will put your database cost to minimum in place of owning a VM and scaling VM resources.

It really depends on your application. If all you do is return static pages, you might be fine with the smallest instance and CDN like CloudFront. If every request is dynamic and takes massive computations, you need some strong servers.
I suggest you start with some reasonable settings (e.g. t3.medium) and then load-test it to figure out what you really need. There are many tools for that. You basically need something that will generate a lot of requests to your servers and track errors, latency and total response time. If any of those metrics come back insufficient (this also depends on your needs), add more resources. Make sure to leave room for traffic spikes.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js