CloudRun is going over max instance count (of 1) - google-cloud-platform

I have a cloud run service (receiving no traffic) where I set the max instances and min instances to 1 so that it's always running.
When deploying a new instance, the instance count jumps to 3. This is a problem (I make some requests on instance start that hits a 429 if two instances are simultaneously making these requests).
Why is CloudRun instance count going over my max?
I can confirm my settings are correct and looking at the logs there are two new instances that start up.
PS: Cloudrun does have this message, which makes me think what I'm trying to do isn't possible. I just figured it would be because of downtime instead of extra instances.
Revisions using a maximum number of instances of 3 or less might experience unexpected downtime.

Your scenario seems to fit one described in the documentation, in which a new deployment for a Cloud Run service might temporarily create additional instances:
When you deploy a new revision, Cloud Run gradually migrates traffic from the old revision to the new one. Because maximum instance limits are set for each revision, you may temporarily exceed the specified limit during the period after deployment.
Additional instances might be created to handle traffic migration from the previous revision to a new one. Cloud Run offers settings that can alter how migration occurs between revisions. One of these settings is used to instantly serve all new traffic on the new revision. You can test if using this setting helps reduce the number of instances that are created. I tested one of the provided sample services and created multiple revisions, which did not exceed 1 active instance.

I read that it is a cloud run problem you need 10 instances and then the error disappears.
I understood that it is a bug in golang that does not know how to do it with two instances, but with 10 instances it should be fixed.
https://github.com/ahmetb/cloud-run-faq/issues/54

Related

Google Cloud cancel replacing instances in group

How is it possible to cancel replacing instances in instance group without website going offline? We have a managed instance group of compute engine instances, we start replace operation with maximum unavailable instances set to 0, if new instance for some reason doesn't become healthy there is an option to remove instance. However it removes all instances making website to go down until a new instance is created. Is it supposed to happen?
This seems to be expected behavior. Please have a look at rolling update in order to update your instance. Striving for 0 downtime could be achievable. To make your servers more responsive and less disruptive, you may consider the following two strategies:
Max surge. This will create instances above the target size in order to speed up the update process.
Opportunistic update. “An opportunistic update is only applied when new instances are created by the managed instance group”

Updating an AWS ECS Service

I have a service running on AWS EC2 Container Service (ECS). My setup is a relatively simple one. It operates with a single task definition and the following details:
Desired capacity set at 2
Minimum healthy set at 50%
Maximum available set at 200%
Tasks run with 80% CPU and memory reservations
Initially, I am able to get the necessary EC2 instances registered to the cluster that holds the service without a problem. The associated task then starts running on the two instances. As expected – given the CPU and memory reservations – the tasks take up almost the entirety of the EC2 instances' resources.
Sometimes, I want the task to use a new version of the application it is running. In order to make this happen, I create a revision of the task, de-register the previous revision, and then update the service. Note that I have set the minimum healthy percentage to require 2 * 0.50 = 1 instance running at all times and the maximum healthy percentage to permit up to 2 * 2.00 = 4 instances running.
Accordingly, I expected 1 of the de-registered task instances to be drained and taken offline so that 1 instance of the new revision of the task could be brought online. Then the process would repeat itself, bringing the deployment to a successful state.
Unfortunately, the cluster does nothing. In the events log, it tells me that it cannot place the new tasks, even though the process I have described above would permit it to do so.
How can I get the cluster to perform the behavior that I am expecting? I have only been able to get it to do so when I manually register another EC2 instance to the cluster and then tear it down after the update is complete (which is not desirable).
I have faced the same issue where the tasks used to get stuck and had no space to place them. Below snippet from AWS doc on updating a service helped me to make the below decision.
If your service has a desired number of four tasks and a maximum
percent value of 200%, the scheduler may start four new tasks before
stopping the four older tasks (provided that the cluster resources
required to do this are available). The default value for maximum
percent is 200%.
We should have the cluster resources available / container instances available to have the new tasks get started so they can start and the older one can drain.
These are the things i do
Before doing a service update add like 20% capacity to your cluster. You can use the ASG (Autoscaling group) commandline and from the desired capacity add 20% to your cluster. This way you will have some additional instance during deployment.
Once you have the instance the new tasks will start spinning up quickly and the older one will start draining.
But does this mean i will have extra container instances ?
Yes, during the deployment you will add some instances but as the older tasks drain they will hang around. The way to remove them is
Create a MemoryReservationLow alarm (~70% threshold in your case) for like 25 mins (longer duration to be sure that we have over commissioned). As the reservation will go low once you have those extra server not being used they can be removed.
I have seen this before. If your port mapping is attempting to map a static host port to the container within the task, you need more cluster instances.
Also this could be because there is not enough available memory to meet the memory (soft or hard) limit requested by the container within the task.

How do I set the instance count of a Elastic Beanstalk environment to 0?

In other words I would like to temporarily turn off an environment (and its associated billing costs) but not delete it entirely.
It seems if I set the [Configuration > Web Tier > Scaling > Minimum instance count] to 0 along with the related "Maximum instance count", AWS rejects those settings as invalid. Ditto for questionable values like 0.1.
Any ideas for temporarily taking an Elastic Beanstalk environment out of service?
You can achieve this by defining time periods in your environment.
from your Elastic Load blanancer dashboard navigate through: Configuration -> Capacity -> Time-based scaling -> Scheduled actions.
You may choose to use multiple scheduled actions to start and stop your environment.
i.e. One to start your environment and another to stop it, or scale out during high load periods and back in after.
In the background, Elastic Beanstalk will terminate and recreate the EC2. This means you'll lose any data on the instances local storage.
I have used this to turn my environments off when they are no longer needed. For example, I have a workload that crawls looking for new content; and I pause this crawling after it has once completed. I also use this feature to spin down my nonproduction environments outside of working hours.
I prefer this method to others as it is supported by both single instance and load-balanced environments.
How about eb scale 0 using the Elastic Beanstalk CLI tools?
I don't believe this is possible. I any case, you would still have a load balancer in front for which you pay, so you're not really reducing the costs for this environment.
It makes more sense to delete the environment completely create a script that will deploy a version from your application versions to a new environment.
I now use the ParkMyCloud service to suspend Elastic Beanstalk environments (and also RDS database instances) that I'm not using, both manually and in scheduled fashion.
Although ParkMyCloud is non-free, it's saved me a lot more money than I've spent using it. It's worked quite well over the last few years so I'm comfortable recommending it.

Elastic Beanstalk rolling update timeout not honored

I am trying to achieve zero down time redeploys on AWS elastic beanstalk.
I basically have two instances on my environment coupled with Jenkins for CI (Using Tomcat).
What I am trying to achieve is each time I trigger a redeploy from Jenkins, only one instance of the environment is redeployed, then have a timeout to allow the new instance to load the application and then redeploy the second instance.
In order to achieve that timeout I am setting both the "Pause time" and "Command timeout" but unfortunately its if though this limit is not honored.
The first instance is redeployed but after around 1 minute the second instance is redeployed regardless of the timeout value I set.
Have anyone archived this? any insights on how to achieve it?
"Pause Time" relates to environmental configuration made to instances. "Command timeouts" relates to commands executed to building the environment (for example if you've customised the container). Neither have anything to do with rolling application updates or zero downtime deployments. The documentation around this stuff is confusing and fragmented.
For zero downtime application deployments, AWS EB give you two options:
Batch application updates
Running two environments and cutting over
Option 1 feels like a lot less work but in my testing hasn't been truly zero downtime. There is a HARDCODED timeout value where traffic will be routed to instances after 1 minute regardless of whether the load balancer healthcheck passes or not.
But if you still want to go ahead then running two instances and setting a batch size of 50% or 1 should give you want you want.

Is it possible to auto scale with amazon web services, with ever changing AMI's?

Curious if this is possible:
We have a web application that at MOST times, works just fine with our single small instance. However, when we get multiple customers running simultaneously intense queries (we are a cloud scheduling service); our instance bogs way down to near 80% cpu load and becomes pretty unresponsive.
Is there a way to have AWS fire up another small instance (or a few), quickly, only for the times that its operating under this intense load? BUT, the real question is how does this work when we have very frequent programming updates to our application? Do we have to manually create a new image everytime we upload a code change???
Thanks
You should never be running anything important on a single EC2 instance. Instances can--and do--go offline randomly. Always use an autoscaling (AS) group that spans multiple availability zones. An AS group will automatically bring new instances online when you hit a certain trigger (in your case, CPU utilization). And then it will scale down the instances when traffic subsides. Autoscaling is the heart and soul of AWS and if you're not using it, you might as well be using a cheaper (and more durable) VPS host.
No, you don't want to be creating a new AMI for each code release. Ideally you should use a base AMI (like one of Amazon's official ones) and then have it auto-provision at boot. You can use the "user data" field when you launch an AMI to bootstrap this process. It can be as simple as a bash script that pulls from your Git repo to as something as sophisticated as Puppet or Chef.
The only time I create custom AMI's is if the provisioning process just takes too long. However that can almost always be solved by storing the needed files in S3.