Why do my #AWS EC2 spot instance requests get stuck in pending-evaluation status? - amazon-web-services

When I try to launch an EC2 spot instance, the instance almost immediately goes into status = pending-evaluation and stays there indefinitely.
My bid price is far above the current spot price, and I have no trouble launching dedicated instances.
Why is this happening? Has anyone had a similar problem?

Can't answer the "why" but regarding "has anyone had a similar problem" - yes, many people have had this same issue over many years.
Search the AWS support forums for "pending-evaluation" and you'll have a lot of threads to read up on: https://forums.aws.amazon.com/search.jspa?mbtc=70747e76394e723be38d774e355fd542ab723cc7d767e298a5626f55fd475590&threadID=&q=%22pending-evaluation%22&objID=&userID=&dateRange=all&numResults=30&rankBy=9
Noticeably quite a few of them has had responses from AWS support stating something in the line of "we found an issue with your account, it should be fixed now". However many posts have remained unanswered so unless you're paying for support it seems you might need luck on your side to get your issues resolved.

Related

Notebook instance cannot be reached for several days

As it is described on the title, I cannot reach a notebook instance for 3 days continiously. I also have this issue occasionally.
Anytime I try, It prompts
does not have enough resources available to fulfill the request. Try a
different zone, or try again later.
And it never became available at all! Any suggestions? I thought to move instance to another region but I am afraid this issue can be encountered on other regions too which is not a solid solution at least.
I would suggest you to try creating the instance in another region/zone.
It would also be useful to know in which zone are you trying to create the instance.
Zones like south-asia1-a/b/c (Mumbai) are always used at their full capacity and rarely have enough resources for creating new instances. There's another post with a similar situation, with a more detailed answer that may solve or find a workaround to your specific case here. Please let me know if this solves your issue.

Sorry, you've reached your maximum limit of Lightsail Instances : 2

In a normal account (no AWS Free Tier), when attempting to create more than two Lightsail instances in the same region I get
CreateInstances [eu-west-2]
Sorry, you've reached your maximum limit of Lightsail Instances : 2.
If you're new to Lightsail, please try again later. If the issue
persists, please contact Customer Support.
Thing is, in the Service Quotas page can read that the Number of instances per Region is 20.
Can see that I can request an increase in this limit and could create the instance in a different location - I've tested and that's allowed - but want all services/products in the same region so that's not an option for me).
Shouldn't I be allowed 20 per region? What am I missing here?
As stated in the error message, considering I'm new to Lightsail (less than one month of usage), will "try again later" and see if that solves.
Following John Rotenstein's suggestion, I went to the AWS's Contact Page and under Billing/Account Support raised in 2020-10-03 a case with the following text
Hi AWS support team
In a normal account (no AWS Free Tier), when attempting to create more than two Lightsail instances in the same region I get
> "CreateInstances [eu-west-2]
> Sorry, you've reached your maximum limit of Lightsail Instances : 2. If you're new to Lightsail, please try again later. If the issue persists, please contact Customer Support."
Thing is, in the Service Quotas page can read that the Number of instances per Region is 20. Shouldn't I be allowed 20 per region? What am I missing here?
As stated in the error message, considering I'm new to Lightsail (less than one month of usage), I "tried again" after 8 hours and then after 21 hours but the problem remained and hence the question.
Attentively
Tiago Peres
and one day after received a response including
Thank you for reaching us regarding this matter, and we apologize for any inconvenience. In order to reach a resolution to this matter, I have engaged our Service Team to dive deep into this request.
Rest assured, I have shared the necessary details to make sure that the investigation is completed as effectively as possible, if there's any information missing from your end I will be reaching you directly.
Since I understand how important this is for you, I will be requesting periodical updates in order to ensure a prompt resolution. Once we have received information, we will be reaching back to you.
The problem now solved. The limit of LightSail instances has been updated successfully to 20 on the EU (London) region.
In my case, I submitted the support ticket asking for more instances. After a few NEXT buttons, the support guy over the chat told me:
"please check again".
He said there is a validation process for your account, after submitting the request for more instances. It takes a few minutes.
I just tried the same again, and it worked.

ZONE_RESOURCE_POOL_EXHAUSTED for DataFlow & DataPrep

Alright team...Dataprep running into BigQuery. I cannot for the life of me find out why I have the ZONE_RESOURCE_POOL_EXHAUSTED issue for the past 5 hours. The night before, everything was going great, but today, I am having some serious issues.
Can anyone give any insight into how to change the resource pool for Dataflow jobs with regard to Dataprep? I can't even get a basic column transform to push through.
Looking forward to anyone helping me with this because honestly, this issue one of those "just change this and maybe that will fix it and if not, maybe a few weeks and it'll work".
Here is the issue in screenshot: https://i.stack.imgur.com/Qi4Dg.png
UPDATE:
I believe some of my issue may deal with GCP Compute incident 18012 espcially since it's a us-central based issue for creation of instances.
The incident you mentioned was actually resolved on November 5th and was only affecting the us-central1-a zone. Seeing that your question was posted on November 10th and other users in the comments got the error in the us-central1-b zone, the error is not related to the incident you linked.
As the error message suggests, this is a resource availability issue. These scenarios are rare and are usually resolved quickly. If this ever happens in the future, using Compute Engine instances in other regions/zones will solve the issue. To do so using Dataprep, as mentioned in the comment, after the job is launched from Dataprep, you can re-run the job from Dataflow while specifying the region/zone you would like to run the job in.

not have enough resources available to fulfil the request try a different zone

not have enough resources available to fulfill the request try a different zone
All of my machines in the different zone
have the same issue and can not run.
"Starting VM instance "home-1" failed.
Error:
The zone 'projects/extreme-pixel-208800/zones/us-west1-b' does not have enough resources available to fulfill the request. Try a different zone, or try again later."
I am having the same issue. I emailed google and figured out this has nothing to do with quota. However, you can try to decrease the need of your instance (eg. decrease RAM, CPUs, GPUs). It might work if you are lucky.
Secondly, if you want to email google again, you will get the message sent from the following template.
Good day! This is XX from Google Cloud Platform Support and I'll be
glad to help you from here. First, my apologies that you’re
experiencing this issue. Rest assured that the team is working hard to
resolve it.
Our goal is to make sure that there are available resources in all
zones. This type of issue is rare, when a situation like this occurs
or is about to occur, our team is notified immediately and the issue
is investigated.
We recommend deploying and balancing your workload across multiple
zones or regions to reduce the likelihood of an outage. Please review
our documentation [1] which outlines how to build resilient and
scalable architectures on Google Cloud Platform.
Again, we want to offer our sincerest apologies. We are working hard
to resolve this and make this an exceptionally rare event. I'll be
keeping this case open for one (1) business day in case you have
additional question related to this matter, otherwise you may
disregard this email for this ticket to automatically close.
All the best,
XXXX Google Cloud Platform Support
[1] https://cloud.google.com/solutions/scalable-and-resilient-apps
So, if you ask me how long you are expected to wait and when this issue is likely to happen:
I waited for an average of 1.5-3 days.
During the weekend (like from Friday to Sunday) daytime EST, GCP has a high probability of unavailable resources.
Usually when you have one instance that has this issue, others too. For me, keep trying in different region waste my time. (But, maybe it just that I don't have any luck)
The error message "The zone 'projects/[...]' does not have enough resources available to fulfill the request. Try a different zone, or try again later." is always in reference to a shortage of resources in a zone.
Google recommends spreading your workload across different zones to reduce the impact of these issues on your workload. Otherwise, there isn't much else to do other than wait or try another zone/region
Faced this Issue yesterday [01/Aug/2020] when GCP free credit was over and below steps helped to workaround this.
I was on asia-south-c zone and moved to us zone
Going to my Google Cloud Platform >>> Compute Engine
Went to Snapshots >>> created a snapshot >>> Select your Compute Engine instance
Once snapshot was completed I clicked on my snapshot.
Ended up under "snapshot details". There, on the top, just click create instance. Here you are basically creating an instance with a copy of your disk.
Select your new zone, don't forget to attach GPUs, all previous setting, create new name.
Click create, that's it, your image should now be running in your new zone
No worry of losting configuration as well.

Cannot Extend GPU Quota on Google Cloud

I am using Google Cloud for development and training of deep neural networks. I've reached the limits of what I can do with CPUs and now need to create and instance with one or more GPUs.
I've followed the instructions from multiple sources. As the instance was being created I received a notification that my quota for my region (us-west1) was zero and to request an increase.
I did so and received the confirmation email within minutes. However, when I then attempted to recreate the instance I was again met with the quota increase error.
I submitted another request (same region) but heard nothing.
I tried in a different region, again requesting a quota increase, but heard nothing. I did this 6 times and -- as you might have guessed -- neither received a confirmation email nor was I able to create my instance.
I tried the hack of using Chrome in Incognito mode, but no joy.
This was an issue a few months ago, at least judging from the S/O and Google forum posts. I would think that by now it would be fixed.
Any help would be much appreciated as I'm totally stuck
NB: Cross-posted to the gce-discussion forum
I think you should contact the Google Cloud Platform Support for this kind of issues.
Open a case asking why your quota increase has not been applied and I am sure they are going to solve this in some days or at least to tell you why your request was declined.
Notice that quoting from the official Documentation "Free Trial accounts do not receive GPU quota by default."
Disclaimer: I work for the Google Cloud Support.