My EC2 instance had 8GB max storage but it was not enough so I decided to expand it to 15GBs (i am using the free tier). I waited process to finish but I lost whatever Access i had to the instance. Connection times out.
I waited several hours more but no change. Accessibility checks are ok and when I choose instance snapshot i see that it stays on the login screen.
I couldn't fix my issues - I tried restarting the instance multiple times, I checked the network settings everything was ok there. I changed volumes of the instance and I noticed that the volume maybe was corrupted because the same instance with different volume the SSH was working fine. I ended up setting up a new instance.
Bit panicky here because I can't troubleshoot the error on a production site and it appears to be completely down.
GCP - Compute Engine VM - N1-standard on the US-West-3C zone running a Bitnami Multisite Wordpress deployment
About 2 hours ago my VM stopped responding (as far as I could tell with monitoring tools) and I was unable to SSH into it or connect in any way. I've experienced this occasionally in the past so my process was to grab a snapshot and restart the VM. I did manage to get the snapshot, however it stopped the VM by itself and I'm now stuck where I can't restart the VM.
The error I'm getting is:
Failed to start name-of-vm: A n1-standard-1 VM instance is currently unavailable in the us-west3-c zone. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation.
I tried changing my configuration (it used to be a custom VM) but that didn't do anything.
Searching for similar errors I've found threads about certain Zones running out of resources, but as far as I can tell this error doesn't specifically say 'run out of resources' and the status of the US-West-3C zone is fine. I can't imagine it would run out in a way where it can't even start a measly n1 vm.
Unfortunately due to some mismanagement this project isn't umbrella'd in our Google Workspace/Organization so I can't request technical support for it.
Any assistance or help pointing to some resources would be greatly appreciated.
currently unavailable in a specific zone would also mean that the zone run out of resources for the certain machine type.
You can try to restore the snapshot you had created on a different machine type e2-standard or n2-standard machine type configuration
I have created an environment in AWS Cloud9 with a Python Lambda function.
This was working fine and for several days I was adding functionality.
However one day the environment failed to open. After several minutes of loading it displayed an error message:
This is taking longer than expected.
If you think there might be an issue, contact AWS Support.
It might be caused by VPC configuration issues.
Please check documentation:
https://docs.aws.amazon.com/cloud9/latest/user-guide/vpc-settings.html?icmpid=docs_ac9_console
I looked at the suggested link, but I don't think the VPC is the issue. I didn't make any changes to it. Moreover I am able to make new environments and open them.
Any ideas how to solve this?
Turns out the problem was the default t2.micro (1 GiB RAM) instance that is used to run Cloud9. I was probably running out of memory. Moving my environment to t2.small (2 GiB RAM) solved the problem.
Documentation on moving environments:
https://docs.aws.amazon.com/cloud9/latest/user-guide/move-environment.html
I had the error message:
"This is taking longer than expected. The delay may be caused by high CPU usage in your environment, or your T2 or T3 instance is running out of burstable CPU capacity credits, or there are VPC configuration issues."
What I did to solve it was to have my internet gateway attached to a VPC and for that VPC to have a public subnet.
I found this link to be useful to help solve this issue particularly when it states the
VPC requirements for AWS Cloud9: https://docs.aws.amazon.com/cloud9/latest/user-guide/vpc-settings.html?icmpid=docs_ac9_console
I agree with the answer above but just to expand with details on what I did:
I created a VPC attached to an Internet Gateway
create a route table and associate with subnet
Route table with routing to the subnet (making it public) and another routing to the internet gateway
This solved my problem.
My solution was different:
I changed the Region to Ohio from N. Virginia and that fixed the problem. But, it could be timing issue where N. Virginia was having problem.
There might have some processes hanging that will blew up memory.
Reboot the instance and try reloading the environment.
Issue : after a given period of time (usually the time it takes for the initial status checks to complete) I can no longer access my EC2 instance via SSH. More specifically during the initial period, I have normal access to my instance via SSH, then it drops, and the machine becomes completely unreachable, even when trying to ping it.
I have double checked Security Group, VPC settings etc. but don't think that can be the issue as at one point in time I can access the machine.
The issue occurs on "vanilla" instances from very basic standard AMIs as well as with AMIs I run on other AWS accounts. I have tried various instance types / sizes, but the issue occurs again and again.
Any ideas welcome! Thanks in advance
Dan
There are many things that can cause this. Specifically, if you're supplying user data during startup it could encounter an issue and eliminate your ability to SSH if it is modifying the file system and mounts or changing permissions.
If you can save the underlying system volume you can remount it and check /var/log/boot.log and /var/log/cloud-init-output.log
I have a PHP application deployed to Amazon Elastic Beanstalk. But I notice a problem that every time I push my code changes via git aws.push to the Elastic Beanstalk, the application deployed didn't picked up the changes. I checked the events log on my application Beanstalk environment and notice that every time the Beanstalk issues:
Deploying new version to instance(s)
it's always followed by:
The following instances have not responded in the allowed command timeout time (they might still finish eventually on their own):
[i-d5xxxxx]
The same thing happens when I try to request snapshot logs. The Beanstalk issues:
requestEnvironmentInfo is starting
then after a few minutes it's again followed by:
The following instances have not responded in the allowed command timeout time (they might still finish eventually on their own): [i-d5xxxxx].
I had this problem a few times. It seems to affect only particular instances. So it can be solved by terminating the EC2 instance (done via the EC2 page on the Management Console). Thereafter, Elastic Beanstalk will detect that there are 0 healthy instances and automatically launch a new one.
If this is a production environment and you have only 1 instance and you want minimal down time
configure minimum instances to 2, and Beanstalk will launch another instance for you.
terminate the problematic instance via EC2 tab, Beanstalk will launch another instance for you because minimum instance is 2
configure minimum instance back to 1, Beanstalk will remove one of your two instances.
By default Elastic Beanstalk "throws a timeout exception" after 8 minutes (480 seconds defined in settings) if your commands did not complete in time.
You can set an higher time up to 30 minutes (1800 seconds).
{
"Namespace": "aws:elasticbeanstalk:command",
"OptionName": "Timeout",
"Value": "1800"
}
Read here: http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/command-options.html
Had the same issue here (single t1.micro instance).
Did solve the problem by rebooting the EC2 instance via the EC2 page on the Management Console (and not from EB page).
Beanstalk deployment (and other features like Get Logs) work by sending SQS commands to instances. SQS client is deployed to instances and checks queue about every 20 secs (see /var/log/cfn-hup.log):
2018-05-30 10:42:38,605 [DEBUG] Receiving messages for queue https://sqs.us-east-2.amazonaws.com/124386531466/93b60687a33e19...
If SQS Client crashes or has network problems on t1/t2 instances then it will not be able to receive commands from Beanstalk, and deployment would time out. Rebooting instance restarts SQS Client, and it can receive commands again.
An easier way to fix SQS Client is to restart cfn-hup service:
sudo service cfn-hup restart
In the case of deployment, an alternative to shutting down the EC2 instances and waiting for Elastic Beanstalk to react, or messing about with minimum and maximum instances, is to simply perform a Rebuild environment on the target environment.
If a previous deployment failed due to timeout then the new version will still be registered against the environment, but due to the timeout it will not appear to be operational (in my experience the instance appears to still be running the old version).
Rebuilding the environment seems to reset things with the new version being used.
Obviously there's the downside with that of a period of downtime.
I think is the correct way to deal with this.
I think the correct way to deal with this is to figure out the cause of the timeout by doing what this answer suggests.
chongzixin's answer is what needs to be done if you need this fixed ASAP before investigating the reason for a timeout.
However, if you do need to increase timeout, see the following:
Add configuration files to your source code in a folder named .ebextensions and deploy it in your application source bundle.
Example:
option_settings:
"aws:elasticbeanstalk:command":
Timeout: 2400
*"value" represents the length of time before timeout in seconds.
Reference: https://serverfault.com/a/747800/496353
"Restart App Server(s)" from the "Actions" menu in Elastic Beanstalk management dashboard followed by eb deploy fixes it for me.
Visual cue for the first instruction
After two days of checking random issues, I restarted both EC2 instances one after another to make sure there is no downtime. Site worked fine but after a while, website started throwing error 504.
When I checked the http server, nginx was off and "Out of HDD space" was thrown. "Increased the HDD size", elastic beanstalk created new instances and the issue was fixed.
For me, the problem was my VPC security group rules. According to the docs, you need to allow outbound traffic on port 123 for NTP to work. I had the port closed, so the clock was drifting, and so the EC2's were becoming unresponsive to commands from the Elastic Beanstalk environment, taking forever to deploy (only to time out) failing to get logs, etc.
Thank you #Logan Pickup for the hint in your comment.