My EC2 ubuntu instance( id: xxx) got terminated due to EC2 instance status checks failure and launched itself as an new instance (yyy). I wanted to see know the reason behind the termination here. I attempted to have a look CloudTrail -> Event History -> Event Name (Lookup Attributes) -> TerminateInstances. But that didn't help me find a reason for this termination. I also tried using CLI aws ec2 describe-instances --instance-id xxx, but what I got is empty array
{ "Reservations": [] }
Any help in finding a way to get the logs would be great.
Related
I've hit a dead-end on my debugging and I'd appreciate any insight y'all could provide.
I setup an auto scaling group (ASG) of size 1 using launch templates in a public subnet. The ASG is configured with a creation policy and an update policy. The instance user data invokes cfn-signal after cfn-init has completed.
During my initial deployment, CloudFormation pauses for the signals, but ultimately times out and a rollback occurs. I set a pause time of 10 minutes in the creation policy, which should be more than enough time. To debug, I deploy again, ssh into the instance during the pause after the EC2 instance checks have passed, I check the logs with
sudo grep -ni 'error\|failure' $(sudo find /var/log -name cfn\* -or -name cloud-init\*)
, and I find no errors. cfn-init.log shows cfn-signal transmitting a SUCCESS status, and cfn-wire.log shows the associated HTTP response with a 200 status. I then attempt to manually send the signal, and stdout shows the following, indicating that signal has already been sent:
[DEBUG] Signaling resource <ASG_LOGICAL_ID> in stack <STACK_NAME> with unique ID <INSTANCE_ID> and status SUCCESS
ValidationError: Signal with ID <INSTANCE_ID> for resource <ASG_LOGICAL_ID> already exists. Signals may only be updated with a FAILURE status.
CloudFormation stack event logs on AWS management console shows that the ASG is still in CREATE_IN_PROGRESS. The transmission of the success signal is not causing the transition to CREATE_COMPLETE.
To test the update policy, I commented out the creation policy, deployed, changed the key pair name in the launch template to another one that also works so I could trigger the update policy, and deployed again. Instance in ASG gets brought down (minimum instance in service is 0 in auto scaling rolling update policy), and its replacement gets brought up. At this point, CloudFormation pauses and waits for signal, and I repeat my debugging steps above only to encounter the same results as above.
Here are more details that may be of use:
Creation Policy was configured like so:
"CreationPolicy" : {
"AutoScalingCreationPolicy" : {
"MinSuccessfulInstancesPercent" : 100,
},
"ResourceSignal" : {
"Count" : 1,
"Timeout" : "PT10M"
}
}
Instance role has a principal policy with the following actions allowed:
cloudformation:DescribeStackResource
cloudformation:SignalResource
The instance is configured to be a NAT instance and it does work. I tested it as suggested here: https://docs.aws.amazon.com/vpc/latest/userguide/VPC_NAT_Instance.html#nat-test-configuration
Logical ID of ASG is passed to resource option of cfn-signal
Logical ID of launch template is passed to resource option of cfn-init
Amazon Linux 2 AMI
Made sure to install aws-cfn-bootstrap
In the launch template, I specified an external network interface to be used as the default network interface of the sole instance. Here's details about that: https://docs.aws.amazon.com/autoscaling/ec2/userguide/create-launch-template.html#change-network-interface
I tried to detach one of my IAM role from my instance (still running) and got a response of successful detachment.
Afterwards I tried to attach a new IAM role to the exactly same instance, however, this message occured: The association <AssociationId> is not the active association.
After using aws ec2 describe-iam-instance-profile-associations to check the IAM instance profile associations, I found that the state is disassociating. And I rechecked the associations the other day, it's still stuck at disassociating.
Then I tried aws ec2 associate-iam-instance-profile to associate my instance with a new role, but all I got is another instance association stuck at associating.
I also tried replace-iam-instance-profile-association and the same showed up: The association <AssociationId> is not the active association.
And rebooting instance did not work either.
Any solutions?
Thanks.
I've fixed this issue by launching a new instance base on the EBS snapshot of the problematic instance, which is the last thing I wanna do.
Anyway, this could be considered as a workaround. :(
It really sucks that you have to pay to create AWS technical support cases.
Found an easy solution for this!
Hope this helps some people finding this.
After getting stuck in the "disassociating" or "associating" state, use the aws CLI to find the associations that causes the problem (They will be stuck at the state ""disassociating" or "associating""):
aws ec2 describe-iam-instance-profile-associations
After finding them use:
aws ec2 disassociate-iam-instance-profile --association-id iip-assoc-xxxxxx
to remove them. Not quite intuitive but you can actually remove the ones in the state "disassociating" after that you can add a new role/instance-profile.
Error : Unable to detach, there are no existing instance profile associations.
While you are trying to add Role to EC2 instance
Debug and Verify:
run > aws iam list-instance-profiles
command output :
{
"InstanceProfiles": []
}
run > aws iam list-instance-profiles-for-role --role-name Your-Role-Name
command output :
{
"InstanceProfiles": []
}
Solution :
run > aws iam create-instance-profile --instance-profile-name profile-name-sameas-role-name
run > aws iam add-role-to-instance-profile --instance-profile-name profile-name-sameas-role-name --role-name role-name
Done !!
Go Back to EC2 dashboard and try to Add the IAM Role again. This time it should work.
this example: https://aws.amazon.com/premiumsupport/knowledge-center/stop-start-ec2-instances/
does not seem to work. I followed the example and the pipeline is always canceled. There are no logs created, i did set up logging. the only "error message" i could find is.
Error MessageUnable to create resource for #Ec2Instance_2017-06-07T09:58:49 due to: No subnets found for the default VPC 'vpc-f7dxxxx'. Please specify a subnet. (Service: AmazonEC2; Status Code: 400; Error Code: MissingInput; Request ID: ebeeae6d-9537-4627-8a56-e832999a1940)
All i am trying to do is execute a aws ec2 start-instances aws cli command as outlined in the example. the instances do exist, they are in a "stopped" state. Has anyone been successful in setting up a pipeline to start and stop existing instances? How did you do it? Thanks for the help
yes, that was it. after you finish going through the example you need to look at the pipeline and edit it. Look for the EC2Resource area. Click on it. then add a subnet. place the micro instance in the same subnet as the ec2 instances you need to start or stop. The example does not address this
I am working on EMR template with autoscaling.
While a static EMR setup with instance group works fine, I cannot attach
AWS::ApplicationAutoScaling::ScalableTarget
As a troubleshooting I've split my template into 2 separate ones. In first I am creating a normal EMR cluster (which is fine). And then in second I have a ScalableTarget definition which fails attach with error:
11:29:34 UTC+0100 CREATE_FAILED AWS::ApplicationAutoScaling::ScalableTarget AutoscalingTarget EMR instance group doesn't exist: Failed to find Cluster XXXXXXX
Funny thing is that this cluster DOES exist.
I also had a look at IAM roles but everything seems to be ok there...
Can anyone advice on that matter?
Did anyone for Autoscaling instancegroup to work via Cloudformation?
I have already tried and raised a request with AWS. This autoscaling feature is not yet available using CloudFormation. Now I am using CF for Custom EMR SecGrp creation and S3 etc and in output tab, I am adding Command line command(aws emr create-cluster...... ). After getting output querying the result to launch Cluster.
Actually, autoscaling can be enabled at the time of cluster launching by using --auto-scaling-role. If we use CF for EMR, autoscaling feature is not available because it launches cluster without "--auto-scaling-role".
I hope this can be useful...
The problem I am trying to solve is how to make my code running within an EC2 instance which is part of a load balanced AWS cluster aware of how many other EC2 instances are withing the same cluster/loadbalancer.
I have the following code which when given the name of a LoadBalancer can tell me how many EC2 instances are associated with that Loadbalancer.
DescribeLoadBalancersResult dlbr = loadBalancingClient.describeLoadBalancers();
List<LoadBalancerDescription> lbds = dlbr.getLoadBalancerDescriptions();
for( LoadBalancerDescription lbd : lbds )
{
if( lbd.getDNSName().equalsIgnoreCase("MyLoadBalancer"))
{
System.out.println(lbd.getDNSName() + " has " + lbd.getInstances().size() + " instances") ;
}
}
which works fine and prints out the loadbalancer name and number of instances is has associated with it.
However I want to see if I can get this info without having to provide the Loadbalancer name. In our setup an EC2 instance will only ever be associated with one Loadbalancer so is there any way to go back the way from EC2 instance to Loadbalancer?
I figure I can go down the route of getting all loadbalancers from All regions, iterating through them until I find the one that contains my EC2 instance but I figured there might be an easier way?
An interesting challenge -- I would have to wrangle with the code myself to think this through, but my gut first response would be to use the AWS CLI here, and to just invoke it from within your Java/C#.
You can make this call:
aws elb describe-load-balancers
And get all manner of information about any and all ELBs, and could simply --query filter that by the instance ID of the instance making the call anyway -- in order to find out what other friends the instance has joined to its same ELB. Just call the internal instance metadata to get that ID:
http://169.254.169.254/latest/meta-data/instance-id
Or another fun way to go would be to bootstrap your instance AMIs so that when they are spawned and joined to an ELB, they register themselves in a SimpleDB or DynamoDB table. We do this all the time as a way of keeping current inventories of websites, or software installed, etc. So this way you would have a list, which you could then keep trimmed by checking for "running" status.
EDIT - 4/13/2015
#MayoMan I have hadto make use of this as well in some current work -- to identify healthy instances attached to an ELB in an auto-scaling group and then act upon them. I've found 'jq' to be a really helpful command-line tool. You could also make these calls directly to an ELB, but here it's describing an ASG:
aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names <ASG Name> | jq -r .AutoScalingGroups[0].Instances[0].HealthStatus
Or to list the InstanceIds themselves:
aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names <ASG Name> | jq -r .AutoScalingGroups[0].Instances[0-3].InstanceId