AWS Cloud formation cfn-init error - amazon-web-services

I'm trying to create cluster structure using Cloud Formation script. I'm using Bitnami drupal AMI.
Bu the script produces below eror:
WaitCondition received failed message: 'Failed to run cfn-init' for uniqueId: i-4d121a16
Then I connected to instance and checked cfn-init.log:
bitnami#ip-10-58-51-235:/var/log$ cat cfn-init.log
2013-01-01 19:18:23,128 [INFO] Starting new HTTP connection (1): 169.254.169.254
2013-01-01 22:52:17,607 [INFO] Starting new HTTP connection (1): 169.254.169.254
2013-01-01 22:52:17,621 [INFO] Starting new HTTPS connection (1): cloudformation-waitcondition- eu-west-1.s3.amazonaws.com
My AWS Console events like below:
Logical ID PhysicalID Status Reason
mycluster arn:aws:cloudformation:eu-west-1:318730... CREATE_FAILED The following resource(s) failed to create: [WaitCondition]
WaitCondition vgdrobe-WaitCondition-UWEKNUJ8TMT6 CREATE_FAILED WaitCondition received failed message: 'Failed to run cfn-init' for uniqueId: i-59272f1c
WaitCondition vgdrb-WaitCondition-MML6Y6E47WTB CREATE_IN_PROGRESS
WebServerGroup vgdrb-WebServerGroup-1OBJYOSQX8093 CREATE_COMPLETE
I couln't find the problem. Could you help me please?
Kind regards...

Is it possible that the cfn-init scripts are not part of the image?
According to the docs, they're installed as part of the Amazon Linux image, but may not be included by default in other images.

Expanding on what Chris suggested, I had to add the following to get cfn-init working on my CloudFormation script:
"apt-get -y install python-setuptools\n",
"easy_install https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-1.0.tar.gz\n",
I found some really useful info about this here: http://smart421.wordpress.com/2012/12/14/aws-cloudformation-and-chef-on-centos/
I also found errors in my UserData caused cfn-init to fail. I got around this by using VS2012 to validate the script before uploading it (as suggested here: How can I quickly and effectively debug CloudFormation templates?).
Hope that helps.

Related

before install: CodeDeploy agent was not able to receive the lifecycle event. Check the CodeDeploy agent logs on your host and make sure the agent is

I have set up a pipeline, but i get the following error during deployment:
before install CodeDeploy agent was not able to receive the lifecycle event. Check the CodeDeploy agent logs on your host and make sure the agent is running and can connect to the CodeDeploy server.
Code Agent is running, but i do not know, what the problem is. I checked the logs of codedeploy:
[ec2-user#ip-172-31-255-11 ~]$ sudo cat /var/log/aws/codedeploy-agent/codedeploy-agent.log
2022-09-27 00:00:02 INFO [codedeploy-agent(3694)]: [Aws::CodeDeployCommand::Client 200 45.14352 0 retries] poll_host_command(host_identifier:"arn:aws:ec2:us-east-1:632547665100:instance/i-01d3b4303d7c9c948")
2022-09-27 00:00:03 INFO [codedeploy-agent(3694)]: Version file found in /opt/codedeploy-agent/.version with agent version OFFICIAL_1.4.0-2218_rpm.
2022-09-27 00:00:03 INFO [codedeploy-agent(3694)]: Version file found in /opt/codedeploy-agent/.version with agent version OFFICIAL_1.4.0-2218_rpm.
Also was unlucky enough to meet this problem today.
Please use this guide and look at the CodeDeploy agent logs of your compute platform instance (EC2, probably).
in my case, it turned out that I did not have an AppSpec file added to the project.

AWS EKS - Failure creating load balancer controller

I am trying to create an application load balancer controller on my EKS cluster by following
this link
When I run these steps (after making the necessary changes to the downloaded yaml file)
curl -o v2_1_2_full.yaml https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.1.2/docs/install/v2_1_2_full.yaml
kubectl apply -f v2_1_2_full.yaml
I get this output
customresourcedefinition.apiextensions.k8s.io/targetgroupbindings.elbv2.k8s.aws configured
mutatingwebhookconfiguration.admissionregistration.k8s.io/aws-load-balancer-webhook configured
role.rbac.authorization.k8s.io/aws-load-balancer-controller-leader-election-role unchanged
clusterrole.rbac.authorization.k8s.io/aws-load-balancer-controller-role configured
rolebinding.rbac.authorization.k8s.io/aws-load-balancer-controller-leader-election-rolebinding unchanged
clusterrolebinding.rbac.authorization.k8s.io/aws-load-balancer-controller-rolebinding unchanged
service/aws-load-balancer-webhook-service unchanged
deployment.apps/aws-load-balancer-controller unchanged
validatingwebhookconfiguration.admissionregistration.k8s.io/aws-load-balancer-webhook configured
Error from server (InternalError): error when creating "v2_1_2_full.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s: no endpoints available for service "cert-manager-webhook"
Error from server (InternalError): error when creating "v2_1_2_full.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s: no endpoints available for service "cert-manager-webhook"
The load balancer controller doesnt appear to start up because of this and never gets to the ready state
Has anyone any suggestions on how to resolve this issue?
Turns out the taints on my nodegroup prevented the cert-manager pods from starting on any node.
These commands helped debug and led me to a fix for this issue:
kubectl get po -n cert-manager
kubectl describe po <pod id> -n cert-manager
My solution was to create another nodeGroup with no taints specified. This allowed the cert-manager to run.

Elastic Beanstalk - Command failed on instance.An unexpected error has occurred [ErrorCode: 0000000001]

First time I'm trying to deploy a django app to elastic beanstalk. The application uses django channels.
These are my config files:
option_settings:
aws:elasticbeanstalk:container:python:
WSGIPath: "dashboard/dashboard/wsgi.py"
aws:elasticbeanstalk:application:environment:
DJANGO_SETTINGS_MODULE: "dashboard/dashboard/settings.py"
PYTHONPATH: /opt/python/current/app/dashboard:$PYTHONPATH
aws:elbv2:listener:80:
DefaultProcess: http
ListenerEnabled: 'true'
Protocol: HTTP
Rules: ws
aws:elbv2:listenerrule:ws:
PathPatterns: /websockets/*
Process: websocket
Priority: 1
aws:elasticbeanstalk:environment:process:http:
Port: '80'
Protocol: HTTP
aws:elasticbeanstalk:environment:process:websocket:
Port: '5000'
Protocol: HTTP
container_commands:
00_pip_upgrade:
command: "source /opt/python/run/venv/bin/activate && pip install --upgrade pip"
ignoreErrors: false
01_migrate:
command: "django-admin.py migrate"
leader_only: true
02_collectstatic:
command: "django-admin.py collectstatic --noinput"
03_wsgipass:
command: 'echo "WSGIPassAuthorization On" >> ../wsgi.conf'
When I run eb create django-env I get the following logs:
Creating application version archive "app-200617_112710".
Uploading: [##################################################] 100% Done...
Environment details for: django-env
Application name: dashboard
Region: us-west-2
Deployed Version: app-200617_112710
Environment ID: e-rdgipdg4z3
Platform: arn:aws:elasticbeanstalk:us-west-2::platform/Python 3.7 running on 64bit Amazon Linux 2/3.0.2
Tier: WebServer-Standard-1.0
CNAME: UNKNOWN
Updated: 2020-06-17 10:27:48.898000+00:00
Printing Status:
2020-06-17 10:27:47 INFO createEnvironment is starting.
2020-06-17 10:27:49 INFO Using elasticbeanstalk-us-west-2-041741961231 as Amazon S3 storage bucket for environment data.
2020-06-17 10:28:10 INFO Created security group named: sg-0942435ec637ad173
2020-06-17 10:28:25 INFO Created load balancer named: awseb-e-r-AWSEBLoa-19UYXEUG5IA4F
2020-06-17 10:28:25 INFO Created security group named: awseb-e-rdgipdg4z3-stack-AWSEBSecurityGroup-17RVV1ZT14855
2020-06-17 10:28:25 INFO Created Auto Scaling launch configuration named: awseb-e-rdgipdg4z3-stack-AWSEBAutoScalingLaunchConfiguration-H5E4G2YJ3LEC
2020-06-17 10:29:30 INFO Created Auto Scaling group named: awseb-e-rdgipdg4z3-stack-AWSEBAutoScalingGroup-1I2C273N6RN8S
2020-06-17 10:29:30 INFO Waiting for EC2 instances to launch. This may take a few minutes.
2020-06-17 10:29:30 INFO Created Auto Scaling group policy named: arn:aws:autoscaling:us-west-2:041741961231:scalingPolicy:8d4c8dcf-d77d-4d18-92d8-67f8a2c1cd9e:autoScalingGroupName/awseb-e-rdgipdg4z3-stack-AWSEBAutoScalingGroup-1I2C273N6RN8S:policyName/awseb-e-rdgipdg4z3-stack-AWSEBAutoScalingScaleDownPolicy-1JAUAII3SCELN
2020-06-17 10:29:30 INFO Created Auto Scaling group policy named: arn:aws:autoscaling:us-west-2:041741961231:scalingPolicy:0c3d9c2c-bc65-44ed-8a22-2f9bef538ba7:autoScalingGroupName/awseb-e-rdgipdg4z3-stack-AWSEBAutoScalingGroup-1I2C273N6RN8S:policyName/awseb-e-rdgipdg4z3-stack-AWSEBAutoScalingScaleUpPolicy-XI8Z22SYWQKR
2020-06-17 10:29:30 INFO Created CloudWatch alarm named: awseb-e-rdgipdg4z3-stack-AWSEBCloudwatchAlarmHigh-572C6W1QYGIC
2020-06-17 10:29:30 INFO Created CloudWatch alarm named: awseb-e-rdgipdg4z3-stack-AWSEBCloudwatchAlarmLow-1RTNBIHPHISRO
2020-06-17 10:33:05 ERROR [Instance: i-01576cfe5918af1c3] Command failed on instance. An unexpected error has occurred [ErrorCode: 0000000001].
2020-06-17 10:33:05 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2020-06-17 10:34:07 ERROR Create environment operation is complete, but with errors. For more information, see troubleshooting documentation.
ERROR: ServiceError - Create environment operation is complete, but with errors. For more information, see troubleshooting documentation.
The error is extremely vague, and I have no clue as to what I'm doing wrong.
I had a similar issue. I used psycopg2-binary instead of psycopg2 and created a new environment. The health status is now ok
Since this is getting some attention, I suggest you check your Elastic Beanstalk logs on the aws console, since the error is completely generic and can be anything. I suggest checking mainly the cmd execution and activity logs.
In my case, it was because I had the following listed in requirements.txt, and they failed to install on EC2:
mkl-fft==1.1.0
mkl-random==1.1.0
mkl-service==2.3.0
pypiwin32==223
pywin32==228
Removing those from requirements.txt fixed the issue
it is most likely a connection error. make sure the instance can access to the internet and you have VPC endpoints for SQS/Cloudformation/CloudWatch/S3/elasticbeanstalk/elasticbeanstalk-health. also make sure the security groups for these endpoints allow access to your instance

Kubectl command throwing error: Unable to connect to the server: getting credentials: exec: exit status 2

I am doing a lab setup of EKS/Kubectl and after the completion cluster build, I run the following:
> kubectl get node
And I get the following error:
Unable to connect to the server: getting credentials: exec: exit status 2
Moreover, I am sure it is a configuration issue for,
kubectl version
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:
aws help
aws <command> help
aws <command> <subcommand> help
aws: error: argument operation: Invalid choice, valid choices are:
create-cluster | delete-cluster
describe-cluster | describe-update
list-clusters | list-updates
update-cluster-config | update-cluster-version
update-kubeconfig | wait
help
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.1", GitCommit:"d224476cd0730baca2b6e357d144171ed74192d6", GitTreeState:"clean", BuildDate:"2020-01-14T21:04:32Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"darwin/amd64"}
Unable to connect to the server: getting credentials: exec: exit status 2
Please advise next steps for troubleshooting.
Please delete the cache folder folder present in
~/.aws/cli/cache
For me running kubectl get nodes or kubectl cluster-info gives me the following error.
Unable to connect to the server: getting credentials: exec: executable kubelogin not found
It looks like you are trying to use a client-go credential plugin that is not installed.
To learn more about this feature, consult the documentation available at:
https://kubernetes.io/docs/reference/access-authn-authz/authentication/#client-go-credential-plugins
I did the following to resolve this.
Deleted all of the contents inside ~/.kube/. In my case, its a windows machine, so its C:\Users\nis.kube. Here nis is the user name that I logged into.
Ran the get credentials command as follows.
az aks get-credentials --resource-group terraform-aks-dev --name terraform-aks-dev-aks-cluster --admin
Note --admin in the end. Without it, its giving me the same error.
Now the above two commands are working.
Reference: https://blog.baeke.info/2021/06/03/a-quick-look-at-azure-kubelogin/
Did you have the kubectl configuration file ready?
Normally we put it under ~/.kube/config and the file includes the cluster endpoint, ceritifcate, contexts, admin users, and so on.
Furtherly, read this document: https://docs.aws.amazon.com/eks/latest/userguide/create-kubeconfig.html
In my case, as I am using azure (not aws), I had to install "kubelogin" which resolved the issue.
"kubelogin" is a client-go credential (exec) plugin implementing azure authentication. This plugin provides features that are not available in kubectl. It is supported on kubectl v1.11+
Can you check your ~/.kube/config file?
Assume if you have start local cluster using minikube for that if your config is available, you should not be getting the error for server.
Sample config file
apiVersion: v1
clusters:
- cluster:
certificate-authority: /Users/singhvi/.minikube/ca.crt
server: https://127.0.0.1:32772
name: minikube
contexts:
- context:
cluster: minikube
user: minikube
name: minikube
current-context: minikube
kind: Config
preferences: {}
users:
- name: minikube
user:
client-certificate: /Users/singhvi/.minikube/profiles/minikube/client.crt
client-key: /Users/singhvi/.minikube/profiles/minikube/client.key
You need to update/recreate your local kubeconfig. In my case I deleted the whole ~/.kube/config and followed this tutorial:
https://docs.aws.amazon.com/eks/latest/userguide/create-kubeconfig.html
Make sure you have installed AWS CLI.
https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
I had the same problem, the issue was that in my .aws/credentials file there was multiple users, and the user that had the permissions on the cluster of EKS (admin_test) wasn't the default user. So in my case, i made the "admin_test" user as my default user in the CLI using environment variables:
export $AWS_PROFILE='admin_test'
After that, i checked the default user with the command:
aws sts get-caller-identity
Finally, i was able to get the nodes with the kubectl get nodes command.
Reference: https://docs.aws.amazon.com/eks/latest/userguide/create-kubeconfig.html
In EKS you can retrieve your kubectl credentials using the following command:
% aws eks update-kubeconfig --name cluster_name
Updated context arn:aws:eks:eu-west-1:xxx:cluster/cluster_name in /Users/theofpa/.kube/config
You can retrieve your cluster name using:
% aws eks list-clusters
{
"clusters": [
"cluster_name"
]
}
I had the same error and solved it by upgrading my awscli to the latest version.
Removing and adding the ~/.aws/credentials file worked to resolve this issue for me.
rm ~/.aws/credentials
touch ~/.aws/credentials

Using command line to create AWS Elastic Beanstalk fail

I tried to setup EB for worker tier by using the following command
eb create -t worker
But I receive the following error
2015-11-04 16:44:01 UTC+0800 ERROR Stack named 'awseb-e-wh4epksrzi-stack' aborted operation. Current state: 'CREATE_FAILED' Reason: The following resource(s) failed to create: [AWSEBWorkerCronLeaderRegistry, AWSEBSecurityGroup].
2015-11-04 16:43:58 UTC+0800 ERROR Creating security group named: sg-7ba1f41e failed Reason: Resource creation cancelled
Is there something specific to run the command line ?
I found the eb command line buggy. try to use the web console. much more reliable.