I've got the following CloudFormation script. The stack is being created and the Ec2 instance launches, and I can SSH in but it's not installing the packages.
I'm not sure where it's failing at. I'm using Ubuntu. I can't find were cfn-init is installed on my instance? Or is it only installed for Amazon Linux AMIs?
How do I go about troubleshooting this?
{
"Parameters" : {
"ShinyKey": {
"Description": "Key pair for instance.",
"Type": "AWS::EC2::KeyPair::KeyName"
}
},
"Resources": {
"Ec2Instance" : {
"Metadata": {
"AWS::CloudFormation::Init": {
"config": {
"packages": {
"apt": {
"r-base-dev": [],
"libcurl4-openssl-dev": [],
"git": []
}
}
}
}
},
"Type" : "AWS::EC2::Instance",
"Properties": {
"ImageId": "ami-9eaa1cf6",
"InstanceType": "t2.micro",
"KeyName": {"Ref": "ShinyKey"},
"SecurityGroups": [{"Ref": "InstanceSecurityGroup"}],
"Tags": [{
"Key": "Name",
"Value": "R-Shiny-Server"
}],
"UserData": {
"Fn::Base64": {
"Fn::Join": [
"",
[
"#!/bin/bash\n",
"/usr/local/bin/cfn-init --region ",
{
"Ref": "AWS::Region"
},
" -s ",
{
"Ref": "AWS::StackName"
},
" -r Ec2Instance\n"
]
]
}
}
}
},
"InstanceSecurityGroup" : {
"Type" : "AWS::EC2::SecurityGroup",
"Properties": {
"GroupDescription" : "Enable SSH access via port 22, and ports 3838 and 80 for Shiny",
"SecurityGroupIngress" : [
{ "IpProtocol" : "tcp", "FromPort" : "22", "ToPort" : "22", "CidrIp" : "0.0.0.0/0" },
{ "IpProtocol" : "tcp", "FromPort" : "80", "ToPort" : "80", "CidrIp" : "0.0.0.0/0" },
{ "IpProtocol" : "tcp", "FromPort" : "3838", "ToPort" : "3838", "CidrIp" : "0.0.0.0/0" }
]
}
}
}
}
The issue with the template above is that cfn-init is not installed in the Ubuntu AMI, so the call to cfn-init in your user-data script will return "command not found" and do nothing.
The cfn-helper utilities are automatically installed only in the latest Amazon Linux AMI, as noted in the documentation. For Ubuntu you need to install them manually, which you can do by adding a line to your existing user-data script, after "#!/bin/bash\n",:
"apt-get update && apt-get install pip && pip install https://s3.amazonaws.com/cloudformation-examples/aws-cfn-bootstrap-latest.tar.gz\n",
After this, the call to /usr/local/bin/cfn-init in the next line should run correctly.
You need to include the Cloudformation Init attribute in the Metadata property of the instance. The cfn-init script uses this meta data to determine what actions should be taken on boot.
In your sample code though, it does not look like you are even trying to install any packages, so I am not sure what package you are expecting to be present.
Related
Below is the CloudFormation template for EC2 instance, from here:
"EC2Instance":{
"Type": "AWS::EC2::Instance",
"Properties":{
"ImageId": "ami-05958d7635caa4d04",
"InstanceType": "t2.micro",
"SubnetId": { "Ref": "SubnetId"},
"KeyName": { "Ref": "KeyName"},
"SecurityGroupIds": [ { "Ref": "EC2InstanceSecurityGroup"} ],
"IamInstanceProfile": { "Ref" : "EC2InstanceProfile"},
"UserData":{
"Fn::Base64": { "Fn::Join": ["", [
"#!/bin/bash\n",
"echo ECS_CLUSTER=", { "Ref": "EcsCluster" }, " >> /etc/ecs/ecs.config\n",
"groupadd -g 1000 jenkins\n",
"useradd -u 1000 -g jenkins jenkins\n",
"mkdir -p /ecs/jenkins_home\n",
"chown -R jenkins:jenkins /ecs/jenkins_home\n"
] ] }
},
"Tags": [ { "Key": "Name", "Value": { "Fn::Join": ["", [ { "Ref": "AWS::StackName"}, "-instance" ] ]} }]
}
},
To troubleshoot Jenkins running in EC2, we would like to use cloudwatch agent.
But, documentation gives pointers on using cloudwatch agent on created stack.
How to create install CloudWatch agent resource type, within existing template(above)?
Below is the cloud formation template that creates Elastic Load Balancer as public facing to jenkins(jenkins:ecs docker) running in VPC subnet:
{
"AWSTemplateFormatVersion": "2010-09-09",
"Description": "Jenkins Stack",
"Parameters":{
"VpcId": {
"Type": "AWS::EC2::VPC::Id",
"Description": "The target VPC Id"
},
"SubnetId": {
"Type": "AWS::EC2::Subnet::Id",
"Description": "The target subnet Id"
},
"KeyName": {
"Type": "String",
"Description": "The key pair that is allowed SSH access"
}
},
"Resources":{
"EC2Instance":{
"Type": "AWS::EC2::Instance",
"Properties":{
"ImageId": "ami-05958d7635caa4d04",
"InstanceType": "t2.micro",
"SubnetId": { "Ref": "SubnetId"},
"KeyName": { "Ref": "KeyName"},
"SecurityGroupIds": [ { "Ref": "EC2InstanceSecurityGroup"} ],
"IamInstanceProfile": { "Ref" : "EC2InstanceProfile"},
"UserData":{
"Fn::Base64": { "Fn::Join": ["", [
"#!/bin/bash\n",
"echo ECS_CLUSTER=", { "Ref": "EcsCluster" }, " >> /etc/ecs/ecs.config\n",
"groupadd -g 1000 jenkins\n",
"useradd -u 1000 -g jenkins jenkins\n",
"mkdir -p /ecs/jenkins_home\n",
"chown -R jenkins:jenkins /ecs/jenkins_home\n"
] ] }
},
"Tags": [ { "Key": "Name", "Value": { "Fn::Join": ["", [ { "Ref": "AWS::StackName"}, "-instance" ] ]} }]
}
},
"EC2InstanceSecurityGroup":{
"Type": "AWS::EC2::SecurityGroup",
"Properties": {
"GroupDescription": { "Fn::Join": ["", [ { "Ref": "AWS::StackName" }, " ingress security group" ] ] },
"VpcId": { "Ref": "VpcId" },
"SecurityGroupIngress": [
{
"IpProtocol": "tcp",
"FromPort": "8080",
"ToPort": "8080",
"SourceSecurityGroupId": { "Ref": "ElbSecurityGroup"}
},
{
"IpProtocol": "tcp",
"FromPort": "22",
"ToPort": "22",
"CidrIp": "0.0.0.0/0"
}
],
"Tags": [ { "Key": "Name", "Value": { "Fn::Join": ["", [ { "Ref": "AWS::StackName" }, "-ec2-sg" ] ] } } ]
}
},
"EC2InstanceProfile": {
"Type": "AWS::IAM::InstanceProfile",
"Properties": {
"Path": "/",
"Roles": [ { "Ref": "EC2InstanceRole" } ]
}
},
"EC2InstanceRole": {
"Type": "AWS::IAM::Role",
"Properties": {
"AssumeRolePolicyDocument":{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": [ "ec2.amazonaws.com" ] },
"Action": [ "sts:AssumeRole" ]
}
]
},
"Path": "/",
"ManagedPolicyArns": [ "arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role" ]
}
},
"ElbSecurityGroup": {
"Type": "AWS::EC2::SecurityGroup",
"Properties": {
"GroupDescription": { "Fn::Join": ["", [ { "Ref": "AWS::StackName" }, " ELB ingress security group" ] ] },
"VpcId": { "Ref": "VpcId"},
"SecurityGroupIngress": [
{
"IpProtocol": "tcp",
"FromPort": "80",
"ToPort": "80",
"CidrIp": "0.0.0.0/0"
}
],
"Tags": [ { "Key": "Name", "Value": { "Fn::Join": ["", [ { "Ref": "AWS::StackName" }, "-elb-sg" ] ] } } ]
}
},
"ElasticLoadBalancer": {
"Type": "AWS::ElasticLoadBalancing::LoadBalancer",
"Properties": {
"CrossZone": "false",
"SecurityGroups": [ { "Ref": "ElbSecurityGroup" } ],
"Listeners": [
{
"LoadBalancerPort": "80",
"InstancePort": "8080",
"Protocol": "http"
}
],
"Instances": [ { "Ref": "EC2Instance"} ],
"Subnets": [ { "Ref": "SubnetId"} ]
}
},
"EcsCluster": {
"Type": "AWS::ECS::Cluster"
},
"EcsTaskDefinition": {
"Type": "AWS::ECS::TaskDefinition",
"Properties": {
"ContainerDefinitions": [
{
"Name": "jenkins",
"Image": "somedockeracct/jenkins:ecs",
"Memory": 500,
"PortMappings": [
{
"ContainerPort": 8080,
"HostPort": 8080
},
{
"ContainerPort": 50000,
"HostPort": 50000
}
],
"MountPoints": [
{
"SourceVolume": "docker",
"ContainerPath": "/var/run/docker.sock"
},
{
"SourceVolume": "jenkins_home",
"ContainerPath": "/var/jenkins_home"
}
]
}
],
"Volumes": [
{
"Name": "jenkins_home",
"Host": { "SourcePath": "/ecs/jenkins_home" }
},
{
"Name": "docker",
"Host": { "SourcePath": "/var/run/docker.sock" }
}
]
}
},
"EcsService": {
"Type": "AWS::ECS::Service",
"Properties": {
"Cluster": { "Ref": "EcsCluster" },
"TaskDefinition": { "Ref": "EcsTaskDefinition" },
"DesiredCount": 1
}
}
},
"Outputs":{
"ElbDomainName": {
"Description": "Public DNS name of Elastic Load Balancer",
"Value": {
"Fn::GetAtt": [
"ElasticLoadBalancer",
"DNSName"
]
}
},
"EC2InstanceDomainName": {
"Description": "Public DNS name of EC2 instance",
"Value": {
"Fn::GetAtt": [
"EC2Instance",
"PublicDnsName"
]
}
}
}
}
where the docker file of jenkins master(jenkins:ecs) is:
FROM jenkins/jenkins:2.190.2
MAINTAINER Developer team <devteam#abc.com>
# Suppress apt installation warnings
# https://serverfault.com/a/227194/220043
ENV DEBIAN_FRONTEND=noninteractive
# Official Jenkins image does not include sudo, change to root user
USER root
# Used to set the docker group ID
# Set to 497 by default, which is the groupID used by AWS Linux ECS instance
ARG DOCKER_GID=497
# Create Docker Group with GID
# Set default value of 497 if DOCKER_GID set to blank string by Docker compose
RUN groupadd -g ${DOCKER_GID:-497} docker
# Install base packages for docker, docker-compose & ansible
# apt-key adv --keyserver keyserver.ubuntu.com --recv-keys AA8E81B4331F7F50 && \
RUN apt-get update -y && \
apt-get -y install bc \
gawk \
libffi-dev \
musl-dev \
apt-transport-https \
curl \
python3 \
python3-dev \
python3-setuptools \
gcc \
make \
libssl-dev \
python3-pip
# Used at build time but not runtime
ARG DOCKER_VERSION=18.06.1~ce~3-0~debian
# Install the latest Docker CE binaries and add user `jenkins` to the docker group
RUN apt-get update && \
apt-get -y install apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common && \
curl -fsSL https://download.docker.com/linux/$(. /etc/os-release; echo "$ID")/gpg > /tmp/dkey; apt-key add /tmp/dkey && \
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") \
$(lsb_release -cs) \
stable" && \
apt-get update && \
apt-get -y install docker-ce=${DOCKER_VERSION:-18.06.1~ce~3-0~debian} && \
# docker-ce-cli=${DOCKER_VERSION:-18.06.1~ce~3-0~debian} \
# containerd.io && \
usermod -aG docker jenkins
ARG DOCKER_COMPOSE=1.24.1
# Install docker compose
RUN curl -L "https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE:-1.24.1}/docker-compose-$(uname -s)-$(uname -m)" \
-o /usr/local/bin/docker-compose && \
chmod +x /usr/local/bin/docker-compose && \
pip3 install ansible boto3
# Change to jenkins user
USER jenkins
# Add jenkins plugin
COPY plugins.txt /usr/share/jenkins/plugins.txt
RUN /usr/local/bin/install-plugins.sh < /usr/share/jenkins/plugins.txt
Master jenkins docker container runs in EC2(docker host).
In this scenario, ELB is not used for load balancing but to public face Jenkins. Currently ELB is connected using http
How to enable https secure connection to jenkins via ELB?
Which component holds the responsibility to ensure secure connection? ELB or Jenkins
Try setting up SSL certificates on the ELB using AWS ACM. Once that is done, create a secure listener for https on the ELB and forward that traffic to the http port of Jenkins using the steps mentioned below:
AWS ACM
Navigate to AWS ACM and click on "Request a Certificate" button.
Select "Request a public certificate" option
Add domain names for which the certificate is required
Choose "DNS validation" for certificate validation
Add tags(optional) and Review and Confirm
If you are using AWS route53 for your DNS then there is a button which automatically creates the CNAME entries for your certificate in route53. If you are using any other DNS then make sure that you create the CNAME records as mentioned by ACM.
After the CNAME record is verified on the DNS, your ACM certificate status will change from "Pending" to "Issued"
AWS ELB - Classic Load Balancer
Add a new listener to the load balancer with the below mentioned details:
Load Balancer Protocol: HTTPS(Secure HTTP)
Load Balancer Port: 443
Instance Protocol: HTTP
Instance Port: 8080(or any other port you have configured Jenkins)
AWS ELB - Application Load Balancer
In case if you are using ALB, then create a Target group with Target type: Instance, Protocol: HTTP, Port: 8080
In case you face any issues in health checks, ensure that your security groups are allowing traffic on port 8080 from the ELB/ALB.
Another way would be to install nginx on the server and use the below listed configuration for the server file(In this case change 8080 to 80 in the ELB/ALB configuration mentioned above):
server {
listen 80;
server_name xxx.xxx.xxx;
location / {
proxy_pass http://localhost:8080;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
https://docs.aws.amazon.com/acm/latest/userguide/gs-acm-request-public.html
https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/elb-listener-config.html
https://docs.aws.amazon.com/elasticloadbalancing/latest/application/create-https-listener.html
Here is what I have to do:
I have to create an instance using EC2 cloudformation template.
After certain installation of packages, I like to reboot the instance through cloud formation template itself.
Once the instance is rebooted, I have to complete the execution of the remaining script.
Please suggest me how this could be done.
This is my current template:
{
"AWSTemplateFormatVersion" : "2010-09-09",
"Description" : "",
"Parameters": {
"VPCID": {
"Description": "The VPC for this instance",
"Type": "AWS::EC2::VPC::Id",
},
"SubnetID": {
"Description": "The Subnet for this instance",
"Type": "AWS::EC2::Subnet::Id",
},
"AllowedCIDR": {
"Description": "IP address range (in CIDR notation) of the client that will be allowed to connect to the cluster using SSH e.g., 203.0.113.5/32",
"AllowedPattern": "(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})/(\\d{1,2})",
"Type": "String",
"MinLength": "9",
"MaxLength": "18",
"Default": "10.0.0.0/16",
"ConstraintDescription": "must be a valid CIDR range of the form x.x.x.x/x"
},
"SSHKeyName": {
"Description": "The EC2 Key Pair to allow SSH access to the instance",
"Type": "AWS::EC2::KeyPair::KeyName",
},
"TypeOfInstance": {
"Type": "String",
"Default": "t2.medium",
"Description": "Enter t2.medium, t2.large, m3.large, m4.large, m4.xlarge, etc.",
"ConstraintDescription": "Must be a valid EC2 instance type."
}
},
"Resources": {
"Ec2Instance": {
"Type": "AWS::EC2::Instance",
"Properties": {
"SecurityGroupIds": [
{
"Ref": "InstanceSecurityGroup"
}
],
"KeyName": {
"Ref": "SSHKeyName"
},
"ImageId": "ami-a8d369c0",
"SubnetId": { "Ref": "SubnetID" },
"InstanceType": { "Ref": "TypeOfInstance" },
"UserData" : { "Fn::Base64" : { "Fn::Join" : ["", [
"#!/bin/bash -xe\n",
"touch /tmp/testfile\n",
"yum -y install rng-tools\n",
"systemctl start rngd\n",
"systemctl enable rngd\n",
"yum update -y \n",
"echo \"################### Install Packages #######################\"\n",
"reboot \n",
"echo \"################### Install Remaining packages and configuration #######################\"\n",
]]}}
},
"InstanceSecurityGroup": {
"Type": "AWS::EC2::SecurityGroup",
"Properties": {
"GroupDescription": "Enable SSH access via port 22",
"VpcId" : {
"Ref" : "VPCID"
},
"GroupName": "my-securitygroup",
"SecurityGroupIngress": [
{
"IpProtocol": "tcp",
"FromPort": "22",
"ToPort": "22",
"CidrIp": "0.0.0.0/0"
}
]
}
}
}
}
Since there is no way for cloudformation to trigger mid-way of user-data script after the instance has stopped and then restarted, here is one workaround that I can think of.
Save some kind of flag on the instance just before reboot (example cfn-userdata-script-continue). Download the remaining part of your script to the instance and save it to a pre-defined location
After reboot, check the existence of this flag. If the flag exists, navigate to the location where you saved the partial script. Run the script. Delete the flag cfn-userdata-script
You could also use a scheduled tasks in EC2 instance to complete the tasks. E.g. in windows you can set the tasks to be run once after the reboot.
I am trying to create RDS cluster and aurora instance using the cloudoformation template below:
{
"AWSTemplateFormatVersion" : "2010-09-09",
"Description" : "example setup",
"Parameters" : {
"DBInstanceIdentifier" : {
"Type": "String",
"Description": "Name for the DB instance."
},
"DBUser" : {
"Type": "String",
"Description": "Master user"
},
"DBPassword" : {
"Type": "String",
"Description": "Pass"
},
"DBModel" : {
"Type": "String",
"Description": "Instance model to be used for the DB."
}
},
"Resources": {
"RDSCluster": {
"Type": "AWS::RDS::DBCluster",
"Properties": {
"MasterUsername": { "Ref" : "DBUser" },
"MasterUserPassword": { "Ref" : "DBPassword" },
"Engine": "aurora",
"DBClusterParameterGroupName": "default.aurora5.6",
"VpcSecurityGroupIds": [{"Fn::GetAtt" : [ "DBFromSiteSecurityGroup" , "GroupId" ]}]
}
},
"AuroraInstance": {
"Type": "AWS::RDS::DBInstance",
"Properties": {
"DBInstanceIdentifier": { "Ref" : "DBInstanceIdentifier" },
"DBParameterGroupName": "default.aurora5.6",
"Engine": "aurora",
"DBClusterIdentifier": {
"Ref": "RDSCluster"
},
"PubliclyAccessible": "true",
"DBInstanceClass": { "Ref" : "DBModel" }
}
},
"DBFromSiteSecurityGroup" : {
"Type" : "AWS::EC2::SecurityGroup",
"Properties" : {
"GroupDescription" : "Enable MySQL",
"SecurityGroupIngress" : [
{"IpProtocol" : "tcp", "FromPort" : "3306", "ToPort" : "3306", "CidrIp" : "195.171.102.98/32"}
]
}
},
"DBFromSiteSecurityGroupIngress1" : {
"Type" : "AWS::EC2::SecurityGroupIngress",
"Properties" : {
"GroupName" : { "Ref" : "DBFromSiteSecurityGroup" },
"IpProtocol" : "tcp",
"ToPort" : "3306",
"FromPort" : "3306",
"SourceSecurityGroupName" : { "Ref" : "DBFromSiteSecurityGroup" }
}
}
}
}
The db_model parameter I am passing is "db.t2.medium". The cluster gets created successfully in the cloudformation console however the AWS::RDS::DBInstance creation fails with the following error
"DeletionPolicy:Snapshot cannot be specified for a cluster instance, use deletion policy on the cluster instead."
What's more weird that when I try to run the same CF template in say eu london region, it works fine!!! Is there something wrong with the EU ireland region and aurora?
From AWS Support
This is a known issue and has been reported by other customers as well. The service team is currently working on the fix for this but there is no ETA as to when that would be pushed.
The work-around in the meanwhile is to specify a DeletionPolicy inside the DB instance resource definition that is failing to create, with the value of 'Delete'. [1]
An example below:
"Resources": {
"Database1": {
"DeletionPolicy": "Delete",
"Properties": {...},
"Type": "AWS::RDS::DBInstance"
}
}
References:
[1] DeletionPolicy - http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-attribute-deletionpolicy.html#w2ab2c19c23c11c17
An update from AWS Support:
When creating an Amazon Aurora DBInstance in a DB Cluster using AWS
CloudFormation, CloudFormation applies a default Deletion policy of
“Delete”, if a deletion policy is not specified. If a Deletion Policy
of “Snapshot” is specified for an Amazon Aurora DBInstance,
CloudFormation returns an error, because instances in a DB Cluster
cannot be snapshotted individually; Snapshotting must be done at the
DB Cluster level.
As part of a recent deployment, we inadvertently changed the default
deletion policy for an Amazon Aurora DBInstance to “Snapshot”. This
caused our template validation to fail. To remedy this, CloudFormation
is reverting the value of default DeletionPolicy for Amazon Aurora
DBInstances to “Delete”. This fix will be completed by 21st July,
2017. Until this fix is completely rolled out, customers can explicitly override our incorrect defaults by specifying a deletion
policy of “Delete” for Amazon Aurora DBInstances.
We have corrected the gap in our testing that led to this situation,
and will continue to improve our testing to prevent recurrences. We
recognize how critical it is for us to preserve existing behavior for
our customers, and apologize for this inconvenience.
I am creating an AWS EC2 instance in a VPC with internet access using cloudformation. I am able to create the EC2 as expected based on the JSON. But it seems like the instance state goes to stopped soon after creating the EC2. I was expecting the EC2 to be up and in running state as soon as created.
Has anyone faced this problem?
I am able to go to the AWS console and manually make the instance to running state successfully though.
Here is the JSON for EC2
"PublicEC2Instance": {
"Type": "AWS::EC2::Instance",
"Properties": {
"ImageId": {
"Fn::FindInMap": ["AWSRegionArch2AMI", {
"Ref": "AWS::Region"
},
"64"
]
},
"InstanceType": {
"Ref": "InstanceType"
},
"KeyName": {
"Ref": "KeyPair"
},
"BlockDeviceMappings": [{
"DeviceName": "/dev/sda1",
"Ebs": {
"VolumeSize": "8"
}
}, {
"DeviceName": "/dev/sdm",
"Ebs": {
"VolumeSize": "8"
}
}],
"Tags": [{
"Key": "Name",
"Value": "Sample-PublicEC2"
}],
"UserData": {
"Fn::Base64": {
"Ref": "WebServerPort"
}
},
"NetworkInterfaces": [{
"AssociatePublicIpAddress": "true",
"DeleteOnTermination": "true",
"DeviceIndex": "0",
"SubnetId": {
"Ref": "PublicSubnet"
},
"GroupSet": [{
"Ref": "PublicSecurityGroup"
}]
}]
}
}
The UserData in your template looks invalid. It's possible that the instance startup aborts on invalid data. Try removing this property and creating the stack again.
If this doesn't solve the problem, you can try looking at the console output of the stopped instance for more information. See Getting Console Output and Rebooting Instances for instructions on how to do this using the AWS Management Console.