I have an experiment I'd like to run 100 different times, each with a command line flag set to a different integer value. Each experiment will output the result to a text file. Experiments take about 2 hours each and are independent of each other.
I currently have a Docker image that can run the experiment when provided the command line flag.
I am curious if there is a way to write a script that can launch 100 AWS instances (one for each possible flag value), run the Docker image, and then output the result to a shared text file somewhere. Is this possible? I am very inexperienced with AWS so I'm not sure if this is the proper tool or what steps would be required (besides building the Docker image).
Thanks.
You could do this using vagrant with the vagrant-aws plugin to spin up the instances and the Docker Provisioner to pull your images / run your containers or the Ansible Provisioner. For example:
.
├── playbook.yml
└── Vagrantfile
The Vagrantfile:
# -*- mode: ruby -*-
# vi: set ft=ruby :
VAGRANTFILE_API_VERSION = "2"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
N = 100
(1..N).each do |server_id|
config.vm.box = "dummy"
config.ssh.forward_agent = true
config.vm.define "server#{server_id}" do |server|
server.vm.provider :aws do |aws, override|
aws.access_key_id = ENV["AWS_ACCESS_KEY_ID"]
aws.secret_access_key = ENV["AWS_SECRET_ACCESS_KEY"]
aws.instance_type = "t2.micro"
aws.block_device_mapping = [
{
"DeviceName" => "/dev/sda1",
"Ebs.VolumeSize" => 30
}
]
aws.tags = {
"Name" => "node#{server_id}.example.com",
"Environment" => "stage"
}
aws.subnet_id = "subnet-d65893b0"
aws.security_groups = [
"sg-deadbeef"
]
aws.region = "eu-west-1"
aws.region_config "eu-west-1" do |region|
region.ami = "ami-0635ad49b5839867c"
region.keypair_name = "ubuntu"
end
aws.monitoring = true
aws.associate_public_ip = false
aws.ssh_host_attribute = :private_ip_address
override.ssh.username = "ubuntu"
override.ssh.private_key_path = ENV["HOME"] + "/.ssh/id_rsa"
override.ssh.forward_agent = true
end
if server_id == N
server.vm.provision :ansible do |ansible|
ansible.limit = "all"
ansible.playbook = "playbook.yml"
ansible.compatibility_mode = "2.0"
ansible.raw_ssh_args = "-o ForwardAgent=yes"
ansible.extra_vars = {
"ansible_python_interpreter": "/usr/bin/python3"
}
end
end
end
end
end
Note: this example does ansible parallel execution from the Tips & Tricks.
The ansible playbook.yml:
- hosts: all
pre_tasks:
- name: get instance facts
local_action:
module: ec2_instance_facts
filters:
private-dns-name: '{{ ansible_fqdn }}'
"tag:Environment": stage
register: _ec2_instance_facts
- name: add route53 entry
local_action:
module: route53
state: present
private_zone: yes
zone: 'example.com'
record: '{{ _ec2_instance_facts.instances[0].tags["Name"] }}'
type: A
ttl: 7200
value: '{{ _ec2_instance_facts.instances[0].private_ip_address }}'
wait: yes
overwrite: yes
tasks:
- name: install build requirements
apt:
name: ['python3-pip', 'python3-socks', 'git']
state: present
update_cache: yes
become: true
- name: apt install docker requirements
apt:
name: ['apt-transport-https', 'ca-certificates', 'curl', 'gnupg-agent', 'software-properties-common']
state: present
become: true
- name: add docker apt key
apt_key:
url: https://download.docker.com/linux/ubuntu/gpg
state: present
become: true
- name: add docker apt repository
apt_repository:
repo: 'deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable'
state: present
become: true
- name: apt install docker-ce
apt:
name: ['docker-ce', 'docker-ce-cli', 'containerd.io']
state: present
update_cache: yes
become: true
- name: get docker-compose
get_url:
url: 'https://github.com/docker/compose/releases/download/1.24.1/docker-compose-{{ ansible_system }}-{{ ansible_userspace_architecture }}'
dest: /usr/local/bin/docker-compose
mode: '0755'
become: true
- name: pip install docker and boto3
pip:
name: ['boto3', 'docker', 'docker-compose']
executable: pip3
- name: create docker config directory
file:
path: /etc/docker
state: directory
become: true
- name: copy docker daemon.json
copy:
content: |
{
"group": "docker",
"log-driver": "journald",
"live-restore": true,
"experimental": true,
"insecure-registries" : [],
"features": { "buildkit": true }
}
dest: /etc/docker/daemon.json
become: true
- name: enable docker service
service:
name: docker
enabled: yes
become: true
- name: add ubuntu user to docker group
user:
name: ubuntu
groups: docker
append: yes
become: true
- name: restart docker daemon
systemd:
state: restarted
daemon_reload: yes
name: docker
no_block: yes
become: true
# pull your images then run your containers
The only approach that I can think of is using AWS SSM to run multiple commands but still, you might need to spin 100's of instances and that would not be the good approach.
Below are the set of commands you can use :
Spin instance using below Cloudformation template, run it in loop to create multiple instances :
---
Resources:
MyInstance:
Type: AWS::EC2::Instance
Properties:
AvailabilityZone: <region>
ImageId: <amiID>
InstanceType: t2.micro
KeyName : <KeyName>
Use below command to get the intance ID :
aws ec2 describe-instances --filters 'Name=tag:Name,Values=EC2' --query 'Reservations[*].Instances[*].InstanceId' --output text
Using that instance-id, run below command :
aws ssm send-command --instance-ids "<instanceID>" --document-name "AWS-RunShellScript" --comment "<COMMENT>" --parameters commands='sudo yum update -y' --output text
I don't think docker will be of any help here as that would complicate things for you due to SSM agent installation. So your best bet would be running commands one by one and finally storing your output in S3.
Related
I am trying to provide initial configuration and software installation to a newly created AWS EC2 instance by using Ansible. If I run my playbooks independently it works just as I want. However, if I try to automate it into a single playbook by using two imports, it doesn't work (probably because the dynamic inventory can't get the newly created IP address?)...
Running together:
[WARNING]: Could not match supplied host pattern, ignoring:
aws_region_eu_central_1
PLAY [variables from dynamic inventory] ****************************************
skipping: no hosts matched
Running separately:
TASK [Gathering Facts] *********************************************************
[WARNING]: Platform linux on host XX.XX.XX.XX is using the discovered Python
interpreter at /usr/bin/python, but future installation of another Python
interpreter could change the meaning of that path. See https://docs.ansible.com
/ansible/2.10/reference_appendices/interpreter_discovery.html for more
information.
ok: [XX.XX.XX.XX]
This is my main playbook:
- import_playbook: server-setup.yml
- import_playbook: server-configuration.yml
server-setup.yml:
---
# variables from dynamic inventory
- name: variables from dynamic inventory
remote_user: ec2-user
hosts: localhost
roles:
- ec2-instance
server-configuration.yml:
---
# variables from dynamic inventory
- name: variables from dynamic inventory
remote_user: ec2-user
become: true
become_method: sudo
become_user: root
ignore_unreachable: true
hosts: aws_region_eu_central_1
gather_facts: false
pre_tasks:
- pause:
minutes: 5
roles:
- { role: epel, sudo: true }
- { role: nodejs, sudo: true }
This is my ansible.cfg file:
[defaults]
inventory = test_aws_ec2.yaml
private_key_file = master-key.pem
enable_plugins = aws_ec2
host_key_checking = False
pipelining = True
log_path = ansible.log
roles_path = /roles
forks = 1000
and finally my hosts.ini:
[local]
localhost ansible_python_interpreter=/usr/local/bin/python3
Objective of my effort: Create EKS node with Custom AMI(ubuntu)
Issue Statement: On creating aws_eks_node_group along with launch_template, I am getting an error:
Error: error waiting for EKS Node Group (qa-svr-centinela-eks-cluster01:qa-svr-centinela-nodegroup01) creation: AsgInstanceLaunchFailures: Could not launch On-Demand Instances. Unsupported - The requested configuration is currently not supported. Please check the documentation for supported configurations. Launching EC2 instance failed.. Resource IDs: [eks-82bb24f0-2d7e-ba9d-a80a-bb9653cde0c6]
Research so far: As per AWS we can start using custom AMIs for EKS.
Now the custom ubuntu image that I am using, is built with Packer, and I was encrypting the boot, and using AWS KMS External key for that purpose. At first I thought maybe the encryption used for the AMI is causing problem. So I removed the encryption for the AMI from the packer code.
But it didn't resolve the issue. Maybe I am not thinking in the right direction?
Any help is much appreciated. Thanks.
Terraform code used is in the post below.
I am attempting to create an EKS node group with launch template. But getting into an error.
packer code
source "amazon-ebs" "ubuntu18" {
ami_name = "pxx3"
ami_virtualization_type = "hvm"
tags = {
"cc" = "sxx1"
"Name" = "packerxx3"
}
region = "us-west-2"
instance_type = "t3.small"
# AWS Ubuntu AMI
source_ami = "ami-0ac73f33a1888c64a"
associate_public_ip_address = true
ebs_optimized = true
# public subnet
subnet_id = "subnet-xx"
vpc_id = "vpc-xx"
communicator = "ssh"
ssh_username = "ubuntu"
}
build {
sources = [
"source.amazon-ebs.ubuntu18"
]
provisioner "ansible" {
playbook_file = "./ubuntu.yml"
}
}
ubuntu.yml - only used for installing a few libraries
---
- hosts: default
gather_facts: no
become: yes
tasks:
- name: create the license key for new relic agent
shell: |
curl -s https://download.newrelic.com/infrastructure_agent/gpg/newrelic-infra.gpg | apt-key add - && \
printf "deb [arch=amd64] https://download.newrelic.com/infrastructure_agent/linux/apt bionic main" | tee -a /etc/apt/sources.list.d/newrelic-infra.list
- name: check sources.list
shell: |
cat /etc/apt/sources.list.d/newrelic-infra.list
- name: apt-get update
apt: update_cache=yes force_apt_get=yes
- name: install new relic agent
package:
name: newrelic-infra
state: present
- name: update apt-get repo and cache
apt: update_cache=yes force_apt_get=yes
- name: apt-get upgrade
apt: upgrade=dist force_apt_get=yes
- name: install essential softwares
package:
name: "{{ item }}"
state: latest
loop:
- software-properties-common
- vim
- nano
- glibc-source
- groff
- less
- traceroute
- whois
- telnet
- dnsutils
- git
- mlocate
- htop
- zip
- unzip
- curl
- ruby-full
- wget
ignore_errors: yes
- name: Add the ansible PPA to your system’s sources list
apt_repository:
repo: ppa:ansible/ansible
state: present
mode: 0666
- name: Add the deadsnakes PPA to your system’s sources list
apt_repository:
repo: ppa:deadsnakes/ppa
state: present
mode: 0666
- name: install softwares
package:
name: "{{ item }}"
state: present
loop:
- ansible
- python3.8
- python3-winrm
ignore_errors: yes
- name: install AWS CLI
shell: |
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
./aws/install
aws_eks_node_group configuration.
resource "aws_eks_node_group" "nodegrp" {
cluster_name = aws_eks_cluster.eks.name
node_group_name = "xyz-nodegroup01"
node_role_arn = aws_iam_role.eksnode.arn
subnet_ids = [data.aws_subnet.tf_subnet_private01.id, data.aws_subnet.tf_subnet_private02.id]
scaling_config {
desired_size = 2
max_size = 2
min_size = 2
}
depends_on = [
aws_iam_role_policy_attachment.nodepolicy01,
aws_iam_role_policy_attachment.nodepolicy02,
aws_iam_role_policy_attachment.nodepolicy03
]
launch_template {
id = aws_launch_template.eks.id
version = aws_launch_template.eks.latest_version
}
}
aws_launch_template configuration.
resource "aws_launch_template" "eks" {
name = "${var.env}-launch-template"
update_default_version = true
block_device_mappings {
device_name = "/dev/sda1"
ebs {
volume_size = 50
}
}
credit_specification {
cpu_credits = "standard"
}
ebs_optimized = true
# AMI generated with packer (is private)
image_id = "ami-0ac71233a184566453"
instance_type = t3.micro
key_name = "xyz"
network_interfaces {
associate_public_ip_address = false
}
}
I have used Ansible to create 1 AWS EC2 instance using the examples in the Ansible ec2 documentation. I can successfully create the instance with a tag. Then I temporarily add it to my local inventory group using add_host.
After doing this, I am having trouble when I try to configure the newly created instance. In my Ansible play, I would like to specify the instance by its tag name. eg. hosts: <tag_name_here>, but I am getting an error.
Here is what I have done so far:
My directory layout is
inventory/
staging/
hosts
group_vars/
all/
all.yml
site.yml
My inventory/staging/hosts file is
[local]
localhost ansible_connection=local ansible_python_interpreter=/home/localuser/ansible_ec2/.venv/bin/python
My inventory/staging/group_vars/all/all.yml file is
---
ami_image: xxxxx
subnet_id: xxxx
region: xxxxx
launched_tag: tag_Name_NginxDemo
Here is my Ansible playbook site.yml
- name: Launch instance
hosts: localhost
gather_facts: no
tasks:
- ec2:
key_name: key-nginx
group: web_sg
instance_type: t2.micro
image: "{{ ami_image }}"
wait: true
region: "{{ region }}"
vpc_subnet_id: "{{ subnet_id }}"
assign_public_ip: yes
instance_tags:
Name: NginxDemo
exact_count: 1
count_tag:
Name: NginxDemo
exact_count: 1
register: ec2
- name: Add EC2 instance to inventory group
add_host:
hostname: "{{ item.public_ip }}"
groupname: tag_Name_NginxDemo
ansible_user: centos_user
ansible_become: yes
with_items: "{{ ec2.instances }}"
- name: Configure EC2 instance in launched group
hosts: tag_Name_NginxDemo
become: True
gather_facts: no
tasks:
- ping:
I run this playbook with
$ cd /home/localuser/ansible_ec2
$ source .venv/bin/activate
$ ansible-playbook -i inventory/staging site.yml -vvv`
and this creates the EC2 instance - the 1st play works correctly. However, the 2nd play gives the following error
TASK [.....] ******************************************************************
The authenticity of host 'xx.xxx.xxx.xx (xx.xxx.xxx.xx)' can't be established.
ECDSA key fingerprint is XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.
Are you sure you want to continue connecting (yes/no)? yes
fatal: [xx.xxx.xxx.xx]: FAILED! => {"changed": false, "module_stderr":
"Shared connection to xx.xxx.xxx.xx closed.\r\n", "module_stdout": "/bin/sh:
1: /usr/bin/python: not found\r\n", "msg": "MODULE FAILURE", "rc": 127}
I followed the instructions from
this SO question to create the task with add_hosts
here to set gather_facts: False, but this still does not allow the play to run correctly.
How can I target the EC2 host using the tag name?
EDIT:
Additional info
This is the only playbook I have run to this point. I see this message requires Python but I cannot install Python on the instance as I cannot connect to it in my play Configure EC2 instance in launched group...if I could make that connection, then I could install Python (if this is the problem). Though, I'm not sure how to connect to the instance.
EDIT 2:
Here is my Python info on the localhost where I am running Ansible
I am running Ansible inside a Python venv.
Here is my python inside the venv
$ python --version
Python 2.7.15rc1
$ which python
~/ansible_ec2/.venv/bin/python
Here are my details about Ansible that I installed inside the Python venv
ansible 2.6.2
config file = /home/localuser/ansible_ec2/ansible.cfg
configured module search path = [u'/home/localuser/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /home/localuser/ansible_ec2/.venv/local/lib/python2.7/site-packages/ansible
executable location = /home/localuser/ansible_ec2/.venv/bin/ansible
python version = 2.7.15rc1 (default, xxxx, xxxx) [GCC 7.3.0]
Ok, so after a lot of searching, I found 1 possible workaround here. Basically, this workaround uses the lineinfile module and adds the new EC2 instance details to the hosts file permanently....not just for the in-memory plays following the add_host task. I followed this suggestion very closely and this approach worked for me. I did not need to use the add_host module.
EDIT:
The line I added in the lineinfile module was
- name: Add EC2 instance to inventory group
- lineinfile: line="{{ item.public_ip }} ansible_python_interpreter=/usr/bin/python3" insertafter=EOF dest=./inventory/staging/hosts
with_items: "{{ ec2.instances }}"
I've try to create AMI using ansible-playbook, i've already export aws secret key & access into path and ansible version is 2.0.0.
- hosts: localhost
tasks:
- name: create ami
ec2_ami:
region: "ap-southeast-1"
instance_id: "i-c2xxxx"
name: "jmicro"
wait: yes
register: ami
but when i run command : ansible-playbook create_ami.yml, i get this error :
ERROR: Syntax Error while loading YAML script, create_ami.yml
Note: The error may actually appear before this position: line 5, column 1
ec2_ami:
region: "ap-southeast-1"
is there something wrong with my yaml script ? but when i run :
# ansible localhost -m ec2_ami -a "instance_id=i-c2xxxx region=ap-southeast-1 wait=yes name=jmicro"
it's success !!
There are some strange characters in front of tasks. Don't know what that is but when I copy your code block and paste it into my editor, then try to remove the whitespaces with Backspace, it actually deletes the characters on the right side of the curser.
YAML only allows whitespaces for indenting lines. I think that is the issue here. And tasks needs to be on the same level as hosts. Other than that your definition looks OK to me.
- hosts: localhost
tasks:
- name: create ami
ec2_ami:
region: "ap-southeast-1"
instance_id: "i-c2xxxx"
name: "jmicro"
wait: yes
register: ami
I am using ec2.py dynamic inventory for provisioning with ansible.
I have placed the ec2.py in /etc/ansible/hosts file and marked it executable.
I also have the ec2.ini file in /etc/ansible/hosts.
[ec2]
regions = us-west-2
regions_exclude = us-gov-west-1,cn-north-1
destination_variable = public_dns_name
vpc_destination_variable = ip_address
route53 = False
all_instances = True
all_rds_instances = False
cache_path = ~/.ansible/tmp
cache_max_age = 0
nested_groups = False
group_by_instance_id = True
group_by_region = True
group_by_availability_zone = True
group_by_ami_id = True
group_by_instance_type = True
group_by_key_pair = True
group_by_vpc_id = True
group_by_security_group = True
group_by_tag_keys = True
group_by_tag_none = True
group_by_route53_names = True
group_by_rds_engine = True
group_by_rds_parameter_group = True
Above is my ec2.ini file
---
- hosts: localhost
connection: local
gather_facts: yes
vars_files:
- ../group_vars/dev_vpc
- ../group_vars/dev_sg
- ../hosts_vars/ec2_info
vars:
instance_type: t2.micro
tasks:
- name: Provisioning EC2 instance
local_action:
module: ec2
region: "{{ region }}"
key_name: "{{ key }}"
instance_type: "{{ instance_type }}"
image: "{{ ami_id }}"
wait: yes
group_id: ["{{ sg_npm }}", "{{sg_ssh}}"]
vpc_subnet_id: "{{ PublicSubnet }}"
source_dest_check: false
instance_tags: '{"Name": "EC2", "Environment": "Development"}'
register: ec2
- name: associate new EIP for the instance
local_action:
module: ec2_eip
region: "{{ region }}"
instance_id: "{{ item.id }}"
with_items: ec2.instances
- name: Waiting for NPM Server to come-up
local_action:
module: wait_for
host: "{{ ec2 }}"
state: started
delay: 5
timeout: 200
- include: ec2-configure.yml
Now the configuring script is as follows
- name: Configure EC2 server
hosts: tag_Name_EC2
user: ec2-user
sudo: True
gather_facts: True
tasks:
- name: Install nodejs related packages
yum: name={{ item }} enablerepo=epel state=present
with_items:
- nodejs
- npm
However when the configure script is called, the second script results into no hosts found.
If I execute the ec2-configure.yml just alone and if the EC2 server is up & running then it is able to find it and configure it.
I added the wait_for to make sure that the instance is in running state before the ec2-configure.yml is called.
Would appreciate if anyone can point my error. Thanks
After researching I came to know that the dynamic inventory doesnt refresh between playbook calls, it will only refresh if you are executing the playbook seprately.
However I was able to resolve the issue by using add_host command.
- name: Add Server to inventory
local_action: add_host hostname={{ item.public_ip }} groupname=webserver
with_items: webserver.instances
With ansible 2.0+, you refresh the dynamic inventory in the middle of the playbook as the task like this:
- meta: refresh_inventory
To extend this a bit, If you are getting problem with the cache in your playbook, then you can use it like this:
- name: Refresh the ec2.py cache
shell: "./inventory/ec2.py --refresh-cache"
changed_when: no
- name: Refresh inventory
meta: refresh_inventory
where ./inventory is the path to your dynamic inventory, please adjust it accordingly.
Hope this will help you.
Configure EC2 server play can't find any hosts from EC2 dynamic inventory because the new instance was added in the first play of the playbook - during the same execution. Group tag_Name_EC2 didn't exist in the inventory when the inventory was read and thus can't be found.
When you run the same playbook again Configure EC2 server should find the group.
We have used the following workaround to guide users in this kind of situations.
First, provision the instance:
tasks:
- name: Provisioning EC2 instance
local_action:
module: ec2
...
register: ec2
Then add a new play before ec2-configure.yml. The play uses ec2 variable that was registered in Provisioning EC2 instance and will fail and exit the playbook if any instances were launched:
- name: Stop and request a re-run if any instances were launched
hosts: localhost
gather_facts: no
tasks:
- name: Stop if instances were launched
fail: msg="Re-run the playbook to load group variables from EC2 dynamic inventory for the just launched instances!"
when: ec2.changed
- include: ec2-configure.yml
You can also refresh the cache:
ec2.py --refresh-cache
Or if your using as the Ansible host file:
/etc/ansible/hosts --refresh-cache