Adding Domain Nameserver into Google Container Optimized OS - google-cloud-platform

I would like to prepend our own domain nameserver into COS. How should I do it ?
Is it just create the folowing in /etc/dhcp/dhclient.conf:
prepend domain-name-servers <domain ip>;
I have added the above configuration but I still not able to use my domain in the COS VM instance. Is there something I've missed ?
How do I restart the network adapter in COS without reseting/rebooting ?

COS uses "cloud-init". If you want to add dns server as configs like this to COS, you'd use cloud-init as a way to configure your instance when it boots up. The cloud-init tool expects its configuration in the value of the user-data key of the instance metadata. For more information1
To pass the configurations of cloud-init to the instance, you need to create your instance with the flag: --metadata-from-file user-data=[filename], or add the user-data=[filename] key value pair to the instance from the console, where the file would be stored on an external location like cloud storage, to which you'd provide the URL. It's also possible to just copy the config into the value section when setting the metadata. Example configurations to specify name servers and domains can be found in the following link.
By replacing the yaml config value in metadata (but keeping the "user-data" key) with the following config, you can configure resolv.conf to use custom name servers and get the instance to use those name servers for address resolution.
As an example you can create a file called cloud-config-resolv containing the following:
#cloud-config
write_files:
- path: /etc/systemd/resolved.conf
permissions: "0644"
owner: root:root
content: |
# This is my custom resolv.conf!
[Resolve]
DNS= 8.8.8.8 (Use your IP)
runcmd:
- ['systemctl', 'restart', 'systemd-resolved']
You can then run the following command to add [Your-IP] to the resolv.conf.
gcloud compute instances create instance-name \
--image-family cos-stable \
--image-project cos-cloud \
--metadata-from-file user-data=cloud-config-resolv \
--zone us-central1-a
I'm not confirmed that it will persist after 24hrs as dhcp lease is renewed and any changes are cleared. But the file does persist through network daemon restarts and VM restarts.

Related

Ansible Dynamic inventory with static group with dynamic children

I am sure many who work with Terraform and Ansible or just Ansible on a daily basis must have come across this question.
Some background:
I create my infrastructure on AWS using Terraform and configure my machines using Ansible. my inventory file contains hardcoded public ip addresses with some variables. As the business demands, I create and destroy my machines very often.
My question:
I want don't want to update my inventory file with new public IP addresses every time I destroy and create my instances. So my fundamental requirement is - every time I destroy my machine I should be able run my Terraform script to recreate the machines and when I run my Ansible Playbook, Ansible should be able to pick up the right target machines and run the playbook. I need to know what I need to describe in my inventory file to achieve this automation. Domain name (www.fooexample.com) and static public IP addresses in the inventory file is not an option in my case? I have seen scripts that do it with, what it looks like a hostname (webserver1)
There are forums that talk about using the ec2.py option but ec2.py is getting all the public ip addresses associated with the account but i only want to target some of the machines as you can imagine and not all of them with my playbook.
Any help regarding this would be appreciated.
Thanks in Advance
I do something similar in GCP but the concept should apply to AWS.
Starting with Ansible 2.7 there is a new inventory plugin architecture and some inventory plugins to replace the dynamic inventory scripts (such as ec2.py and gcp.py). The AWS plugin documentation is at https://docs.ansible.com/ansible/2.9/plugins/inventory/aws_ec2.html.
First, you need to tag the groups of hosts you want to target in AWS. You should be able to handle this with Terraform (such as Service = Web).
Next, enable the aws_ec2 plugin in ansible.cfg by adding:
[inventory]
enable_plugins = aws_ec2
Now, convert over to using the new plugin instead of ec2.py. This means creating a aws_ec2.yaml file based on the documentation. An example might look like:
plugin: aws_ec2
regions:
- us-east-1
keyed_groups:
- prefix: tag
key: tags
# Set individual variables with compose
compose:
ansible_host: public_ip_address
The key parts here are the keyed_groups and compose section. This will give you the public IP addresses as the host to connect to in inventory and groups you can limit to with -l or --limit.
Considering you had some instances in us-east-1 tagged with Service = Web you could target them like:
ansible -i aws_ec2.yaml -m ping -l tag_Service_Web
This would target just those tagged hosts on their public IP address. Any dynamic scaling you do (such as increasing the count in Terraform for that resource) will be picked up by the inventory plugin on next run.
You can also use the tag in playbooks. If you had a playbook that you always targeted at these hosts you can set hosts: tag_Service_Web in the playbook.
Bonus:
I've been experimenting with an Ansible Pull model that automates some of this bootstrapping. The idea is to combine cloud-init with a special script to bootstrap the playbook for that host automatically.
Example script that cloud-init kicks off:
#!/bin/bash
set -euo pipefail
lock_files=(
/var/lib/dpkg/lock
/var/lib/apt/lists/lock
/var/lib/dpkg/lock-frontend
/var/cache/apt/archives/lock
/var/lib/apt/daily_lock
)
export ANSIBLE_HOST_PATTERN_MISMATCH="ignore"
export PATH="/tmp/ansible-venv/bin:$PATH"
for file in "${lock_files[#]}"; do
while fuser "$file" >/dev/null 2>&1; do
echo "Waiting for lock $file to be available..."
sleep 5
done
done
apt-get update -qy
apt-get install --no-install-recommends -qy virtualenv python-virtualenv python-nacl python-wheel python-bcrypt
virtualenv -p /usr/bin/python --system-site-packages /tmp/ansible-venv
pip install ansible==2.7.10 apache-libcloud==2.3.0 jmespath==0.9.3
ansible-pull myplaybook.yaml \
-U git#github.com:myorg/infrastructure.git \
-i gcp_compute.yaml \
--private-key /tmp/ansible-keys/infrastructure_ssh_deploy_key \
--vault-password-file /tmp/ansible-keys/vault \
-d /tmp/ansible-infrastructure \
--accept-host-key
This script is a bit simplified from my actual one (leaving out some domain specific authentication and key providing stuff). But you can adapt it to AWS by doing something like bootstrapping keys from S3 or KMS or another boot time configuration service. I find that ansible-pull works well when the playbook only takes a minute or two to run and doesn't have any dependencies on external inventory (like references to other groups such as to gather IP addresses).

Ansible GCP IAP tunnel

I’m trying to connect to a GCP compute instance through IAP. I have a service account with permissions.
I have tried the following
Basic ansible ping,ansible -vvvv GCP -m ping, which errors because the host name is not found bc I do not have an external ip
I have set ssh_executeable=wrapper.sh like here
Number 2 is almost working but regexing commands are hacky.
Is there a native ansible solution?
Edit: The gcp_compute dynamic inventory does work for pinging instances but it does not work for managing the instances.
Ansible does NOT support package or system management while tunneling through IAP.
For those who are still looking for a solution to use IAP SSH with Ansible on an internal IP. I've made some changes to the scripts listed here
My main problem was the fact that I had to add --zone as an option, as gcloud wouldn't automatically detect this when run through Ansible.
As I didn't want to call the CLI, adding more waittime, I've opted for using group_vars to set my ssh options. This also allows me to specify other options to the gcloud compute ssh command.
Here are the contents of the files needed for setup:
ansible.cfg
[inventory]
enable_plugins = gcp_compute
[defaults]
inventory = misc/inventory.gcp.yml
interpreter_python = /usr/bin/python
[ssh_connection]
# Enabling pipelining reduces the number of SSH operations required
# to execute a module on the remote server.
# This can result in a significant performance improvement
# when enabled.
pipelining = True
scp_if_ssh = False
ssh_executable = misc/gcp-ssh-wrapper.sh
ssh_args = None
misc/gcp-ssh-wrapper.sh
#!/bin/bash
# This is a wrapper script allowing to use GCP's IAP SSH option to connect
# to our servers.
# Ansible passes a large number of SSH parameters along with the hostname as the
# second to last argument and the command as the last. We will pop the last two
# arguments off of the list and then pass all of the other SSH flags through
# without modification:
host="${#: -2: 1}"
cmd="${#: -1: 1}"
# Unfortunately ansible has hardcoded ssh options, so we need to filter these out
# It's an ugly hack, but for now we'll only accept the options starting with '--'
declare -a opts
for ssh_arg in "${#: 1: $# -3}" ; do
if [[ "${ssh_arg}" == --* ]] ; then
opts+="${ssh_arg} "
fi
done
exec gcloud compute ssh $opts "${host}" -- -C "${cmd}"
group_vars/all.yml
---
ansible_ssh_args: --tunnel-through-iap --zone={{ zone }} --no-user-output-enabled --quiet
As you can see, by using the ansible_ssh_args from the group_vars, we can now pass the zone as it's already known through the inventory.
If you also want to be able to copy files through gcloud commands, you can use the following configuration:
ansible.cfg
[ssh_connection]
# Enabling pipelining reduces the number of SSH operations required to
# execute a module on the remote server. This can result in a significant
# performance improvement when enabled.
pipelining = True
ssh_executable = misc/gcp-ssh-wrapper.sh
ssh_args = None
# Tell ansible to use SCP for file transfers when connection is set to SSH
scp_if_ssh = True
scp_executable = misc/gcp-scp-wrapper.sh
misc/gcp-scp-wrapper.sh
#!/bin/bash
# This is a wrapper script allowing to use GCP's IAP option to connect
# to our servers.
# Ansible passes a large number of SSH parameters along with the hostname as the
# second to last argument and the command as the last. We will pop the last two
# arguments off of the list and then pass all of the other SSH flags through
# without modification:
host="${#: -2: 1}"
cmd="${#: -1: 1}"
# Unfortunately ansible has hardcoded scp options, so we need to filter these out
# It's an ugly hack, but for now we'll only accept the options starting with '--'
declare -a opts
for scp_arg in "${#: 1: $# -3}" ; do
if [[ "${scp_arg}" == --* ]] ; then
opts+="${scp_arg} "
fi
done
# Remove [] around our host, as gcloud scp doesn't understand this syntax
cmd=`echo "${cmd}" | tr -d []`
exec gcloud compute scp $opts "${host}" "${cmd}"
group_vars/all.yml
---
ansible_ssh_args: --tunnel-through-iap --zone={{ zone }} --no-user-output-enabled --quiet
ansible_scp_extra_args: --tunnel-through-iap --zone={{ zone }} --quiet
gce dynamic inventory does not work unless all the inventory are publicly accessible. For private ip, the tunnel is not invoked when ansible commands are executed. The gce dynamic inventory will return inventory, but you can't actually send commands if behind a tunnel and private IP only. The only work around i could find is to have the ssh binary point at a custom script which calls the gcloud wrapper.
(Converting my comment as an answer as requested by OP)
Ansible has a native gce dynamic inventory plugin that you should use to connect to your instances.
To make lotjuh's answer work I had to also update my inventory.gcp.yml file to have the following
plugin: gcp_compute
projects:
- myproject
auth_kind: application
hostnames:
- name
Without the hostnames: - name I was getting gcloud ssh errors since it tried to ssh into the instances using their host IP.
This approach also requires that the project be set in the gcloud config with gcloud config set project myproject
not a direct answer to the OP, but after having crushed my head on how to keep my project safe (via IAP) and let ansible work at reasonable speed, I've ended up with a mix of IAP and OS Login. This continues to use the dynamic inventory if needed.
I use IAP and no public IPs on my VMs, then I've enabled OS Login project wide and I've created a small "ansible-server" VM internal to the project (well this is a WIP as in the end a VPC paired project should CI/CD ansible but this is another story).
Inside the VM I've setup the identity of a dedicated service account via
gcloud auth activate-service-account name#project.iam.gserviceaccount.com --key-file=/path/to/sa/json/key
then I've created a pair of ssh keys
I've enabled the S.A. to login by exporting the public key via
gcloud compute os-login ssh-keys add --key-file ~/.ssh/my-sa-public-key
I run all my playbooks from within the VM passing the -u switch to ansible. This is blazing fast and let me revoke any permission via IAM avoiding floating ssh keys abandoned into project or VM metadata.
So the flow now is:
I use IAP to login from my workstation into the ansible VM inside the project
I clone the ansible repo inside the VM
I run ansible impersonating the S.A.
Caveats:
to get the correct username to be passed to ansible (via -u) record the username provided by the previous os-login command (it appears in the output of the added key, in my case was somethins like sa_[0-9]*)
be sure the S.A. has both Service Account User and OS Admin Login IAM roles or the ssh will fail
of course this means you have to keep a VM inside the project dedicated to ansible and also that you need to clone the ansible code into the VM. In my case, I mitigate the "issue", just switch the VM on/off on demand and I use the same public key to grant read-only access to the ansible repo (in my case on bitbucket)

struggle with credentials file

I'm about to deploy Docker container on AWS with credential file formatted like this:
[default]
aws_access_key_id = KEY
aws_secret_access_key = KEY
region=eu-west-2
vpc-id=vpc-bb1b7fd3
and located in ~/.aws/credentials
When I execute command docker-machine create --driver amazonec2 app
I get:
Couldn't determine your account Default VPC ID : "AuthFailure: AWS was not able to validate the provided access credentials\n\tstatus code: 401, request id: faf606d9-b12e-4a9e-a6c5-18eb609ffc45"
Error setting machine configuration from flags provided: amazonec2 driver requires either the --amazonec2-subnet-id or --amazonec2-vpc-id option or an AWS Account with a default vpc-id
Default VPC-ID is already defined. Anyone can help to resolve this or point me in the right direction ?
Command I'm using
docker-machine create --driver amazonec2 --amazonec2-access-key AKIAyyy --amazonec2-secret-key AKIAxxx --amazonec2-region eu-west-2 --amazonec2-vpc-id vpc-bb1b7fd3 flask_app
and when I'm trying to use credentials file located in my file system:
docker-machine create --driver amazonec2 flask_app
where vpc-bb1b7fd3 was generated by AWS by default hence must be valid and time is correct too. I also tried to swap the keys in case I somehow managed to swap them but they're OK too. Output from sudo ntpdate ntp.ubuntu.com was identical with machine's system time.
Error says: Error with pre-create check: "AuthFailure: AWS was not able to validate the provided access credentials\n\tstatus code: 401, request id: 9d642d91-cd93-4104-b9fb-2a42b1249e3b"
Tried:
On Stack Exchange was very similar problem solved by restarting Docker daemon because Docker's clock stops syncing its time with computer's time when computer is in sleep and awaken again. I restarted Docker daemon with no change. Still the same error.
Problem solved by downloading rootkey.csv from AWS and moving it into ~/.aws
Docker instance is now uploaded onto AWS.
The issue is not with keys, so possible two reason
Your system Time is wrong
Invalid VPC ID
You should check your computer's clock maybe it's wrong, even though it it set to update "automatically from the internet." Try to Run the following will fix the computer's clock
sudo ntpdate ntp.ubuntu.com
Or run accordingly to your OS.
AWS was not able to validate the provided access credentials
The second reason seems like you are missing some flags in your command if time does not work then pls update the question with command.
VPC ID
We determine your default VPC ID at the start of a command. In some
cases, either because your account does not have a default vpc, or you
don’t want to use the default one, you can specify a vpc with the
--amazonec2-vpc-id flag.
Login to the AWS console Go to Services -> VPC -> Your VPCs. Locate
the VPC ID you want from the VPC column. Go to Services -> VPC ->
Subnets. Examine the Availability Zone column to verify that zone a
exists and matches your VPC ID. For example, us-east1-a is in the a
availability zone. If the a zone is not present, you can create a new
subnet in that zone or specify a different zone when you create the
machine.
To create a machine with a non-default VPC-ID:
docker-machine create --driver amazonec2 --amazonec2-access-key AKI******* --amazonec2-secret-key 8T93C********* --amazonec2-vpc-id vpc-****** aws02
This example assumes the VPC ID was found in the a availability zone. Use the--amazonec2-zone flag to specify a zone other than the a zone. For example,--amazonec2-zone c signifies us-east1-c.
docker-machine-with-aws-driver-amazon-web-services-from-docker-documentation

GCP VMs + ssh/config file

guys.
GCP offers multiple ways of ssh-ing in gcloud, cloud shell, and local machine cloud SDK.
While all these options are great and I have been using them, I normally prefer using .ssh/config to shorten the process of logging in to machines.
For an example, for EC2, you just add:
Host $name
HostName $hostname
User $username
IdentityFile $pathtoFile
Is there any way to replicate this for GCP VMs?
Thanks
According to This Doc
If you have already connected to an instance through the gcloud tool, your keys are already generated and applied to your project or instance. The key files are available in the following locations:
Linux and macOS
Public key: $HOME/.ssh/google_compute_engine.pub
Private key: $HOME/.ssh/google_compute_engine
Windows
Public key: C:\Users[USERNAME].ssh\google_compute_engine.pub
Private key: C:\Users[USERNAME].ssh\google_compute_engine
You can use the key with typical -i or in .ssh/config config file.
Or simply do
ssh-add ~/.ssh/google_compute_engine
to add the identity to your ssh agent.
PS> I've seen people create an alias for the ssh command, something like
alias gce='gcloud compute ssh'
If you want to SSH to different instances of a google cloud project (from a mac or Linux), do the following:
Step 1. Install SSH keys without password
Use the following command to generate the keys on your mac
ssh-keygen -t rsa -f ~/.ssh/<private-key-name> -C <your gcloud username>
For example private-key-name can be bpa-ssh-key. It will create two files with the
following names in the ~/.ssh directory
bpa-ssh-key
bpa-ssh-key.pub
Step 2. Update the public key on your GCP project
Goto Google Cloud Console, choose your project, then
VMInstances->Metadata->SSH Keys->Edit->Add Item
Cut and paste the contents of the bpa-ssh-key.pub (from your mac) here and then save
Reset the VM Instance if it is running
Step 3. Edit config file under ~/.ssh on your mac
Edit the ~/.ssh/config to add the following lines if not present already
Host *
PubKeyAuthentication yes
IdentityFile ~/.ssh/bpa-ssh-key
Step 4. SSHing to GCP Instance
ssh username#gcloud-externalip
It should create a SSH shell without asking for the password (since you have created the RSA/SSH keys without a password) on the gcloud instance.
Since Metadata is common across all instances under the same project, you can seam-lessly SSH into any of the instances by choosing the respective External IP of the gcloud instance.

Changing ssh keypair when creating an ec2 instance with chef

I am bringing up an ec2 instance based on an in house AMI that used a different ssh key for authentication than the one I'd like to use on the instance I create using knife (in the example I call it original-pem-for-ami.pem):
knife ec2 server create -I ami-0123456 -f m2.xlarge \
--ssh-user username --groups sg-1234 \
--identity-file ~/.ssh/original-pem-for-ami.pem \
--node-name solr1 --hint ec2 -a public_ip_address \
--ssh-key name-of-key-i-want-to-use-to-login-to-new-instance
When I run this command the server comes up correctly, the correct security group is assigned etc, but I can only connect it to using:
ssh -i ~/.ssh/original-pem-for-ami.pem username#assigned-ec2-public-dns-name
Is there a way to make the new instance the key associated with the named keypair name-of-key-i-want-to-use-to-login-to-new-instance. I thought using --ssh-key name-of-key-i-want-to-use-to-login-to-new-instance would do this.
Check which version of knife-ec2 you have. --ssh-key is correct in 0.12 but before that (0.11 and earlier) I think it was something different. Also make sure this works through the normal AWS tools, it is possible the AMI wasn't prepared correctly and uses a hardwired key.