Ansible GCP IAP tunnel - google-cloud-platform

Ansible GCP IAP tunnel - google-cloud-platform

I’m trying to connect to a GCP compute instance through IAP. I have a service account with permissions.
I have tried the following
Basic ansible ping,ansible -vvvv GCP -m ping, which errors because the host name is not found bc I do not have an external ip
I have set ssh_executeable=wrapper.sh like here
Number 2 is almost working but regexing commands are hacky.
Is there a native ansible solution?
Edit: The gcp_compute dynamic inventory does work for pinging instances but it does not work for managing the instances.
Ansible does NOT support package or system management while tunneling through IAP.

For those who are still looking for a solution to use IAP SSH with Ansible on an internal IP. I've made some changes to the scripts listed here
My main problem was the fact that I had to add --zone as an option, as gcloud wouldn't automatically detect this when run through Ansible.
As I didn't want to call the CLI, adding more waittime, I've opted for using group_vars to set my ssh options. This also allows me to specify other options to the gcloud compute ssh command.
Here are the contents of the files needed for setup:
ansible.cfg
[inventory]
enable_plugins = gcp_compute
[defaults]
inventory = misc/inventory.gcp.yml
interpreter_python = /usr/bin/python
[ssh_connection]
# Enabling pipelining reduces the number of SSH operations required
# to execute a module on the remote server.
# This can result in a significant performance improvement
# when enabled.
pipelining = True
scp_if_ssh = False
ssh_executable = misc/gcp-ssh-wrapper.sh
ssh_args = None
misc/gcp-ssh-wrapper.sh
#!/bin/bash
# This is a wrapper script allowing to use GCP's IAP SSH option to connect
# to our servers.
# Ansible passes a large number of SSH parameters along with the hostname as the
# second to last argument and the command as the last. We will pop the last two
# arguments off of the list and then pass all of the other SSH flags through
# without modification:
host="${#: -2: 1}"
cmd="${#: -1: 1}"
# Unfortunately ansible has hardcoded ssh options, so we need to filter these out
# It's an ugly hack, but for now we'll only accept the options starting with '--'
declare -a opts
for ssh_arg in "${#: 1: $# -3}" ; do
if [[ "${ssh_arg}" == --* ]] ; then
opts+="${ssh_arg} "
fi
done
exec gcloud compute ssh $opts "${host}" -- -C "${cmd}"
group_vars/all.yml
---
ansible_ssh_args: --tunnel-through-iap --zone={{ zone }} --no-user-output-enabled --quiet
As you can see, by using the ansible_ssh_args from the group_vars, we can now pass the zone as it's already known through the inventory.
If you also want to be able to copy files through gcloud commands, you can use the following configuration:
ansible.cfg
[ssh_connection]
# Enabling pipelining reduces the number of SSH operations required to
# execute a module on the remote server. This can result in a significant
# performance improvement when enabled.
pipelining = True
ssh_executable = misc/gcp-ssh-wrapper.sh
ssh_args = None
# Tell ansible to use SCP for file transfers when connection is set to SSH
scp_if_ssh = True
scp_executable = misc/gcp-scp-wrapper.sh
misc/gcp-scp-wrapper.sh
#!/bin/bash
# This is a wrapper script allowing to use GCP's IAP option to connect
# to our servers.
# Ansible passes a large number of SSH parameters along with the hostname as the
# second to last argument and the command as the last. We will pop the last two
# arguments off of the list and then pass all of the other SSH flags through
# without modification:
host="${#: -2: 1}"
cmd="${#: -1: 1}"
# Unfortunately ansible has hardcoded scp options, so we need to filter these out
# It's an ugly hack, but for now we'll only accept the options starting with '--'
declare -a opts
for scp_arg in "${#: 1: $# -3}" ; do
if [[ "${scp_arg}" == --* ]] ; then
opts+="${scp_arg} "
fi
done
# Remove [] around our host, as gcloud scp doesn't understand this syntax
cmd=`echo "${cmd}" | tr -d []`
exec gcloud compute scp $opts "${host}" "${cmd}"
group_vars/all.yml
---
ansible_ssh_args: --tunnel-through-iap --zone={{ zone }} --no-user-output-enabled --quiet
ansible_scp_extra_args: --tunnel-through-iap --zone={{ zone }} --quiet

gce dynamic inventory does not work unless all the inventory are publicly accessible. For private ip, the tunnel is not invoked when ansible commands are executed. The gce dynamic inventory will return inventory, but you can't actually send commands if behind a tunnel and private IP only. The only work around i could find is to have the ssh binary point at a custom script which calls the gcloud wrapper.

(Converting my comment as an answer as requested by OP)
Ansible has a native gce dynamic inventory plugin that you should use to connect to your instances.

To make lotjuh's answer work I had to also update my inventory.gcp.yml file to have the following
plugin: gcp_compute
projects:
- myproject
auth_kind: application
hostnames:
- name
Without the hostnames: - name I was getting gcloud ssh errors since it tried to ssh into the instances using their host IP.
This approach also requires that the project be set in the gcloud config with gcloud config set project myproject

not a direct answer to the OP, but after having crushed my head on how to keep my project safe (via IAP) and let ansible work at reasonable speed, I've ended up with a mix of IAP and OS Login. This continues to use the dynamic inventory if needed.
I use IAP and no public IPs on my VMs, then I've enabled OS Login project wide and I've created a small "ansible-server" VM internal to the project (well this is a WIP as in the end a VPC paired project should CI/CD ansible but this is another story).
Inside the VM I've setup the identity of a dedicated service account via
gcloud auth activate-service-account name#project.iam.gserviceaccount.com --key-file=/path/to/sa/json/key
then I've created a pair of ssh keys
I've enabled the S.A. to login by exporting the public key via
gcloud compute os-login ssh-keys add --key-file ~/.ssh/my-sa-public-key
I run all my playbooks from within the VM passing the -u switch to ansible. This is blazing fast and let me revoke any permission via IAM avoiding floating ssh keys abandoned into project or VM metadata.
So the flow now is:
I use IAP to login from my workstation into the ansible VM inside the project
I clone the ansible repo inside the VM
I run ansible impersonating the S.A.
Caveats:
to get the correct username to be passed to ansible (via -u) record the username provided by the previous os-login command (it appears in the output of the added key, in my case was somethins like sa_[0-9]*)
be sure the S.A. has both Service Account User and OS Admin Login IAM roles or the ssh will fail
of course this means you have to keep a VM inside the project dedicated to ansible and also that you need to clone the ansible code into the VM. In my case, I mitigate the "issue", just switch the VM on/off on demand and I use the same public key to grant read-only access to the ansible repo (in my case on bitbucket)

Related

Adding Domain Nameserver into Google Container Optimized OS

I would like to prepend our own domain nameserver into COS. How should I do it ?
Is it just create the folowing in /etc/dhcp/dhclient.conf:
prepend domain-name-servers <domain ip>;
I have added the above configuration but I still not able to use my domain in the COS VM instance. Is there something I've missed ?
How do I restart the network adapter in COS without reseting/rebooting ?

COS uses "cloud-init". If you want to add dns server as configs like this to COS, you'd use cloud-init as a way to configure your instance when it boots up. The cloud-init tool expects its configuration in the value of the user-data key of the instance metadata. For more information1
To pass the configurations of cloud-init to the instance, you need to create your instance with the flag: --metadata-from-file user-data=[filename], or add the user-data=[filename] key value pair to the instance from the console, where the file would be stored on an external location like cloud storage, to which you'd provide the URL. It's also possible to just copy the config into the value section when setting the metadata. Example configurations to specify name servers and domains can be found in the following link.
By replacing the yaml config value in metadata (but keeping the "user-data" key) with the following config, you can configure resolv.conf to use custom name servers and get the instance to use those name servers for address resolution.
As an example you can create a file called cloud-config-resolv containing the following:
#cloud-config
write_files:
- path: /etc/systemd/resolved.conf
permissions: "0644"
owner: root:root
content: |
# This is my custom resolv.conf!
[Resolve]
DNS= 8.8.8.8 (Use your IP)
runcmd:
- ['systemctl', 'restart', 'systemd-resolved']
You can then run the following command to add [Your-IP] to the resolv.conf.
gcloud compute instances create instance-name \
--image-family cos-stable \
--image-project cos-cloud \
--metadata-from-file user-data=cloud-config-resolv \
--zone us-central1-a
I'm not confirmed that it will persist after 24hrs as dhcp lease is renewed and any changes are cleared. But the file does persist through network daemon restarts and VM restarts.

Unable to SSH/gcloud into default Google Deep Learning VM

I created a new Google Deep Learning VM keeping all the defaults except for asking no GPU:
The VM instance was successfully launched:
But I cannot SSH into it:
Same issue when attempting to use with gcloud (using the command provided when clicking on the instance's arrow down button at the right of SSH):
ssh: connect to host 34.105.108.43 port 22: Connection timed out
ERROR: (gcloud.beta.compute.ssh) [/usr/bin/ssh] exited with return code [255].
Why?
VM instance details:

Turns out that the browser-based SSH client and browser-based gcloud client were disabled by my organization, this is why I couldn't access the VM. The reason I was given is that to allow browser-based SSH, one would have to expose the VMs to the entire web, because Google does not provide a list of the IPs they use for browser-based SSH.
So instead one can SSH into a GCP VM via one's local SSH client by first uploading one's SSH key using the GCP web console. See https://cloud.google.com/compute/docs/instances/connecting-advanced#linux-macos (mirror) for the documentation on how to use one's local SSH client with GCP.
Since the documentation can be a bit tedious to parse, here are the commands I run on my local Ubuntu 18.04 LTS x64 to upload my SSH key and connect to the VM:
If you haven't installed gcloud yet:
# https://cloud.google.com/sdk/docs/install#linux (<- go there to get the latest gcloud URL to download via curl):
sudo apt-get install -y curl
curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-310.0.0-linux-x86_64.tar.gz
tar -xvf google-cloud-sdk-310.0.0-linux-x86_64.tar.gz./google-cloud-sdk/install.sh
./google-cloud-sdk/bin/gcloud init
Once gcloud is installed:
# Connect to gcloud
gcloud auth login
# Retrieve one's GCP "username"
gcloud compute os-login describe-profile
# The output will be "name: '[some large number, which is the username]'"
# Create a new SSH key
ssh-keygen -t rsa -f ~/.ssh/gcp001 -C USERNAME
chmod 400 ~/.ssh/gcp001
# if you want to view the public key: nano ~/.ssh/gcp001.pub
gcloud compute os-login ssh-keys add --key-file ~/.ssh/gcp001.pub
gcloud compute ssh --project PROJECT_ID --zone ZONE VM_NAME
# Note that PROJECT_ID can be viewed when running `gcloud auth login`,
# which will output "Your current project has been set to: [PROJECT_ID]".

In order to connect to the VM Instance you will have to follow the guide from GCP and then set up the role with the necessary authorization under IAM & Admin.

Please do:
sudo gcloud compute config-ssh
gcloud auth login
Login to your Gmail account. Accept access of Google Cloud.
Later set project if not yet done:
gcloud config set project YOU-PROJECT-ID
Run gcloud compute ssh with all you need.
If you still have a problem, please remove this:
rm .ssh/google_compute_engine
Run gcloud compute ssh with all you need again and the issue should be solved!

Ansible Dynamic inventory with static group with dynamic children

I am sure many who work with Terraform and Ansible or just Ansible on a daily basis must have come across this question.
Some background:
I create my infrastructure on AWS using Terraform and configure my machines using Ansible. my inventory file contains hardcoded public ip addresses with some variables. As the business demands, I create and destroy my machines very often.
My question:
I want don't want to update my inventory file with new public IP addresses every time I destroy and create my instances. So my fundamental requirement is - every time I destroy my machine I should be able run my Terraform script to recreate the machines and when I run my Ansible Playbook, Ansible should be able to pick up the right target machines and run the playbook. I need to know what I need to describe in my inventory file to achieve this automation. Domain name (www.fooexample.com) and static public IP addresses in the inventory file is not an option in my case? I have seen scripts that do it with, what it looks like a hostname (webserver1)
There are forums that talk about using the ec2.py option but ec2.py is getting all the public ip addresses associated with the account but i only want to target some of the machines as you can imagine and not all of them with my playbook.
Any help regarding this would be appreciated.
Thanks in Advance

I do something similar in GCP but the concept should apply to AWS.
Starting with Ansible 2.7 there is a new inventory plugin architecture and some inventory plugins to replace the dynamic inventory scripts (such as ec2.py and gcp.py). The AWS plugin documentation is at https://docs.ansible.com/ansible/2.9/plugins/inventory/aws_ec2.html.
First, you need to tag the groups of hosts you want to target in AWS. You should be able to handle this with Terraform (such as Service = Web).
Next, enable the aws_ec2 plugin in ansible.cfg by adding:
[inventory]
enable_plugins = aws_ec2
Now, convert over to using the new plugin instead of ec2.py. This means creating a aws_ec2.yaml file based on the documentation. An example might look like:
plugin: aws_ec2
regions:
- us-east-1
keyed_groups:
- prefix: tag
key: tags
# Set individual variables with compose
compose:
ansible_host: public_ip_address
The key parts here are the keyed_groups and compose section. This will give you the public IP addresses as the host to connect to in inventory and groups you can limit to with -l or --limit.
Considering you had some instances in us-east-1 tagged with Service = Web you could target them like:
ansible -i aws_ec2.yaml -m ping -l tag_Service_Web
This would target just those tagged hosts on their public IP address. Any dynamic scaling you do (such as increasing the count in Terraform for that resource) will be picked up by the inventory plugin on next run.
You can also use the tag in playbooks. If you had a playbook that you always targeted at these hosts you can set hosts: tag_Service_Web in the playbook.
Bonus:
I've been experimenting with an Ansible Pull model that automates some of this bootstrapping. The idea is to combine cloud-init with a special script to bootstrap the playbook for that host automatically.
Example script that cloud-init kicks off:
#!/bin/bash
set -euo pipefail
lock_files=(
/var/lib/dpkg/lock
/var/lib/apt/lists/lock
/var/lib/dpkg/lock-frontend
/var/cache/apt/archives/lock
/var/lib/apt/daily_lock
)
export ANSIBLE_HOST_PATTERN_MISMATCH="ignore"
export PATH="/tmp/ansible-venv/bin:$PATH"
for file in "${lock_files[#]}"; do
while fuser "$file" >/dev/null 2>&1; do
echo "Waiting for lock $file to be available..."
sleep 5
done
done
apt-get update -qy
apt-get install --no-install-recommends -qy virtualenv python-virtualenv python-nacl python-wheel python-bcrypt
virtualenv -p /usr/bin/python --system-site-packages /tmp/ansible-venv
pip install ansible==2.7.10 apache-libcloud==2.3.0 jmespath==0.9.3
ansible-pull myplaybook.yaml \
-U git#github.com:myorg/infrastructure.git \
-i gcp_compute.yaml \
--private-key /tmp/ansible-keys/infrastructure_ssh_deploy_key \
--vault-password-file /tmp/ansible-keys/vault \
-d /tmp/ansible-infrastructure \
--accept-host-key
This script is a bit simplified from my actual one (leaving out some domain specific authentication and key providing stuff). But you can adapt it to AWS by doing something like bootstrapping keys from S3 or KMS or another boot time configuration service. I find that ansible-pull works well when the playbook only takes a minute or two to run and doesn't have any dependencies on external inventory (like references to other groups such as to gather IP addresses).

Connect to particular GCP account

I have been using the GCP console to connect to a cloud instance and want to switch to using SSH through powershell as that seems to maintain a longer persistence. Transferring my public key through cloud shell into authorized_key file seems to be temporary since once cloud shell disconnects, the file doesn't persist. I've tried using os-login but that generates a completely different user from what I've been using through cloud shell (Cloud shell creates a user: myname while gcloud creates a user: myname_domain_com. Is there a way to continue using the same profile created by cloud shell when logging in through gcloud. I am using the same email and account in both the console and gcloud myname#domain.com. The alternative is to start all over from gcloud and that would be a pain.

If you want to SSH to different instances of a google cloud project (from a mac or Linux), do the following:
Step 1. Install SSH keys without password
Use the following command to generate the keys on your mac
ssh-keygen -t rsa -f ~/.ssh/ -C
For example private-key-name can be bpa-ssh-key. It will create two files with the following names in the ~/.ssh directory
bpa-ssh-key
bpa-ssh-key.pub
Step 2. Update the public key on your GCP project
Goto Google Cloud Console, choose your project, then
VMInstances->Metadata->SSH Keys->Edit->Add Item
Cut and paste the contents of the bpa-ssh-key.pub (from your mac) here and then save
Reset the VM Instance if it is running
Step 3. Edit config file under ~/.ssh on your mac Edit the ~/.ssh/config to add the following lines if not present already
Host *
PubKeyAuthentication yes
IdentityFile ~/.ssh/bpa-ssh-key
Step 4. SSHing to GCP Instance
ssh username#gcloud-externalip
It should create a SSH shell without asking for the password (since you have created the RSA/SSH keys without a password) on the gcloud instance.
Since Metadata is common across all instances under the same project, you can seam-lessly SSH into any of the instances by choosing the respective External IP of the gcloud instance.

GCP VMs + ssh/config file

guys.
GCP offers multiple ways of ssh-ing in gcloud, cloud shell, and local machine cloud SDK.
While all these options are great and I have been using them, I normally prefer using .ssh/config to shorten the process of logging in to machines.
For an example, for EC2, you just add:
Host $name
HostName $hostname
User $username
IdentityFile $pathtoFile
Is there any way to replicate this for GCP VMs?
Thanks

According to This Doc
If you have already connected to an instance through the gcloud tool, your keys are already generated and applied to your project or instance. The key files are available in the following locations:
Linux and macOS
Public key: $HOME/.ssh/google_compute_engine.pub
Private key: $HOME/.ssh/google_compute_engine
Windows
Public key: C:\Users[USERNAME].ssh\google_compute_engine.pub
Private key: C:\Users[USERNAME].ssh\google_compute_engine
You can use the key with typical -i or in .ssh/config config file.
Or simply do
ssh-add ~/.ssh/google_compute_engine
to add the identity to your ssh agent.
PS> I've seen people create an alias for the ssh command, something like
alias gce='gcloud compute ssh'

If you want to SSH to different instances of a google cloud project (from a mac or Linux), do the following:
Step 1. Install SSH keys without password
Use the following command to generate the keys on your mac
ssh-keygen -t rsa -f ~/.ssh/<private-key-name> -C <your gcloud username>
For example private-key-name can be bpa-ssh-key. It will create two files with the
following names in the ~/.ssh directory
bpa-ssh-key
bpa-ssh-key.pub
Step 2. Update the public key on your GCP project
Goto Google Cloud Console, choose your project, then
VMInstances->Metadata->SSH Keys->Edit->Add Item
Cut and paste the contents of the bpa-ssh-key.pub (from your mac) here and then save
Reset the VM Instance if it is running
Step 3. Edit config file under ~/.ssh on your mac
Edit the ~/.ssh/config to add the following lines if not present already
Host *
PubKeyAuthentication yes
IdentityFile ~/.ssh/bpa-ssh-key
Step 4. SSHing to GCP Instance
ssh username#gcloud-externalip
It should create a SSH shell without asking for the password (since you have created the RSA/SSH keys without a password) on the gcloud instance.
Since Metadata is common across all instances under the same project, you can seam-lessly SSH into any of the instances by choosing the respective External IP of the gcloud instance.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js