How do I use a cloud-init ubuntu image with Rancher on vSphere, with a network without dhcp? - vmware

Long story short: using a network without dhcp to deploy a new cluster from Rancher to vSphere causes a timeout on "waiting for ssh".
I am using a network protocol profile and vApp settings to set the static ip on the nodes.
I followed this guide:
https://www.virtualthoughts.co.uk/2020/03/29/rancher-vsphere-network-protocol-profiles-and-static-ip-addresses-for-k8s-nodes/
But when I disable cloud-inits initial network configuration, the nodes are never assigned the static IP from vApp. Without disabling the initial configuration, the first boot takes around 2 minutes (because it waits for dhcp, and fails), but it DOES apply the static ip from vApp afterwards - but unfortunately the 2 minutes it waits for dhcp is enough for Rancher to timeout waiting for ssh.

It would appear that my issue was that the network assigned to the nodes were unable to provide ipv6 through dhcp, and this caused the netplan to not apply the ipv4 static ips - but also didn't throw an error.
I found it through:
journalctl -b -u systemd-networkd
After I updated the netplan configuration with:
link-local: [ipv4]
The nodes now get their static ip's correctly

Related

Can't access Google Cloud Compute Instance External IP (boot fedora35)

I have a nodejs app running inside it a vm instance.
I created a firewall rule opening port tcp:5000.
It works locally (inside vm) and I am able to verify the connection via
sudo wget http://localhost:5000
this gives status 200 ok.
when I replace the localhost with the vm external Ip address I get a connection time out.
I have tried most things I could get on the internet but now I am tired. Its my first time interacting with gcp so I guess it's expected.
The Network tags on the VM instance do not match your firewall rule inventory-controller-port. The Compute Engine Instance has these network tags http-server, https-server. To resolve this issue, these are your options.
Update the Firewall rule inventory-controller-port and add these network tags http-server, https-server. For more information, check this documentation about updating firewall rules.
Add additional network tags that you added on the Firewall rule inventory-controller-port on your existing VM. To know more check this documentation about Adding tags to an existing VM.

gcloud VM - This site can’t be reached

I have created n1-standard-1 (1 vCPU, 3.75 GB memory) VM and installed LAMP on it with a static IP address. When I am trying to hit the static IP address in browser, it says This site can’t be reached However I have checked firewall rules and port 80 is opened.
Below is the output of gcloud compute firewall-rules list command -
And the output of telnet is as -
Is there anything else I need to do to open port 80 and 443?
Please help, thank you!!
This could be the VM's configuration. You'll want to check that the machine is actually listening on that port. You may have installed LAMP but are the services started, for instance? Best way to do that is SSH into the system and curl localhost. If the curl fails, you know the services are not listening on that port.
After that check that you can access the system from the VPC if you can, for example via another system in the same VPC, run curl <machine>. If that doesn't work, you may find the system is only listening on 127.0.0.1 or has other settings blocking connections from other machines.
If those steps succeed then your firewall rules are indeed to blame - check that your system is in the correct VPC (default you listed above).
Finally, you haven't specified how you assigned the static IP address but make sure that the address is created and assigned to that instance.

Unable to bind docker container to secondary interface

I have an EC2 instance (running CentOS 7) with two network interfaces on it. The primary is ens5 and the secondary was attached as eth0. What I'm attempting to do is bind my docker container to eth0, so that both incoming and outgoing traffic is associated with the IP address of eth0.
I have a couple of external ports exposed. The first thing I tried in my docker run command was to just do eth0_ip:port:port. The container did start up successfully, and I was able to hit the container from the host on the IP of eth0, however when making requests from other EC2 instances in the same VPC, the requests timed out. Using tcpdump I was able to confirm that external requests are making it to the instance, however requests aren't making it to the container.
I also attempted to create a new network associated with the IP address of eth0, and then set the --network flag in my run command, but I was greeted with the same exact failure.
Any help would be greatly appreciated!

AWS EC2 - Ubuntu instance, SSH connect to host operation timed out

I am new to setting up virtual machines. I created my first Ubuntu instance using AWS EC2. Everything seemed to check out until I tried connecting to it with ssh, as per instructions.
To provide some context, my app is called "smpapp". My computer is macOS High Sierra. Naturally, my smpapp.pem file saved to ~/Downloads. First, I opened up the Terminal and set my working directory to Downloads with cd ~/Downloads. Then I entered chmod 400 smpapp.pem, which didn't return any error, so I assume it was a success. Then, I entered ssh -i "smpapp.pem" ubuntu#ec2-XX-XX-XXX-XXX.us-east-2.compute.amazonaws.com (omitting public DNS numbers with Xs). It took awhile to process before spitting out, ssh: connect to host ec2-XX-XX-XXX-XXX.us-east-2.compute.amazonaws.com port 22: Operation timed out.
Can someone explain the general problem to me and how I can fix it (methodically and in layman's terms)?
Could be a few things:
Does your ec2 instance have a public ip? (if not, you might have to attach an elastic ip or put it in a public subnet)
Is the security group attached to the ec2 instance allowing connections to port 22?
Is the ACL on the subnet allowing public connections to the subnet?
Is your VPC configured to routetraffic through your IGW?
Amazon offers step by step instructions on determining the issue, it could be for any reason of the above not being configured properly. You can find step by step instructions on what do in the official amazon docs here.

zookeeper installation on multiple AWS EC2instances

I am new to zookeeper and aws EC2. I am trying to install zookeeper on 3 ec2 instances.
as per zookeeper document, I have installed zookeeper on all 3 instances, created zoo.conf and add below configuration:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=localhost:2888:3888
server.2=<public ip of ec2 instance 2>:2889:3889
server.3=<public ip of ec2 instance 3>:2890:3890
also I have created myid file on all 3 instances as /opt/zookeeper/data/myid
as per guideline..
I have couple of queries as below:
whenever I am starting zookeeper server on each instance, it will start in standalone mode.(as per logs)
can above configuration is really gonna connect to each other? port 2889:3889 & 2890:38900 - what these port all about. can I need to configure it on ec2 machine or I need to give some other port against it?
Is I need to create security group to open these connection? I am not sure how to do it in ec2 instance.
How to confirm all 3 zookeeper has started and they can communicate with each other?
The ZooKeeper configuration is designed such that you can install the exact same configuration file on all servers in the cluster without modification. This makes ops a bit simpler. The component that specifies the configuration for the local node is the myid file.
The configuration you've defined is not one that can be shared across all servers. All of the servers in your server list should be binding to a private IP address that is accessible to other nodes in the network. You're seeing your server start in standalone mode because you're binding to localhost. So, the problem is the other servers in the cluster can't see localhost.
Your configuration should look more like:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=<private ip of ec2 instance 1>:2888:3888
server.2=<private ip of ec2 instance 2>:2888:3888
server.3=<private ip of ec2 instance 3>:2888:3888
The two ports listed in each server definition are respectively the quorum and election ports used by ZooKeeper nodes to communicate with one another internally. There's usually no need to modify these ports, and you should try to keep them the same across servers for consistency.
Additionally, as I said you should be able to share that exact same configuration file across all instances. The only thing that should have to change is the myid file.
You probably will need to create a security group and open up the client port to be available for clients and the quorum/election ports to be accessible by other ZooKeeper servers.
Finally, you might want to look in to a UI to help manage the cluster. Netflix makes a decent UI that will give you a view of your cluster and also help with cleaning up old logs and storing snapshots to S3 (ZooKeeper takes snapshots but does not delete old transaction logs, so your disk will eventually fill up if they're not properly removed). But once it's configured correctly, you should be able to see the ZooKeeper servers connecting to each other in the logs as well.
EDIT
#czerasz notes that starting from version 3.4.0 you can use the autopurge.snapRetainCount and autopurge.purgeInterval directives to keep your snapshots clean.
#chomp notes that some users have had to use 0.0.0.0 for the local server IP to get the ZooKeeper configuration to work on EC2. In other words, replace <private ip of ec2 instance 1> with 0.0.0.0 in the configuration file on instance 1. This is counter to the way ZooKeeper configuration files are designed but may be necessary on EC2.
Adding additional info regarding Zookeeper clustering inside Amazon's VPC.
Solution with VPC's public IP addres should be preferable solution since Zookeeper and using '0.0.0.0' should be your last option.
In case when you are using docker in your EC2 instance '0.0.0.0' will not work properly with Zookeeper 3.5.X after node restart.
The issue lies in resolving '0.0.0.0' and ensemble sharing of node addresses and SID order (if you will start your nodes in descending order, this issue may not occur).
So far the only working solution is to upgrade to 3.6.2+ version.