consul agent join - i/o timeout - amazon-web-services

I have 3 Consul Servers I have created within AWS. They were created with Terraform and are joined as part of a cluster.
There is a security group created as part of that Terraform which allowed inbound TCP/UDP on 8300, 8301, 8302, 8400, 8500.
I have installed the consul agent on a new Ubuntu 16.04 instance.
I collect the private IP of one of the Consul servers and try to join it from the client:
consul agent -join 172.1.1.1:8301 -data-dir /tmp/consul
Result:
==> Starting Consul agent...
==> Joining cluster...
==> 1 error(s) occurred:
* Failed to join 172.1.1.1: dial tcp 172.1.1.1:8301: i/o timeout
I can't see what is missing here that is stopping the client from joining.

Not enough data in the question. What do you mean you collected the private IP, was it the server's private IP assigned by the subnet, or is the IP you listed actually a "TaggedAddresses" from the consul itself, which is created if you are not running consul on the host network. So clearly, you need to share some of your consul server configuration too.
Secondly, if it the server's private IP only, please make sure that there is no issue in the NACL or ephemeral ports. You will find more information on the following link from amazon's official documentation:
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_ACLs.html#VPC_ACLs_Ephemeral_Ports

Related

Unable to run kubectl command on fully private EKS Cluster

I am having a very hard time to deploy and access fully private eks cluster.
My Issue:
I have deployed a fully private cluster and I am not able to run kubectl command even from the machine in cluster's VPC. Also, whenever I try to create nodes I get message waiting for at least one node to join the cluster and then after 25 minutes it time's out.
My Requirements:
I have a fully private VPC called HSCN with 2 private and 2 public subnets. Even through there are public subnets in it but still it is fully private and has no access to outside network. Then, I have another VPC called internet with 2 private and 2 public subnets. This VPC has access to internet and is used to access machines in the HSCN vpc(fully private vpc). In short, it is serving as a gateway. These both VPC are connected through VPCPeering Connetion.
Now, I want to create a fully private cluster in the private subnet of the hscn vpc. I am following this GUIDE but I think this guide is not meant for beginners like me but still I am doing my best to understand it.
The first requirement it says to create a repo which I think I don't need for now as I am not goind to create pod.
The 2nd requirement require us to create VPC endpoints. If we are creating an EKS CLUSTER then it is automatically taken care by eks. I can confirm that eks is creating these endpoint automatically. But I have created manually and still I am not able to run kubectl commands and deploy self-manged nodes.
I ran a number of commands to check if anything is wrong with accessing the server address.
nmap -p 443 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Starting Nmap 7.80 ( https://nmap.org ) at 2022-09-09 11:11 UTC
Nmap scan report for 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com (192.168.*.*)
Host is up (0.00031s latency).
Other addresses for 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com (not scanned): 192.168.*.*
rDNS record for 192.168.*.*: ip-192-168-*-*.eu-west-*.compute.internal
PORT STATE SERVICE
443/tcp open https
Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds
Another command is
nslookup 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Server: 127.0.0.53
Address: 127.0.0.53#53
Non-authoritative answer:
Name: 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Address: 192.168.*.*
Name: 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Address: 192.168.*.*
And another is
telnet 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com 443
Trying 192.168.*.*...
Connected to 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Escape character is '^]'.
^CConnection closed by foreign hos
It is clear that I can access the api server endpoints from my machine which is in the same vpc as the api server.
But still when I run the kubectl command I am getting this output
Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Can anyone suggest me what exactly I need to do?
Thanks

Unable to access external mongo server from a pod but able to connect from EC2 instance

I am trying to connect to a MongoDB instance which is running on an external server from a pod running in a k8s cluster. I do have a VPC peering setup between two VPCs and I am perfectly able to connect to MongoDB server from nodes but when I try from a running pod, it fails. On trying traceroute, I think the private IP is not being resolved outside of the pod network.
Is there anything else which needs to be configured on pod networking side?
Taking a wild guess here, I believe your podCidr is conflicting with one of the Cidrs on your VPC. For example:
192.168.0.0/16 <podCidr) -> 192.168.1.0/24 (VPC cidr)
# Pod is thinking it needs to talk to another pod in the cluster
# instead of a server
You can see your podCidr with this command (clusterCIDR field):
$ kubectl -n kube-system get cm kube-proxy -o=yaml
Another aspect where things could be misconfigured could be your overlay network, where the pods are not getting pod IP address.
This is working fine. I was testing the connectivity with telnet from a pod and since telnet was not returning anything after the successful connection, it seemed that there was some network issue. After testing this with a simplehttp server and monitoring connections, I saw that it all worked fine.
The IP Addresses of podCidr is overlapping with VPC Cidr in which your mongo server is residing and hence Kube Router is preferring the internal route table 1st which is working as designed.
You wither need to reconfigure your VPC Network with a different network or the Kube Network.

InfluxDB localmachine not able to recognize other machines having public IP

I want to launch a cluster with one machine inside my home network and other having public IP.
I configured the file /etc/default/influxdb/ by adding the following line:
INFLUXD_OPTS="-join <Public-IP>:8091"
I followed official documentation of influxdb cluster settings.
I added rules for port 8086,8091 to the security groups. I am able to do telnet to that port.
show servers
name: data_nodes
----------------
id http_addr tcp_addr
1 localhost:8086 localhost:8088
name: meta_nodes
----------------
id http_addr tcp_addr
1 localhost:8091 localhost:8088
How to launch a cluster with one machine in my home network and other machine in aws cloud having public IP?
The AWS machines cannot reach your localhost. You must use domain names that are fully resolvable by every member of the cluster. Even once that's working, a cluster connected over the public internet is likely to fail due to latency issues.

zookeeper installation on multiple AWS EC2instances

I am new to zookeeper and aws EC2. I am trying to install zookeeper on 3 ec2 instances.
as per zookeeper document, I have installed zookeeper on all 3 instances, created zoo.conf and add below configuration:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=localhost:2888:3888
server.2=<public ip of ec2 instance 2>:2889:3889
server.3=<public ip of ec2 instance 3>:2890:3890
also I have created myid file on all 3 instances as /opt/zookeeper/data/myid
as per guideline..
I have couple of queries as below:
whenever I am starting zookeeper server on each instance, it will start in standalone mode.(as per logs)
can above configuration is really gonna connect to each other? port 2889:3889 & 2890:38900 - what these port all about. can I need to configure it on ec2 machine or I need to give some other port against it?
Is I need to create security group to open these connection? I am not sure how to do it in ec2 instance.
How to confirm all 3 zookeeper has started and they can communicate with each other?
The ZooKeeper configuration is designed such that you can install the exact same configuration file on all servers in the cluster without modification. This makes ops a bit simpler. The component that specifies the configuration for the local node is the myid file.
The configuration you've defined is not one that can be shared across all servers. All of the servers in your server list should be binding to a private IP address that is accessible to other nodes in the network. You're seeing your server start in standalone mode because you're binding to localhost. So, the problem is the other servers in the cluster can't see localhost.
Your configuration should look more like:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=<private ip of ec2 instance 1>:2888:3888
server.2=<private ip of ec2 instance 2>:2888:3888
server.3=<private ip of ec2 instance 3>:2888:3888
The two ports listed in each server definition are respectively the quorum and election ports used by ZooKeeper nodes to communicate with one another internally. There's usually no need to modify these ports, and you should try to keep them the same across servers for consistency.
Additionally, as I said you should be able to share that exact same configuration file across all instances. The only thing that should have to change is the myid file.
You probably will need to create a security group and open up the client port to be available for clients and the quorum/election ports to be accessible by other ZooKeeper servers.
Finally, you might want to look in to a UI to help manage the cluster. Netflix makes a decent UI that will give you a view of your cluster and also help with cleaning up old logs and storing snapshots to S3 (ZooKeeper takes snapshots but does not delete old transaction logs, so your disk will eventually fill up if they're not properly removed). But once it's configured correctly, you should be able to see the ZooKeeper servers connecting to each other in the logs as well.
EDIT
#czerasz notes that starting from version 3.4.0 you can use the autopurge.snapRetainCount and autopurge.purgeInterval directives to keep your snapshots clean.
#chomp notes that some users have had to use 0.0.0.0 for the local server IP to get the ZooKeeper configuration to work on EC2. In other words, replace <private ip of ec2 instance 1> with 0.0.0.0 in the configuration file on instance 1. This is counter to the way ZooKeeper configuration files are designed but may be necessary on EC2.
Adding additional info regarding Zookeeper clustering inside Amazon's VPC.
Solution with VPC's public IP addres should be preferable solution since Zookeeper and using '0.0.0.0' should be your last option.
In case when you are using docker in your EC2 instance '0.0.0.0' will not work properly with Zookeeper 3.5.X after node restart.
The issue lies in resolving '0.0.0.0' and ensemble sharing of node addresses and SID order (if you will start your nodes in descending order, this issue may not occur).
So far the only working solution is to upgrade to 3.6.2+ version.

How to connect hornetq on AWS VPC from another vm on AWS

I have 2 VMs on AWS. On the first VM I have hornet and application that send messages to hornet. On another VM I have application that is a consumer of hornet.
The consumer fails to pull messages from hornet, and I can't understand why. Hornetq is running, I opened to ports to any IP.
I tried to connect hornet with jconsole (on my local computer) and failed, so I can't see if the hornet has any consumers/ suppliers.
I've tried to change 'bind' configurations to 0.0.0.0 but when I restarted hornet they were automatically changed to what I have as server IP in config.properties.
Any suggestions what might be the problem that I failed to connect my application to the hornetq?
Thanks!
These are the things you need to check for the connectivity between VMs in VPC.
The Security- Group of the instance has both Ingress-Egress Configuration settings unlike the traditional EC2 Security Group [ now Classic EC2 ]. Check the Egress from your Consumer and ingress to the Server
If the instances are in different Subnets you need to check for the ACL as well; however the default setting would be allow.
Check if the iptables / OS level firewall which are blocking.
With respect to the connectivity failed from your local machine to Hornetq - you need to place the Instance in Public sub and configure the Instance's SG accordingly; only the app / VM would accessible to public internet
I have assumed that both the instances are in the Same VPC. However the title of the post sounds slightly misleading - if it is 2 different VPCs altogether, then new concept of VPC Peering also comes in