Unable to run kubectl command on fully private EKS Cluster - amazon-web-services

I am having a very hard time to deploy and access fully private eks cluster.
My Issue:
I have deployed a fully private cluster and I am not able to run kubectl command even from the machine in cluster's VPC. Also, whenever I try to create nodes I get message waiting for at least one node to join the cluster and then after 25 minutes it time's out.
My Requirements:
I have a fully private VPC called HSCN with 2 private and 2 public subnets. Even through there are public subnets in it but still it is fully private and has no access to outside network. Then, I have another VPC called internet with 2 private and 2 public subnets. This VPC has access to internet and is used to access machines in the HSCN vpc(fully private vpc). In short, it is serving as a gateway. These both VPC are connected through VPCPeering Connetion.
Now, I want to create a fully private cluster in the private subnet of the hscn vpc. I am following this GUIDE but I think this guide is not meant for beginners like me but still I am doing my best to understand it.
The first requirement it says to create a repo which I think I don't need for now as I am not goind to create pod.
The 2nd requirement require us to create VPC endpoints. If we are creating an EKS CLUSTER then it is automatically taken care by eks. I can confirm that eks is creating these endpoint automatically. But I have created manually and still I am not able to run kubectl commands and deploy self-manged nodes.
I ran a number of commands to check if anything is wrong with accessing the server address.
nmap -p 443 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Starting Nmap 7.80 ( https://nmap.org ) at 2022-09-09 11:11 UTC
Nmap scan report for 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com (192.168.*.*)
Host is up (0.00031s latency).
Other addresses for 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com (not scanned): 192.168.*.*
rDNS record for 192.168.*.*: ip-192-168-*-*.eu-west-*.compute.internal
PORT STATE SERVICE
443/tcp open https
Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds
Another command is
nslookup 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Server: 127.0.0.53
Address: 127.0.0.53#53
Non-authoritative answer:
Name: 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Address: 192.168.*.*
Name: 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Address: 192.168.*.*
And another is
telnet 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com 443
Trying 192.168.*.*...
Connected to 1E9057EC8C316E£D"#JY$J&G%1C94A.gr7.eu-west-*.eks.amazonaws.com
Escape character is '^]'.
^CConnection closed by foreign hos
It is clear that I can access the api server endpoints from my machine which is in the same vpc as the api server.
But still when I run the kubectl command I am getting this output
Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Can anyone suggest me what exactly I need to do?
Thanks

Related

Cant connect to AWS EC2

I have a ec2 instance with public ip in public subnet Ubuntu 20.04, everything seems open- But I cant ping or ssh to the instance:
NACL ( I/b & O/b) : 100 - all/all 0.0.0.0/0
Route Table- 0.0.0.0/0 igw
SG: 8080, 443, 22 - 0/0
I have this " open address " hyperlink in the ec2 instance console next to public-IP and public DNS - when I try opening it does not open- can't reach this page - tries https://IP.
Putty times out, also cmd ssh:
ssh -i "pem-file.pem" ubuntu#IP
ssh: connect to host 'IP' port 22: Connection timed out
When an SSH connection times-out, it is normally an indication that network traffic is not getting to the Amazon EC2 instance.
Things to check:
The instance is running Linux
The instance is launched in a public subnet, which is defined as having a Route Table entry to points to an Internet Gateway
The instance has a public IP address, which you are using for the connection
The Network Access Control Lists (NACLs) are set to their default "Allow All" values
A Security Group associated with the instance that permits inbound access on port 22 (SSH) either from your IP address, or from the Internet (0.0.0.0/0)
Your corporate network permits an outbound SSH connection (try alternate networks, eg home vs work vs tethered to your phone)
See also: Troubleshooting connecting to your instance - Amazon Elastic Compute Cloud
If you continue to have problems, then use ssh -vvv ... to activate debugging, and add the output to your Question.
Once you have SSH working, then try to get 443 working.
Do not use Ping to test connectivity because that requires additional rules in the Security Group, and all it tests is whether Ping is working.
Reachability Analyzer is an easy solution. It will analyze the requested path and direct you to the problem.
1. VPC > Reachability Analyzer
2. Create and analyze path
In this case, I would check the path from the Internet GW to the instance on port 22
3. Find the problem
Once the analysis is completed you can find the issue. In my case it's a routing table with no route to the internet GW
4. Fix the problem
Let's add the needed route
5. Verify the path again
Rerun the analysis again
6. SSH is working
ssh -i "my_key.pem" ec2-user#ec2-900-227-116-41.compute-1.amazonaws.com
__| __|_ )
_| ( / Amazon Linux 2022 AMI
___|\___|___| Preview
http://aws.amazon.com/linux/amazon-linux-2022
Last login: Wed Dec 1 09:18:54 2021 from 84.110.59.182
[ec2-user#ip-264-31-83-228 ~]$

Unable to access external mongo server from a pod but able to connect from EC2 instance

I am trying to connect to a MongoDB instance which is running on an external server from a pod running in a k8s cluster. I do have a VPC peering setup between two VPCs and I am perfectly able to connect to MongoDB server from nodes but when I try from a running pod, it fails. On trying traceroute, I think the private IP is not being resolved outside of the pod network.
Is there anything else which needs to be configured on pod networking side?
Taking a wild guess here, I believe your podCidr is conflicting with one of the Cidrs on your VPC. For example:
192.168.0.0/16 <podCidr) -> 192.168.1.0/24 (VPC cidr)
# Pod is thinking it needs to talk to another pod in the cluster
# instead of a server
You can see your podCidr with this command (clusterCIDR field):
$ kubectl -n kube-system get cm kube-proxy -o=yaml
Another aspect where things could be misconfigured could be your overlay network, where the pods are not getting pod IP address.
This is working fine. I was testing the connectivity with telnet from a pod and since telnet was not returning anything after the successful connection, it seemed that there was some network issue. After testing this with a simplehttp server and monitoring connections, I saw that it all worked fine.
The IP Addresses of podCidr is overlapping with VPC Cidr in which your mongo server is residing and hence Kube Router is preferring the internal route table 1st which is working as designed.
You wither need to reconfigure your VPC Network with a different network or the Kube Network.

Cannot connect to EC2 - ssh: connect to host port 22: Connection refused

I am currently overseas and I am trying to connect to my EC2 instance through ssh but I am getting the error ssh: connect to host ec2-34-207-64-42.compute-1.amazonaws.com port 22: Connection refused
I turned on my vpn to New York but still nothing changes. What reasons could there be for not being able to connect to this instance?
The instance is still running and serving the website but I am not able to connect through ssh. Is this a problem with the wifi where I am staying or with the instance itself?
My debugging steps to EC2 connection time out
Double check the security group access for port 22
Make sure you have your current IP on there and update to be sure it hasn't changed
Make sure the key pair you're attempting to use corresponds to the one attached to your EC2
Make sure your key pair on your local machine is chmod'ed correctly. I believe it's chmod 600 keypair.pem check this
Make sure you're in either your .ssh folder on your host OR correctly referencing it: HOME/.ssh/key.pem
Last weird totally wishy washy checks:
reboot instance
assign elastic IP and access that
switch from using the IP to Public DNS
add a : at the end of user#ip:
Totally mystical debugging sets for 6 though. That's part of the "my code doesn't work - don't know why. My code does work - don't know why." Category
Note:
If you access your EC2 while you are connected to a VPN, do know that your IP changes! So enable incoming traffic from your VPN's IP on your EC2 security group.
In AWS, navigate to Services > EC2.
Under Resources, select Running Instances.
Highlight your instance and click Connect.
In Terminal, cd into the directory containing your key and copy the command in step 3 under "To access your instance."
In Terminal, run: ssh -vvv -i [MyEC2Key].pem ec2-user#xx.xx.xx.xx(xx.xx.xx.xx = your EC2 Public IP) OR run the command in the example under step 4.
Just check if your public ip that you get when you are on VPN is configured as a source address in the SG inbound entry that opens up port 22.
You can check your ip using https://www.google.co.in/search?q=whats+my+ip, when connected to your VPN.
I tried everything in this and several other answers, also in some aws youtube videos. Lost perhaps five hours over a few sessions trying to solve it and now finally..
I was getting the exact same error message as the OP. I even rented another EC2 instance in a nearer data centre for twenty minutes to see if that was it.
Then I thought it might be the router or internet provider in the guest house where I am staying. Had already noticed that some non-mainstream news sites had been blocked - and that was it!
You can check if the router is blocking port 22:
https://superuser.com/questions/1336054/how-to-detect-if-a-network-is-blocking-outgoing-ports
cardamom#neptune $ time nmap -p 22 portquiz.net
Starting Nmap 7.70 ( https://nmap.org ) at 2021-02-03 20:43 CET
Nmap scan report for portquiz.net (27.39.379.385)
Host is up (0.028s latency).
rDNS record for 27.39.379.385: ec2-27-39-379-385.eu-west-3.compute.amazonaws.com
PORT STATE SERVICE
22/tcp closed ssh
Nmap done: 1 IP address (1 host up) scanned in 0.19 seconds
real 0m0,212s
user 0m0,034s
sys 0m0,017s
Then, the question of why someone would want to block the ssh port 22 is addressed in at length here:
https://serverfault.com/questions/25545/why-block-port-22-outbound
Had the same problem after creating some instances on a new VPC. (If internet SSH worked before this solution may not work for you)
When creating a new VPC, make sure you create an internet gateway (VPC -> Internet Gateways)
And also make sure that your VPC's routing table (VPC -> Route Tables) has an entry which redirects all IPs (or just your IP) to the internet gateway you just created.
For me, it was because of this:
NOT ec2-user#xx.xx.xx.xx
BUT THIS =>>> ubuntu#xx.xx.xx.xx
Watch the image of EC2 instance!
Instead of
ssh -i "key.pem" ubuntu#ec2-161-smth.com
use
ssh -i "key.pem" ec2-user#ec2-161-smth.com

consul agent join - i/o timeout

I have 3 Consul Servers I have created within AWS. They were created with Terraform and are joined as part of a cluster.
There is a security group created as part of that Terraform which allowed inbound TCP/UDP on 8300, 8301, 8302, 8400, 8500.
I have installed the consul agent on a new Ubuntu 16.04 instance.
I collect the private IP of one of the Consul servers and try to join it from the client:
consul agent -join 172.1.1.1:8301 -data-dir /tmp/consul
Result:
==> Starting Consul agent...
==> Joining cluster...
==> 1 error(s) occurred:
* Failed to join 172.1.1.1: dial tcp 172.1.1.1:8301: i/o timeout
I can't see what is missing here that is stopping the client from joining.
Not enough data in the question. What do you mean you collected the private IP, was it the server's private IP assigned by the subnet, or is the IP you listed actually a "TaggedAddresses" from the consul itself, which is created if you are not running consul on the host network. So clearly, you need to share some of your consul server configuration too.
Secondly, if it the server's private IP only, please make sure that there is no issue in the NACL or ephemeral ports. You will find more information on the following link from amazon's official documentation:
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_ACLs.html#VPC_ACLs_Ephemeral_Ports

InfluxDB localmachine not able to recognize other machines having public IP

I want to launch a cluster with one machine inside my home network and other having public IP.
I configured the file /etc/default/influxdb/ by adding the following line:
INFLUXD_OPTS="-join <Public-IP>:8091"
I followed official documentation of influxdb cluster settings.
I added rules for port 8086,8091 to the security groups. I am able to do telnet to that port.
show servers
name: data_nodes
----------------
id http_addr tcp_addr
1 localhost:8086 localhost:8088
name: meta_nodes
----------------
id http_addr tcp_addr
1 localhost:8091 localhost:8088
How to launch a cluster with one machine in my home network and other machine in aws cloud having public IP?
The AWS machines cannot reach your localhost. You must use domain names that are fully resolvable by every member of the cluster. Even once that's working, a cluster connected over the public internet is likely to fail due to latency issues.