"kubectl get " cli response time too high on k8s cluster - kubectl

I had two small k8s cluster (one with flannel and one with calico).
Exact steps were used to install the two k8s cluster only difference being selection of Pod network at install time (one uses flannel one uses calico).
Issue was that "kubectl get all" command had different response time on both cluster. It takes roughly a min to respond on k8s with calico while k8s with flannel gave instant response.
was sure the issue is not due to Pod network selection choice as had no issues in spinning pods etc on both cluster both are working as expected.
Time on flannel based k8s 0m0.167s
$ time kubectl get all
NAME READY STATUS RESTARTS AGE
pod/nginx-6db489d4b7-h2mvv 1/1 Running 0 17m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 10d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx 1/1 1 1 17m
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-6db489d4b7 1 1 1 17m
real 0m0.167s
user 0m0.100s
sys 0m0.028s
Time on calico based k8s cluster hangs and responds only after nearly a minute .. 0m59.294s
$ time kubectl get all
NAME READY STATUS RESTARTS AGE
pod/nginx-6db489d4b7-b8c2g 1/1 Running 0 11m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 20m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx 1/1 1 1 11m
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-6db489d4b7 1 1 1 11m
real 0m59.294s
user 0m0.316s
sys 0m0.072s
At cluster install time it was ensured to ran below commands as regular user
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ ll .kube/config
-rw------- 1 ubuntu ubuntu 5455 Jan 23 10:17 .kube/config
I tried setting the kubeconfig env variable as
export KUBECONFIG=$HOME/.kube/config
Still this did not fix the response time for kubectl command.

After spending a lot of time in wrong direction it was identified that the cache and http-cache folder under $HOME/.kube were not having the correct chown permission.
Once fixed the permissions on those two folders (cache and http-cache) as well under $HOME/.kube the kubectl get response time was back to normal.
ubuntu#k8s-calico-master-1:~/.kube$ ll
total 24
drwxrwxr-x 4 ubuntu ubuntu 4096 Jan 23 10:18 ./
drwxr-xr-x 5 ubuntu ubuntu 4096 Jan 23 10:17 ../
drwxr-x--- 3 ubuntu ubuntu 4096 Jan 23 10:18 cache/
-rw------- 1 ubuntu ubuntu 5455 Jan 23 10:17 config
drwxr-x--- 3 ubuntu ubuntu 4096 Jan 23 10:18 http-cache/
ubuntu#k8s-calico-master-1:~/.kube$ cd cache/
ubuntu#k8s-calico-master-1:~/.kube/cache$ ll
total 12
drwxr-x--- 3 ubuntu ubuntu 4096 Jan 23 10:18 ./
drwxrwxr-x 4 ubuntu ubuntu 4096 Jan 23 10:18 ../
drwxr-x--- 3 ubuntu ubuntu 4096 Jan 23 10:18 discovery/
ubuntu#k8s-calico-master-1:~/.kube$ cd http-cache/
ubuntu#k8s-calico-master-1:~/.kube/http-cache$ ll
total 164
drwxr-x--- 3 ubuntu ubuntu 4096 Jan 23 10:18 ./
drwxrwxr-x 4 ubuntu ubuntu 4096 Jan 23 10:18 ../
drwxr-x--- 2 ubuntu ubuntu 4096 Jan 23 10:18 .diskv-temp/
-rw-rw---- 1 ubuntu ubuntu 813 Jan 23 10:18 f436dd33b3ceee24aa367363c323688e
ubuntu#k8s-calico-master-1:~/.kube/http-cache$ time kubectl get all
NAME READY STATUS RESTARTS AGE
pod/nginx-6db489d4b7-b8c2g 1/1 Running 0 52m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 61m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx 1/1 1 1 52m
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-6db489d4b7 1 1 1 52m
real 0m0.104s
user 0m0.104s
sys 0m0.024s

Related

How to deploy older ingress-nginx-controller or specify version with minikube?

I am trying to deploy a specific version of ingress-controller with minikube and kubernetesv1.13, but from what I see it is only possible to have latest version of ingress-nginx-controller deployed.
I expect the ingress-nginx-controller-#####-#### pod to come back online and run with the nginx-ingress image version I point to in the deployments details.
After editing the ingress-nginx-controller deployment via kubectl edit and changing the image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller property from 0.32.0 to 0.24.1, the pod restarts and goes into CrashLoopBackOff state.
By hitting describe, the pod seems complaining about the node not having free ports:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 5m8s (x2 over 5m8s) default-scheduler 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports.
Normal Scheduled 4m54s default-scheduler Successfully assigned kube-system/ingress-nginx-controller-6c4b64d58c-s5ddz to minikube
After searching for a similar case I tried the following:
I check ss but see no port 80 or 443 being busy on the host:
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 32 192.168.122.1:53 0.0.0.0:*
LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0:*
LISTEN 0 5 127.0.0.1:631 0.0.0.0:*
LISTEN 0 5 [::1]:631 [::]:*
No pods seems to be in terminating status:
NAME READY STATUS RESTARTS AGE
coredns-86c58d9df4-7s55r 1/1 Running 1 3h14m
coredns-86c58d9df4-rtssn 1/1 Running 1 3h14m
etcd-minikube 1/1 Running 1 3h13m
ingress-nginx-admission-create-gpfml 0/1 Completed 0 47m
ingress-nginx-admission-patch-z96hd 0/1 Completed 0 47m
ingress-nginx-controller-6c4b64d58c-s5ddz 0/1 CrashLoopBackOff 9 24m
kube-apiserver-minikube 1/1 Running 0 145m
kube-controller-manager-minikube 1/1 Running 0 145m
kube-proxy-pmwxr 1/1 Running 0 144m
kube-scheduler-minikube 1/1 Running 0 145m
storage-provisioner 1/1 Running 2 3h14m
I did not create any yml file or custom deployment, just installed minikube and enabled the ingress addon.
How to use a different nginx-ingress-controller version ?
The Nginx Version is tied to minikube version.
First I tried previous versions. Unfortunatelly, the available Minikube v1.3 uses nginx 0.25.0 and Minikube v1.2 uses nginx 0.23.0
So the only way I found to run nginx 0.24.0 in Minikube was building the binary myself using minikube v1.4, here is the step-by-step:
Download the minikube 1.4 repository and extract it:
$ wget https://github.com/kubernetes/minikube/archive/v1.4.0.tar.gz
$ tar -xvzf v1.4.0.tar.gz
Then, cd into the newly created minikube-1.4.0 folder and edit the file deploy/addons/ingress/ingress-dp.yaml.tmpl changing the image version to 0.24.1 as below:
spec:
serviceAccountName: ingress-nginx
containers:
- name: controller
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1
In order to build you have to download a Go distribution from original repo: https://golang.org/dl/
then follow the steps in https://golang.org/doc/install to install it. If you are running linux 64 bits, you can use the bellow comments:
$ wget https://dl.google.com/go/go1.14.4.linux-amd64.tar.gz
$ sudo tar -C /usr/local -xzf go1.14.4.linux-amd64.tar.gz
$ export PATH=$PATH:/usr/local/go/bin
Then from the Minikube 1.4.0 folder, run make:
/minikube-1.4.0$ ls
CHANGELOG.md CONTRIBUTING.md go.mod images Makefile OWNERS SECURITY_CONTACTS test.sh
cmd deploy go.sum installers netlify.toml pkg site third_party
code-of-conduct.md docs hack LICENSE README.md test translations
/minikube-1.4.0$ make
It may take a few minutes to download all dependencies, then let's copy the freshly build binary to /usr/local/bin and deploy minikube:
/minikube-1.4.0$ cd out/
/minikube-1.4.0/out$ ls
minikube minikube-linux-amd64
$ sudo cp minikube-linux-amd64 /usr/local/bin/minikube
$ minikube version
minikube version: v1.4.0
$ minikube start --vm-driver=kvm2 --kubernetes-version 1.13.12
NOTE: if you get an error about kvm2 driver when starting minikube, run the following command:
$ curl -LO https://storage.googleapis.com/minikube/releases/latest/docker-machine-driver-kvm2 && sudo install docker-machine-driver-kvm2 /usr/local/bin/
This version comes with ingress enabled by default, let's check the deployment status:
$ minikube addons list | grep ingress
- ingress: enabled
$ kubectl describe deploy nginx-ingress-controller -n kube-system |
grep Image:
Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-54ff9cd656-d95w5 1/1 Running 0 2m14s
coredns-54ff9cd656-tnvnw 1/1 Running 0 2m14s
etcd-minikube 1/1 Running 0 78s
kube-addon-manager-minikube 1/1 Running 0 71s
kube-apiserver-minikube 1/1 Running 0 71s
kube-controller-manager-minikube 1/1 Running 0 78s
kube-proxy-wj2d6 1/1 Running 0 2m14s
kube-scheduler-minikube 1/1 Running 0 87s
nginx-ingress-controller-f98c6df-5h2l7 1/1 Running 0 2m9s
storage-provisioner 1/1 Running 0 2m8s
As you can see, the pod nginx-ingress-controller-f98c6df-5h2l7 is in running state.
If you have any question let me know in the comments.

docker socker at /var/run/docker.sock with AWS

I have an issue where a few tools, Portainer for example, can't find the docker socket on AWS.
I have some setup scripts that were run to set various containers.
On MacOS, it works without problems.
On a CentOS box, no problem as well.
On CentOS / AWS, containers cannot connect to the docker socket.
I am talking about a local unsecured connection to /var/run/docker.sock
What could be different on AWS?
I can see the socket:
➜ run ls -ld /var/run/docker*
drwxr-xr-x 8 root root 200 Nov 27 14:04 /var/run/docker
-rw-r--r-- 1 root root 4 Nov 27 14:03 /var/run/docker.pid
srw-rw-r-- 1 root docker 0 Nov 27 14:03 /var/run/docker.sock

Minikube with Virtualbox or KVM using lots of CPU on Centos 7

I've installed minikube as per the kubernetes instructions.
After starting it, and waiting a while, I noticed that it is using a lot of CPU, even though I have nothing particular running in it.
top shows this:
%Cpu(s): 0.3 us, 7.1 sy, 0.5 ni, 92.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 32521856 total, 2259992 free, 9882020 used, 20379844 buff/cache
KiB Swap: 2097144 total, 616108 free, 1481036 used. 20583844 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4847 root 20 0 3741112 91216 37492 S 52.5 0.3 9:57.15 VBoxHeadless
lscpu shows this:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 21
Model: 2
Model name: AMD Opteron(tm) Processor 3365
I see the same effect if I use KVM instead of VirtualBox
kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 20m
I installed metrics-server and it outputs this:
kubectl top node minikube
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
minikube 334m 16% 1378Mi 76%
kubectl top pods --all-namespaces
NAMESPACE NAME CPU(cores) MEMORY(bytes)
default hello-minikube-56cdb79778-rkdc2 0m 3Mi
kafka-data-consistency zookeeper-84fb4cd6f6-sg7rf 1m 36Mi
kube-system coredns-fb8b8dccf-2nrl4 4m 15Mi
kube-system coredns-fb8b8dccf-g6llp 4m 8Mi
kube-system etcd-minikube 38m 41Mi
kube-system kube-addon-manager-minikube 31m 6Mi
kube-system kube-apiserver-minikube 59m 186Mi
kube-system kube-controller-manager-minikube 22m 41Mi
kube-system kube-proxy-m2fdb 2m 17Mi
kube-system kube-scheduler-minikube 2m 11Mi
kube-system kubernetes-dashboard-79dd6bfc48-7l887 1m 25Mi
kube-system metrics-server-cfb4b47f6-q64fb 2m 13Mi
kube-system storage-provisioner 0m 23Mi
Questions:
1) is it possible to find out why it is using so much CPU? (note that I am generating no load, and none of my containers are processing any data)
2) is that normal?
Are you sure nothing is running? What happens if you type kubectl get pods --all-namespaces? By default Kubernetes only displays the pods that are inside the default namespace (thus excluding the pods inside the system namespace).
Also, while I am no CPU expert, this seems like a reasonable consumption for the hardware you have.
In response to question 1):
You can ssh into minikube and from there you can run top to see the processes which are running:
minikube ssh
top
There is a lot of docker and kublet stuff running:
top - 21:43:10 up 8:27, 1 user, load average: 10.98, 12.00, 11.46
Tasks: 148 total, 1 running, 147 sleeping, 0 stopped, 0 zombie
%Cpu0 : 15.7/15.7 31[|||||||||||||||||||||||||||||||| ]
%Cpu1 : 6.0/10.0 16[|||||||||||||||| ]
GiB Mem : 92.2/1.9 [ ]
GiB Swap: 0.0/0.0 [ ]
11842 docker 20 0 24.5m 3.1m 0.7 0.2 0:00.71 R `- top
1948 root 20 0 480.2m 77.0m 8.6 4.1 27:45.44 S `- /usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --tlsverify --tlscacert /etc/docker/ca+
...
3176 root 20 0 10.1g 48.4m 2.0 2.6 17:45.61 S `- etcd --advertise-client-urls=https://192.168.39.197:2379 --cert-file=/var/lib/minikube/certs/etc+
The two process with 27 and 17 hours of processor time are the culprits.
In response to question 2): No idea but could be. See answer from #alassane-ndiaye

Running VirtualBox in Concourse Task

I'm trying to build vagrant boxes with concourse. I'm using the concourse/buildbox-ci image which is used in concourse's own build pipeline to build the concourse-lite vagrant box.
Before running packer I'm creating the virtualbox devices so they match the hosts devices. Nevertheless the packer build fails with:
==> virtualbox-iso: Error starting VM: VBoxManage error: VBoxManage: error: The virtual machine 'packer-virtualbox-iso-1488205144' has terminated unexpectedly during startup with exit code 1 (0x1)
==> virtualbox-iso: VBoxManage: error: Details: code NS_ERROR_FAILURE (0x80004005), component MachineWrap, interface IMachine
Has somebody got this working?
Is the concourse hetzner worker configuration accessible anywhere?
Additional configuration info:
in the concourse job container:
# ls -al /dev/vboxdrv /dev/vboxdrvu /dev/vboxnetctl
crw------- 1 root root 10, 53 Feb 27 14:19 /dev/vboxdrv
crw------- 1 root root 10, 52 Feb 27 14:19 /dev/vboxdrvu
crw------- 1 root root 10, 51 Feb 27 14:19 /dev/vboxnetctl
on the worker host:
# ls -al /dev/vbox*
crw------- 1 root root 10, 53 Feb 24 09:40 /dev/vboxdrv
crw------- 1 root root 10, 52 Feb 24 09:40 /dev/vboxdrvu
crw------- 1 root root 10, 51 Feb 24 09:40 /dev/vboxnetctl
concourse job:
jobs:
- name: mpf
serial_groups: [build]
plan:
- get: vagrant
trigger: true
- get: version
resource: version-mpf
- task: build
privileged: true
file: vagrant/ci/tasks/build.yml
tags: [vm-builder]
params:
TEMPLATE_FILE: virtualbox-mpf.json
vagrant/ci/scripts/build.sh:
#!/bin/bash -ex
mknod -m 0600 /dev/vboxdrv c 10 53
mknod -m 0600 /dev/vboxdrvu c 10 52
mknod -m 0600 /dev/vboxnetctl c 10 51
for name in $(VBoxManage list hostonlyifs | grep '^Name:' | awk '{print $NF}'); do
VBoxManage hostonlyif remove $name
done
VERSION=$(cat version/version)
packer build -var 'version=${VERSION}' vagrant/packer/${TEMPLATE_FILE}
vagrant/ci/tasks/build.yml:
---
platform: linux
image_resource:
type: docker-image
source: {repository: concourse/buildbox-ci}
inputs:
- name: vagrant
- name: version
outputs:
- name: build
run:
path: vagrant/ci/scripts/build.sh
Unfortunately the Hetzner worker configuration is basically just us periodically upgrading VirtualBox and fixing things when it falls over. (edit: we also make sure to use the same OS distro in the host and in the container - in our case, Arch Linux).
Make sure your VirtualBox version matches the version in the container - down to the patch version.
The device IDs (10,53 and 10,52 and 10,51) also must match those found on the host - these vary from version to version of VirtualBox.
We also make sure to use a special backend that does not perform any network namespacing, which is important if you're spinning up VMs that need a host-only network.
This whole thing's tricky. :/

AWS Block Devices name doesn't match with CentOS SoftLink

On AWS EC2 Block Device is identified as /dev/sda, /dev/sdf and /dev/sdg, but inside EC2 CentOS instance when I do ll /dev/sd* it gives following:
lrwxrwxrwx. 1 root root 4 Feb 17 03:10 /dev/sda -> xvde
lrwxrwxrwx. 1 root root 4 Feb 17 03:10 /dev/sdj -> xvdj
lrwxrwxrwx. 1 root root 4 Feb 17 03:10 /dev/sdk -> xvdk
lrwxrwxrwx. 1 root root 5 Feb 17 03:10 /dev/sdk1 -> xvdk1
When I run ec2-describe-instances --aws-access-key xxxxxx<MyKey>xxx --aws-secret-key xxxxxx<MyKey>xxx --region us-east-1 ``curl -s http://169.254.169.254/latest/meta-data/instance-id`` | grep -i BLOCKDEVICE output is as follow:
/dev/sda
/dev/sdf
/dev/sdg
I am wondering how to link these two: AWS GUI Console's Block Devices and within EC2 instance Block Devices?
Thanks,
This is a device mapping alias problem. You can see more details with a solution here:
https://forums.aws.amazon.com/message.jspa?messageID=255240
Make sure you take backups of everything before making any changes!