How to list only nodes which are master from kubectl output? - kubectl

]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip<IP>.ec2.internal Ready master 300d v1.15.3
ip<IP>.ec2.internal Ready node 180d v1.15.3
ip<IP>.ec2.internal Ready master 300d v1.15.3
ip<IP>.ec2.internal Ready node 300d v1.15.3
ip<IP>.ec2.internal Ready node 300d v1.15.3
ip<IP>.ec2.internal Ready,SchedulingDisabled node 180d v1.15.3
ip<IP>.ec2.internal Ready node 180d v1.15.3
ip<IP>.ec2.internal Ready master 300d v1.15.3
ip<IP>.ec2.internal Ready node 300d v1.15.3
What I want is the output should have only list of node name showing which is first column and which are master only. I tried the script way:
#!/bin/bash
kubectl get nodes --selector=node-role.kubernetes.io/master > nodelist.txt
cat nodelist.txt
while IFS=" " read -r f1
do
echo $f1
done < nodelist.txt
, But I want any method using kubectl --custom-column or json filtering plz suggest.

Also you can use labels and jsonpath to select anything you need from kubectl get nodes -o json output
kubectl get nodes -l node-role.kubernetes.io/master -o 'jsonpath={.items[*].metadata.name}'
btw, you could you kubernetes kubectl Cheat Sheet if you lost at any point. It has most frequently used commands

I did not try this but it should give you the output you want.
kubectl get nodes | grep master | awk 'print {$1 $3}'

But I want any method using kubectl --custom-column or json filtering plz suggest.
Yes, you can use a --custom-columns to only show the name
kubectl get nodes -o custom-columns=NAME:.metadata.name
NAME
my-node
In addition, you can omit the headers using --no-headers
kubectl get nodes -o custom-columns=NAME:.metadata.name --no-headers
my-node
Using the selector you provided, to only show master nodes, the full command is this:
kubectl get nodes --selector=node-role.kubernetes.io/master -o custom-columns=NAME:.metadata.name --no-headers

Related

Attaching volume to already existing folder deleted all the data in digital ocean

I bought a 100 gb volume and accidentally attached it to my var folder.
Now all the previous data is deleted
command I ran mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_volume-sgp1-01 /var
How can I undo this last action in digital ocean
Unmounting returns you all of the data
Get the volume’s mount point with df if you don’t already know it sudo df --human-readable --print-type
Unmount the volume with umount. sudo umount --verbose /mnt/use_your_mount_point
if it shows that "device is busy", kill the processes
get the process by sudo lsof +f -- /mnt/use_your_mount_point
kill the process found ps -ef | grep 1725 <--> ps -ef | grep <pid> AND/OR kill -9 pid

ssh tunnel script hangs forever on beanstalk deployment

I'm attempting to create a ssh tunnel, when deploying an application to aws beanstalk. I want to put the tunnel as a background process, that is always connected on application deploy. The script is hanging forever on the deployment and I can't see why.
"/home/ec2-user/eclair-ssh-tunnel.sh":
mode: "000500" # u+rx
owner: root
group: root
content: |
cd /root
eval $(ssh-agent -s)
DISPLAY=":0.0" SSH_ASKPASS="./askpass_script" ssh-add eclair-test-key </dev/null
# we want this command to keep running in the backgriund
# so we add & at then end
nohup ssh -L 48682:localhost:8080 ubuntu#[host...] -N &
and here is the output I'm getting from /var/log/eb-activity.log:
[2019-06-14T14:53:23.268Z] INFO [15615] - [Application update suredbits-api-root-0.37.0-testnet-ssh-tunnel-fix-port-9#30/AppDeployStage1/AppDeployPostHook/01_eclair-ssh-tunnel.sh] : Starting activity...
The ssh tunnel is spawned, and I can find it by doing:
[ec2-user#ip-172-31-25-154 ~]$ ps aux | grep 48682
root 16047 0.0 0.0 175560 6704 ? S 14:53 0:00 ssh -L 48682:localhost:8080 ubuntu#ec2-34-221-186-19.us-west-2.compute.amazonaws.com -N
If I kill that process, the deployment continues as expected, which indicates that the bug is in the tunnel script. I can't seem to find out where though.
You need to add -n option to ssh when run it in background to avoid reading from stdin.

Horizontal Pod Autoscaler (HPA): Current utilization: <unknown> with custom namespace

UPDATE: I'm deploying on AWS cloud with the help of kops.
I'm in the process applying HPA for one of my kubernete deployment.
While testing the sample app, I deployed with default namespace, I can see the metrics being exposed as below ( showing current utilisation is 0%)
$ kubectl run busybox --image=busybox --port 8080 -- sh -c "while true; do { echo -e 'HTTP/1.1 200 OK\r\n'; \
env | grep HOSTNAME | sed 's/.*=//g'; } | nc -l -p 8080; done"
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
busybox Deployment/busybox 0%/20% 1 4 1 14m
But when I deploy with the custom namespace ( example: test), the current utilisation is showing unknown
$ kubectl get hpa --namespace test
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
busybox Deployment/busybox <unknown>/20% 1 4 1 25m
Can someone please suggest whats wrong here?
For future you need to meet few conditions for HPA to work. You need to have metrics server or heapster running on your cluster. What is important is to set resources on namespace basis.
You did not provide in what environment is your cluster running, but in GKE by default you have a cpu resource set (100m), but you need to specify it on new namespaces:
Please note that if some of the pod’s containers do not have the
relevant resource request set, CPU utilization for the pod will not be
defined and the autoscaler will not take any action for that metric.
In your case I am not sure why it does work after redeploy, as there is not enough information. But for future remember to:
1) object that you want to scale and HPA should be in the same namespace
2) set resources on CPU per namespace or simply add --requests=cpu=value so the HPA will be able to scale based on that.
UPDATE:
for your particular case:
1) kubectl run busybox --image=busybox --port 8080 -n test --requests=cpu=200m -- sh -c "while true; do { echo -e 'HTTP/1.1 200 OK\r\n'; \
env | grep HOSTNAME | sed 's/.*=//g'; } | nc -l -p 8080; done"
2) kubectl autoscale deployment busybox --cpu-percent=50 --min=1 --max=10 -n test
Try running the commands below in the namespace where you experience this issue and see if you get any pointers.
kubectl get --raw /apis/metrics.k8s.io/ - This should display a valid JSON
Also, do a kubectl describe hpa name_of_hpa_deployment - This may indicate if there are any issues with your hpa deployment in that namespace.

CoreOS fleetctl list-machines not showing 3 machines

I am following the DigitalOcean tutorial on CoreOS (https://www.digitalocean.com/community/tutorials/how-to-create-flexible-services-for-a-coreos-cluster-with-fleet-unit-files). When I do a fleetctl list-machines command on node 1 and node 2, I am not able to see all the 3 machines listed but just one for it's own node. The following is what I see:
core#coreos-1 ~ $ fleetctl list-machines
MACHINE IP METADATA
XXXX... 10.abc.de.fgh -
I logged onto my 3rd node and noticed that when I do a fleetctl list-machines I get the following error:
core#coreos-3 ~ $ fleetctl list-machines
Error retrieving list of active machines: googleapi: Error 503: fleet server unable to communicate with etc
What should I do to find out what is the problem and how to resolve this? I have tried rebooting and other things mentioned but nothing is helping.
What happened was that I had a etcd dependencies in my unit file where I had such as following:
# Dependency ordering
After=etcd.service
I think I needed etcd2 instead.
So I did the following as directed:
sudo systemctl stop fleet.service fleet.socket etcd
sudo systemctl start etcd2
sudo systemctl reset-failed
I had to clean up on the instance that had the file when I queried for it:
core#coreos1 ~ $ etcdctl ls /_coreos.com/fleet/job
/_coreos.com/fleet/job/apache.1.service
/_coreos.com/fleet/job/apache#.service
/_coreos.com/fleet/job/apache#80.service
/_coreos.com/fleet/job/apache#9999.service
/_coreos.com/fleet/job/apache-discovery.1.service
/_coreos.com/fleet/job/apache-discovery#.service
/_coreos.com/fleet/job/apache-discovery#80.service
/_coreos.com/fleet/job/apache-discovery#9999.service
by issuing
etcdctl ls /_coreos.com/fleet/job/apache.1.service
etcdctl rm --recursive /_coreos.com/fleet/job/apache-discovery.1.service
Then I started fleet
sudo systemctl start fleet
And when I did a fleetctl list-machines again it showed all my instances connected.

Kubeadm why does my node not show up though kubelet says it joined?

I am setting up a Kubernetes deployment using auto-scaling groups and Terraform. The kube master node is behind an ELB to get some reliability in case of something going wrong. The ELB has the health check set to tcp 6443, and tcp listeners for 8080, 6443, and 9898. All of the instances and the load balancer belong to a security group that allows all traffic between members of the group, plus public traffic from the NAT Gateway address. I created my AMI using the following script (from the getting started guide)...
# curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
# cat <<EOF > /etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
# apt-get update
# # Install docker if you don't have it already.
# apt-get install -y docker.io
# apt-get install -y kubelet kubeadm kubectl kubernetes-cni
I use the following user data scripts...
kube master
#!/bin/bash
rm -rf /etc/kubernetes/*
rm -rf /var/lib/kubelet/*
kubeadm init \
--external-etcd-endpoints=http://${etcd_elb}:2379 \
--token=${token} \
--use-kubernetes-version=${k8s_version} \
--api-external-dns-names=kmaster.${master_elb_dns} \
--cloud-provider=aws
until kubectl cluster-info
do
sleep 1
done
kubectl apply -f https://git.io/weave-kube
kube node
#!/bin/bash
rm -rf /etc/kubernetes/*
rm -rf /var/lib/kubelet/*
until kubeadm join --token=${token} kmaster.${master_elb_dns}
do
sleep 1
done
Everything seems to work properly. The master comes up and responds to kubectl commands, with pods for discovery, dns, weave, controller-manager, api-server, and scheduler. kubeadm has the following output on the node...
Running pre-flight checks
<util/tokens> validating provided token
<node/discovery> created cluster info discovery client, requesting info from "http://kmaster.jenkins.learnvest.net:9898/cluster-info/v1/?token-id=eb31c0"
node/discovery> failed to request cluster info, will try again: [Get http://kmaster.jenkins.learnvest.net:9898/cluster-info/v1/?token-id=eb31c0: EOF]
<node/discovery> cluster info object received, verifying signature using given token
<node/discovery> cluster info signature and contents are valid, will use API endpoints [https://10.253.129.106:6443]
<node/bootstrap> trying to connect to endpoint https://10.253.129.106:6443
<node/bootstrap> detected server version v1.4.4
<node/bootstrap> successfully established connection with endpoint https://10.253.129.106:6443
<node/csr> created API client to obtain unique certificate for this node, generating keys and certificate signing request
<node/csr> received signed certificate from the API server:
Issuer: CN=kubernetes | Subject: CN=system:node:ip-10-253-130-44 | CA: false
Not before: 2016-10-27 18:46:00 +0000 UTC Not After: 2017-10-27 18:46:00 +0000 UTC
<node/csr> generating kubelet configuration
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
Node join complete:
* Certificate signing request sent to master and response
received.
* Kubelet informed of new secure connection details.
Run 'kubectl get nodes' on the master to see this machine join.
Unfortunately, running kubectl get nodes on the master only returns itself as a node. The only interesting thing I see in /var/log/syslog is
Oct 27 21:19:28 ip-10-252-39-25 kubelet[19972]: E1027 21:19:28.198736 19972 eviction_manager.go:162] eviction manager: unexpected err: failed GetNode: node 'ip-10-253-130-44' not found
Oct 27 21:19:31 ip-10-252-39-25 kubelet[19972]: E1027 21:19:31.778521 19972 kubelet_node_status.go:301] Error updating node status, will retry: error getting node "ip-10-253-130-44": nodes "ip-10-253-130-44" not found
I am really not sure where to look...
The Hostnames of the two machines (master and the node) should be different. You can check them by running cat /etc/hostname. If they do happen to be the same, edit that file to make them different and then do a sudo reboot to apply the changes. Otherwise kubeadm will not be able to differentiate between the two machines and it will show as a single one in kubectl get nodes.
Yes , I faced the same problem.
I resolved by:
killall kubelet
run the kubectl join command again
and start the kubelet service