Kubernetes - How to list all pods running in a particular instance group? - kubectl

How can I list, using command line, all the pods running in the nodes of a particular instance group?
For example, if I have instancegroup "Foo", which has let's say three nodes, N1, N2, and N3, that in turn have pods A and B running on N1, pods C, D and E running on N2, and pod F running on N3.... how can I, using kops/kubectl, make query with input "Foo" and output "A, B, C, D, E, F"?
I know that you can query a particular node and list all the pods in it, but I want to query an instancegroup, with many nodes, and get all the pods across all the nodes, with no namespace constraint.
Thanks!

One way would be to assign labels to the nodes in your InstanceGroup, perhaps with the name of the instance group (for simplicity) and then use a combination of kubectl commands to query.
So in your InstanceGroup spec set nodeLabels to be ig: Foo (see https://github.com/kubernetes/kops/blob/master/docs/instance_groups.md#adding-taints-or-labels-to-an-instance-group) for details and then run:
kubectl get pods -o wide --all-namespaces | grep -F -f <(kubectl get nodes -l ig=Foo --no-headers | awk '{print $1}')

Related

Delete attempt of Kubernetes resource reports not found, even though it can be listed with "kubectl get"

I am running Kubeflow pipeline on a single node Rancher K3S cluster. Katib is deployed to create training jobs (Kind: TFJob) along with experiments (a CRD).
I can list the experiment resources with kubectl get experiments -n <namespace>. However, when trying to delete using kubectl delete experiment exp_name -n namespace the API server returns NotFound.
kubectl version is 1.22.12
kubeflow 1.6
How can a(any) resource be deleted when it is listed by "kubectl get , but a direct kubectl delete says the resource cannot be found?
Hopefully there is a general answer applicable for any resource.
Example:
kc get experiments -n <namespace>
NAME TYPE STATUS AGE
mnist-e2e Running True 21h
kc delete experiment mnist-e2e -n namespace
Error from server (NotFound): experiments.kubeflow.org "mnist-e2e" not found
I have tried these methods, but all involve the use of the resource name (mnist-e2e) and result in "NotFound".
I tried patching the manifest to empty the finalizers list:
kubectl patch experiment mnist-e2e \
-n namespace \
-p '{"metadata":{"finalizers":[]}}' \
--type=merge
I tried dumping a manifest of the "orphaned" resource and then deleting using that manifest:
kubectl get experiment mnist-e2e -n namespace -o yaml > exp.yaml
kubectl delete -f exp.yaml
Delete attempts from the Kubeflow UI Experiments (AutoML) page fail.
Thanks

Automated script to delete kubernetes pvc by labels AGE

I have a total of 2030 pvc's and I want to delete 2000 pvc's form them and keep the 30.
Those 30 pvc's are latest and only less that 2 days old.. so that is why I do not want to delete them. The other all 2000 pvc are more than 2 days old.
I want to create a script that runs automatically to delete the pvc's which are more than 2 days old.
some example of my pvcs:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-14353-postgresql-0 Bound pvc-1a6 8Gi RWO gp2 2d15h
data-14354-postgresql-0 Bound pvc-2d6 8Gi RWO gp2 16d
data-14358-postgresql-0 Bound pvc-9dc 8Gi RWO gp2 127m
data-14367-postgresql-0 Bound pvc-2eb 8Gi RWO gp2 65h
data-14370-postgresql-0 Bound pvc-90d 8Gi RWO gp2 56d
now as u can see I have a mixed AGE label.
They can be deleted using:
kubectl delete pvc
but this will delete all.. and I do not want to delete all!
what command or -label for age I can add to run the command to delete all pvc except those with less than 2 days old.
TL:DR;
kubectl get pvc --sort-by=.metadata.creationTimestamp --no-headers | tac | cut -d ' ' -f 1 | sed -n 31,2000p | xargs kubectl delete pvc
Let's explain in pieces:
kubectl get pvc --sort-by=.metadata.creationTimestamp --no-headers:
This part lists the PVCs in descending order without the header row
tac
reverse the output and produce ascending order
cut -d ' ' -f 1
Get's the first column which have the PVCs names
sed -n 31,2000p
Prints all the PVCs after line 30 which will contain your 30 PVCs which are less that 2 days old
p - Print out the pattern space (to the standard output). This command is usually only used in conjunction with the -n command-line option.
n - If auto-print is not disabled, print the pattern space, then, regardless, replace the pattern space with the next line of input. If there is no more input then sed exits without processing any more commands.
xargs kubectl delete pvc
Deletes your PVCs.
There is no easy way to do this by just using a kubectl command, its only easy to operate that way when you can use labels in a good way.
But you can do this programmatically, by using a programming or scripting language against the Kubernetes API Server.
Perhaps, you can in this case, manually add a label to the 30 PVCs that you want to keep, and then delete the PVCs which does not have this label.
If you have managed to label the PVCs that you want to keep, you can use e.g.
kubectl delete pvc -l !keep
See more about how to add labels and select resources using labels on Labels and Selectors

issues with filter wildcards on gcloud sdk (ubuntu)

I'm attempting to parse external IP addresses of GCP compute instances in an instance group, then separate them with commas, to inject into a configuration file for a software.
I've created a command that does this successfully on my Mac (10.14.6):
gcloud compute instances list --filter="name :(name-of-instance*)" \
--format="get(networkInterfaces[0].accessConfigs[0].natIP)" \
| tr '\n' ',' | sed s/.$//
which immediately outputs a list:
x.x.x.x,y.y.y.y,z.z.z.z
this command is then put into a bash script that will be running on a compute instance (running Ubuntu 1604LTS).
however, when I try on a test instance (Ubuntu 1604LTS), the previous command with the wildcard * does not output anything.
I've tested this by removing the wildcard and specifying the full name of one of the instances, and it does output the external IP of that instance correctly:
gcloud compute instances list --filter="name :(name-of-instance-full)" \
--format="get(networkInterfaces[0].accessConfigs[0].natIP)" \
| tr '\n' ',' | sed s/.$//
I've tried with several Filter Expressions including name : (instanceName*), name ~ ^instanceName*, name = instanceName* (wildcards are not permitted for the = expression so it fails everywhere).
I cannot tell if this is a bug in gcloud sdk or if I'm missing something about how filters work on GCP compute instances.
expected result on Ubuntu1604LTS when using wildcard:
x.x.x.x,y.y.y.y,z.z.z.z (same as on Mac)
actual result when using wildcard:
Agreed with amanda's answer but as per gcp documentation :(colon) support has been deprecated and will be dropped shortly.
So Instead as per the example here below should work:
gcloud compute instances list --filter="name~instance-name" \
--format="get(networkInterfaces[0].accessConfigs[0].natIP)" \
| tr '\n' ',' | sed s/.$//
turns out the spaces that were between the key pair and the parentheses needed to be removed to make this work in Ubuntu:
gcloud compute instances list --filter="name:instance-name*" \
--format="get(networkInterfaces[0].accessConfigs[0].natIP)" \
| tr '\n' ',' | sed s/.$//

Kubectl : No resource found

I’ve installed ICP4Data successfully. I am pretty green in respect to ICP4Data and Kubernetes. I’m trying to use kubectl command for listing the pods in ICP4D but “kubectl get pods” returns “No resource found”. Am I missing something?
icp4d uses 'zen' namespaces to logically separate its assets and resources from the core native icp/kube platform. In the default installation of ICP4D, there are no pods deployed on 'default' namespace and hence you get "no resources found" cause if you don't provide the namespace while trying to get pods, kubectl assumes its default namespace.
To List the pods from zen namespace
kubectl get pods -n zen
To list all the namespaces available to you - try
kubectl get namespaces
To list pods from all the namespaces, you might want to append --all-namespaces
kubectl get pods --all-namespaces
This should list all the pods from zen, kubesystem and possibly others.
Please try adding namespace to the command as well. In the case for ICP4D try kubectl get pods -n zen.
On the other hand, you could switch your namespace to zen at the beginning by
kubectl config set-context --current --namespace=zen
Then you will be able to see all the information by running without the -n argument
kubectl get pods
Check you are currently on which namespace.
To find out your pod is created in which namespace, you can run this command
kubectl get pods --all-namespaces
Also just to add, since I was in default workspace and I wanted to get logs of a pod in another namespace, just doing
kubectl get logs -f <pod_name>
was giving output "Error from server (NotFound): pods "pod_name" not found".
So I specified the namespace as well.
kubectl logs -f <pod_name> -n namespace

Fetching the complete information about a VM instance on google cloud compute

I am running a VM instance on a google cloud compute project.
I would like to fetch information about the instance in a text file so that I can pass it to another developer and he will be able to spin off similar instance on his own google cloud compute project.
In other words, the information I'm looking for includes (among others): the type operating system selected for the instance, the number of GPUs and their type, instance's zone, disk size, disk type (SSD or other), number of CPUs etc...
I'm using gcloud SDK to start or stop the instance. I tried to run gcloud compute instances describe, but the information that I retrieve this way does not include all the information I'm looking for.
There is question with somewhat similar title but the OP of that question is looking for a different info.
Thanks!
I do have a bash script that you may be useful for your. The scrip[t look on the instance details using the gcloud command and create a file with the name {all-details.txt} including the information that you are looking for.
the only thing is that you need to provide the instance name and the zone since mandatory field for gcloud command:
#!/bin/bash
echo " instance name :$1 "
echo " zone:$2"
sudo gcloud compute instances describe $1 --zone $2 | grep -E 'cpuPlatfor|machineType|guestAccelerators|accelera
torCount|acceleratorType|disks|type' > details1.txt \
&& sudo gcloud compute disks describe $1 --zone $2 | grep -E 'licenses|sourceImage|sizeGb' > details2.txt \
&& cat details1.txt details2.txt > all-details.txt \
&& rm details1.txt details2.txt
Then you have just to run the bash with the 2 argument like : sudo bash script.sh