How to get all pods without jobs - list

Is it possible to retrieve all pods without taking jobs?
kubectl get pods
pod1 1/1 Running 1 28d
pod2 1/1 Running 1 28d
pods3 0/1 Completed 0 30m
pod4 0/1 Completed 0 30m
I don't want to see jobs, but only the other pod.
I don't want to fetch them basing on "Running State" because I would like to verify if all deployment I am trying to install are "deployed".
Basing on that I wanted to use the following command, but it is fetching also jobs I am trying to exclude:
kubectl wait --for=condition=Ready pods --all --timeout=600s

Add a special label (e.g. kind=pod) to your job pods. Then use kubectl get pods -l kind!=pod.

If using a bit of scripting is OK...this one-liner should return the names of all "non-Jobs" pods in all namespaces:
for p in `kubectl get pods --all-namespaces -o=jsonpath="{range .items[*]}{.metadata.name}{';'}{.metadata.ownerReferences[?(#.kind != 'Job')].name}{'\n'}{end}"`; do v_owner_name=$(echo $p | cut -d';' -f2); if [ ! -z "$v_owner_name" ]; then v_pod_name=$(echo $p | cut -d';' -f1); echo $v_pod_name; fi; done
Using the above as a foundation, the following aims to return all "non-Jobs" pods in Ready status:
for p in `kubectl get pods --all-namespaces -o=jsonpath="{range .items[*]}{.metadata.name}{';'}{'Ready='}{.status.conditions[?(#.type == 'Ready')].status}{';'}{.metadata.ownerReferences[?(#.kind != 'Job')].name}{'\n'}{end}"`; do v_owner_name=$(echo $p | cut -d';' -f3); if [ ! -z "$v_owner_name" ]; then v_pod_name=$(echo $p | cut -d';' -f1,2); echo $v_pod_name; fi; done
This doc explains (arguably - to some degree) the JSONPath support in kubectl.

If your question is -
I would like to verify if all deployment I am trying to install are
"deployed"
Then this is not the right way of checking Pods status in Kubernetes. Please check the replicas and readyReplicas for your deployment.
kubectl get deployment <deployment-Name> -ojson | jq -r '.status | { desired: .replicas, ready: .readyReplicas }'
Output:-
{
"desired": 1,
"ready": 1
}
Here I am using jq (It's very handy) utility to parse the stuff

Related

How to display constant values using custom-columns format of kubectl?

I have multiple clusters and I want to check which ingresses do not specify explicit certificate. Right now I use the following command:
~$ k config get-contexts -o name | grep -E 'app(5|3)41.+-admin' | xargs -n1 -I {} kubectl --context {} get ingress -A -o 'custom-columns=NS:{.metadata.namespace},NAME:{.metadata.name},CERT:{.spec.tls.*.secretName}' | grep '<none>'
argocd argo-cd-argocd-server <none>
argocd argo-cd-argocd-server <none>
reference-app reference-app-netcore-ingress <none>
argocd argo-cd-argocd-server <none>
argocd argo-cd-argocd-server <none>
test-ingress my-nginx <none>
~$
I want to improve the output by including the context name, but I can't figure out how to modify the custom-columns format to do that.
The below command would Not yield the exact desired output, but it will be close. using jsonpath, it's possible:
kubectl config get-contexts -o name | xargs -n1 -I {} kubectl get ingress -A -o jsonpath="{range .items[*]}{} {.metadata.namespace} {.metadata.name} {.spec.tls.*.secretName}{'\n'}{end}" --context {}
If the exact output is needed, then the kubectl output needs to be looped in the bash loop. Example:
kubectl config get-contexts -o name | while read context; do k get ingress -A -o 'custom-columns=NS:{.metadata.namespace},NAME:{.metadata.name},CERT:{.spec.tls.*.secretName}' --context "$context" |awk -vcon="$context" 'NR==1{$0=$0FS"CONTEXT"}NR>1{$0=$0 FS con}1'; done |column -t
NS NAME CERT CONTEXT
default tls-example-ingress testsecret-tls kubernetes-admin-istio-demo.local#istio-demo.local
default tls-example-ingress1 testsecret-tls kubernetes-admin-istio-demo.local#istio-demo.local
default tls-example-ingress2 <none> kubernetes-admin-istio-demo.local#istio-demo.local
To perform post-processing around the header and context, the awk command was used. Here is some details about it:
Command:
awk -vcon="$context" 'NR==1{$0=$0FS"CONTEXT"}NR>1{$0=$0 FS con}1'; done |column -t
-vcon="$context": This is to create a variable called con inside awk to store the value of bash variable($context).
NR==1: Here NR is the record number(in this case line number) and $0 means record/line.
NR==1{$0=$0FS"CONTEXT"}: This means, on the 1st line, reset the line to itself followed by FS(default is space) followed by a string "CONTEXT".
Similarly, NR>1{$0=$0 FS con} means, from the 2nd line onwards, append the line with FS followed by con.
1 in the end is the tell awk to do the print.

Wrong Yarn node label mapping with AWS EMR machine types

Does anyone have experience with Yarn node labels on AWS EMR? If so you please share your thoughts. We want to run All the Spark executors on Task(Spot) machine and all the Spark ApplicationMaster/Driver on Core(on-Demand) machine. Previously we were running Spark executors and Spark Driver all on the CORE machine(on-demand).
In order to achieve this, we have created the "TASK" yarn node label as a part of a custom AWS EMR Bootstrap action. And Have mapped the same "TASK" yarn label when any Spot instance is registered with AWS EMR in a separate bootstrap action. As "CORE" is the default yarn node label expression, so we are simply mapping it with an on-demand instance upon registration of the node in the bootstrap action.
We are using "spark.yarn.executor.nodeLabelExpression": "TASK" spark conf to launch spark executors on Task nodes.
So.. we are facing the problem of the wrong mapping of the Yarn node label with the appropriate machine i.e For a short duration of time(around 1-2 mins) the "TASK" yarn node label is mapped with on-demand instances and "CORE" yarn node label is mapped with spot instance. So During this short duration of wrong labeling Yarn launches Spark executors on On-demand instances and Spark drivers on Spot instances.
This wrong mapping of labels with corresponding machine type persists till the bootstrap actions are complete and after that, the mapping is automatically resolved to its correct state.
The script we are running as a part of the bootstrap action:
This script is run on all new machines to assign a label to that machine. The script is being run as a background PID as the yarn is only available after all custom bootstrap actions are completed
#!/usr/bin/env bash
set -ex
function waitTillYarnComesUp() {
IS_YARN_EXIST=$(which yarn | grep -i yarn | wc -l)
while [ $IS_YARN_EXIST != '1' ]
do
echo "Yarn not exist"
sleep 15
IS_YARN_EXIST=$(which yarn | grep -i yarn | wc -l)
done
echo "Yarn exist.."
}
function waitTillTaskLabelSyncs() {
LABEL_EXIST=$(yarn cluster --list-node-labels | grep -i TASK | wc -l)
while [ $LABEL_EXIST -eq 0 ]
do
sleep 15
LABEL_EXIST=$(yarn cluster --list-node-labels | grep -i TASK | wc -l)
done
}
function getHostInstanceTypeAndApplyLabel() {
HOST_IP=$(curl http://169.254.169.254/latest/meta-data/local-hostname)
echo "host ip is ${HOST_IP}"
INSTANCE_TYPE=$(curl http://169.254.169.254/latest/meta-data/instance-life-cycle)
echo "instance type is ${INSTANCE_TYPE}"
PORT_NUMBER=8041
spot="spot"
onDemand="on-demand"
if [ $INSTANCE_TYPE == $spot ]; then
yarn rmadmin -replaceLabelsOnNode "${HOST_IP}:${PORT_NUMBER}=TASK"
elif [ $INSTANCE_TYPE == $onDemand ]
then
yarn rmadmin -replaceLabelsOnNode "${HOST_IP}:${PORT_NUMBER}=CORE"
fi
}
waitTillYarnComesUp
# holding for resource manager sync
sleep 100
waitTillTaskLabelSyncs
getHostInstanceTypeAndApplyLabel
exit 0
yarn rmadmin -addToClusterNodeLabels "TASK(exclusive=false)"
This command is being run on the Master instance to create a new TASK yarn node label at the time of cluster creation.
Does anyone have clue to prevent this wrong mapping of labels?
I would like to propose the next:
Create every node with some default label, like LABEL_PENDING. You can do it using the EMR classifications;
In the bootstrap script, you should identify if the current node is On-Demand or Spot instance;
After that, on every node you should update change LABEL_PENDING in /etc/hadoop/conf/yarn-site.xml to ON_DEMAND or SPOT;
On the master node, you should add 3 labels to YARN: LABEL_PENDING, ON_DEMAND, and SPOT.
Example of EMR Classifications:
[
{
"classification": "yarn-site",
"properties": {
"yarn.node-labels.enabled": "true",
"yarn.node-labels.am.default-node-label-expression": "ON_DEMAND",
"yarn.nodemanager.node-labels.provider.configured-node-partition": "LABEL_PENDING"
},
"configurations": []
},
{
"classification": "capacity-scheduler",
"properties": {
"yarn.scheduler.capacity.root.accessible-node-labels.ON_DEMAND.capacity": "100",
"yarn.scheduler.capacity.root.accessible-node-labels.SPOT.capacity": "100",
"yarn.scheduler.capacity.root.default.accessible-node-labels.ON_DEMAND.capacity": "100",
"yarn.scheduler.capacity.root.default.accessible-node-labels.SPOT.capacity": "100"
},
"configurations": []
},
{
"classification": "spark-defaults",
"properties": {
"spark.yarn.am.nodeLabelExpression": "ON_DEMAND",
"spark.yarn.executor.nodeLabelExpression": "SPOT"
},
"configurations": []
}
]
Example of the additional part to your bootstrap script
yarnNodeLabelConfig="yarn.nodemanager.node-labels.provider.configured-node-partition"
yarnSiteXml="/etc/hadoop/conf/yarn-site.xml"
function waitForYarnConfIsReady() {
while [[ ! -e $yarnSiteXml ]]; do
sleep 2
done
IS_CONF_PRESENT_IN_FILE=$(grep $yarnNodeLabelConfig $yarnSiteXml | wc -l)
while [[ $IS_CONF_PRESENT_IN_FILE != "1" ]]
do
echo "Yarn conf file doesn't have properties"
sleep 2
IS_CONF_PRESENT_IN_FILE=$(grep $yarnNodeLabelConfig $yarnSiteXml | wc -l)
done
}
function updateLabelInYarnConf() {
INSTANCE_TYPE=$(curl http://169.254.169.254/latest/meta-data/instance-life-cycle)
echo "Instance type is $INSTANCE_TYPE"
if [[ $INSTANCE_TYPE == "spot" ]]; then
sudo sed -i 's/>LABEL_PENDING</>SPOT</' $yarnSiteXml
elif [[ $INSTANCE_TYPE == "on-demand" ]]
then
sudo sed -i 's/>LABEL_PENDING</>ON_DEMAND</' $yarnSiteXml
fi
}
waitForYarnConfIsReady
updateLabelInYarnConf

kubectl jsonpath command to list out failed jobs and namespaces

The command below give a list of failed jobs
kubectl get jobs -o=jsonpath='{.items[?(#.status.failed==1)].metadata.name}' --all-namespaces
job-3764289372 abc-23145263524 xyz-6745096523
I need to list out the jobs and their namespaces. Is it possible to do this with jsonpath?
Something like below?
NAMESPACE NAME
dev-namespace job-3764289372
namespace-123 abc-23145263524
I think its not possible using plain jsonpath and a post processing(solution-2) is needed. However, if you could use go-template(solution-1), then it can be done with go-template without using any post processing.
k get job
NAME COMPLETIONS DURATION AGE
job1 1/1 2m36s 3h27m
job2 0/1 3h23m 3h23m #failed
job3 0/1 3h23m 3h23m #failed
Solution-1: using go-template:
Below go-template will print the namespace and name of the failed job.
kubectl get job -A -o go-template='{{range $i, $p := .items}}{{range .status.conditions}}{{if (eq .type "Failed")}}{{$p.metadata.namespace}} {{$p.metadata.name}}{{"\n"}}{{end}}{{end}}{{end}}'
default job2
default job3
Solution-2: using jsonpath with awk.
kubectl get job -o jsonpath='{range .items[*]}{.metadata.namespace} {.metadata.name} {.status.conditions[*].type}{"\n"}'|awk 'BEGIN{print "namespace","name"}$NF=="Failed"{print $1,$2}'
namespace name
default job2
default job3

Jenkinsfile to automatically deploy to EKS

How do I pass my aws credentials when I am running a Jenkinsjob
taking this as an example https://github.com/PaulMaddox/amazon-eks-kubectl
$ docker run -v ~/.aws:/home/kubectl/.aws -e CLUSTER=demo maddox/kubectl get services
The above works on my laptop , but I want to pass aws credentials on the file.I have aws configured in my Jenkins-->credentials .I also have a bitbucket repo which contains a Jenkinsfile and a yam file for "service" and "deployment"
the way I do it now is run the kubectl create -f filename.yaml and it deploys to eks .. just want to do the same thing but automate it with a Jenkinsfile , suggestions on how to do it either with kubectl or with helm
In your Jenkinsfile you should include similar section:
stage('Deploy on Dev') {
node('master'){
withEnv(["KUBECONFIG=${JENKINS_HOME}/.kube/dev-config","IMAGE=${ACCOUNT}.dkr.ecr.us-east-1.amazonaws.com/${ECR_REPO_NAME}:${IMAGETAG}"]){
sh "sed -i 's|IMAGE|${IMAGE}|g' k8s/deployment.yaml"
sh "sed -i 's|ACCOUNT|${ACCOUNT}|g' k8s/service.yaml"
sh "sed -i 's|ENVIRONMENT|dev|g' k8s/*.yaml"
sh "sed -i 's|BUILD_NUMBER|01|g' k8s/*.yaml"
sh "kubectl apply -f k8s"
DEPLOYMENT = sh (
script: 'cat k8s/deployment.yaml | yq -r .metadata.name',
returnStdout: true
).trim()
echo "Creating k8s resources..."
sleep 180
DESIRED= sh (
script: "kubectl get deployment/$DEPLOYMENT | awk '{print \$2}' | grep -v DESIRED",
returnStdout: true
).trim()
CURRENT= sh (
script: "kubectl get deployment/$DEPLOYMENT | awk '{print \$3}' | grep -v CURRENT",
returnStdout: true
).trim()
if (DESIRED.equals(CURRENT)) {
currentBuild.result = "SUCCESS"
return
} else {
error("Deployment Unsuccessful.")
currentBuild.result = "FAILURE"
return
}
}
}
}
}
which will be responsible for automating deployment proccess.
I hope it helps.

Amazon RDS - Online only when needed?

I had a question about Amazon RDS. I only need the database online for about 2 hours a day but I am dealing with quite a large database at around 1gb.
I have two main questions:
Can I automate bringing my RDS database online and offline via scripts to save money?
When I put a RDS offline to stop the "work hours" counter running and billing me, when I bring it back online will it still have the same content (i.e will all my data stay there, or will it have to be a blank DB?). If so, is there any way around this rather than backing up to S3 and reimporting it every time?
If you wish to do this programatically,
Snapshot the RDS instance using rds-create-db-snapshot http://docs.aws.amazon.com/AmazonRDS/latest/CommandLineReference/CLIReference-cmd-CopyDBSnapshot.html
Delete the running instance using rds-delete-db-instance http://docs.aws.amazon.com/AmazonRDS/latest/CommandLineReference/CLIReference-cmd-DeleteDBInstance.html
Restore the database from the snapshot using rds-restore-db-instance-from-db-snapshot http://docs.aws.amazon.com/AmazonRDS/latest/CommandLineReference/CLIReference-cmd-RestoreDBInstanceFromDBSnapshot.html
You may also do all of this from the AWS Web Console as well, if you wish to do this manually.
You can start EC2* instances using shell scripts, so I guess that you can as well for RDS.
(see http://docs.aws.amazon.com/AmazonRDS....html)
But unlike EC2*, you cannot "stop" an RDS instance without "destroying" it. You need to create a DB snapshot when terminating your database. You will use this DB snapshot when re-starting the database.
*EC2 : Elastic Computing, renting a virtual server or a server.
Here's a script that will stop/start/reboot an RDS instance
#!/bin/bash
# usage ./startStop.sh lhdevices start
INSTANCE="$1"
ACTION="$2"
# export vars to run RDS CLI
export JAVA_HOME=/usr;
export AWS_RDS_HOME=/home/mysql/RDSCli-1.15.001;
export PATH=$PATH:/home/mysql/RDSCli-1.15.001/bin;
export EC2_REGION=us-east-1;
export AWS_CREDENTIAL_FILE=/home/mysql/RDSCli-1.15.001/keysLightaria.txt;
if [ $# -ne 2 ]
then
echo "Usage: $0 {MySQL-Instance Name} {Action either start, stop or reboot}"
echo ""
exit 1
fi
shopt -s nocasematch
if [[ $ACTION == 'start' ]]
then
echo "This will $ACTION a MySQL Instance"
rds-restore-db-instance-from-db-snapshot lhdevices
--db-snapshot-identifier dbStart --availability-zone us-east-1a
--db-instance-class db.m1.small
echo "Sleeping while instance is created"
sleep 10m
echo "waking..."
rds-modify-db-instance lhdevices --db-security-groups kfarrell
echo "Sleeping while instance is modified for security group name"
sleep 5m
echo "waking..."
elif [[ $ACTION == 'stop' ]]
then
echo "This will $ACTION a MySQL Instance"
yes | rds-delete-db-snapshot dbStart
echo "Sleeping while deleting old snapshot "
sleep 10m
#rds-create-db-snapshot lhdevices --db-snapshot-identifier dbStart
# echo "Sleeping while creating new snapshot "
# sleep 10m
# echo "waking...."
#rds-delete-db-instance lhdevices --force --skip-final-snapshot
rds-delete-db-instance lhdevices --force --final-db-snapshot-identifier dbStart
echo "Sleeping while instance is deleted"
sleep 10m
echo "waking...."
elif [[ $ACTION == 'reboot' ]]
then
echo "This will $ACTION a MySQL Instance"
rds-reboot-db-instance lhdevices ;
echo "Sleeping while Instance is rebooted"
sleep 5m
echo "waking...."
else
echo "Did not recognize command: $ACTION"
echo "Usage: $0 {MySQL-Instance Name} {Action either start, stop or reboot}"
fi
shopt -u nocasematch
Amazon recently updated their CLI to include a way to start and stop RDS instances. stop-db-instance and start-db-instance detail the steps needed to perform these operations.