In my helm chart, there's a pre-delete job that removes some extra resources when doing helm delete. If the deployment goes well, there's no problem with it.
However, when errors happen such as imagePullBackoff or pvc unbounded, the pre-delete job still try to execute and will go into error state as well so that the helm delete will time out.
I understand there's a helm delete --no-hook option, but i can't change the delete button in UI to make it happen as it's provided by third party.
Is there anything that I can do in my chart so that the helm delete automatically doesn't wait for pre-delete job if the job failed?
You can try to write your pre-delete hook Job in a way that it will always reports success no matter what happened during the execution of main operation.
Example:
$ cat success.sh:
ls sdfsf || exit 0
$ cat success2.sh
set +e
ls
ls sdfsf
exit 0
The scripts success.sh and success2.sh always return 0 (success), despite that ls sdfsf command inside the scripts returns 2 ("No such file or directory" error).
# following command also has exit code 0
$ ls sfsdf || echo -n ''
Related
I am using CDK to deploy cf stack to AWS. It has cdk diff command to tell me what changed in this deployment. If there is nothing changed, it just shows There were no differences for each stack included in the cdk project.
I have a requirement to run different command based on whether the cdk requires a change. How can I know whether it requires a change from a script? I have checked that cdk diff return code is 0 for both change and no change. What is the right way to know whether the change-set will change anything?
The --fail flag will cause cdk diff to exit with exit code 1 in case of a diff. Add conditional logic to handle the exit code cases:
cdk diff --fail && echo "no diffs found" || echo "diffs found"
I have a workflow which gets triggered everyday in the Morning at 07:15 AM.I want to get an email to my Id from Informatica when the workflow doesn't get trigerred within 3 min from the start time.
You have two options -
easiest would be - create another workflow(scheduled at 7:18 AM) with a command task which will check a file.
After command task, put a condition to the task link status=1 and then add a email task.
Add a touch command as pre session to the main workflow.
New workflow will be like -
start -->cmd task -->|--link status<>0--> email task
command task will be like -
#!/bin/sh
if [ -r /somedir/ind.txt ]; then
exit 0
rm /somedir/ind.txt
else
exit 1
fi
Now, in real time, at 7:15 the wkflow will start and crete the file, second workflow will detect and do nothing. Now , if file doenst exist, it will mail.
second option will be, you can create a cron script that starts around 7:18AM, check if file exist or not - if file is absent, it will mail and delete the file. Your command file should be like this -
#!/bin/sh
if [ -r /somedir/ind.txt ]; then
exit 0
rm /somedir/ind.txt
mail -s <...some command...>
else
exit 1
fi
As per my understanding of the docs, the -R flag should do exactly this, but for me the command kubectl rollout status -R -f k8s/services fails with error: rollout status is only supported on individual resources and resource collections - 3 resources were found.
In the k8s/services directory I have 3 service manifests. What is a resource collection, mentioned in the error message, if not 3 services for example? What should be in the directory when using -R?
kubectl rollout status --help:
Show the status of the rollout.
By default 'rollout status' will watch the status of the latest rollout until it's done. If you don't want to wait for
the rollout to finish then you can use --watch=false. Note that if a new rollout starts in-between, then 'rollout
status' will continue watching the latest revision. If you want to pin to a specific revision and abort if it is rolled
over by another revision, use --revision=N where N is the revision you need to watch for.
Examples:
# Watch the rollout status of a deployment
kubectl rollout status deployment/nginx
Options:
-f, --filename=[]: Filename, directory, or URL to files identifying the resource to get from a server.
-k, --kustomize='': Process the kustomization directory. This flag can't be used together with -f or -R.
-R, --recursive=false: Process the directory used in -f, --filename recursively. Useful when you want to manage
related manifests organized within the same directory.
--revision=0: Pin to a specific revision for showing its status. Defaults to 0 (last revision).
--timeout=0s: The length of time to wait before ending watch, zero means never. Any other values should contain a
corresponding time unit (e.g. 1s, 2m, 3h).
-w, --watch=true: Watch the status of the rollout until it's done.
Usage:
kubectl rollout status (TYPE NAME | TYPE/NAME) [flags] [options]
Use "kubectl options" for a list of global command-line options (applies to all commands).
I have tested with kubectl version 1.14 and 1.15.
It means that it found 3 services, but you can only see roll out status for a specific service. like :
kubectl rollout status -f k8s/services/<svc-name>.yaml
You don't need to use -R when all yamls are the child of services.
Take a look why -R flag was added in this issue
This question already has answers here:
Tell when Job is Complete
(7 answers)
Closed 3 years ago.
I'm looking for a way to wait for Job to finish execution Successfully once deployed.
Job is being deployed from Azure DevOps though CD on K8S on AWS. It is running one time incremental database migrations using Fluent migrations each time it's deployed. I need to read pod.status.phase field.
If field is "Succeeded", then CD will continue. If it's "Failed", CD stops.
Anyone have an idea how to achieve this?
I think the best approach is to use the kubectl wait command:
Wait for a specific condition on one or many resources.
The command takes multiple resources and waits until the specified
condition is seen in the Status field of every given resource.
It will only return when the Job is completed (or the timeout is reached):
kubectl wait --for=condition=complete job/myjob --timeout=60s
If you don't set a --timeout, the default wait is 30 seconds.
Note: kubectl wait was introduced on Kubernetes v1.11.0. If you are using older versions, you can create some logic using kubectl get with --field-selector:
kubectl get pod --field-selector=status.phase=Succeeded
We can check Pod status using K8S Rest API.
In order to connect to API, we need to get a token:
https://kubernetes.io/docs/tasks/administer-cluster/access-cluster-api/#without-kubectl-proxy
# Check all possible clusters, as you .KUBECONFIG may have multiple contexts:
kubectl config view -o jsonpath='{"Cluster name\tServer\n"}{range .clusters[*]}{.name}{"\t"}{.cluster.server}{"\n"}{end}'
# Select name of cluster you want to interact with from above output:
export CLUSTER_NAME="some_server_name"
# Point to the API server refering the cluster name
APISERVER=$(kubectl config view -o jsonpath="{.clusters[?(#.name==\"$CLUSTER_NAME\")].cluster.server}")
# Gets the token value
TOKEN=$(kubectl get secrets -o jsonpath="{.items[?(#.metadata.annotations['kubernetes\.io/service-account\.name']=='default')].data.token}"|base64 -d)
From above code we have acquired TOKEN and APISERVER address.
On Azure DevOps, on your target Release, on Agent Job, we can add Bash task:
#name of K8S Job object we are waiting to finish
JOB_NAME=name-of-db-job
APISERVER=set-api-server-from-previous-code
TOKEN=set-token-from-previous-code
#log APISERVER and JOB_NAME for troubleshooting
echo API Server: $APISERVER
echo JOB NAME: $JOB_NAME
#keep calling API until you get status Succeeded or Failed.
while true; do
#read all pods and query for pod containing JOB_NAME using jq.
#note that you should not have similar pod names with job name otherwise you will get mutiple results. This script is not expecting multiple results.
res=$(curl -X GET $APISERVER/api/v1/namespaces/default/pods/ --header "Authorization: Bearer $TOKEN" --insecure | jq --arg JOB_NAME "$JOB_NAME" '.items[] | select(.metadata.name | contains($JOB_NAME))' | jq '.status.phase')
if (res=="Succeeded"); then
echo Succeeded
exit 0
elif (res=="Failed"); then
echo Failed
exit 1
else
echo $res
fi
sleep 2
done
If Failed, script will exit with code 1 and CD will stop (if configured that way).
If Succeeded, exist with code 0 and CD will continue.
In final setup:
- Script is part of artifact and I'm using it inside Bash task in Agent Job.
- I have placed JOB_NAME into Task Env. Vars so it can be used for multiple DB migrations.
- Token and API Server address are in Variable group on global level.
TODO:
curl is not existing with code 0 if URL is invalid. It needs --fail flag, but still above line exists 0.
"Unknown" Pod status should be handled as well
I'm using 'eb deploy' in my continuous integration script. I'm having 2 problems with it:
It always returns returncode 0, even if there is an error. This breaks my deploy pipeline, because there is no way to detect an error.
It displays output only after command is finished.
Is there any way to make 'eb deploy' to work as any normal script and return proper error codes?
This is a know issue reported upstream here. You can fix it by using grep in a pretty straight forward way. Instead of:
eb deploy
Use grep to get the success string. This will return a non-zero status (ie: failure) if it can't be found:
eb deploy | tee /dev/tty | grep "update completed successfully"
Note how I used tee to make sure that the output can still be seen on the continuous integration portal (in my case circleci).