Redeploy latest image without using `kubectl apply -f file` - kubectl

We have a deployment on a cluster, we want to tell it to pull the latest image and re-deploy. If I run kubectl apply -f deployment.yml that file hasn't actually changed. How do I just tell the cluster to use a newer image version?

As per documentation:
Note: A Deployment’s rollout is triggered if and only if the Deployment’s pod template (that is, .spec.template) is changed, for example if the labels or container images of the template are updated. Other updates, such as scaling the Deployment, do not trigger a rollout.
Please consider using:
kubectl patch deployment my-nginx --patch '{"spec": {"template": {"spec": {"containers": [{"name": "nginx","image": "nginx:1.7.9"}]}}}}'
kubectl --record deployment.apps/nginx-deployment set image deployment.v1.apps/nginx-deployment nginx=nginx:1.9.1
kubectl edit deploy/<your_deployment> --record
Documentation about Updating a Deployment and Container Images.
According to the best practices please Note:
You should avoid using the :latest tag when deploying containers in production as it is harder to track which version of the image is running and more difficult to roll back properly.
However if you would like always force pull new image you can use on of those options:
- set the imagePullPolicy of the container to Always.
- omit the imagePullPolicy and use :latest as the tag for the image to use.
- omit the imagePullPolicy and the tag for the image to use.
- enable the AlwaysPullImages admission controller.
Hope this help.

Related

My GKE pods stoped with error "no command specified: CreateContainerError"

Everything was Ok and nodes were fine for months, but suddenly some pods stopped with an error
I tried to delete pods and nodes but same issues.
Try below possible solutions to resolve your issue:
Solution 1 :
Check a malformed character in your Dockerfile and cause it to crash.
When you encounter CreateContainerError is to check that you have a valid ENTRYPOINT in the Dockerfile used to build your container image. However, if you don’t have access to the Dockerfile, you can configure your pod object by using a valid command in the command attribute of the object.
So workaround is to not specify any workerConfig explicitly which makes the workers inherit all configs from the master.
Refer to Troubleshooting the container runtime, similar SO1, SO2 & Also check this similar github link for more information.
Solution 2 :
Kubectl describe pod podname command provides detailed information about each of the pods that provide Kubernetes infrastructure. With the help of this you can check for clues, if Insufficient CPU follows the solution below.
The solution is to either:
1)Upgrade the boot disk: If using a pd-standard disk, it's recommended to upgrade to pd-balanced or pd-ssd.
2)Increase the disk size.
3)Use node pool with machine type with more CPU cores.
See Adjust worker, scheduler, triggerer and web server scale and performance parameters for more information.
If you still have the issue, you can then update the GKE version for your cluster Manually upgrading the control planeto one of the fixed versions.
Also check whether you have updated it in the last year to use the new kubectl authentication coming in the GKE v1.26 plugin?
Solution 3 :
If you're having a pipeline on GitLab that deploys an image to a GKE cluster: Check the version of the Gitlab runner that handles the jobs of your pipeline .
Because it turns out that every image built through a Gitlab runner running on an old version causes this issue at the container start. Simply deactivate them and only let Gitlab runners running last version in the pool, replay all pipelines.
Check the gitlab CI script using an old docker image like docker:19.03.5-dind, update to docker:dind helps the kubernetes to start the pod again.

Command To Delete All Image Versions of Artifact Registry Except Latest?

I have a GitHub Action that pushes my image to Artifact Registry. This is the steps that authenticates and then pushes it to my Google Cloud Artifact Registry
- name: Configure Docker Client
run: |-
gcloud auth configure-docker --quiet
gcloud auth configure-docker $GOOGLE_ARTIFACT_HOST_URL --quiet
- name: Push Docker Image to Artifact Registry
run: |-
docker tag $IMAGE_NAME:latest $GOOGLE_ARTIFACT_HOST_URL/$PROJECT_ID/images/$IMAGE_NAME:$GIT_TAG
docker push $GOOGLE_ARTIFACT_HOST_URL/$PROJECT_ID/images/$IMAGE_NAME:$GIT_TAG
Where $GIT_TAG is always 'latest'
I want to add one more command that then purges all except the latest version. In this screenshot below, you see theres 2 images
I would like to remove the one that was 3 days ago as its not the one with the tag 'latest'.
Is there a terminal command to do this?
You may initially check through the filtered list of container images for your specific criteria.
gcloud artifacts docker images list --include-tags
Once you have the view of the images to be deleted, move to the following.
You may use the following command to delete an Artifact Registry container image.
gcloud artifacts docker images delete IMAGE [--async] [--delete-tags] [GCLOUD_WIDE_FLAG …]
A valid container image that can be referenced by tag or digest, has the format of
LOCATION-docker.pkg.dev/PROJECT-ID/REPOSITORY-ID/IMAGE:tag
This command can fail for the following reasons:
Trying to delete an image by digest when the image is still tagged.
Add --delete-tags to delete the digest and the tags.
Trying to delete an image by tag when the image has other tags. Add
--delete-tags to delete all tags.
A valid repository format was not provided.
The specified image does not exist.
The active account does not have permission to delete images.
It is always recommended to check and reconfirm any deletion operations so you don’t lose any useful artifacts and data items.
Also check this helpful document here for Artifacts docker Image Deletion guidelines and some useful information on Managing Images.
As guillaume blaquiere mentioned you may have a look at this link which may help you.

Tagging multi-platform images in ECR creates untagged manifests

I've started using docker buildx to tag and push mutli-platform images to ECR. However, ECR appears to apply the tag to the parent manifest, and leaves each related manifest as untagged. ECR does appear to prevent deletion of the child manifests, but it makes managing cleanup of orphaned untagged images complicated.
Is there a way to tag these child manifests in some way?
For example, consider this push:
docker buildx build --platform "linux/amd64,linux/arm64" --tag 1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image:1.0 --push .
Inspecting the image:
docker buildx imagetools inspect 1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image:1.0
Shows:
Name: 1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image:1.0
MediaType: application/vnd.docker.distribution.manifest.list.v2+json
Digest: sha256:4221ad469d6a18abda617a0041fd7c87234ebb1a9f4ee952232a1287de73e12e
Manifests:
Name: 1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image:1.0#sha256:c1b0c04c84b025357052eb513427c8b22606445cbd2840d904613b56fa8283f3
MediaType: application/vnd.docker.distribution.manifest.v2+json
Platform: linux/amd64
Name: 1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image:1.0#sha256:828414cad2266836d9025e9a6af58d6bf3e6212e2095993070977909ee8aee4b
MediaType: application/vnd.docker.distribution.manifest.v2+json
Platform: linux/arm64
However, ECR shows the 2 child images as untagged
I'm running into the same problem. So far my solution seems a little easier than some of the other suggestions, but I still don't like it.
After doing the initial:
docker buildx build --platform "linux/amd64,linux/arm64" --tag 1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image:1.0 --push .
I follow up with:
docker buildx build --platform "linux/amd64" --tag 1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image:1.0-amd --push .
docker buildx build --platform "linux/arm64" --tag 1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image:1.0-arm --push .
This gets me the parallel build speed of building multiple platforms at the same time, and gets me the images tagged in ECR. Thanks to having the build info cached it is pretty quick, it appears to just push the tags and that is it. In a test I just did the buildx time for the first command was 0.5 seconds. and the second one took 0.7 seconds.
That said, I'm not wild about this solution, and found this question while looking for a better one.
There are several ways to tag the image, but they all involve pushing the platform specific manifest with the desired tag. With docker, you can pull the image, retag it, and push it, but the downside is you'll have to pull every layer.
A much faster option is to only transfer the manifest json with registry API calls. You could do this with curl, but auth becomes complicated. There are several tools for working directly with registries, including Googles crane, RedHat's skopeo, and my own regclient. Regclient includes the regctl command which would implement this like:
image=1234567890.dkr.ecr.eu-west-1.amazonaws.com/my-service/my-image
tag=1.0
regctl image copy \
${image}#$(regctl image digest --platform linux/amd64 $image:$tag) \
${image}:${tag}-linux-amd64
regctl image copy \
${image}#$(regctl image digest --platform linux/arm64 $image:$tag) \
${image}:${tag}-linux-arm64
You could also script an automated fix to this, listing all tags in the registry, pulling the manifest list for the tags that don't already have the platform, and running the image copy to retag each platform's manifest. But it's probably easier and faster to script your buildx job to include something like regctl after buildx pushes the image.
Note if you use a cred helper for logging into ECR, regctl can use this with the local command. If want to run regctl as a container, and you are specifically using ecr-login, use the alpine version of the images since they include the helper binary.
In addition to what Brandon mentioned above on using regctl, here's the command for skopeo if you're looking to use it with ECR credential helper. https://github.com/awslabs/amazon-ecr-credential-helper
skopeo copy \
docker://1234567890.dkr.ecr.us-west-2.amazonaws.com/stackoverflow#sha256:1badbc699ed4a1785295baa110a125b0cdee8d854312fe462d996452b41e7755 \
docker://1234567890.dkr.ecr.us-west-2.amazonaws.com/stackoverflow:1.0-linux-arm64
https://github.com/containers/skopeo
Paavan Mistry, AWS Containers DA

Elastic Kubernetes Service AWS Deployment process to avoid down time

Its been a month I have started working on EKS AWS and up till now successfully deployed by code.
The steps which I follow for deployment are given below:
Create image from docker terminal.
Tag and push to ECR AWS.
Create the deployment "project.json" and service file "project-svc.json".
Save the above file in "kubectl/bin" path and deploy it with following commands below.
"kubectl apply -f projectname.json" and "kubectl apply -f projectname-svc.json".
So if I want to deployment the same project again with change, I push the new image on ECR and delete the existing deployment by using "kubectl delete -f projectname.json" without deleting the existing service and deploy it again using command "kubectl apply -f projectname.json" again.
Now, I'm in confusing that after I delete the existing deployment there is a downtime until I apply or create the deployment again. So, how to avoid this ? Because I don't want the downtime actually that is the reason why I started to use the EKS.
And one more thing is the process of deployment is a bit long too. I know I'm missing something can anybody guide me properly please?
The project is on .NET Core and if there is any simplified way to do deployment using Visual Studio please guide me for that also.
Thank You in advance!
There is actually no need to delete your deployment. Just need to update the desired state (the deployment configuration) and let K8s do its magic and apply the needed changes, like deploying a new version of your container.
If you have a single instance of your container, you will experience a short down time while changes are applied. If your application supports multiple replicas (HA), you can enjoy the rolling upgrade feature.
Start by reading the official Kubernetes documentation of a Performing a Rolling Update.
You only need to use the delete/apply if you are changing (And if you have) the ConfigMap attached to the Deployment.
Is the only change you do is the "image" of the deployment - you must use the "set-image" command.
Kubectl let you change the actual deployment image and it does the Rolling Updates all by itself and with 3+ pods you have the minimum chance for downtime.
Even more, if you use the --record flag, you can "rollback" to your previous image with no effort because it keep track of the changes.
You also have the possibility to specify the "Context" too, with no need to jump from contexts.
You can go like this:
kubectl set image deployment DEPLOYMENT_NAME DEPLOYMENT_NAME=IMAGE_NAME --record -n NAMESPACE
OR Specifying the Cluster
kubectl set image deployment DEPLOYEMTN_NAME DEPLOYEMTN_NAME=IMAGE_NAME_ECR -n NAMESPACE --cluster EKS_CLUSTER_NPROD --user EKS_CLUSTER --record
As an Eg:
kubectl set image deployment nginx-dep nginx-dep=ecr12345/nginx:latest -n nginx --cluster eu-central-123-prod --user eu-central-123-prod --record
The --record is what let you track all the changes, if you want to rollback just do:
kubectl rollout undo deployment.v1.apps/nginx-dep
More documentations about it here:
Updating a deployment
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#updating-a-deployment
Roll Back Deployment
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-back-a-deployment

Upgrading JupyterHub helm release w/ new docker image, but old image is being used?

I have a JupyterHub notebook server, and I am running on managed kubernetes on aws (EKS). My docker repository is AWS ECR.
I am iteratively developing my docker image for testing.
My workflow is:
Update the docker image
Update docker image tag in helm release config config.yaml
Upgrade helm release helm upgrade jhub jupyterhub/jupyterhub --version=0.7.0 --values config.yaml
Test the changes to docker image
However, the old docker image is still being used?
How must I change my development workflow, so that I can simply update docker image, and test?
Additional info:
Helm chart (public): https://jupyterhub.github.io/helm-chart/
Edit:
Additional troubleshooting steps taken:
Tried deleting the helm release and re-installing:
helm delete --purge jhub && helm upgrade --install jhub jupyterhub/jupyterhub --namespace jhub --version=0.7.0 --values config.yaml
Tried deleting the helm release AND namespace, and re-installing:
helm delete --purge jhub && kubectl delete namespace jhub && helm upgrade --install jhub jupyterhub/jupyterhub --namespace jhub --version=0.7.0 --values config.yaml
Also tried overriding imagePullPolicy value to Always (per Mostafa's suggestion in his answer)
hub:
imagePullPolicy: Always
None of these work. Old, original docker image is still being used.
What is strange, is that when I inspect the docker images currently being used in my kubernetes cluster, I see the new docker image. But it is not the one being used.
kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}"
# output:
...
<AWS_ACCOUNT>.dkr.ecr.eu-west-1.amazonaws.com/<REPO>:NEW_TAG # <-- not actually being used in jupyerhub
...
Edit(2):
I checked pod description, and found strange event messsage:
I checked one of my pod descriptions and saw a strange event message
Normal Pulled 32m kubelet, <<REDACTED>> Container image "<AWS_ACCOUNT>.dkr.ecr.eu-west-1.amazonaws.com/<REPO>:NEW_TAG" already present on machine
The image being referred to above is my new image, that I just uploaded to image repo. It is impossible for the image to already be downloaded on the cluster. Somehow, the hash is the same for both the original image and the new image, or this is a bug?
The docker image might not updated due to having imagePullPolicy set to IfNotPresent which means the following according to kubernetes documentation:
The default pull policy is IfNotPresent which causes the Kubelet to skip pulling an image if it already exists. If you would like to always force a pull, you can do one of the following:
set the imagePullPolicy of the container to Always.
omit the imagePullPolicy and use :latest as the tag for the image to use.
omit the imagePullPolicy and the tag for the image to use.
enable the AlwaysPullImages admission controller
In your case you can set the value of imagePullPolicy to Awlays inside config.yaml while you deploying the new chart in order to make it pull the newest docker image of your code
# Add this in your config.yaml (check if hub: is already exist to avoid overriding it)
hub:
imagePullPolicy: Always