Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress - django

I am using helm upgrade xyz --install command and my release were failing due to other helm issues. So no successful release is done yet.
Then one time when the above command was in progress, I pressed ctrl+c command. And since then it is showing Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress whenever I try helm upgrade again.
When i do helm history helm history xyz It shows Error: release: not found. No I don't know how to roll back the previous operation so that I can try helm upgrade again.
I tried --force too helm upgrade xyz --install --force but it is still showing some operation is in progress.
So how can I roll back the previous process when I don't have any successful release.

The solution is to use helm rollback to restore your previous revision:
helm rollback <name> <revision>

your previously installed/ upgraded helms are in the status of pending-upgrade. These charts were not shown, when we list the helms. try using helm status [chart_name], it shows the current state of charts. uninstall the charts for which the status : pending.
Then you can reinstall the charts without any issue.

OP:
Found the issue.
I was not giving namespace in helm delete command so It was using some default namespace. once i passed namespace it worked

helm -n namespace history myapp
helm -n namespace rollback 7
Here 7 is just the revision to which i rolled back. After this you can proceed with your regular upgrade

In my case, I had to get the status of that release using
helm status <release-name>
Then I saw that the status for that release is pending-upgrade. I simply uninstalled that release using
helm uninstall <release-name>
and ran the install command again.

Related

How can I run beta gcloud component like "gcloud beta artifacts docker images scan" within Cloud Build?

I am trying to include the Container Analyis API link in a Cloud Build pipeline.This is a beta component and with command line I need to install it first:
gcloud components install beta local-extract
then I can run the on demand container analyis (if the container is present locally):
gcloud beta artifacts docker images scan ubuntu:latest
My question is how I can use component like beta local-extract within Cloud Build ?
I tried to do a fist step and install the missing componentL
## Update components
- name: 'gcr.io/cloud-builders/gcloud'
args: ['components', 'install', 'beta', 'local-extract', '-q']
id: Update component
but as soon as I move to the next step the update is gone (since it is not in the container)
I also tried to install the component and then run the scan using (& or ;) but it is failling:
## Run vulnerability scan
- name: 'gcr.io/cloud-builders/gcloud'
args: ['components', 'install', 'beta', 'local-extract', '-q', ';', 'gcloud', 'beta', 'artifacts', 'docker', 'images', 'scan', 'ubuntu:latest', '--location=europe']
id: Run vulnaribility scan
and I get:
Already have image (with digest): gcr.io/cloud-builders/gcloud
ERROR: (gcloud.components.install) unrecognized arguments:
;
gcloud
beta
artifacts
docker
images
scan
ubuntu:latest
--location=europe (did you mean '--project'?)
To search the help text of gcloud commands, run:
gcloud help -- SEARCH_TERMS
so my question are:
how can I run "gcloud beta artifacts docker images scan ubuntu:latest" within Cloud Build ?
bonus: from the previous command how can I get the "scan" output value that I will need to pass as a parameter to my next step ? (I guess it should be something with --format)
You should try the cloud-sdk docker image:
https://github.com/GoogleCloudPlatform/cloud-sdk-docker
The Cloud Build team (implicitly?) recommends it:
https://github.com/GoogleCloudPlatform/cloud-builders/tree/master/gcloud
With the cloud-sdk-docker container you can change the entrypoint to bash pipe gcloud commands together. Here is an (ugly) example:
https://github.com/GoogleCloudPlatform/functions-framework-cpp/blob/d3a40821ff0c7716bfc5d2ca1037bcce4750f2d6/ci/build-examples.yaml#L419-L432
As to your bonus question. Yes, --format=value(the.name.of.the.field) is probably what you want. The trick is to know the name of the field. I usually start with --format=json on my development workstation to figure out the name.
The problem comes from Cloud Build. It cache some often used images and if you want to use a brand new feature in GCLOUD CLI the cache can be too old.
I performed a test tonight, the version is 326 in cache. the 328 has just been released. So, the cached version has 2 weeks old, maybe too old for your feature. It could be worse in your region!
The solution to fix this, is to explicitly request the latest version.
Go to this url gcr.io/cloud-builders/gcloud
Copy the latest version
Paste the full version name in the step of your Cloud Build pipeline.
The side effect is a longer build. Indeed, because this latest image isn't cached, it has to be downloaded in Cloud Build.

Is there a way to pass github repository credentials storing helm charts at install?

I've followed this guide: https://dev.to/jamiemagee/how-to-host-your-helm-chart-repository-on-github-3kd to setup a private github helm chart repository, everything is working fine, github action scripts are testing the docs, releases are being made, but at the time of install it fails with a 404.
➜ ~ helm install --devel -f thor.job-export.values.yaml job-export sample/thor --debug
install.go:172: [debug] Original chart version: ""
install.go:174: [debug] setting version to >0.0.0-0
Error: failed to fetch https://github.com/myuser/helm-charts/releases/download/thor-0.1.4/thor-0.1.4.tgz : 404 Not Found
Understanding that helm cannot fetch the *.tgz files because it seems it's not using my GitHub credentials.
I've tried to add the private github repo with this command:
helm repo add sample 'https://mytoken#raw.githubusercontent.com/myuser/helm-charts/gh-pages/'
Which seems to be able to gather the chart info in the error described above.
Is there a way to pass the github credentials at install time of the chart? tried this:
helm install -f thor.job-export.values.yaml --password mytoken --username myuser job-export sample/thor --debug
But fails with the same error... So, is there any way to pass the github repo user and password to helm at chart install ?

How to configure SNI passthrough for Istio Egress

I'm trying to follow the instructions described at https://istio.io/docs/tasks/traffic-management/egress/wildcard-egress-hosts/#setup-egress-gateway-with-sni-proxy but the actual command
cat <<EOF | istioctl manifest generate --set values.global.istioNamespace=istio-system -f - > ./istio-egressgateway-with-sni-proxy.yaml
.
.
.
fails . I generated an istio issue but no concrete workaround so far. Issue at https://github.com/istio/istio/issues/21379
As I mentioned in comments
Based on this github istio issue I would say now it's only possible to do through helm and it's should be possible to do it via istioctl in 1.5 version. So workaround for now would be to use helm instead of istioctl or wait for the 1.5 version which might actually fix that.
and #Vinay B add
The workaround right now, as #jt97 suggested is to use helm 2.x to generate the yaml
Refer to https://archive.istio.io/v1.2/docs/tasks/traffic-management/egress/wildcard-egress-hosts/#setup-egress-gateway-with-sni-proxy as an example
is actually the only workaround for now.
If you're looking for informations when it's gonna be available via istioctl then follow this github issue which is currently open and it's added to 1.5 milestone, so there is a chance it will be available when 1.5 comes out, which is March 5th.
The workaround right now, as #jt97 suggested is to use helm 2.x to generate the yaml
Refer to https://archive.istio.io/v1.2/docs/tasks/traffic-management/egress/wildcard-egress-hosts/#setup-egress-gateway-with-sni-proxy as an example

WSO2 helm pattern-1 failing with configmaps "apim-conf" already exists

I have followed the steps mentioned here: https://github.com/wso2/kubernetes-apim/tree/master/helm/pattern-1. I am encountering an issue that when I execute:
helm install --name wso2am ~/git/src/github.com/wso2/kubernetes-apim/helm/pattern-1/apim-with-analytics
I receive the following error:
Error: release wso2am failed: configmaps "apim-conf" already exists
This happens on the first time of running the helm install command.
I've deleted the configmaps (kubectl delete configmaps apim-conf) and the release (helm del --purge wso2am), and when I try it again I get the same error.
Any assistance on how to get past this issue would be appreciated.
The issue with this is that there was a second copy of the apim-conf.yaml but named apim-conf.yaml_old. This caused helm to attempt to install apim-conf twice. This is resolved.
You can check the configmaps in the wso2 namespace by using the following command.
kubectl get configmaps -n wso2
Then you can remove the configmap apim-conf as follows.
kubectl delete configmap apim-conf -n wso2

Cloud Composer GKE Node upgrade results in Airflow task randomly failing

The problem:
I have a managed Cloud composer environment, under a 1.9.7-gke.6 Kubernetes cluster master.
I tried to upgrade it (as well as the default-pool nodes) to 1.10.7-gke.1, since an upgrade was available.
Since then, Airflow has been acting randomly. Tasks that were working properly are failing for no given reason. This makes Airflow unusable, since the scheduling becomes unreliable.
Here is an example of a task that runs every 15 minutes and for which the behavior is very visible right after the upgrade:
airflow_tree_view
On hover on a failing task, it only shows an Operator: null message (null_operator). Also, there is no log at all for that task.
I have been able to reproduce the situation with another Composer environment in order to ensure that the upgrade is the cause of the dysfunction.
What I have tried so far :
I assumed the upgrade might have screwed up either the scheduler or Celery (Cloud composer defaults to CeleryExecutor).
I tried restarting the scheduler with the following command:
kubectl get deployment airflow-scheduler -o yaml | kubectl replace --force -f -
I also tried to restart Celery from inside the workers, with
kubectl exec -it airflow-worker-799dc94759-7vck4 -- sudo celery multi restart 1
Celery restarts, but it doesn't fix the issue.
So I tried to restart the airflow completely the same way I did with airflow-scheduler.
None of these fixed the issue.
Side note, I can't access Flower to monitor Celery when following this tutorial (Google Cloud - Connecting to Flower). Connecting to localhost:5555 stay in 'waiting' state forever. I don't know if it is related.
Let me know if I'm missing something!
1.10.7-gke.2 is available now [1]. Can you further upgrade to 1.10.7-gke.2 to see if the issue persists?
[1] https://cloud.google.com/kubernetes-engine/release-notes