Istio default ingress-gateway got deleted - istio

I am doing chaos testing on all istio core components, pilot, mixer, citadel, and default objects/resources. It am manually deleting the components and document the behavior, which will help when it actually breaks in production.
I have deleted ingress-gateway service. It also deleted egress pods, which i didn't expect.
Since I am going to delete all the default objects one by one, Is there a better or more cleaner way to recreate core objects? For example, how would I recreate ingress and egress services?

In my opinion the best way to re-create lost/deleted components of Istio, is to do it by helm (package manager for Kubernetes).
helm upgrade <your-release-name> <repo-name>/<chart-name> --reuse-values --force
You can also keep track of changes of your istio installation (aka Istio release), and simply restore to its last working version using following commands:
helm history <release_name>
helm rollback --force [RELEASE] [REVISION]
Eventually you can always reach out back to Istio installation directory, and re-apply piece of manifest corresponding to deleted object, for example for istio v1.1.1, the istio-ingressgateway Service object is declared inside 'istio-1.1.1/install/kubernetes/istio-demo.yaml'. Additionally these manifest files can be generated by helm template command directly from source code repository.

Related

My GKE pods stoped with error "no command specified: CreateContainerError"

Everything was Ok and nodes were fine for months, but suddenly some pods stopped with an error
I tried to delete pods and nodes but same issues.
Try below possible solutions to resolve your issue:
Solution 1 :
Check a malformed character in your Dockerfile and cause it to crash.
When you encounter CreateContainerError is to check that you have a valid ENTRYPOINT in the Dockerfile used to build your container image. However, if you don’t have access to the Dockerfile, you can configure your pod object by using a valid command in the command attribute of the object.
So workaround is to not specify any workerConfig explicitly which makes the workers inherit all configs from the master.
Refer to Troubleshooting the container runtime, similar SO1, SO2 & Also check this similar github link for more information.
Solution 2 :
Kubectl describe pod podname command provides detailed information about each of the pods that provide Kubernetes infrastructure. With the help of this you can check for clues, if Insufficient CPU follows the solution below.
The solution is to either:
1)Upgrade the boot disk: If using a pd-standard disk, it's recommended to upgrade to pd-balanced or pd-ssd.
2)Increase the disk size.
3)Use node pool with machine type with more CPU cores.
See Adjust worker, scheduler, triggerer and web server scale and performance parameters for more information.
If you still have the issue, you can then update the GKE version for your cluster Manually upgrading the control planeto one of the fixed versions.
Also check whether you have updated it in the last year to use the new kubectl authentication coming in the GKE v1.26 plugin?
Solution 3 :
If you're having a pipeline on GitLab that deploys an image to a GKE cluster: Check the version of the Gitlab runner that handles the jobs of your pipeline .
Because it turns out that every image built through a Gitlab runner running on an old version causes this issue at the container start. Simply deactivate them and only let Gitlab runners running last version in the pool, replay all pipelines.
Check the gitlab CI script using an old docker image like docker:19.03.5-dind, update to docker:dind helps the kubernetes to start the pod again.

How to update Istio configuration after installation?

Every document I found only tells you how to enable/disable a feature while installing a new Istio instance. But I think in a lot of cases, people need to update the Istio configuration.
Accessing External Services, in this instance, it says I need to provide <flags-you-used-to-install-Istio>, but what if I don't know how the instance was installed?
Address auto allocation, in this instance, it doesn't mention a way to update the configuration. Does it imply this feature has to be enabled in a fresh installation?
Why there's no istioctl update command?
The confusion totally makes sense. As at least it would be nice for it to be called out somewhere.
Basically, there is no update command for the same reason as there is no kubectl update command. What istioctl does is generate the YAML output which represents in a declarative way how your application should be running. And then applies it to the cluster and Kubernetes handles it.
So basically istioctl install with the same values will produce the same output and when applied to Kubernetes, if there were no changes, nothing will be updated.
I will rephrase your questions to be more precise, I believe the context is the same:
How do I find Istio installation configuration
Prior to installation, you should have generated the manifest. This can be done with
istioctl manifest generate <flags-you-use-to-install-Istio> > $HOME/istio-manifest.yaml
With this manifest you can inspect what is being installed, and track changes to the manifest over time.
This will also capture any changes to underlying charts (if installed with Helm). Just add -f flag to the command:
istioctl manifest generate -f path/to/manifest.yaml > $HOME/istio-manifest.yaml
If there is no manifest available, you can check IstioOperator CustomResource, but Istio must be installed with operator, for it to be available.
If neither of the above are available, you are out of luck. This is not an optimal situation, but it is what we get.
How do I customize Istio installation
Using IstioOperator
You can pass new configuration, in YAML format, to istioctl install
echo '
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
components:
pilot:
k8s:
resources:
requests:
cpu: 1000m # override from default 500m
memory: 4096Mi # ... default 2048Mi
hpaSpec:
maxReplicas: 10 # ... default 5
minReplicas: 2 # ... default 1
' | istioctl install -f -
The above example adjusts the resources and horizontal pod autoscaling settings for Pilot
Any other configuration (ServiceEntry, DestinationRule, etc.) is deployed like any other resource with kubectl apply.
Why is there no istioctl update command
Because of the #2. Changes to Istio are applied using istioctl install.
If you want to upgrade Istio to a newer version, there are instructions available in the docs.
Good brother, I registered an account to speak. I have been looking for a long time how to update istio, such as the configuration of the global grid. After seeing your post and the answer below, I finally have an answer.
My previous operation was to create two configurations, one is istiod configuration and the other is ingress configuration. When I perform istioctl install -f istiod.yaml, my ingress will be deleted, which bothers me.
Until I saw this post, I got it
I merged the two files into one, the following is my file, it can be updated without deleting my ingress configuration
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
profile: minimal
meshConfig:
accessLogFile: /dev/stdout
accessLogEncoding: TEXT
enableTracing: true
defaultConfig:
tracing:
zipkin:
address: jaeger-collector.istio-system:9411
sampling: 100
components:
ingressGateways:
-name: ingressgateway
namespace: istio-ingress
enabled: true
label:
# Set a unique label for the gateway. This is required to ensure Gateways
# can select this workload
istio: ingressgateway
values:
gateways:
istio-ingressgateway:
# Enable gateway injection
injectionTemplate: gateway
Thank you very much, this post solved my troubles

AWS cloudformation: How to run cfn-nag locally in Windows

I have a cloud formation template where I have all the resources and details for the project.
I have the cfn-lint setup locally and it is running perfectly fine. However when I push the code changes, build fails at deployment stage due to cfn-nag stating some simple changes which could be fixed.
I'm using windows machine and I need a way to run this cfn-nag locally so that I could check this just like cfn-lint and fix them locally instead of waiting 40 minutes for build till it reaches deployment stage.
I referred several posts online, found below two helpful
https://stelligent.com/2018/03/23/validating-aws-cloudformation-templates-with-cfn_nag-and-mu/
https://github.com/stelligent/cfn_nag
What is the difference between cfn-nag and cfn-lint and why lint is not failing on what cfn-nag is complaining about?
The above links have some instructions on Ruby and Brew but I'm using Nodejs, felt lost. Please help.
CFN-Nag looks for patterns in AWS CloudFormation templates that may indicate insecure infrastructure,
Ex:
IAM rules that are too permissive (wildcards),
Security group rules that are too permissive (wildcards),
Access logs that aren’t enabled,
Encryption that isn’t enabled,
CFN-Lint scans the AWS CloudFormation template by processing a collection of Rules, where every rule handles a specific function check or validation of the template. It validates against AWS CloudFormation Resource specification.
This collection of rules can be extended with custom rules using the --append-rules argument.
Ex: Whitespaces, alignment(YAML), type checks, valid values for resource properties, and other best practices.
Those two links you previded above have all the information needed, just not directly for a Nodejs developer using a Windows machine.
Step1: Pull the docket image stelligent/cfn-nag
Step2: Add the script to your package.json for cfn-nag
Ex:
"scripts" : {
"cfn:nag": "cfn-nag"
}
If you're using docker-compose.yml
Add the cfn-nag image details to your docker-compose.yml like below
cfn-nag:
image: "stelligent/cfn-nag"
volumes:
-./path_of_cfn_file_to_copy: /path_to_copy_to
command: ${COMMAND: -/path_to_copy_tp/cfn_file}
Just set the scripts in package.json to run via docker-compose
"cfn:nag": "docker-compose run --rm cfn-nag"

Istio install with updated configs doesn't delete Prometheus ServiceMonitor objects

I have Istio (version 1.16.3) configured with an external Prometheus and I have the Prometheus ServiceMonitor objects configured using the built in Prometheus operator based on the discussion in this issue: https://github.com/istio/istio/issues/21187
For most part this works fine, except that I noticed that the kubernetes-services-secure-monitor and the kubernetes-pods-secure-monitor were also created and this resulted in Prometheus throwing certificate not found errors, as expected because I have not set these up.
"level=error ts=2020-07-06T03:43:33.464Z caller=manager.go:188 component="scrape manager" msg="error creating new scrape pool" err="error creating HTTP client: unable to load specified CA cert /etc/prometheus/secrets/istio.prometheus/root-cert.pem: open /etc/prometheus/secrets/istio.prometheus/root-cert.pem: no such file or directory" scrape_pool=istio-system/kubernetes-pods-secure-monitor/0
I also noticed that the service monitor creation can be disabled by using the Values.prometheus.provisionPrometheusCert flag as per this:
istio/manifests/charts/istio-telemetry/prometheusOperator/templates/servicemonitors.yaml
{{- if .Values.prometheus.provisionPrometheusCert }}
However, re-applying the config using `istioctl install did not delete those service monitors.
Does istioctl install command not delete/prune existing resources?
Here is my full configuration:
apiVersion: install.istio.io/v1alpha1
kind: IstioControlPlane
metadata:
namespace: istio-system
name: istio-controlplane
labels:
istio-injection: enabled
spec:
profile: default
addonComponents:
prometheus:
enabled: false
prometheusOperator:
enabled: true
grafana:
enabled: false
kiali:
enabled: true
namespace: staging
tracing:
enabled: false
values:
global:
proxy:
logLevel: warning
mountMtlsCerts: false
prometheusNamespace: monitoring
tracer:
zipkin:
address: jaeger-collector.staging:9411
prometheusOperator:
createPrometheusResource: false
prometheus:
security:
enabled: false
provisionPrometheusCert: false
Two separate concerns: Upgrade to a new version of Istio and updates to the config.
Upgrade
As far as I know there we´re a lot of issues when upgrading istio from older versions to 1.4,1.5,1.6, but recently when istioctl upgrade came up you shouldn´t be worried about upgrading your cluster.
The istioctl upgrade command performs an upgrade of Istio. Before performing the upgrade, it checks that the Istio installation meets the upgrade eligibility criteria. Also, it alerts the user if it detects any changes in the profile default values between Istio versions.
Additionally Istio 1.6 will support a new upgrade model to safely canary-deploy new versions of Istio. In this new model, proxies will associate with a specific control plane that they use. This allows a new version to deploy to the cluster with less risk - no proxies connect to the new version until the user explicitly chooses to. This allows gradually migrating workloads to the new control plane, while monitoring changes using Istio telemetry to investigate any issues
Related documentation about that is here and here.
Update
As I mentioned in comments, the 2 things I found which might help are
istioctl operator logs
If something with your update goes wrong then it will appear in istio operator logs, and the update will fail.
You can observe the changes that the controller makes in the cluster in response to IstioOperator CR updates by checking the operator controller logs:
$ kubectl logs -f -n istio-operator $(kubectl get pods -n istio-operator -lname=istio-operator -o jsonpath='{.items[0].metadata.name}')
istioctl verify install
Verify a successful installation
You can check if the Istio installation succeeded using the verify-install command which compares the installation on your cluster to a manifest you specify.
If you didn’t generate your manifest prior to deployment, run the following command to generate it now:
$ istioctl manifest generate <your original installation options> > $HOME/generated-manifest.yaml
Then run the following verify-install command to see if the installation was successful:
$ istioctl verify-install -f $HOME/generated-manifest.yaml
Hope you find this useful.

Elastic Kubernetes Service AWS Deployment process to avoid down time

Its been a month I have started working on EKS AWS and up till now successfully deployed by code.
The steps which I follow for deployment are given below:
Create image from docker terminal.
Tag and push to ECR AWS.
Create the deployment "project.json" and service file "project-svc.json".
Save the above file in "kubectl/bin" path and deploy it with following commands below.
"kubectl apply -f projectname.json" and "kubectl apply -f projectname-svc.json".
So if I want to deployment the same project again with change, I push the new image on ECR and delete the existing deployment by using "kubectl delete -f projectname.json" without deleting the existing service and deploy it again using command "kubectl apply -f projectname.json" again.
Now, I'm in confusing that after I delete the existing deployment there is a downtime until I apply or create the deployment again. So, how to avoid this ? Because I don't want the downtime actually that is the reason why I started to use the EKS.
And one more thing is the process of deployment is a bit long too. I know I'm missing something can anybody guide me properly please?
The project is on .NET Core and if there is any simplified way to do deployment using Visual Studio please guide me for that also.
Thank You in advance!
There is actually no need to delete your deployment. Just need to update the desired state (the deployment configuration) and let K8s do its magic and apply the needed changes, like deploying a new version of your container.
If you have a single instance of your container, you will experience a short down time while changes are applied. If your application supports multiple replicas (HA), you can enjoy the rolling upgrade feature.
Start by reading the official Kubernetes documentation of a Performing a Rolling Update.
You only need to use the delete/apply if you are changing (And if you have) the ConfigMap attached to the Deployment.
Is the only change you do is the "image" of the deployment - you must use the "set-image" command.
Kubectl let you change the actual deployment image and it does the Rolling Updates all by itself and with 3+ pods you have the minimum chance for downtime.
Even more, if you use the --record flag, you can "rollback" to your previous image with no effort because it keep track of the changes.
You also have the possibility to specify the "Context" too, with no need to jump from contexts.
You can go like this:
kubectl set image deployment DEPLOYMENT_NAME DEPLOYMENT_NAME=IMAGE_NAME --record -n NAMESPACE
OR Specifying the Cluster
kubectl set image deployment DEPLOYEMTN_NAME DEPLOYEMTN_NAME=IMAGE_NAME_ECR -n NAMESPACE --cluster EKS_CLUSTER_NPROD --user EKS_CLUSTER --record
As an Eg:
kubectl set image deployment nginx-dep nginx-dep=ecr12345/nginx:latest -n nginx --cluster eu-central-123-prod --user eu-central-123-prod --record
The --record is what let you track all the changes, if you want to rollback just do:
kubectl rollout undo deployment.v1.apps/nginx-dep
More documentations about it here:
Updating a deployment
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#updating-a-deployment
Roll Back Deployment
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-back-a-deployment