Istio-Service to service communication is not happening as expected - istio

i have one service which has sidecar and one service without sidecar.
I am making a request(POST) from the service with sidecar to the service without a sidecar which also runs on HTTPS.
I am getting below error in the logs.
org.springframework.web.client.ResourceAccessException: I/O error on POST request for “”: Unrecognized SSL message, plaintext connection?; nested exception is javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
Am I missing something here due to which i am seeing this error?

You probably need to disable mTLS for the service without a sidecar by a DestinationRule, something like:
kubectl apply -f - <<EOF
apiVersion: "networking.istio.io/v1alpha3"
kind: DestinationRule
metadata:
name: default
namespace: foo
spec:
host: bar.foo.svc.cluster.local
trafficPolicy:
tls:
mode: DISABLE
EOF

Related

Istio: DestinationRule for a legacy service outside the mesh

I have a k8s cluster with Istio deployed in the istio-system namespace, and sidecar injection enabled by default in another namespace called mesh-apps. I also have a second legacy namespace which contains certain applications that do their own TLS termination. I am trying to setup mTLS access between services running inside the mesh-apps namespace and those running inside legacy.
For this purpose, I have done the following:
Created a secret in the mesh-apps namespace containing the client cert, key and CAcert to be used to connect with an application in legacy via mTLS.
Mounted these at a well-defined location inside a pod (the sleep pod in Istio samples actually) running in mesh-apps.
Deployed an app inside legacy and exposed it using a ClusterIP service called mymtls-app on port 8443.
Created the following destination rule in the mesh-apps namespace, hoping that this enables mTLS access from mesh-apps to legacy.
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: originate-mtls
spec:
host: mymtls-app.legacy.svc.cluster.local
trafficPolicy:
portLevelSettings:
- port:
number: 8443
tls:
mode: MUTUAL
clientCertificate: /etc/sleep/tls/server.cert
privateKey: /etc/sleep/tls/server.key
caCertificates: /etc/sleep/tls/ca.pem
sni: mymtls-app.legacy.svc.cluster.local
Now when I run the following command from inside the sleep pod, I would have expected the above DestinationRule to take effect:
kubectl exec sleep-37893-foobar -c sleep -- curl http://mymtls-app.legacy.svc.cluster.local:8443/hello
But instead I just get the error:
Client sent an HTTP request to an HTTPS server.
If I add https in the URL, then this is the error:
curl: (56) OpenSSL SSL_read: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate, errno 0
command terminated with exit code 56
I figured my own mistake. I needed to correctly mount the certificate, private key, and CA chain in the sidecar, not in the app container. In order to mount them in the sidecar, I performed the following actions:
Created a secret with the cert, private key and CA chain.
kubectl create secret generic sleep-secret -n mesh-apps \
--from-file=server.key=/home/johndoe/certs_mtls/client.key \
--from-file=server.cert=/home/johndoe/certs_mtls/client.crt \
--from-file=ca.pem=/home/johndoe/certs_mtls/server_ca.pem
Modified the deployment manifest for the sleep container thus:
template:
metadata:
annotations:
sidecar.istio.io/userVolumeMount: '[{"name": "secret-volume", "mountPath": "/etc/sleep/tls", "readonly": true}]'
sidecar.istio.io/userVolume: '[{"name": "secret-volume", "secret": {"secretName": "sleep-secret"}}]'
Actually I had already created the secret earlier, but it was mounted in the app container (sleep) instead of the sidecar, in this way:
spec:
volumes:
- name: <secret_volume_name>
secret:
secretName: <secret_name>
optional: true
containers:
- name: ...
volumeMounts:
- mountPath: ...
name: <secret_volume_name>

Google Cloud Run custom domains do not work with web sockets

I successfully deployed a simple Voila dashboard using Google Cloud Run for Anthos. However, since I created the deployment using a GitLab CI pipeline, by default the service was assigned a long and obscure domain name (e.g. http://sudoku.dashboards-19751688-sudoku.k8s.proteinsolver.org/).
I followed the instructions in mapping custom domains to map a shorter custom domain to the service described above (e.g http://sudoku.k8s.proteinsolver.org). However, while the static assets load fine from this new custom domain, the interactive dashboard does not load, and the javascript console is populated with errors:
default.js:64 WebSocket connection to 'wss://sudoku.k8s.proteinsolver.org/api/kernels/5bcab8b9-11d5-4de0-8a64-399e35258aa1/channels?session_id=7a0eed38-77bb-40e8-ad77-d05632b5fa1b' failed: Error during WebSocket handshake: Unexpected response code: 503
_createSocket # scheduler.production.min.js:10
[...]
Is there a way to get web sockets to work with custom domains? Am I doing something wrong?
TLDR, the following yaml needs to be applied to make websocket work:
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: allowconnect-cluser-local-gateway
namespace: gke-system
spec:
workloadSelector:
labels:
app: cluster-local-gateway
configPatches:
- applyTo: NETWORK_FILTER
match:
listener:
portNumber: 80
filterChain:
filter:
name: "envoy.http_connection_manager"
patch:
operation: MERGE
value:
typed_config:
"#type": "type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager"
http2_protocol_options:
allow_connect: true
Here is the explanation.
For the custom domain feature, the request path is
client ---> istio-ingress envoy pods ---> cluster-local-gateway envoy pods ---> user's application.
Specifically for websocket request, it needs cluster-local-gateway envoy pods to support extended CONNECT feature.
The EnvoyFilter yaml enables the extended CONNECT feature by setting allow_connect to true within the cluster-local-gateway pods.
I tried it by myself, and it works for me.
I don't know anything about your GitLab CI pipeline. By default, Knative (Cloud Run for Anthos) assigns external domain names like {name}.{namespace}.example.com where example.com can be customized based on your domain.
You can find this domain at Cloud Console or kubectl get ksvc.
First try if this domain works correctly with websockets. If so, indeed it's a "custom domain" issue. (If you are not sure, please edit your title/question to not to mention "custom domains".)
Also, you need to explicitly mark your container port as h2c on Knative for websockets to work. See ports section below, specifically name: h2c:
apiVersion: serving.knative.dev/v1alpha1
kind: Service
metadata:
name: hello
spec:
template:
spec:
containers:
- image: gcr.io/google-samples/hello-app:1.0
ports:
- name: h2c
containerPort: 8080
I also see that the response code to your requests is HTTP 503, likely indicating a server error. Please check your application’s logs.

What's the purpose of the `VirtualService` in this example?

I am looking at this example of Istio, and they are craeting a ServiceEntry and a VirtualService to access the external service, but I don't understand why are they creating a VirtualService as well.
So, this is the ServiceEntry:
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: edition-cnn-com
spec:
hosts:
- edition.cnn.com
ports:
- number: 80
name: http-port
protocol: HTTP
- number: 443
name: https-port
protocol: HTTPS
resolution: DNS
With just this object, if I try to curl edition.cnn.com, I get 200:
/ # curl edition.cnn.com -IL 2>/dev/null | grep HTTP
HTTP/1.1 301 Moved Permanently
HTTP/1.1 200 OK
While I can't access other services:
/ # curl google.com -IL
HTTP/1.1 502 Bad Gateway
location: http://google.com/
date: Fri, 10 Jan 2020 10:12:45 GMT
server: envoy
transfer-encoding: chunked
But in the example they create this VirtualService as well.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: edition-cnn-com
spec:
hosts:
- edition.cnn.com
tls:
- match:
- port: 443
sni_hosts:
- edition.cnn.com
route:
- destination:
host: edition.cnn.com
port:
number: 443
weight: 100
What's the purpose of the VirtualService in this scenario?.
The VirtualService object is basically an abstract pilot resource that modifies envoy filter.
So creating VirtualService is a way of modification of envoy and its main purpose is like answering the question: "for a name, how do I route to backends?"
VirtualService can also be bound to Gateway.
In Your case lack of VirtualService results in lack of modification of the envoy from the default/global configuration. That means that the default configuration was enough for this case to work correctly.
So the Gateway which was used was most likely default. With same protocol and port that you requested with curl which all matched Your ServiceEntry requirements for connectivity.
This is also mentioned in istio documentation:
Virtual
services,
along with destination
rules,
are the key building blocks of Istio’s traffic routing functionality.
A virtual service lets you configure how requests are routed to a
service within an Istio service mesh, building on the basic
connectivity and discovery provided by Istio and your platform. Each
virtual service consists of a set of routing rules that are evaluated
in order, letting Istio match each given request to the virtual
service to a specific real destination within the mesh. Your mesh can
require multiple virtual services or none depending on your use case.
You can use VirtualService to add thing like timeout to the connection like in this example.
You can check the routes for Your service with the following command from istio documentation istioctl proxy-config routes <pod-name[.namespace]>
For bookinfo productpage demo app it is:
istioctl pc routes $(kubectl get pod -l app=productpage -o jsonpath='{.items[0].metadata.name}') --name 9080 -o json
This way You can check how routes look without VirtualService object.
Hope this helps You in understanding istio.
The VirtualService is not really doing anything, but as the docs say:
creating a VirtualService with a default route for every service, right from the start, is generally considered a best practice in Istio
The ServiceEntry adds the CNN site as an entry to Istio’s internal service registry, so auto-discovered services in the mesh can route to these manually specified services.
Usually that's used to allow monitoring and other Istio features of external services from the start, whereas the VirtualService would allow the proper routing of request (basically traffic management).
This page in the docs gives a bit more background info on using ServiceEntries and VirtualServices, but basically the ServiceEntry makes sure your mesh knows about the service and can monitor it, and the VirtualService controls what traffic is going to the service, which in this case is all of it.

Why does GCE Load Balancer behave differently through the domain name and the IP address?

A backend service happens to be returning Status 404 on the health check path of the Load Balancer. When I browse to the Load Balancer's domain name, I get "Error: Server Error/ The server encountered a temporary error", and the logs show
"type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry"
statusDetails: "failed_to_pick_backend", which makes sense.
When I browse to the Load Balancer's Static IP, my browser shows the 404 Error Message which the underlying Kubernetes Pod returned, In other words the Load Balancer passed on the request despite the failed health check.
Why these two different behaviors?
[Edit]
Here is the yaml for the Ingress that created the Load Balancer:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ingress1
spec:
rules:
- host: example.com
http:
paths:
- backend:
serviceName: myservice
servicePort: 80
I did a "deep dive" into that and managed to reproduce the situation on my GKE cluster, so now I can tell that there are a few things combined here.
A backend service happens to be returning Status 404 on the health check path of the Load Balancer.
There could be 2 options (it is not clear from the description you have provided).
something like:
"Error: Server Error
The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds."
This one you are geting from LoadBalancer in case HealthCheck failed for pod. The official documentation on GKE Ingress object says that
a Service exposed through an Ingress must respond to health checks from the load balancer.
Any container that is the final destination of load-balanced traffic must do one of the following to indicate that it is healthy:
Serve a response with an HTTP 200 status to GET requests on the / path.
Configure an HTTP readiness probe. Serve a response with an HTTP 200 status to GET requests on the path specified by the readiness probe. The Service exposed through an Ingress must point to the same container port on which the readiness probe is enabled.
It is needed to fix HealthCheck handling. You can check Load balancer details by visiting GCP console - Network Services - Load Balancing.
"404 Not Found -- nginx/1.17.6"
This one is clear. That is the response returned by endpoint myservice is sending request to. It looks like something is misconfigured there. My guess is that pod merely can't serve that request properly. Can be nginx web-server issue, etc. Please check the configuration to find out why pod can't serve the request.
While playing with the setup I have find an image that allows you to check if request has reached the pod and requests headers.
so it is possible to create a pod like:
apiVersion: v1
kind: Pod
metadata:
annotations:
run: fake-web
name: fake-default-knp
# namespace: kube-system
spec:
containers:
- image: mendhak/http-https-echo
imagePullPolicy: IfNotPresent
name: fake-web
ports:
- containerPort: 8080
protocol: TCP
to be able to see all the headers that were in incoming requests (kubectl logs -f fake-default-knp ).
When I browse to the Load Balancer's Static IP, my browser shows the 404 Error Message which the underlying Kubernetes Pod returned.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ingress1
spec:
rules:
- host: example.com
http:
paths:
- backend:
serviceName: myservice
servicePort: 80
Upon creation of such an Ingress object, there will be at least 2 backends in GKE cluster.
- the backend you have specified upon Ingress creation ( myservice one)
- the default one (created upon cluster creation).
kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP
l7-default-backend-xyz 1/1 Running 0 20d 10.52.0.7
Please note that myservice serves only requests that have Host header set to example.com . The rest of requests are sent to "default backend" . That is the reason why you are receiving "default backend - 404" error message upon browsing to LoadBalancer's IP address.
Technically there is a default-http-backend service that has l7-default-backend-xyz as an EndPoint.
kubectl get svc -n kube-system -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default-http-backend NodePort 10.0.6.134 <none> 80:31806/TCP 20d k8s-app=glbc
kubectl get ep -n kube-system
NAME ENDPOINTS AGE
default-http-backend 10.52.0.7:8080 20d
Again, that's the "object" that returns the "default backend - 404" error for the requests with "Host" header not equal to the one you specified in Ingress.
Hope that it sheds a light on the issue :)
EDIT:
myservice serves only requests that have Host header set to example.com." So you are saying that requests go to the LB only when there is a host header?
Not exactly. The LB receives all the requests and passes requests in accordance to "Host" header value. Requests with example.com Host header are going to be served on myservice backend .
To put it simple the logic is like the following:
request arrives;
system checks the Host header (to determine user's backend)
request is served if there is a suitable user's backend ( according to the Ingress config) and that backend is healthy , otherwise "Error: Server Error The server encountered a temporary error and could not complete your request. Please try again in 30 seconds." is thrown if backend is in non-healthy state;
if request's Host header doesn't match any host in Ingress spec, request is sent to l7-default-backend-xyz backend (not the one that is mentioned in Ingress config). That backend replies with: "default backend - 404" error .
Hope that makes it clear.

Istio: authn tls-check and external services

In my istio mesh I have configured mTLS, and I have some external-to-the-mesh and external-to-the-cluster services I am consuming: I can connect to them just fine by creating a trafficPolicy with TLS disabled, but no matter what I do I cannot get authn tls-check to be happy as it always displays CONFLICT with server in mTLS and client in HTTP.
From what I understand, the "server" in this case is external to the mesh, and I can't seem to create a policy that applies to it to tell istio that this server is not using mTLS (obviously, as it's outside the mesh): has anybody been able to set things up so that you have an external service to your mTLS mesh and auth tls-check displays OK with mTLS disabled for both server and client?
You should create a ServiceEntry for your external service with protocol http, and then you should be able to call it. You don't need to set a trafficPolicy.
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: external-svc-myservice
spec:
hosts:
- myservice.com
location: MESH_EXTERNAL
ports:
- number: 80
name: http
protocol: HTTP
resolution: DNS