I'm trying to set up GKE Gateway with an HTTPS listener using a wildcard certificate managed via Certificate Manager.
The problem I'm facing is not in provisioning of the certificate, which was done successfully following the DNS Authorization tutorial and this answer. I've successfully provisioned a wildcard certificate, which is shown by gcloud certificate-manager certificates describe <cert-name> as ACTIVE and
AUTHORIZED on both the domain and its wildcard subdomain. I've also provisioned the associated Certificate Map and Map Entry (all via Terraform) and created a global static IP address and a wildcard A record for it in Cloud DNS.
However, when I try to use this cert and address in the GKE Gateway resource, the resource gets "stuck" (never reaches SYNC phase), and there's no HTTPS GCLB provisioned as seen via gcloud or Cloud Console.
Here's the config I was trying to use for it:
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
name: external-https
annotations:
networking.gke.io/certmap: gke-gateway
spec:
gatewayClassName: gke-l7-gxlb
listeners:
- name: http
protocol: HTTP
port: 80
allowedRoutes:
kinds:
- kind: HTTPRoute
- name: https
protocol: HTTPS
port: 443
allowedRoutes:
kinds:
- kind: HTTPRoute
addresses:
- type: NamedAddress
value: gke-gateway
I've tried multiple different combinations of this config, including with an explicit IPAddress, or without allowedRoutes. But no matter what I try, it doesn't seem to work. I can only see the initial ADD and UPDATE events in the output of kubectl describe gateway external-http, and there're no logs for it to be found anywhere afaik (since GKE Gateway Controller is part of the GKE Control Plane and without any logging exposed to the customers, from what I understand).
The only time I was able to make either internal or external Gateway to work is when using HTTP protocol, i.e. without certificates. Hence, I think this has to do with HTTPS, and probably more specifically with linking to the managed wildcard certificate.
Additionally, I should mention that my attempts at deploying the Gateway fail most of the time (i.e. the resource gets "stuck" in the same way), even when reusing a previously-working HTTP config. I'm not sure what the source of this flakiness is (apart from maybe some internal quota), but I imagine this is fully expected, as the service is still in Beta.
Has anyone been able to actually provision a Gateway with HTTPS listener and wildcard certs, and how?
Related
I've seen something strange where I've been able to have an nginx-ingress with an injected sidecar (i.e. part of the mesh) successfully route traffic that it receives into a cluster based on a k8s ingress definition, and then apply Istio traffic routing to route traffic as desired internally, but this only works when the traffic is being sent to the k8s services via port 80, and only when that is a port that is NOT served by the associated k8s service. This tells me my success is likely some kind of hack.
I'm asking if anyone can point out where I'm going wrong and/or why this is working. (I need to use the nginx ingress here, I can't switch to using istio-ingressgateway for this.)
My configuration / ability to reproduce this is documented in full on this github project: https://github.com/bob-walters/nginx-istio which I've created to provide a way to repeat this setup.
My setup:
a standard Istio installation in a k8s cluster (docker desktop) with the namespaces configured to do automatic sidecar injection.
an nginx-ingress deployment (file) with injected istio sidecar.
configured the nginx-ingress with these values in order to ensure that the sidecar would not try to handle inbound traffic but should permit the outbound traffic to go to the sidecar:
podAnnotations:
traffic.sidecar.istio.io/includeInboundPorts: ""
traffic.sidecar.istio.io/excludeInboundPorts: "80,443"
A set of (demo) services based on podinfo representing the different services that I want to route between via Istio virtual services. Each serving on port 9898 with type: ClusterIP (i.e. only accessible via ingress)
A k8s Ingress definition (file) for the nginx-ingress which carries out the routing for some fictitious hostnames to the different podinfo deployments. The ingress includes the following specific annotations:
The annotation nginx.ingress.kubernetes.io/service-upstream: "true" is set in order to ensure that the nginx-ingress uses the cluster IP address, rather than individual pod IP addresses, when forwarding traffic.
The annotation nginx.ingress.kubernetes.io/upstream-vhost: nginx-cache-v2.whitelabel-dev.svc.cluster.local is NOT set. Many articles will indicate that you should typically set this in combination with the above, but setting this has the effect of altering the Host header to the value specified, and Istio routes based on the Host header, so setting this would require that all Istio routing rules be specified in terms of those hostnames and not the original hostnames. More details on this can be found at: https://github.com/kubernetes/ingress-nginx/issues/3171
Finally: a Virtual Service (file) for one of the hosts (same hostname given in the ingress definition) which is meant to apply once the traffic reaches the Istio cluster, and carries out routing based on a cookie header. (It's doing weighted service shifting with a cookie to pin user sessions.)
Here's the oddity:
The Istio traffic management seems to apply correctly if the target port of the ingress is 80. If it's 9898 (as you would expect because that is the service's available port), the Istio traffic management doesn't seem to apply at all.
This is what I'm seeing as I try varying the port numbers:
Target Port of Ingress Rule
K8s Service Port
Virtual service Port
Result
80
9898
not set
virtual service works as desired
9898
9898
not set
routes to K8s Service. Virtual service has no effect
8080
9898
not set
fails: timeout/502 while attempting to invoke service
9898
9898
9898
routes to K8s Service. Virtual service has no effect
443
9898
not set
fails: timeout/502 while attempting to invoke service
I'm really confused as to why this is not working with port 9898, but is working for port 80, especially given that K8s reports my ingress definition as invalid. My understanding of the routing is that the inbound traffic would go to the 'controller' container in the nginx-ingress service, bypassing the istio proxy as long as it comes in on ports 80 or 443. The outbound traffic should all be going through the proxy destined for the ClusterIP addresses of the k8s services, but with the 'Host' header still containing the original requested host. Thus Istio should be able to handle its routing responsibilities based on Host + Port, and does so, but only if I am routing into the mesh with port 80.
Any help greatly appreciated!
I struggled with this some more and eventually got it working.
There are some specific (non-intuitive) things that have to be correctly lined up for virtual services to work with traffic handled by an nginx-ingress. The details are at the README.md at https://github.com/bob-walters/nginx-istio
I am new to istio and had doubt configuring a Request authentication policy.The policy uses a jwksuri which is an external URI.The policy is applied on the istio-system namespace.The moment I apply this policy and do
>istioctl proxy-status
The ingress gateway on which the policy is applied LDS is marked stale.If I remove this policy the gateway goes back into SYNCED state.It seems this jwksuri is not accessible since we are behind a company proxy. I created Service entry to access the external jwks uri something like this
kubectl apply -f - <<EOF
apiVersion:
networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: jwksexternal
spec:
hosts:
-
authorization.company.com
ports:
- number: 443
name: https
protocol: HTTPS
resolution: DNS
location: MESH_EXTERNAL
EOF
Also tried to create one more service entry "Configuring traffic to external proxy" referring to this documentation https://istio.io/latest/docs/tasks/traffic-management/egress/http-proxy/
But this is not working.How should I configure the company proxy in Istio.
Edit this is the logs in istiod (Please note https://authorization.company.com/jwk is an external url)
2021-06-02T14:35:39.423938Z error model Failed to fetch public key from "https://authorization.company.com/jwk": Get "https://authorization.company.com/jwk": dial tcp: lookup authorization.company.com on 10.X.0.X:53: no such host
2021-06-02T14:35:39.423987Z error Failed to fetch jwt public key from "https://authorization.company.com/jwk": Get "https://authorization.company.com/jwk": dial tcp: lookup authorization.company.com on 10.X.0.X:53: no such host
2021-06-02T14:35:39.424917Z info ads LDS: PUSH for node:istio-ingressgateway-5b69b5448c-8wbt4.istio-system resources:1 size:4.5kB
2021-06-02T14:35:39.433976Z warn ads ADS:LDS: ACK ERROR router~10.X.48.X~istio-ingressgateway-5b69b5448c-8wbt4.istio-system~istio-system.svc.cluster.local-105 Internal:Error adding/updating listener(s) 0.0.0.0_8443: Provider 'origins-0' in jwt_authn config has invalid local jwks: Jwks RSA [n] or [e] field is missing or has a parse error
Not able to find a workaround for this issue. As of now embedded the jwks into the jwt rules.But this has a problem ,whenever the public key keys get rotated .The jwt rules fail. This is a proxy issue but not sure how to bypass
By default, Istio allows traffic to external systems.
See https://istio.io/latest/docs/tasks/traffic-management/egress/egress-control/#change-to-the-blocking-by-default-policy
So if the problem is that the JWKS URL can't be accessed, it is most likely not because of Istio and a ServiceEntry won't help. I guess the problem will be somewhere else, not in Istio.
We would like to use Istio for achieving blocking of egress access from applications and to have an allow-list/block-list of IP Addresses and CIDR blocks. Are there any solutions possible using Istio?
-Renjith
We would like to use Istio for achieving blocking of egress access from applications
I think you could use REGISTRY_ONLY outboundTrafficPolicy.mode for that.
Istio has an installation option, meshConfig.outboundTrafficPolicy.mode, that configures the sidecar handling of external services, that is, those services that are not defined in Istio’s internal service registry. If this option is set to ALLOW_ANY, the Istio proxy lets calls to unknown services pass through. If the option is set to REGISTRY_ONLY, then the Istio proxy blocks any host without an HTTP service or service entry defined within the mesh. ALLOW_ANY is the default value, allowing you to start evaluating Istio quickly, without controlling access to external services. You can then decide to configure access to external services later.
More about that here and here.
and to have an allow-list/block-list of IP Addresses and CIDR blocks.
AFAIK the only way to create an allow/block list in istio is with AuthorizationPolicy or EnvoyFilter.
I have found few examples where they used AuthorizationPolicy with egress gateway, for example here.
They just changed the AuthorizationPolicy label from app: istio-ingressgateway to app: istio-egressgateway.
spec:
selector:
matchLabels:
app: istio-egressgateway
I was looking for any example with ip/cidr, but I couldn't find anything, so I'm not sure if that's gonna work with the egress gateway.
Additional resources:
https://istio.io/latest/docs/tasks/security/authorization/authz-ingress/#ip-based-allow-list-and-deny-list
Istio authorization policy not applying on child gateway
https://istio.io/latest/docs/reference/config/security/authorization-policy/#Source
https://github.com/salrashid123/istio_helloworld#egress-rules
I followed this tutorial https://cloud.google.com/storage/docs/hosting-static-website
But I am not able to reach the site on https because of ERR_SSL_VERSION_OR_CIPHER_MISMATCH / SSL_ERROR_NO_CYPHER_OVERLAP depending on the browser
I use managed certificate provided by google, but no browser seems to be compatible with it. I use GCP default SSL policy, but I also tried create one for testing with minimal requirements of TSL 1.0, but nothing changed.
Yes , if using google managed cert sometimes it takes time to propagate to your associate domain , so in future you could eiter use "curl" command or used dig command to verify it , sometimes it takes 24 hrs too which is maximum time.
Please verify following points:
verify your website pointing towards frontend LB
check the state of google managed cert added on the front end of LB
verify that frontend is using HTTPS and backend is using HTTP
verify your ssl cert
I'm sure it's a problem with the DNS server. If your config is correct, you have to wait a few hours more and redeploy again.
In my case, I was setting up a subdomain with different IP used in the domain.
My managed certificate it was something like this:
apiVersion: networking.gke.io/v1beta1
kind: ManagedCertificate
metadata:
name: my-certificate
spec:
domains:
- www.sub.example.com
My ingress was fine:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: st-ext-ip-prod
networking.gke.io/v1beta1.FrontendConfig: ssl-redirect
networking.gke.io/managed-certificates: my-certificate
spec: ...
The problem was the configuration in my DNS SERVER. In my certificate, I was using the domain starting with wwww but in my server, I didn't have the CNAME to support the www.sub
# A DOMAIN-IP
www CNAME example.com
sub A SUBDOMAIN-IP
www.sub CNAME sub.example.com
Doing that configuration (CNAME for www.sub) I had to wait like 5 hours (it could take more)
I had to redeploy everything from the beginning and finally, I didn't have that issue ERR_SSL_VERSION_OR_CIPHER_MISMATCH again.
I have a couple of questions
When we make changes to ingress resource, are there any cases where we have to delete the resource and re-create it again or is kubectl apply -f <file_name> sufficient?
When I add the host attribute without www i.e. (my-domain.in), I am not able to access my application but with www i.e. (www.my-domain.in) it works, what's the difference?
Below is my ingress resource
When I have the host set to my-domain.in, I am unable to access my application, but when i set the host to www.my-domain.in I can access the application.
my domain is on a different provider and I have added CNAME (www) pointing to DNS name of my ALB.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: eks-learning-ingress
namespace: production
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/certificate-arn: arn:aws:a982529496:cerd878ef678df
labels:
app: eks-learning-ingress
spec:
rules:
- host: my-domain.in **does not work**
http:
paths:
- path: /*
backend:
serviceName: eks-learning-service
servicePort: 80
First answering your question 1:
When we make changes to ingress resource, are there any cases where we have to delete the resource and re-create it again or is kubectl apply -f sufficient?
In theory, yes, the kubectl apply is the correct way, either it will show ingress unchanged or ingress configured.
Other valid documented option is kubectl edit ingress INGRESS_NAME which saves and apply at the end of the edition if the output is valid.
I said theory because bugs happen, so we can't fully discard it, but bug is the worst case scenario.
Now the blurrier question 2:
When I add the host attribute without www i.e. (my-domain.in), I am not able to access my application but with www i.e. (www.my-domain.in) it works, what's the difference?
To troubleshoot it we need to isolate the processes, like in a chain we have to find which link is broken. One by one:
Endpoint > Domain Provider> Cloud Provider > Ingress > Service > Pod.
DNS Resolution (Domain Provider)
DNS Resolution (Cloud Provider)
Kubernetes Ingress (Ingress > Service > Pod)
DNS Resolution
Domain Provider:
To the Internet, who answers for my-domain.in is your Domain Provider.
What are the rules for my-domain.in and it's subdomains (like www.my-domain.in or admin.my-domain.in)?
You said "domain is on a different provider and I have added CNAME (www) pointing to DNS name of my ALB."
Are my-domain.in and my-domain.in being redirected to the ALB address instinctively?
How does it handle URL subdomains? how the request is passed on to your Cloud?
Cloud Provider:
Ok, the cloud provider is receiving the request correctly and distinctly.
Does your ALB have generic or specific rules for subdomains or path requests?
Test with another host, a different VM with a web server.
Check ALB Troubleshooting Page
Kubernetes Ingress
Usually we would start the troubleshoot from this part, but since you mentioned it works with www.my-domain.in, we can presume that your service, deployment and even ingress structure is working correctly.
You can check the Types of Ingress Docs to get a few examples of how it should work.
Bottom Line: I believe your DNS has a route for www.my-domain.in but the root domain has no route to your cloud provider that's why it's only working when you are enabling the ingress for www.