Istio traffic management with nginx-ingress working but only for port 80 - istio

I've seen something strange where I've been able to have an nginx-ingress with an injected sidecar (i.e. part of the mesh) successfully route traffic that it receives into a cluster based on a k8s ingress definition, and then apply Istio traffic routing to route traffic as desired internally, but this only works when the traffic is being sent to the k8s services via port 80, and only when that is a port that is NOT served by the associated k8s service. This tells me my success is likely some kind of hack.
I'm asking if anyone can point out where I'm going wrong and/or why this is working. (I need to use the nginx ingress here, I can't switch to using istio-ingressgateway for this.)
My configuration / ability to reproduce this is documented in full on this github project: https://github.com/bob-walters/nginx-istio which I've created to provide a way to repeat this setup.
My setup:
a standard Istio installation in a k8s cluster (docker desktop) with the namespaces configured to do automatic sidecar injection.
an nginx-ingress deployment (file) with injected istio sidecar.
configured the nginx-ingress with these values in order to ensure that the sidecar would not try to handle inbound traffic but should permit the outbound traffic to go to the sidecar:
podAnnotations:
traffic.sidecar.istio.io/includeInboundPorts: ""
traffic.sidecar.istio.io/excludeInboundPorts: "80,443"
A set of (demo) services based on podinfo representing the different services that I want to route between via Istio virtual services. Each serving on port 9898 with type: ClusterIP (i.e. only accessible via ingress)
A k8s Ingress definition (file) for the nginx-ingress which carries out the routing for some fictitious hostnames to the different podinfo deployments. The ingress includes the following specific annotations:
The annotation nginx.ingress.kubernetes.io/service-upstream: "true" is set in order to ensure that the nginx-ingress uses the cluster IP address, rather than individual pod IP addresses, when forwarding traffic.
The annotation nginx.ingress.kubernetes.io/upstream-vhost: nginx-cache-v2.whitelabel-dev.svc.cluster.local is NOT set. Many articles will indicate that you should typically set this in combination with the above, but setting this has the effect of altering the Host header to the value specified, and Istio routes based on the Host header, so setting this would require that all Istio routing rules be specified in terms of those hostnames and not the original hostnames. More details on this can be found at: https://github.com/kubernetes/ingress-nginx/issues/3171
Finally: a Virtual Service (file) for one of the hosts (same hostname given in the ingress definition) which is meant to apply once the traffic reaches the Istio cluster, and carries out routing based on a cookie header. (It's doing weighted service shifting with a cookie to pin user sessions.)
Here's the oddity:
The Istio traffic management seems to apply correctly if the target port of the ingress is 80. If it's 9898 (as you would expect because that is the service's available port), the Istio traffic management doesn't seem to apply at all.
This is what I'm seeing as I try varying the port numbers:
Target Port of Ingress Rule
K8s Service Port
Virtual service Port
Result
80
9898
not set
virtual service works as desired
9898
9898
not set
routes to K8s Service. Virtual service has no effect
8080
9898
not set
fails: timeout/502 while attempting to invoke service
9898
9898
9898
routes to K8s Service. Virtual service has no effect
443
9898
not set
fails: timeout/502 while attempting to invoke service
I'm really confused as to why this is not working with port 9898, but is working for port 80, especially given that K8s reports my ingress definition as invalid. My understanding of the routing is that the inbound traffic would go to the 'controller' container in the nginx-ingress service, bypassing the istio proxy as long as it comes in on ports 80 or 443. The outbound traffic should all be going through the proxy destined for the ClusterIP addresses of the k8s services, but with the 'Host' header still containing the original requested host. Thus Istio should be able to handle its routing responsibilities based on Host + Port, and does so, but only if I am routing into the mesh with port 80.
Any help greatly appreciated!

I struggled with this some more and eventually got it working.
There are some specific (non-intuitive) things that have to be correctly lined up for virtual services to work with traffic handled by an nginx-ingress. The details are at the README.md at https://github.com/bob-walters/nginx-istio

Related

How does GCP internal load balancer + GKE service work? (It works, but I do not know why)

E.g. an istio service
istio-ingressgateway LoadBalancer 10.103.19.83 10.160.32.41 15021:30943/TCP,80:32609/TCP,443:30341/TCP,3306:30682/TCP,15443:30302/TCP
Which resulted in a TCP internal load balancer. The front end is ports 15021, 80, 443, 3306, and 15443.
The backend is basically the instance group of the cluster.
How does the load balancer know 443 at the front end will forward to 30341 at backend? As far as I know, TCP load balancer is doing port forwarding? How/Where does the magic happening
The LoadBalancer Service type is an extension of the NodePort type, which is an extension of the ClusterIP type. A nodePort just opens up a port in the range 30000-32767 on each worker node and uses a label selector to identify which Pods to send the traffic to.
This means that internal clients call the Service by using the internal IP address of a node along with the TCP port specified by nodePort. The request is forwarded to one of the member Pods on the TCP port specified by the targetPort field.
Here’s an example
When a Service is created in kubernetes, a corresponding Endpoints object is created along with it. It also applies to LoadBalancer service type.
If you create a simple nginx deployment e.g. by running:
kubectl apply -f https://k8s.io/examples/application/deployment.yaml
and then expose it as a LoadBalancer service:
kubectl apply -f https://k8s.io/examples/application/deployment.yaml
apart from the service itself, you will also see the lb-nginx Endpoints object. You can inspect its details:
kubectl get ep lb-nginx -o yaml
As you can see it keeps track of all exposed pods (being part of a Deployment in this case) so that corresponding iptables rules, which are responsible for forwarding the traffic to a particular pod, can be up-to-date all the time, even if number of them or their ip chages.
You can e.g. scale your deployment to 5 replicas:
kubectl scale deployment nginx-deployment --replicas=5
and inspect the Endpoints object again:
kubectl get ep lb-nginx -o yaml
and you will see that right after your 5 pods are up and running it immediately gets updated as well.
As you can see in subsets section of the yaml:
subsets:
- addresses:
- ip: 10.12.0.3
nodeName: gke-gke-default-pool-75259266-oauz
targetRef:
kind: Pod
name: nginx-deployment-66b6c48dd5-dw9mt
namespace: default
resourceVersion: "22394113"
uid: 8d7e1d3e-64e2-4891-b567-61ee48f61ed1
apart from the ip address of the Pod it maintains information about the node on which it is running.
Let's go back for a moment to the Service:
kubectl get svc lb-nginx -o yaml
As you can see LoadBalancer service apart from its external IP address has its ClusterIP as every other Service (well, almost every as headless services don't have ClusterIP):
spec:
clusterIP: 10.16.6.236
clusterIPs:
- 10.16.6.236
externalTrafficPolicy: Cluster
ports:
- nodePort: 31935
port: 80
protocol: TCP
targetPort: 80
So as you can imagine this external IP is somehow mapped to the cluster ip so it route the traffic further to the respective endpoints in the cluster. How exactly this mapping is done doesn't really matter as it is done by the cloud provider and such implementation details are not part of publicly shared knowledge. The only thing you need to know is that when your cloud provider provisions an external load balancer to satisfy your request defined in a Service of LoadBalancer type, apart from creating an external load balancer it takes care of the mapping between this external IP and some standard port assigned to it and a kubernetes service which has all the information needed to route the traffic further to the respective pods. In case you wonder how exactly this is done on GCP side i.e. mapping/binding between the external (or internal) loadbalancer and kubernetes LoadBalancer service, I'm affraid such implementation details are not publicly revealed.

Egress Blocking Based on IP Address

We would like to use Istio for achieving blocking of egress access from applications and to have an allow-list/block-list of IP Addresses and CIDR blocks. Are there any solutions possible using Istio?
-Renjith
We would like to use Istio for achieving blocking of egress access from applications
I think you could use REGISTRY_ONLY outboundTrafficPolicy.mode for that.
Istio has an installation option, meshConfig.outboundTrafficPolicy.mode, that configures the sidecar handling of external services, that is, those services that are not defined in Istio’s internal service registry. If this option is set to ALLOW_ANY, the Istio proxy lets calls to unknown services pass through. If the option is set to REGISTRY_ONLY, then the Istio proxy blocks any host without an HTTP service or service entry defined within the mesh. ALLOW_ANY is the default value, allowing you to start evaluating Istio quickly, without controlling access to external services. You can then decide to configure access to external services later.
More about that here and here.
and to have an allow-list/block-list of IP Addresses and CIDR blocks.
AFAIK the only way to create an allow/block list in istio is with AuthorizationPolicy or EnvoyFilter.
I have found few examples where they used AuthorizationPolicy with egress gateway, for example here.
They just changed the AuthorizationPolicy label from app: istio-ingressgateway to app: istio-egressgateway.
spec:
selector:
matchLabels:
app: istio-egressgateway
I was looking for any example with ip/cidr, but I couldn't find anything, so I'm not sure if that's gonna work with the egress gateway.
Additional resources:
https://istio.io/latest/docs/tasks/security/authorization/authz-ingress/#ip-based-allow-list-and-deny-list
Istio authorization policy not applying on child gateway
https://istio.io/latest/docs/reference/config/security/authorization-policy/#Source
https://github.com/salrashid123/istio_helloworld#egress-rules

Istio ingress gateway : domain name and port forwarding

I have set up an Istio service mesh. It works fine as I want so far. From outside I can only access with the port number like http://www.mytest.com:41333. What do I have to do to forward 80 to 41333 so that I can access it with http://www.mytest.com
Here is my Gateway :
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: mytest-gateway
spec:
selector:
istio: ingressgateway # use istio default controller
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "www.mytest.com"
Not sure what to do...
I assume your istio ingress gateway service type is NodePort, if you istio ingress gateway is NodePort then you have to use http://www.mytest.com:41333.
If you want to use http://www.mytest.com then you would have to change it to LoadBalancer.
You can check if your istio ingress gateway is NodePort with
kubectl get svc -n istio-system
And check istio ingress gateway type.
NodePort: Exposes the Service on each Node's IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You'll be able to contact the NodePort Service, from outside the cluster, by requesting NodeIP:NodePort.
LoadBalancer: Exposes the Service externally using a cloud provider's load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created.
As mentioned in istio documentation
If the EXTERNAL-IP value is (or perpetually ), your environment does not provide an external load balancer for the ingress gateway. In this case, you can access the gateway using the service’s node port.
If you use cloud like aws you can configure Istio with AWS Load Balancer with appropriate annotations.
On cloud providers which support external load balancers, setting the type field to LoadBalancer provisions a load balancer for your Service. The actual creation of the load balancer happens asynchronously, and information about the provisioned balancer is published in the Service's .status.loadBalancer
If it´s on premise, like minikube, then you could take a look at metalLB
MetalLB is a load-balancer implementation for bare metal Kubernetes clusters, using standard routing protocols.
Kubernetes does not offer an implementation of network load-balancers (Services of type LoadBalancer) for bare metal clusters. The implementations of Network LB that Kubernetes does ship with are all glue code that calls out to various IaaS platforms (GCP, AWS, Azure…). If you’re not running on a supported IaaS platform (GCP, AWS, Azure…), LoadBalancers will remain in the “pending” state indefinitely when created.
Bare metal cluster operators are left with two lesser tools to bring user traffic into their clusters, “NodePort” and “externalIPs” services. Both of these options have significant downsides for production use, which makes bare metal clusters second class citizens in the Kubernetes ecosystem.
MetalLB aims to redress this imbalance by offering a Network LB implementation that integrates with standard network equipment, so that external services on bare metal clusters also “just work” as much as possible.
You can read more about it in below link:
https://medium.com/#emirmujic/istio-and-metallb-on-minikube-242281b1134b

How to expose a Kubernetes service on AWS using `service.spec.externalIPs` and not `--type=LoadBalancer`?

I've deployed a Kubernetes cluster on AWS using kops and I'm able to expose my pods using a service with --type=LoadBalancer:
kubectl run sample-nginx --image=nginx --replicas=2 --port=80
kubectl expose deployment sample-nginx --port=80 --type=LoadBalancer
However, I cannot get it to work by specifying service.spec.externalIPs with the public IP of my master node.
I've allowed ingress traffic the specified port and used https://kubernetes.io/docs/concepts/services-networking/service/#external-ips as documentation.
Can anyone clarify how to expose a service on AWS without using the cloud provider's native load balancer?
If you want to avoid using Loadbalancer then you case use NodePort type of service.
NodePort exposes service on each Node’s IP at a static port (the NodePort).
ClusterIP service that NodePort service routes is created along. You will be able to reach the NodePort service, from outside by requesting:
<NodeIP>:<NodePort>
That means that if you access any node with that port you will be able to reach your service. It worth to remember that NodePorts are high-numbered ports (30 000 - 32767)
Coming back specifically to AWS here is theirs official document how to expose a services along with NodePort explained.
Do note very important inforamation there about enabling the ports:
Note: Before you access NodeIP:NodePort from an outside cluster, you must enable the security group of the nodes to allow
incoming traffic through your service port.
Let me know if this helps.

how is cluster IP in kubernetes-aws configured?

I am very new to kubernetes and have just got a stock kubernetes v.1.3.5 cluster up on AWS using kube-up. So far, I have been playing around with kubernetes in understanding it's mechanics (nodes, pods, svc and stuff). Based on my initial (or maybe crude) understanding , I had few questions:
1) How does routing to cluster IP work here (i.e in kube-aws) ? I see that the services have IPs in the range 10.0.0.0/16. I did a deployment with rc=3 of stock nginx and then attached a service to it with Node Port exposed. All works great! I can connect to the service from my dev machine. This nginx service has a cluster IP of 10.0.33.71:1321. Now, if I ssh into one of the minions(or nodes or VMS) and do a "telnet 10.0.33.71 1321", it connects as expected. But I am clueless how this works, I couldn't find any routes related to 10.0.0.0/16 in the VPC setup by kubernetes. What exactly happens under the hood here that results in a successful connection for app like telnet? However, If I ssh into the master node and do "telnet 10.0.33.71 1321", it does not connect. Why does it fail to connect from master?
2) There is a cbr0 interface inside each node. Each minion node has cbr0 configured as 10.244.x.0/24 and master has cbr0 as 10.246.0.0/24.
I can ping to any of the 10.244.x.x pods from any of the nodes(including master). But I am not able to ping 10.246.0.1 (cbr0 inside master node) from any of the minion nodes. What could be happening here?
Here's the routes set up by kubernetes in aws. VPC.
Destination Target
172.20.0.0/16 local
0.0.0.0/0 igw-<hex value>
10.244.0.0/24 eni-<hex value> / i-<hex value>
10.244.1.0/24 eni-<hex value> / i-<hex value>
10.244.2.0/24 eni-<hex value> / i-<hex value>
10.244.3.0/24 eni-<hex value> / i-<hex value>
10.244.4.0/24 eni-<hex value> / i-<hex value>
10.246.0.0/24 eni-<hex value> / i-<hex value>
Mark Betz (SRE at Olark) presents Kubernetes networking in three articles:
pods
services:
ingress
For a pod, you are looking at:
You find:
etho0: a "physical network interface"
docker0/cbr0: a bridge for connecting two ethernet segments no matter their protocol.
veth0, 1, 2: Virtual Network Interface, one per container.
docker0 is the default Gateway of veth0. It is named cbr0 for "custom bridge".
Kubernetes starts containers by sharing the same veth0, which means each container must expose different ports.
pause: a special container started in "pause", to detect SIGTERM sent to a pod, and forward it to the containers.
node: an host
cluster: a group of nodes
router/gateway
The last element is where things start to be more complex:
Kubernetes assigns an overall address space for the bridges on each node, and then assigns the bridges addresses within that space, based on the node the bridge is built on.
Secondly, it adds routing rules to the gateway at 10.100.0.1 telling it how packets destined for each bridge should be routed, i.e. which node’s eth0 the bridge can be reached through.
Such a combination of virtual network interfaces, bridges, and routing rules is usually called an overlay network.
When a pod contacts another pod, it goes through a service.
Why?
Pod networking in a cluster is neat stuff, but by itself it is insufficient to enable the creation of durable systems. That’s because pods in Kubernetes are ephemeral.
You can use a pod IP address as an endpoint but there is no guarantee that the address won’t change the next time the pod is recreated, which might happen for any number of reasons.
That means: you need a reverse-proxy/dynamic load-balancer. And it better be resilient.
A service is a type of kubernetes resource that causes a proxy to be configured to forward requests to a set of pods.
The set of pods that will receive traffic is determined by the selector, which matches labels assigned to the pods when they were created
That service uses its own network. By default, its type is "ClusterIP"; it has its own IP.
Here is the communication path between two pods:
It uses a kube-proxy.
This proxy uses itself a netfilter.
netfilter is a rules-based packet processing engine.
It runs in kernel space and gets a look at every packet at various points in its life cycle.
It matches packets against rules and when it finds a rule that matches it takes the specified action.
Among the many actions it can take is redirecting the packet to another destination.
In this mode, kube-proxy:
opens a port (10400 in the example above) on the local host interface to listen for requests to the test-service,
inserts netfilter rules to reroute packets destined for the service IP to its own port, and
forwards those requests to a pod on port 8080.
That is how a request to 10.3.241.152:80 magically becomes a request to 10.0.2.2:8080.
Given the capabilities of netfilter all that’s required to make this all work for any service is for kube-proxy to open a port and insert the correct netfilter rules for that service, which it does in response to notifications from the master api server of changes in the cluster.
But:
There’s one more little twist to the tale.
I mentioned above that user space proxying is expensive due to marshaling packets.
In kubernetes 1.2, kube-proxy gained the ability to run in iptables mode.
In this mode, kube-proxy mostly ceases to be a proxy for inter-cluster connections, and instead delegates to netfilter the work of detecting packets bound for service IPs and redirecting them to pods, all of which happens in kernel space.
In this mode kube-proxy’s job is more or less limited to keeping netfilter rules in sync.
The network schema becomes:
However, this is not a good fit for external (public facing) communication, which needs an external fixed IP.
You have dedicated services for that: nodePort and LoadBalancer:
A service of type NodePort is a ClusterIP service with an additional capability: it is reachable at the IP address of the node as well as at the assigned cluster IP on the services network.
The way this is accomplished is pretty straightforward:
When kubernetes creates a NodePort service, kube-proxy allocates a port in the range 30000–32767 and opens this port on the eth0 interface of every node (thus the name “NodePort”).
Connections to this port are forwarded to the service’s cluster IP.
You get:
A Loadalancer is more advancer, and allows to expose services using stand ports.
See the mapping here:
$ kubectl get svc service-test
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
openvpn 10.3.241.52 35.184.97.156 80:32213/TCP 5m
However:
Services of type LoadBalancer have some limitations.
You cannot configure the lb to terminate https traffic.
You can’t do virtual hosts or path-based routing, so you can’t use a single load balancer to proxy to multiple services in any practically useful way.
These limitations led to the addition in version 1.2 of a separate kubernetes resource for configuring load balancers, called an Ingress.
The Ingress API supports TLS termination, virtual hosts, and path-based routing. It can easily set up a load balancer to handle multiple backend services.
The implementation follows a basic kubernetes pattern: a resource type and a controller to manage that type.
The resource in this case is an Ingress, which comprises a request for networking resources
For instance:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test-ingress
annotations:
kubernetes.io/ingress.class: "gce"
spec:
tls:
- secretName: my-ssl-secret
rules:
- host: testhost.com
http:
paths:
- path: /*
backend:
serviceName: service-test
servicePort: 80
The ingress controller is responsible for satisfying this request by driving resources in the environment to the necessary state.
When using an Ingress you create your services as type NodePort and let the ingress controller figure out how to get traffic to the nodes.
There are ingress controller implementations for GCE load balancers, AWS elastic load balancers, and for popular proxies such as NGiNX and HAproxy.