Istio 0.8. Details about LoadBalancerSettings.ConsistentHashLB - istio

Could someone provide more details or examples about how works this load balance algorithm?
https://istio.io/docs/reference/config/istio.networking.v1alpha3/#LoadBalancerSettings.ConsistentHashLB
Consistent hashing (ketama hash) based load balancer for even load
distribution/redistribution when the connection pool changes. This
load balancing policy is applicable only for HTTP-based connections. A
user specified HTTP header is used as the key with xxHash hashing.

The LoadBalancerSettings.ConsistentHashLB flag is for an Envoy configuration, and there's more detail in Envoy's Load Balancing Docs:
The ring/modulo hash load balancer implements consistent hashing to upstream hosts. The algorithm is based on mapping all hosts onto a circle such that the addition or removal of a host from the host set changes only affect 1/N requests. This technique is also commonly known as “ketama” hashing.
It's an hash algorithm that decreases the impact of servers being added and removed from Envoy's balancing pool (e.g. the servers behind a VirtualService).
Without such an algorithm, adding a single server to the pool causes hashes to map to different servers:
We wrote ketama to replace how our memcached clients mapped keys to servers...whenever we added or removed servers from the pool, everything
hashed to different servers, which effectively wiped the entire cache.
Back to Istio - Envoy's docs again note:
A consistent hashing load balancer is only effective when protocol routing is used that specifies a value to hash on.
which means - specify a header to generate the hash from. From Istio docs:
httpHeader | string | REQUIRED. The name of the HTTP request header that will be used to obtain the hash key. If the request header is not present, the load balancer will use a random number as the hash, effectively making the load balancing policy random.

Related

Using Istio circuit breakers in production

I'd like to start using the http2MaxRequests circuit breaker in my Kubernetes based Istio service mesh, but the semantics are befuddling to me:
on the destination side, it does what I would expect: each Envoy will accept at most the specified number of concurrent requests, and will then reject new requests. Importantly, the configuration is based on the concurrency supported by each instance of my application server; when I scale the number of instances up or down, I don't need to change this configuration.
on the source side, things are a lot more messy. The http2MaxRequests setting specifies the number of concurrent requests from each source Envoy to the entire set of destination Envoys. I simply don't understand how this can possible work in production!
First off, different source applications have wildly different concurrency models; we have an Nginx proxy that supports a massive amount of concurrent requests per instance, and we have internal services that support a handful. Those source applications will themselves be scaled very differently, with e.g. Nginx having very few instances that make lots of requests. I would need a DestinationRule for each source application pertaining to each destination service.
Even then, the http2MaxRequests setting limits the concurrent requests to the entire destination service, not to each individual instance (endpoint). This fundamentally breaks the notion of autoscaling, the entire point of which is to vary the capacity as needed.
What I would love to see is a way to enforce at the source side a destination-instance concurrency limit; that is, to tie the limit not to the entire Envoy cluster, but to each Envoy endpoint instead. That would make it actually useful for me, as I could then base the configuration on the concurrency supported by each instance of my service, which is a constant factor.

Traefik Best Practices/Capabilities For Dynamic Vanity Domain Certificates

I'm looking for guidance on the proper tools/tech to accomplish what I assume is a fairly common need.
If there exists a web service: https://www.ExampleSaasWebService.com/ and customers can add vanity domains/subdomains to white-label or resell the service and replace the domain name with their own, there needs to be a reverse proxy to terminate vanity domains TLS traffic and route it to the statically defined (HTTPS) back-end service on the non-vanity original domain (there is essentially one "back-end" server somewhere else on the internet, not the local network, that accepts all incoming traffic no matter the incoming domain). Essentially:
"Customer A" could setup an A/CNAME record to VanityProxy.ExampleSaasWebService.com (the host running Traefik) from example.customerA.com.
"Customer B" could setup an A/CNAME record to VanityProxy.ExampleSaasWebService.com (the host running Traefik) from customerB.com and www.customerB.com.
etc...
I (surprisingly) haven't found anything that does this out of the box, but looking at Traefik (2.x) I'm seeing some promising capabilities and it seems like the most capable tool to accomplish this. Primarily because of the Let's Encrypt integration and the ability to reconfigure without a restart of the service.
I initially considered AWS's native certificate management and load balancing, but I see there is a limit of ~25 certificates per load balancer which seems like a non-starter. Presumably there could be thousands of vanity domains in place at any time.
Some of my Traefik specific questions:
Am I correct in understanding that you can get away without explicitly provisioning a generated list of explicit vanity domains to produce TLS certificates for in the config files? They can be determined on-the-fly and provisioned from Let's Encrypt based on the headers of the incoming requests/SNI?
E.g. If a request comes to www.customerZ.com and there is not yet a certificate for that domain name, one can be generated on the fly?
I found this note on the OnDemand flag in the v1.6 docs, but I'm struggling to find the equivalent documentation in the (2.x) docs.
Using AWS services, how can I easily share "state" (config/dynamic certificates that have already been created) between multiple servers to share the load? My initial thought was EFS, but I see EFS shared file system may not work because of a dependency on file change watch notifications not working on NFS mounted file systems?
It seemed like it would make sense to provision an AWS NLB (with a static IP and an associated DNS record) that delivered requests to a fleet of 1 or more of these Traefik proxies with a universal configuration/state that was safely persisted and kept in sync.
Like I mentioned above, this seems like a common/generic need. Is there a configuration file sample or project that might be a good starting point that I overlooked? I'm brand new to Traefik.
When routing requests to the back-end service, the original Host name will be identifiable still somewhere in the headers? I assume it can't remain in the Host header as the back-end recieves requests to an HTTPS hostname as well.
I will continue to experiment and post any findings back here, but I'm sure someone has setup something like this already -- so just looking to not reinvent the wheel.
I managed to do this with Caddy. It's very important that you configure the ask,interval and burst to avoid possible DDoS attacks.
Here's a simple reverse proxy example:
# https://caddyserver.com/docs/caddyfile/options#on-demand-tls
{
# General Options
debug
on_demand_tls {
# will check for "?domain=" return 200 if domain is allowed to request TLS
ask "http://localhost:5000/ask/"
interval 300s
burst 1
}
}
# TODO: use env vars for domain name? https://caddyserver.com/docs/caddyfile-tutorial#environment-variables
qrepes.app {
reverse_proxy localhost:5000
}
:443 {
reverse_proxy localhost:5000
tls {
on_demand
}
}

Openshift roundrobin request across all pods

We want to have roundrobin of request across all pods deployed in openshift.
I have configured below annotations in Route config but the sequence of calls to all pods is random:
haproxy.router.openshift.io/balance : roundrobin
haproxy.router.openshift.io/disable_cookies: 'true'
We have spinup 3 pods. We want requests to have sequence
pod1,pod2,pod3,pod1,pod2,pod3,pod1....
But the real behaviour after setting above annotations in random like:
pod1,pod1,pod2,pod2,pod3,pod1,pod2,pod2.... which is incorrect.
Do we need to configure any openshift configuration make it perfect roundroubin?
If you want to access through pod1, pod2, pod3 in order, the you should use leastconn on the same pod group.
leastconn The server with the lowest number of connections receives the
connection. Round-robin is performed within groups of servers
of the same load to ensure that all servers will be used. Use
of this algorithm is recommended where very long sessions are
expected, such as LDAP, SQL, TSE, etc... but is not very well
suited for protocols using short sessions such as HTTP. This
algorithm is dynamic, which means that server weights may be
adjusted on the fly for slow starts for instance.
roundrobin of HAProxy would distribute the request equally, but it might not protect the accessing server order in the group.
roundrobin Each server is used in turns, according to their weights.
This is the smoothest and fairest algorithm when the server's
processing time remains equally distributed. This algorithm
is dynamic, which means that server weights may be adjusted
on the fly for slow starts for instance. It is limited by
design to 4095 active servers per backend. Note that in some
large farms, when a server becomes up after having been down
for a very short time, it may sometimes take a few hundreds
requests for it to be re-integrated into the farm and start
receiving traffic. This is normal, though very rare. It is
indicated here in case you would have the chance to observe
it, so that you don't worry.
Refer HAProxy balance (algorithm) for details of balance algorithm options.

designing for failure when polling/retrying/queueing is not possible

I'm designing a website/web service to be hosted in the cloud (specifically AWS although that's mostly irrelevant), and I'm spending a lot of time thinking about "designing for failure". I want my system to seamlessly handle node failures, i.e. without any significant user impact or engineer intervention.
In most cases, it's easy to see how to handle sudden node failure. If my app has an API handled by 4 servers behind a load balancer, polled by AJAX or an iPhone app, the poller can simply detect the failed TCP/IP transmission and retry... assuming the load balancer behaves correctly, it will hit a healthy instance.
If the app is more processing-oriented, a queue service like SQS can be used to allow stateless nodes to pick up where the failed nodes left off.
The difficulty I see is with "points of entry", where no retry/polling is possible because the application hasn't been loaded yet, and where a failure means the app never starts. For example, the index.html on a webpage... if a node fails while transmitting that file, the user's browser will likely hang and not automatically retry (they will need to refresh).
The Load Balancer is also a single "point of entry/failure". However, in this case it appears we can solve the problem by creating multiple Load Balancers, and load balancing them using DNS Load Balancing as described here: http://blog.rightscale.com/2012/10/23/dns-load-balancing-and-using-multiple-load-balancers-in-the-cloud/
Is this a solution that would work for the simpler index.html case? Overall, how can we create redundancy where polling/retrying/queuing is not possible?
EDIT: Another idea is to have the index.html hosted statically on a CDN, S3, etc (where resource availability is more dependable), although that prevents using dynamic content. Dynamic content could be added if the page populates itself using JS, but that adds a dependency on JS as well as latency for the user.

Web service provider routing

I am looking to implement a service (web/windows, .net) that maintains a list of available services and can provide an endpoint based on the nature or type of request. The requester can then pass the actual work request to the provided endpoint. The actual work requests can contain very large chunks (from 10MB up to and possibly exceeding a GB) of data.
WCF routing services sounds like a perfect fit, but turns out not to be because the it requires the actual work request to pass through it, creating a bottleneck at the routing service (the whole point is to get a system to be able to scale out). If I had smaller messages, WCF routing would be a no brainer.
Is there anything out there that fits the bill? Preferably .NET/windows based?
Do you mean because the requests block for work?
Do could use OneWay OperationContract to create async services so as to not block the request pool.
[ServiceContract]
interface IMyContract
{
[OperationContract(IsOneWay = true)]
void DoWork()
}
Update
I think understand your question better now, you are looking to distribute load to different servers to avoid request bottle necks due to heavy traffic load (preferably distributed based on content).
I'd say that MVC Routing is indeed ideal for this. One of the features that you can leverage is the fall over functionality. You can actually define multiple backup endpoints, and in the case where one fails, it will automatically move over to the next. There's a good introduction to how this works here.
There's also a good article here that talks about load balancing with WCF using the same principles. It provides 2 solutions for a round robin filter implementation that allows you to load balance the service requests (even though at the begin he says his general answer to whether it supports load balancing is no for implementation reasons).
If you are worried about all requests routing via the one server and still becoming a bottle neck, then think of web load balancers. It's the same scenario. Sitting in the middle forwarding packets doesn't require much work, and they have no problem handling huge volumes of traffic. I don't think this is an issue IMO.