This question already has answers here:
Does Kafka support ELB in front of broker cluster?
(2 answers)
Closed 2 years ago.
I have 3 node kafka cluster (zookeeper also installed on the same 3 nodes). Im not sure if I deploy the AWS NLB in front of my broker. I have 3 producers, even though if the evenly go to all 3 brokers, they will decide where to partition it and etc. I don't know what benefit I can get from AWS NLB and what is the cons of it.
I also researched this and didn't find much help out there. I ended up putting a NLB, with a TCP target group, in front of my broker and this is why:
Save some DNS headache. I have a CNAME on the NLB A record and that's what I use for my bootstrap server value. I can scale horizontally seamlessly by just adding the new broker to the NLB target group (via Cloudformation). I'm not tied down to any IPs in our AWS environment now because of DNS records. I also use a Route53 private zone for the Zookeeper nodes so the brokers only point to the overall A record that all those nodes share.
Easy to monitor broker health with built in CW monitoring.
I read about the benefit of SSL offloading with an ELB, but I don't really consider that a benefit because the client to broker comms will still be non-SSL. I'm not doing that, but I thought I'd list it.
I haven't done any benchmark testing with the NLB yet, but I'm not too concerned. IMO, the simplified DNS made it worth it.
Cheers
EDIT: Proxy protocol will not work with Kafka, so if you want the ability to restrict traffic via source IP, in your security groups, you'll have to use type 'instance' vs type 'ip' for your NLB target group targets.
https://aws.amazon.com/premiumsupport/knowledge-center/security-group-load-balancer/
A lesson learned using the NLB name from a target:
https://aws.amazon.com/premiumsupport/knowledge-center/target-connection-fails-load-balancer/
For this issue, I just switched my --bootstrap-server to 'localhost' on any broker target.
Related
We are looking to separate our blog platform to a separate ec2 server (In Nginx) for better performance and scalability.
Scenario is:
Web request (www.example.com) -> Load Balancer/Route -> Current EC2 Server
Blog request (www.example.com/blog) -> Load Balancer/Route -> New Separate EC2 Server for blog
Please help in this case what is the best option to use:
Haproxy
ALB - AWS
Any other solution?
Also, is it possible to have the load balancer or routing mechanism in a different AWS region? We are currently hosted in AWS.
Haproxy
You would have to set this up on an EC2 server and manage everything yourself. You would be responsible for scaling this correctly to handle all the traffic it gets. You would be responsible for deploying it to multiple availability zones to provide high availability. You would be responsible for installing all security updates on the operating system.
ALB - AWS
Amazon will automatically scale this out to handle any amount of traffic you get. Amazon will handle all security patches of the underlying system. Amazon provides free SSL certificates for ALBs. Amazon will deploy this automatically across multiple availability zones to provide high availability.
Any other solution?
I think AWS Global Accelerator would work here as well, but you would have to weigh the differences between Global Accelerator and ALB to decide which fits your use case and budget the best.
You could also look at placing a CDN in front of everything, like CloudFront or Cloudflare.
Also, is it possible to have the load balancer or routing mechanism in
a different AWS region?
AWS Global Accelerator would be the thing to look at if load balancing in different regions is a concern for you. Given the details you have provided I'm not sure why you would want this however.
Probably what you really need is a CDN in front of your websites, with or without the ALB.
Scenario is:
Web request (www.example.com) -> Load Balancer/Route -> Current EC2
Server Blog request (www.example.com/blog) -> Load Balancer/Route ->
New Separate EC2 Server for blog
In my view you can use ALB deployed in multi AZ for high availability for the following reasons :-
aws alb allows us to route traffic based on various attributes and path in URL is one of them them.
https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-listeners.html#rule-condition-types
With aws ALB you can have two target groups with instance handling traffic one for first path (www.example.com) and second target group for another path (www.example.com/blog).
ALB allows something called SNI (which allows to handle multiple certications behind a single alb for multiple domains), so all you need to do is set up single https listener and upload your certificates https://aws.amazon.com/blogs/aws/new-application-load-balancer-sni/
i have answered on [something similar] it might help you also
This is my opinion, take it as that. I am sure a lot of people wont agree.
If your project is small or personal, you can go with HAProxy (Cheap USD4 or less if you get a t3a as a spot instance) Or free if you place it inside another EC2 of yours may be using docker.
If your project is not personal or not small, go with ALB (Expensive but simpler and better integrated to other AWS stuff)
HAProxy can handle tons of connections, but you have to do more things by yourself. ALB can also handle tons of connections and AWS will do most of the work.
I think HAProxy is more suitable for personal/small projects because if your project doesnt grow, then you dont have to touch HAProxy. It is set and forget the same as ALB but cost less.
You usually wont mind about Availability zones or disaster tolerance in a personal project, so HAProxy should be easy to config.
Another consideration: AWS offers a free tier on ALB, so if your project will run for less than a year ALB is the way to go.
If you are learning, then ALB should be considered because real clients usually love to stick to AWS in all aspects, and HAProxy is your call and also your risk (just to reduce cost for a company that usually pays a lot more for your salary, so not worth the risk).
I setup a Fargate cluster on AWS. My cluster has the following services:
server-A (port 3000)
server-B (port 4000)
Each service is in the same VPC and have the same security group (any ports, any source, any destination). The VPC is isolated from internet.
Now, I want server-A to send a http query to server-B. I would assume that, as in Docker swarm, there is a private DNS that maps the service name to its private IP, and it would be as simple as sending the query to: http://server-B:4000. However, server-A gets a timeout, which means it can't reach server-B.
I've read in the documentation that I can put the 2 containers in the same service, each container listening on a different port, so that, thanks to the loopback interface, from server-A, I could query http://127.0.0.1:4000 and server-B will respond, and vice-versa.
However, I want to be able to scale server-A and server-B independently, so I think it makes sense to keep each server independant from each other by having 2 services.
I've read that, for 2 tasks to talk to each other, I need to setup a load balancer. Coming from the world of Docker Swarm, it was so easy to query the services by their service name, and behind the scene, the request was forwarded to one of the containers in that service. But it doesn't seem to work like that on AWS Fargate.
Questions:
how can server-A talk to server-B?
As service sometimes redeploy, their private IP changes, so it makes no sense to query by IP, querying by hostname seems the most natural way
Do I need to setup any kind of internal DNS?
Thanks for your help, I am really lost on doing this simple setup.
After searching, I found out it was due to the fact that I was not enabling "Service Discovery" during the service creation, so no private DNS was created. Here is some additional documentation which explains exactly the steps:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-service-discovery.html
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 days ago.
Improve this question
I want to know the major difference between Amazon application load balance (ALB) and classic load balance (CLB). I have searched that but they gave only examples, such as classic load balance contains only same content, but application load balance can contain different content and grouped with target group.
ALB has some features (e.g., host based routing, path based routing), but my question is why we use ALB instead of classic load balance. Please provide use cases for both.
ALB
The application load balancer (Elastic Load Balancer V2) lets you direct traffic to EC2 instances based on more complex criteria/rules, specifically URL paths. You can have users trying to access "/signup" go to a group of instances and users trying to access "/homepage" go to another set of instances. In comparison to the classic ELB, an ALB inspects traffic at the application level (OSI layer 7). Its at this level that URL paths can be parsed for example.
ELB/Classic
The classic load balancer routes traffic uniformly using information at the TCP level (OSI layer 4 Transport). It will either send request to each instances "round-robin" style or utilize sticky-sessions and send each users/client to the same instance they initially landed on.
Why ALB over ELB?
You could use an ALB if you decided to architect your system in such a way that each path had its own set of instances or its own service. So
/signup, /login, /info etc etc all go through one load balancer that is pinned to your domain name https//mysite.com but a different EC2 instance is servicing each. ALBs only support HTTP/HTTPS. If your system uses another protocol you would have to use an ELB/Classic load balancer. Websockets HTTP/2 are currently only supported on ALB.
Another reason you might choose ALB over ELB is that there are some additional newer features that have not yet been added to ELB or may never be added. As Michael points out below AWS WAF is not supported on the classic load balancer but is on ALB. I expanded on other features farther down.
Why ELB over ALB?
Architecturally speaking its much simpler to send every request to the same set of instances and then internally within your application delegate requests for certain paths to certain functions/classes/methods etc etc... This is essentially the monolith design most applications start out as. For most work loads dedicating traffic to certain instance groups (the ALB approach) would be a waste of EC2 power. You would have some instances doing lots of work and others barely used.
Similar to the ALB there are features of the classic ELB that have not yet arrived on ALB. I expand on that below.
Update - More on Feature Differences
From a product perspective they differ in other ways that aren't really related to how they operate and more about some features not being present yet.
HTTP to HTTPS Redirection - For example, in ALB each target group (group of instances your assigning a specific route) currently can only handle one protocol so if you were to implement HTTP to HTTPS redirects this requires two instances minimum. With ELB you can handle HTTP to HTTPS redirection on a single instance. I imagine ALB will have this feature soon.
https://forums.aws.amazon.com/thread.jspa?threadID=247546
Multiple SSL Certificates on One Load Balancer - With an ALB you can assign it multiple SSL certificates for different domains. This is not possible on a classic load balancer though the feature has been requested. For a classic load balancer you can use a wild card certificate but that is not the same thing. An ALB makes use of SNI server name identification to make this possible where as this has not been added to the classic ELB feature set.
We're using Redis to collect events from our web application (pub/sub based) behind AWS ELB.
We're looking for a solution that will allow us to scale-up and high-availability for the different servers. We do not wish to have these two servers in a Redis cluster, our plan is to monitor them using cloudwatch and switch between them if necessary.
We tried a simple test of locating two Redis server behind the ELB, telnetting the ELB DNS and see what happens using 'redis-cli monitor', but we don't see nothing. (when trying the same without the ELB it seems fine)
any suggestions?
thanks
I came across this while looking for a similar question, but disagree with the accepted answer. Even though this is pretty old, hopefully it will help someone in the future.
It's more appropriate for your question here to use DNS failover with a Redis Replication Auto-Failover configuration. DNS failover provides groups of availability (if you need that level of scale) and the Replication group provides cache up time.
http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover-configuring.html
The Active-passive failover should provide the solution you're wanting with High Availability:
Active-passive failover: Use this failover configuration when you want
a primary group of resources to be available the majority of the time
and you want a secondary group of resources to be on standby in case
all of the primary resources become unavailable. When responding to
queries, Amazon Route 53 includes only the healthy primary resources.
If all of the primary resources are unhealthy, Amazon Route 53 begins
to include only the healthy secondary resources in response to DNS
queries.
After you setup the DNS, then you would point that to the Elasticache Redis failover group's URL and add multiple groups for higher availability during a failover operation.
However, you might need to setup your application to write and read from different endpoints to maximize the architecture's scalability.
Sources:
http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/Replication.html
http://docs.aws.amazon.com/AmazonElastiCache/latest/UserGuide/AutoFailover.html
Placing a pair of independent redis nodes behind a LB will likely not be what you want. What will happen is ELB will try to balance connections to each instance, splitting half to one and half to another. This means that commands issued by one connection may not be seen by another. It also means no data is shared. So client a could publish a message, and client b being subscribed to the other server won't see the message.
For PUBSUB behind ELB you have a secondary problem. ELB will close an idle connection. So if you subscribe to a channel that isn't busy your ELB will close your connection. As I recall the max you can make this is 60s, meaning if you don't publish a message every single minute your clients will be disconnected.
As to how much of a problem that is depends on your client library, and frankly in my experience most don't handle it well in that they are unaware of the need to re-subscribe upon re-establishing the connection, meaning you would have to code that yourself.
That said a sentinel + redis solution would be quite ideal if your c,isn't has proper sentinel support. In this scenario. Your client asks the sentinels for the master to talk to, and on a connection failure it repeats this process. This would handle the setup you describe, without the problems of being behind an ELB.
Assuming you are running in VPC:
did you register the EC2 instances with the ELB?
did you add the correct security group setting to the ELB (allowing inbound port 23)?
did you add an ELB listener that maps port 23 on the ELB to port 23 on the instances?
did you set sensible ELB health checks (e.g. TCP on port 23) so that ELB thinks the EC2 instances are healthy?
If the ELB thinks the servers behind it are not healthy then ELB will not send them any traffic.
I am preparing a system of EC2 workers on AWS that use Firebase as a queue of tasks they should work on.
My app in node.js that reads the queue and works on tasks is done and working and I would like to properly setup a firewall (EC2 Security Group) that allows my machines to connect only to my Firebase.
Each rule of that Security Group contains:
protocol
port range
and destination (IP address with mask, so it supports whole subnets)
My question is - how can I setup this rule for Firebase? I suppose that IP address of my Firebase is dynamic (it resolves to different IPs from different instances). Is there a list of possible addresses or how would you address this issue? Can some kind of proxy be a solution that would not slow down my Firebase drastically?
Since using node to interact with Firebase is outbound traffic, the default security group should work fine (you don't need to allow any inbound traffic).
If you want to lock it down further for whatever reason, it's a bit tricky. As you noticed, there are a bunch of IP addresses serving Firebase. You could get a list of them all with "dig -t A firebaseio.com" and add all of them to your firebase rules. That would work for today, but there could be new servers added next week and you'd be broken. To try to be a bit more general, you could perhaps allow all of 75.126.., but that is probably overly permissive and could still break if new Firebase servers were added in a different data center or something.
FWIW, I wouldn't worry about it. Blocking inbound traffic is generally much more important than outbound (since to generate outbound traffic you have to have already managed to somehow run software on the box)