zookeeper installation on multiple AWS EC2instances - amazon-web-services

I am new to zookeeper and aws EC2. I am trying to install zookeeper on 3 ec2 instances.
as per zookeeper document, I have installed zookeeper on all 3 instances, created zoo.conf and add below configuration:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=localhost:2888:3888
server.2=<public ip of ec2 instance 2>:2889:3889
server.3=<public ip of ec2 instance 3>:2890:3890
also I have created myid file on all 3 instances as /opt/zookeeper/data/myid
as per guideline..
I have couple of queries as below:
whenever I am starting zookeeper server on each instance, it will start in standalone mode.(as per logs)
can above configuration is really gonna connect to each other? port 2889:3889 & 2890:38900 - what these port all about. can I need to configure it on ec2 machine or I need to give some other port against it?
Is I need to create security group to open these connection? I am not sure how to do it in ec2 instance.
How to confirm all 3 zookeeper has started and they can communicate with each other?

The ZooKeeper configuration is designed such that you can install the exact same configuration file on all servers in the cluster without modification. This makes ops a bit simpler. The component that specifies the configuration for the local node is the myid file.
The configuration you've defined is not one that can be shared across all servers. All of the servers in your server list should be binding to a private IP address that is accessible to other nodes in the network. You're seeing your server start in standalone mode because you're binding to localhost. So, the problem is the other servers in the cluster can't see localhost.
Your configuration should look more like:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/data
clientPort=2181
server.1=<private ip of ec2 instance 1>:2888:3888
server.2=<private ip of ec2 instance 2>:2888:3888
server.3=<private ip of ec2 instance 3>:2888:3888
The two ports listed in each server definition are respectively the quorum and election ports used by ZooKeeper nodes to communicate with one another internally. There's usually no need to modify these ports, and you should try to keep them the same across servers for consistency.
Additionally, as I said you should be able to share that exact same configuration file across all instances. The only thing that should have to change is the myid file.
You probably will need to create a security group and open up the client port to be available for clients and the quorum/election ports to be accessible by other ZooKeeper servers.
Finally, you might want to look in to a UI to help manage the cluster. Netflix makes a decent UI that will give you a view of your cluster and also help with cleaning up old logs and storing snapshots to S3 (ZooKeeper takes snapshots but does not delete old transaction logs, so your disk will eventually fill up if they're not properly removed). But once it's configured correctly, you should be able to see the ZooKeeper servers connecting to each other in the logs as well.
EDIT
#czerasz notes that starting from version 3.4.0 you can use the autopurge.snapRetainCount and autopurge.purgeInterval directives to keep your snapshots clean.
#chomp notes that some users have had to use 0.0.0.0 for the local server IP to get the ZooKeeper configuration to work on EC2. In other words, replace <private ip of ec2 instance 1> with 0.0.0.0 in the configuration file on instance 1. This is counter to the way ZooKeeper configuration files are designed but may be necessary on EC2.

Adding additional info regarding Zookeeper clustering inside Amazon's VPC.
Solution with VPC's public IP addres should be preferable solution since Zookeeper and using '0.0.0.0' should be your last option.
In case when you are using docker in your EC2 instance '0.0.0.0' will not work properly with Zookeeper 3.5.X after node restart.
The issue lies in resolving '0.0.0.0' and ensemble sharing of node addresses and SID order (if you will start your nodes in descending order, this issue may not occur).
So far the only working solution is to upgrade to 3.6.2+ version.

Related

How to debug RDS connectivity

I have an Elastic Beanstalk application running for several years using an RDS database (now attached to the EB itself but set up separately).
This has been running without issues for ages. There's a security group attached to the load balancer, that allows traffic on port 5432 (postgreSQL).
Now I've set up a new environment which is identical, but since I want to use Amazon Linux 2, I cannot clone the existing environment (a cloned environment works as well, BTW). So I've carefully set up everything the same way - I've verified that the SG:s are the same, that all environment properties are set, that the VPC and subnets are identical.
However, as soon as my EB instances try to call RDS, it just gets stuck and times out, producing a HTTP 504 for the calling clients.
I've used the AWS Reachability Analyzer to analyze the path from the EB's EC2 instance to the ENI used by the RDS database instance, and that came out fine - there is reachability, at least VPC and SG-wise. Still, I cannot get the database calls to work.
How would one go about to debug this? What could cause a postgresQL connection with valid parameters, from an instance which is confirmed to reach the RDS ENI, to fail for this new instance, while the existing, old, EB application still runs fine?
The only differences in configuration are (new vs original):
Node 14 on Amazon Linux 2 vs Node 10 on original Amazon Linux ("v1")
Application load balancer vs classic load balancer
Some Nginx tweaks from the old version removed as they weren't compatible nor applicable
If the path is reachable, what could cause the RDS connectivity to break, when all DB connection params are also verified to be valid?
Edit: What I've now found is that RDS is attached to subnet A, and an EB having an instance in subnet A can connect to it, but not an instance in subnet B. With old EBs and classic load balancers, a single AZ/subnet could be used, but now at least two must be chosen.
So I suspect my route tables are somehow off. What could cause a host in 10.0.1.x not to reach a host in 10.0.2.x if they're both in the same VPC comprising of these two subnets, and Reachability Analyzer thinks there is a reachable path? I cannot find anywhere that these two subnets are isolated.
check the server connection information
nslookup myexampledb.xxxx.us-east-1.rds.amazonaws.com
verify information
telnet <RDS endpoint> <port number>
nc -zv <RDS endpoint> <port number>
note: keep in mind to replace your endpoint/port to your endpoint available in database settings

AWS ECS Task can't connect to RDS Database

I'm a newer AWS user and today I got stuck while working on a sample project. I successfully created a docker container that runs a simple R script that connects to my AWS RDS MySQL Database and creates & writes some basic files to it. I built a public ECR repository, pushed my docker image there, and built a ECS cluster & task choosing Fargate and using the container image from my repository. My task ran and I could see the R code being executed when I went through the logs, but it was never able to connect to the SQL Database and exited afterwards.
I've had to whitelist my own IP address in the security group for the RDS Database so that I can connect to it, so I'm aware I probably have to do that for my ECS task to establish that connection too. But won't that IP address constantly change because I won't have a static IP for the Fargate Server that is executing my task? I'm trying to stay on the free tier so I'm not sure I want to setup an elastic IP address for this server.
These 2 articles seem close if not the same issue I'm having but I can't figure out a solution. I haven't found any other info.
https://aws.amazon.com/premiumsupport/knowledge-center/ecs-fargate-task-database-connection/
https://aws.amazon.com/premiumsupport/knowledge-center/ecs-fargate-static-elastic-ip-address/
The end goal is to get this sample project successfully running on a scheduled fixed interval, and then running actual scripts on there to help automate things and make my life easier, so this sample project is a first step towards that. Any help or info on the questions I'm having would be appreciated !
Yes, your task is ephemeral (whether you launch it manually or as part of an ECS service) and its private/public ip address may change over time if it gets replaced. The way you'd make the connectivity rules to stick is to assign a security group to the task (that may have inbound access on a specific port you need I assume and outbound to everything) and assign another security group to the RDS db that has inbound access on port 3306 for the security group you assigned to the task (this is the trick, the SG will not change and you are telling RDS to allow access to ALL traffic coming from that SG). I see the first article you posted doesn't talk about this part (it should).

Unable to access external mongo server from a pod but able to connect from EC2 instance

I am trying to connect to a MongoDB instance which is running on an external server from a pod running in a k8s cluster. I do have a VPC peering setup between two VPCs and I am perfectly able to connect to MongoDB server from nodes but when I try from a running pod, it fails. On trying traceroute, I think the private IP is not being resolved outside of the pod network.
Is there anything else which needs to be configured on pod networking side?
Taking a wild guess here, I believe your podCidr is conflicting with one of the Cidrs on your VPC. For example:
192.168.0.0/16 <podCidr) -> 192.168.1.0/24 (VPC cidr)
# Pod is thinking it needs to talk to another pod in the cluster
# instead of a server
You can see your podCidr with this command (clusterCIDR field):
$ kubectl -n kube-system get cm kube-proxy -o=yaml
Another aspect where things could be misconfigured could be your overlay network, where the pods are not getting pod IP address.
This is working fine. I was testing the connectivity with telnet from a pod and since telnet was not returning anything after the successful connection, it seemed that there was some network issue. After testing this with a simplehttp server and monitoring connections, I saw that it all worked fine.
The IP Addresses of podCidr is overlapping with VPC Cidr in which your mongo server is residing and hence Kube Router is preferring the internal route table 1st which is working as designed.
You wither need to reconfigure your VPC Network with a different network or the Kube Network.

Connect to Neptune on AWS from local machine

I am trying to connect to Neptune DB in AWS Instance from my local machine in office, like connecting to RDS from office. Is it possible to connect Neptune db from local machine? Is Neptune db publicly available? Is there any way a developer can connect Neptune db from office?
Neptune does not support public endpoints (endpoints that are accessible from outside the VPC). However, there are few architectural options using which you can access your Neptune instance outside your VPC. All of them have the same theme: setup a proxy (EC2 machine, or ALB, or something similar, or a combination of these) that resides inside your VPC, and make that proxy accessible from outside your VPC.
It seems like you want to talk to your instance purely for development purposes. The easiest option for that would be to spin up an ALB, and create a target group that points to your instance's IP.
Brief Steps (These are intentionally not in detail, please refer to AWS Docs for detailed instructions):
dig +short <your cluster endpoint>
This would give you the current master's IP address.
Create an ALB (See AWS Docs on how to do this).
Make your ALB's target group point to the IP Address obtained for step #1. By the end of this step, you should have an ALB listening on PORT-A, that would forward requests to IP:PORT, where IP is your database IP (from Step 1) and PORT is your database port (default is 8182).
Create a security group that allows inbound traffic from everywhere. i.e. Inbound TCP rule for 0.0.0.0 on PORT-A.
Attach the security group to your ALB
Now from your developer boxes, you can connect to your ALB endpoint at PORT-A, which would internally forward the request to your Neptune instance.
Do checkout ALB docs for details around how you can create it and the concepts around it. If you need me to elaborate any of the steps, feel free to ask.
NOTE: This is not a recommended solution for a production setup. IP's used by Neptune instances are bound to change with failovers and host replacements. Use this solution only for testing purposes. If you want a similar setup for production, feel free to ask a question and we can discuss options.
As already mentioned you can't access directly outside your VPC.
The following link describes another solution using a SSH tunnel: connecting-to-aws-neptune-from-local-environment.
I find it much easier for testing and development purpose.
You can create the SSH tunnel with Putty as well.
Reference: https://github.com/M-Thirumal/aws-cloud-tutorial/blob/main/neptune/connect_from_local.md
Connect to AWS Neptune from the local system
There are many ways to connect to Amazon Neptune from outside of the VPC, such as setting up a load balancer or VPC peering.
Amazon Neptune DB clusters can only be created in an Amazon Virtual Private Cloud (VPC). One way to connect to Amazon Neptune from outside of the VPC is to set up an Amazon EC2 instance as a proxy server within the same VPC. With this approach, you will also want to set up an SSH tunnel to securely forward traffic to the VPC.
Part 1: Set up a EC2 proxy server.
Launch an Amazon EC2 instance located in the same region as your Neptune cluster. In terms of configuration, Ubuntu can be used. Since this is a proxy server, you can choose the lowest resource settings.
Make sure the EC2 instance is in the same VPC group as your Neptune cluster. To find the VPC group for your Neptune cluster, check the console under Neptune > Subnet groups. The instance's security group needs to be able to send and receive on port 22 for SSH and port 8182 for Neptune. See below for an example security group setup.
Lastly, make sure you save the key-pair file (.pem) and note the directory for use in the next step.
Part 2: Set up an SSH tunnel.
This step can vary depending on if you are running Windows or MacOS.
Modify your hosts file to map localhost to your Neptune endpoint.
Windows: Open the hosts file as an Administrator (C:\Windows\System32\drivers\etc\hosts)
MacOS: Open Terminal and type in the command: sudo nano /etc/hosts
Add the following line to the hosts file, replacing the text with your Neptune endpoint address.
127.0.0.1 localhost YourNeptuneEndpoint
Open Command Prompt as an Administrator for Windows or Terminal for MacOS and run the following command. For Windows, you may need to run SSH from C:\Users\YourUsername\
ssh -i path/to/keypairfilename.pem ec2-user#yourec2instanceendpoint -N -L 8182:YourNeptuneEndpoint:8182
The -N flag is set to prevent an interactive bash session with EC2 and to forward ports only. An initial successful connection will ask you if you want to continue connecting? Type yes and enter.
To test the success of your local graph-notebook connection to Amazon Neptune, open a browser and navigate to:
https://YourNeptuneEndpoint:8182/status
You should see a report, similar to the one below, indicating the status and details of your specific cluster:
{
"status": "healthy",
"startTime": "Wed Nov 04 23:24:44 UTC 2020",
"dbEngineVersion": "1.0.3.0.R1",
"role": "writer",
"gremlin": {
"version": "tinkerpop-3.4.3"
},
"sparql": {
"version": "sparql-1.1"
},
"labMode": {
"ObjectIndex": "disabled",
"DFEQueryEngine": "disabled",
"ReadWriteConflictDetection": "enabled"
}
}
Close Connection
When you're ready to close the connection, use Ctrl+D to exit.
Hi you can connect NeptuneDB by using gremlin console at your local machine.
USE THIS LINK to setup your local gremlin server, it works for me gremlin 3.3.2 version
Only you have to update the remote.yaml as per your url and port

How to connect hornetq on AWS VPC from another vm on AWS

I have 2 VMs on AWS. On the first VM I have hornet and application that send messages to hornet. On another VM I have application that is a consumer of hornet.
The consumer fails to pull messages from hornet, and I can't understand why. Hornetq is running, I opened to ports to any IP.
I tried to connect hornet with jconsole (on my local computer) and failed, so I can't see if the hornet has any consumers/ suppliers.
I've tried to change 'bind' configurations to 0.0.0.0 but when I restarted hornet they were automatically changed to what I have as server IP in config.properties.
Any suggestions what might be the problem that I failed to connect my application to the hornetq?
Thanks!
These are the things you need to check for the connectivity between VMs in VPC.
The Security- Group of the instance has both Ingress-Egress Configuration settings unlike the traditional EC2 Security Group [ now Classic EC2 ]. Check the Egress from your Consumer and ingress to the Server
If the instances are in different Subnets you need to check for the ACL as well; however the default setting would be allow.
Check if the iptables / OS level firewall which are blocking.
With respect to the connectivity failed from your local machine to Hornetq - you need to place the Instance in Public sub and configure the Instance's SG accordingly; only the app / VM would accessible to public internet
I have assumed that both the instances are in the Same VPC. However the title of the post sounds slightly misleading - if it is 2 different VPCs altogether, then new concept of VPC Peering also comes in