I'm trying to setup a Mesos cluster in AWS VPC.
I've setup 3 Ubuntu machines with Mesos according to this tutorial.
The problem is inside the web UI I see a message No master is currently leading ... and when I ssh to any of the 3 master machines and run echo srvr | nc localhost 2182 I get This ZooKeeper instance is not currently serving requests on all of them.
I also can't ping between the 3 servers.
Related
I'm unable to deploy the simplest docker-compose file to an ElasticBeanstalk environment configured with Application Load Balancer for high-availability.
This is the docker file:
version: "3.9"
services:
demo:
image: nginxdemos/hello
ports:
- "80:80"
restart: always
This is the ALB configuration:
EB Chain of events:
Creating CloudWatch alarms and log groups
Creating security groups
For the load balancer
Allow incoming traffic from the internet to my two listerners on ports 80/443
For the EC2 machines
Allow incoming traffic to the process port from the first security group created
Create auto scaling groups
Create Application Load Balancer
Create EC2 instance
Approx. 10 minutes after creating the EC2 instance (#5), I get the following log:
Environment health has transitioned from Pending to Severe. ELB processes are not healthy on all instances. Initialization in progress (running for 12 minutes). None of the instances are sending data. 50.0 % of the requests to the ELB are failing with HTTP 5xx. Insufficient request rate (2.0 requests/min) to determine application health (6 minutes ago). ELB health is failing or not available for all instances.
Looking at the Target Group, it is indicating 0 healthy instances (based on the default healthchecks)
When SSH'ing the instance, I see that the docker service is not even started, and my application is not running. So that explains why the instance is unhealthy.
However, what am I supposed to do differently? based on the understanding I have, to me it looks like a bug in the flow initiated by ElasticBealstalk, as the flow is waiting for the instances to be healthy before starting my application (otherwise, why the application wasn't started in the 10 minutes after the EC2 instance was created?)
It doesn't seem like an application issue, because the docker service was not even started.
Appreciate your help.
I tried to replicate your issue using your docker-compose.yml and Docker running on 64bit Amazon Linux 2/3.4.12 platform. For the test I created a zip file containing only the docker-compose.yml.
Everything works as expected and no issues were found.
The only thing I can suggest is to double check your files. Also there is no reason to use 443 as you don't have https at all.
I have 2 AWS virtual machines instances, running on 2 IPv4 public IPs A.B.C.D and X.Y.Z.W
I installed Docker on both machines, and launch Docker Swarm with node A.B.C.D as manager and X.Y.Z.W as worker. When I launched Docker Swarm, I used the A.B.C.D as the advertise-addr, like so:
docker swarm init --advertise-addr A.B.C.D
The Swarm initialized successfully
The problem occurred when I created a service from the image jwilder/whoami and exposed the service on port 8000:
docker service create -d -p 8000:8000 jwilder/whoami
I expected that I can access the service on port 8000 from both nodes, according to the Swarm Routing Mesh documentation. However, in fact I can only access the service from only one node, which is the node that the container was running on
I also tried this experiment on Azure virtual machine and alo failed, so I guess this is a problem with Swarm on these cloud providers, maybe some networking misconfiguration.
Does anyone know how to fix this ? Thanks in advance :D
One main problem that you are not refer is about the Security Groups. You expose port 8000 but by default, the Security Groups only open port 22 for SSH. Please check the SG and make sure you open the necessary ports.
Just spun an EC2 ubunto on AWS. Installed Docker. Pulled my test springboot image and run it on the host. Can't access the app via browser. When I curl on the host, it does respond with valid http response. Is there a network or firewall that I should be looking at?
ubuntu#ip-172-31-4-157:~$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ea9879c1b38c parikshit123/docker-spring-boot:firsttry "java -jar docker-sp…" 20 minutes ago Up 20 minutes 0.0.0.0:8085->8085/tcp frosty_sammet
ubuntu#ip-172-31-4-157:~$ curl localhost:8085/test/hello
Hello from Mitalubuntu#ip-172-31-4-157:~$
Just figured out.
By default, AWS Ec2 instances have ALL TCP tranffic (inbound and outbound) blocked. I learned that it has to be g opened. I added security group and it worked. Now I can access the endpoint via browser. Bingo!
I used the cloud formation template provided by Docker for AWS setup & prerequisites to set up a docker swarm.
I created a REST service using Tibco BusinessWorks Container Edition and deployed it into the swarm by creating a docker service.
docker service create --name aka-swarm-demo --publish 8087:8085 akamatibco/docker_swarm_demo:part1
The service starts successfully but the CloudWatch logs show the below exception:
I have tried passing the JVM environment variable in the Dockerfile as :
ENV JAVA_OPTS= "-Dbw.rest.docApi.port=7778"
but it doesn't help.
The interesting fact is at the end the log says:
com.tibco.thor.frwk.Application - TIBCO-THOR-FRWK-300006: Started BW Application [SFDemo:1.0]
So I tried to access the application using CURL -
curl -X GET --header 'Accept: application/json' 'URL of AWS load balancer : port which I exposed while creating the service/resource URI'
But I am getting the below message:
The REST service works fine when I do docker run.
I have checked the Security Groups of the manager and load-balancer. The load-balancer has inbound open to all traffic and for the manager I opened HTTP connections.
I am not able to figure out if anything I have missed. Can anyone please help ?
As mentioned in Deploy services to swarm, if you read along, you will find the following:
PUBLISH A SERVICE’S PORTS DIRECTLY ON THE SWARM NODE
Using the routing mesh may not be the right choice for your application if you need to make routing decisions based on application state or you need total control of the process for routing requests to your service’s tasks. To publish a service’s port directly on the node where it is running, use the mode=host option to the --publish flag.
Note: If you publish a service’s ports directly on the swarm node using mode=host and also set published= this creates an implicit limitation that you can only run one task for that service on a given swarm node. In addition, if you use mode=host and you do not use the --mode=global flag on docker service create, it will be difficult to know which nodes are running the service in order to route work to them.
Publishing ports for services works different than for regular containers. The problem was; the image does not expose the port after running service create --publish and hence the swarm routing layer cannot reach the REST service. To resolve this use mode = host.
So I used the below command to create a service:
docker service create --name tuesday --publish mode=host,target=8085,published=8087 akamatibco/docker_swarm_demo:part1
Which eventually removed the exception.
Also make sure to configure the firewall settings of your load balancer so as to allow communications through desired protocols in order to access your applications deployed inside the container.
For my case it was HTTP protocol, enabling port 8087 on load balancer which served the purpose.
I have set up the same version of redis in my amazon ec2 ubuntu instance and also in my home computer running ubuntu. I have set my security group in ec2 to have the port 6379 accesible publicly. I have added the line
slaveof ec2-xx-xxx-xxx-xx.us-west-2.compute.amazonaws.com 6379
where ec2-xx-xxx-xxx-xx.us-west-2.compute.amazonaws.com is the public dns of my ec2 instance
to my redis configuration file in my own computer (the slave). Now when I run redis in the master (amazon ec2) and the slave (my computer at home) both in the command line, if i set a new redis key in the master, I get no update in the slave. The slave returns nill/null as no key exists.
What's wrong? Aren't the master and the slave connected? Or is their a different way to connect to the master through the public ip/dns?
Please note that I have also tried
slaveof ubuntu#ec2-xx-xxx-xxx-xx.us-west-2.compute.amazonaws.com 6379
where ubuntu is the user through which I have logged into to the amazon ec2 instance
But this does not work either. I have not set any authentication restrictions so the slave does not requires any password to connect to the master. I have searched online, rarely any detailed stuffs on redis replication and related error handlings.