Kafka on AWS ECS, how to handle advertised.host without known instance? - amazon-web-services

I'm trying to get Kafka running on an AWS ECS container. I have this setup already / working fine on my local docker environment, using the spotify/kafka image
To get this working locally, I needed to ensure the ADVERTISED_HOST environment variable was set. ADVERTISED_HOST needed to be set as the containers external IP, otherwise when I try to connect it was just giving me connection refused.
My local docker-compose.yaml has this for the kafka container:
kafka:
image: spotify/kafka
hostname: kafka
environment:
- ADVERTISED_HOST=192.168.0.70
- ADVERTISED_PORT=9092
ports:
- "9092:9092"
- "2181:2181"
restart: always
Now the problem is, I don't know what the IP is going to be, as I dont know which instance this will run on. So how do I set that environment variable?

Your entrypoint script will need to call the EC2 Metadata Service on startup (in this case http://169.254.169.254/latest/meta-data/local-hostname) to get the external-to-docker hostname and set that variable.
Sample:
[ec2-user ~]$ curl http://169.254.169.254/latest/meta-data/local-hostname
ip-10-251-50-12.ec2.internal

Related

How to address another container in the same task definition in AWS ECS on Fargate?

I have an MQTT application which consists of a broker and multiple clients. The broker and each client run in their own container. Locally I am using Docker compose to set up my application:
services:
broker:
image: mqtt-broker:latest
container_name: broker
ports:
- "1883:1883"
networks:
- engine-net
db:
image: database-client:latest
container_name: vehicle-engine-db
networks:
- engine-net
restart: on-failure
networks:
engine-net:
external: false
name: engine-net
The application inside my clients is written in C++ and uses the Paho library. I use the async_client to connect to the broker. It takes two arguments, namely:
mqtt::async_client cli(server_address, client_id);
Hereby, server_address is the IP of the broker + port, and the client_id is the "name" of the client that is connecting. While using the compose file, I can simply use the service name given in the file to address the other containers in the network (here "broker:1883" does the trick). My containers work, and now I want to deploy to AWS Fargate.
In the task definition, I add my containers and give a name to them (the same names like the services in the Docker compose file. However, the client does not seem to be able to connect to the broker, as the deployment fails. I am quite sure that it cannot connect because it cannot resolve the broker IP.
AWS Fargate uses network mode awsvpc which - to my understanding - puts all containers of a task into the same VPC subnet. Therefore, automatic name resolution like in Docker compose would make sense to me.
Has anybody encountered the same problem? How can I resolve it?
Per the documentation, containers in the same Fargate task can address each other on 127.0.0.1 at the container's respective ports.

Fargate task stops about 10s after is starts with no log output

My Fargate task keeps stopping after it's started and doesn't output any logs (awslog driver is selected).
The container does start up and stay running when i execute docker locally.
Docker-compose file:
version: '2'
services:
asterisk:
build: .
container_name: asterisk
restart: always
ports:
- 10000-10099:10000-10099/udp
- 5060:5060/udp
Dockerfile:
FROM debian:10.7
RUN {stuff-that-works-is-here}
# Keep Asterisk running in the foreground
ENTRYPOINT ["asterisk", "-f"]
# SIP port
EXPOSE 5060:5060/udp
# RTP ports
EXPOSE 10000-10099:10000-10099/udp
my task execution role has full cloudwatch access for debugging.
Click on the ECS task instance, expand the container section, the error should be shown there. I have attached a screen shot of it. Here is a screenshotScrenshot
The AWS log driver alone is not enough.
Unfortunately, Fargate doesn't create the log group for you unless you tell it to
See Creating a log group at https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_awslogs.html
I had a similar problem, and the cause was the Health Check.
ECS dont have Health Check for UDP, so when you open a UDP port if you use Docker for the deploy (docker compose), it create a Health Check pointing to a TCP port, and since there was no open TCP ports for that range, the container reset itself due to Health Check.
I had to add a custom Resource to docker-compose:
x-aws-cloudformation:
Resources:
AsteriskUDP5060TargetGroup:
Type: "AWS::ElasticLoadBalancingV2::TargetGroup"
Properties:
HealthCheckProtocol: TCP
HealthCheckPort: 8088
Basically I have a Health Check for a UDP port pointing to a TCP port. Its a "hack" to bypass this problem when the deploy is made with Docker.

Can't connect to elasticache redis from docker container running in EC2

As a part of my CI process, I am creating a docker-machine EC2 instance and running 2 docker containers inside of it via docker-compose. The server container test script attempts to connect to an AWS elasticache redis instance within the same VPC as the EC2. When the test script is run I get the following error:
1) Storage
check cache connection
should return seeded value in redis:
Error: Timeout of 2000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/usr/src/app/test/scripts/test.js)
at listOnTimeout (internal/timers.js:549:17)
at processTimers (internal/timers.js:492:7)
Update: I can connect via redis-cli from the EC2 itself:
redis-cli -c -h ***.cache.amazonaws.com -p 6379 ping
> PONG
It looks like I cant connect to my redis instance because my docker container is using an IP that is not within the same VPC as my elasticache instance. How can I setup my docker config to use the same IP as the host machine while building my containers from remote images? Any help would be appreciated.
Relevant section of my docker-compose.yml:
version: '3.8'
services:
server:
build:
context: ./
dockerfile: Dockerfile
image: docker.pkg.github.com/$GITHUB_REPOSITORY/$REPOSITORY_NAME-server:github_ci_$GITHUB_RUN_NUMBER
container_name: $REPOSITORY_NAME-server
command: npm run dev
ports:
- "8080:8080"
- "6379:6379"
env_file: ./.env
Server container Dockerfile:
FROM node:12-alpine
# create app dir
WORKDIR /usr/src/app
# install dependencies
COPY package*.json ./
RUN npm install
# bundle app source
COPY . .
EXPOSE 8080 6379
CMD ["npm", "run", "dev"]
Elasticache redis SG inbound rules:
EC2 SG inbound rules:
I solved the problem through extensive trial and error. The major hint that pointed me in the right direction was found in the Docker docs:
By default, the container is assigned an IP address for every Docker network it connects to. The IP address is assigned from the pool assigned to the network...
Elasticache instances are only accessible internally from their respective VPC. Based on my config, the docker container and the ec2 instance were running on 2 different IP addresses but only the EC2 IP was whitelisted to connect to Elasticache.
I had to bind the docker container IP to the host EC2 IP in my docker.compose.yml by setting the container network_mode to "host":
version: '3.8'
services:
server:
image: docker.pkg.github.com/$GITHUB_REPOSITORY/$REPOSITORY_NAME-server:github_ci_$GITHUB_RUN_NUMBER
container_name: $REPOSITORY_NAME-server
command: npm run dev
ports:
- "8080:8080"
- "6379:6379"
network_mode: "host"
env_file: ./.env
...

Run multiple task-definition using docker-compose.yml and ecs-params.yml file in AWS ECS with different lauch type and volume mounting

I have 4 docker images which i want to run on ECS. For my local system i use docker-compose file where i have multiple services.
I want to do similar docker compose on ECS.
I want my database image to run on EC2 and rest on fargate and host the volume of database on EC2 and make sure each container can communicate with each-other using their name.
How do i configure my docker-compose.yml and ecs-param.yml file??
My sample docker-compose.yml file
version: '2.1'
services:
first:
image: first:latest
ports:
- "2222:2222"
depends_on:
database:
condition: service_healthy
second:
image: second:latest
ports:
- "8090:8090"
depends_on:
database:
condition: service_healthy
third:
image: third:latest
ports:
- "3333:3333"
database:
image: database
environment:
MYSQL_ROOT_PASSWORD: abcde
MYSQL_PASSWORD: abcde
MYSQL_USER: user
ports:
- "3306:3306"
volumes:
- ./datadir/mysql:/var/lib/mysql
healthcheck:
test: ["CMD", "mysqladmin" ,"ping", "-h", "localhost"]
timeout: 5s
retries: 5
I don't see how you connect containers with each other.
depends_on just tells Docker Compose the order to use when
containers are started.
You may have actual connection hardcoded inside containers, that's not good.
Assuming Docker Compose file you shared containers can reach each
other using their aliases from Compose file.
For example third container can use database domain name to reach
database container.
So if you have such names hardcoded in your containers, it will work. However
usually people configure connection points(URLs) as environment variables in
Docker Compose file. In this case there is nothing hardcoded.
Hosting DB volume on EC2 can be a bad idea.
EC2 has 2 types of storage mapping - EBS and instance (based on S3). Instance
storage is destroyed when EC2 instance is terminated. EBS has data preserves on
EBS always.
So you either use EBS storage (and not EC2) or S3 that is not suitable for your
need here.
Hosting DB in a container is very bad idea.
You can find the same info in a description of many DB images in Docker Hub.
Instead you can use MySql as a service using AWS RDS.
The problem you have now has nothing common with AWS and ECS
now.
When you have Docker Compose running fine locally, you will
get the same on ECS side.
You can see example of configuration via Compose file here.

a service in ecs fails to find another service within the same network

I have a very simple docker-compose for locust (python package for load testing). It starts a 'master' service and a 'slave' service. Everything works perfectly locally but when I deploy it to AWS ECS a 'slave' can't find a master.
services:
my-master:
image: chapkovski/locust
ports:
- "80:80"
env_file:
- .env
environment:
LOCUST_MODE: master
my-slave:
image: chapkovski/locust
env_file:
- .env
environment:
LOCUST_MODE: slave
LOCUST_MASTER_HOST: my-master
So apparently I need to refer from my-slave service not to my-master when I am on ECS. What's wrong here?
Everything works perfectly locally but when I deploy it to AWS ECS
a 'slave' can't find a master.
I assume that slave needs to access master both must be in the same task definition to access like this or you can explore service discovery?
"links": [
"master"
]
links
Type: string array
Required: no
The link parameter allows containers to communicate with each other
without the need for port mappings. Only supported if the network mode
of a task definition is set to bridge. The name:internalName construct
is analogous to name:alias in Docker links.
Note
This parameter is not supported for Windows containers or tasks using the awsvpc network mode.
Important
Containers that are collocated on a single container instance may be
able to communicate with each other without requiring links or host
port mappings. Network isolation is achieved on the container instance
using security groups and VPC settings.
"links": ["name:internalName", ...]
container_definition_network