I have an issue where /var/lib/docker/overlay2/ and /var/lib/docker/volumes/ keep filling up my disk.
Why?
I am running TeamCity agents on AWS ECS task containers in docker-in-docker mode.
This generates volumes that cannot be deleted as they are all in-use state. Well at least I think it's because of this.
Note
I only have 1 container running on the system.
It is an amazon-linux machine.
I don't want to stop docker because it detaches the agent from the server.
What I've tried
Tried docker system prune -af
Tried docker volume prune -af
Tried docker rmi
All of these don't reclaim the space because the volumes are in-use
The only thing I can do is stop docker and delete /var/lib/docker and restart docker which I don't want to do.
Please help me out.
Related
Currently I have a fresh Ubuntu image with just docker installed on it. I tested this image to make sure my docker container works and everything is working fine.
I create a golden AMI of this. I create a launch template and add below to the User Data.
#!/bin/bash
sudo docker pull command
sudo docker run command
My issue is that the command will not run. From the other testing I have done it seems like the docker service is not running when these commands are being executed. When I ssh on to the EC2 and check the docker service it is running. I can manually run the docker commands and it works.
Any idea why the docker service wouldn't run when it boots up with the autoscaling group?
I have tried adding a bash file and running that through the User Data but same thing.
I even added a sleep to see if the docker service would come up before running the commands but still the same thing.
Any help would be greatly appreciated.
I haven't been able to determine if there is a time sync process (such as ntpd or chronyd) running on the docker swarm I've deployed to AWS using Docker Community Edition (CE) for AWS.
I've ssh'd to a swarm manager, but ps doesn't show much, and I don't see anything in /etc or /etc/conf.d that looks relevant.
I don't really have a good understanding of cloudformation, but I can see that the created instances running the docker nodes used AMI image Moby Linux 18.09.2-ce-aws1 stable (ami-0f4fb04ea796afb9a). I created a new instance w/ that AMI so I could ssh there. Still no time sync process indications w/ ps or in /etc
I suppose one of the swarm control containers that is running may deal with sync'ing time (maybe docker4x/l4controller-aws:18.09.2-ce-aws1)? Or maybe the cloudformation template installed one on the instances? But I don't know how to verify that.
So if anyone can tell me if there is a time sync process running (and where)?
And if not, I feel there should be so how might I start one up?
You can verify resources that are created by cloud formation Docker-no-vpc.tmpl from the link you provided.
Second thing, do you think ntpd have something do with docker-swarm? or it should be installed on the underlying EC2 instance?
Do ssh to your ec2 instance and very the status of the service, normally all AWS AMI has ntpd installed.
or you can just type to check
ntpd
If you did not find, you can install it for your self or you can run docker swarm with custom AMI.
UCP requires that the system clocks on all the machines in a UCP
cluster be in sync or else it can start having issues checking the
status of the different nodes in the cluster. To ensure that the
clocks in a cluster are synced, you can use NTP to set each machine's
clock.
First, on each machine in the cluster, install NTP. For example, to
install NTP on an Ubuntu distribution, run:
sudo apt-get update && apt-get install ntp
#On CentOS and RHEL, run:
sudo yum install ntp
what-does-clock-skew-detected-mean
Last thing, do you really need the stack that is created by cloudformation?
EC2 instances + Auto Scaling groups
IAM profiles
DynamoDB Tables
SQS Queue
VPC + subnets and security groups
ELB
CloudWatch Log Group
I know the cloudformation ease our life, but if you do not know the template (what resouces will be created) do not try to run the template, otherwise you will bear sweet cost at the of the month.
Also will suggest exploring AWS ECS and EKS, these are service that are sepcifly designed for docker container.
I have setup master and slave configuration of jenkins on aws ecs. Written a job that will build docker images and push to ecr. So each time the job builds it is taking the same amount of time approx to 10 min. My jenkins master is running on container and and have used Amazon EC2 Container Service Plugin to configure slave. I have mounted the docker socket file so that the slave node will have access to docker daemon but it is not using the layer images of the ecs instance. Each time it starts from fresh.
Overview of each build:
Probably your docker build is not using caching mechanism of docker. Please refer this for building cache.
In my application, I need to create many Docker machines on a cloud computing service (AWS-EC2 for now but could be changed), then deploy many containers on those machines. I am using docker-machine to provision them on AWS, using a command like
docker-machine create --driver amazonec2 --amazonec2-ssh-keypath <path-to-pem> <machine-name>
The problem is that it takes a lot of time, about 6 minutes, to create one such machine. So overall it takes hours just to create the machines for my deployment. Is there any way to create multiple machines at once with docker-machine? Or any way to speed up the provisioning of a machine? All machines have the same configuration, just different EC2 instances with different names.
I think running multiple docker-machine create commands in the background might work, but I fear it might corrupt the configuration (machine list, internal settings, etc.) of docker-machine. Even if I can run multiple such commands safely, I don't know how to check when and if they have completed successfully.
P.S: I understand that AWS supports Docker containers and it might be faster to create instances this way. But their "service model" of computation does not fit the needs of my application.
Running simultaneous docker-machine create commands should be fine - it depends on the driver, but I do this with Azure. The Azure driver has some logic to create the subnet, availability sets etc. and will re-use them if they're already there. Assuming AWS does something similar, you should be fine if you create the first machine and wait for it to complete, then start the rest in parallel.
As to knowing when they're done, it would be cool to run Docker Machine in a Docker container. That way you can create each machine with a docker run command, and use docker ps to see how they're doing. Assuming you want to manage them from the host though, you'd need your containers to share the host's Docker Machine state - mounting a volume and setting MACHINE_STORAGE_PATH to the volume on the host should do it.
TLDR; This will run docker-machine command 9 times (asynchronously) to create multiple machines. Adjust as required:
for i in `seq 1 9` ; do docker-machine create --driver virtualbox worker$i & done
Creates 9x machines...
worker1
worker2
worker3
worker4
worker5
worker6
worker7
worker8
worker9
EB was complaining that my build was timing out, so I ssh'd into an instance and decided to run docker build myself to see what was happening. Every step, even something as simple as a mkdir takes ages to run. Even a WORKDIR stalls for at least a minute or two before executing.
On my local machine these are instant. What is going on?
Same issue here with an Ubuntu machine running on AWS. Turns out the the key to the solution was switching from devicemapper to aufs storage backend.
First, run the following command to figure out which storage backend your currently use:
docker info | grep Storage
If it says devicemapper, you probably found the reason for the slowness.
Here is the prodecure for switching to the aufs backend in Ubuntu, taken from here:
sudo apt-get install -y -q linux-image-extra-$(uname -r)
sudo service docker restart
Note that you will have to rebuild all your existing images / containers, as they will be wiped when you switch to aufs.
Sorry to know you are facing this issue.
Elastic Beanstalk environment creation involves creation of lots of resources like autoscaling group, EC2 instances, security groups, Elastic Load Balancer etc. After that software is installed on your beanstalk instances. I am assuming you are only talking about the slowness of software installation (docker build) on beanstalk.
If you just run mkdir that should not be very slow. It should be reasonably fast.
However if you think that docker build overall is running very slow that could be because of IO intensive operations.
One thing you can try is using EBS provisioned IOPs with Elastic Beanstalk.
Read more about SSD instances here.
Can you try launching a new environment with SSD instances and see if docker build is still slow? If you can show an example dockerfile that takes a long time to build, I can try it out.