Can not unmount a device using "umount" in Docker - build

I don't know why but, umount is not working in docker.
umount: loop3/: must be superuser to umount
Let me share one more thing that is
It creates loop3 under /mnt/loop3 in real machine. Which is most unexpected thing for me, because promises pure virtual environment.
Why? Any solution?
Scenario :
I created docker ubuntu:13.04 to create cross compilation environment.
Docker Linux machine (ubuntu):
Linux 626089eadfeb 3.10.45-1-lts #1 SMP Fri Jun 27 06:44:23 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
Linux Machine (rch Linux):
Linux localhost 3.10.45-1-lts #1 SMP Fri Jun 27 06:44:23 UTC 2014 x86_64 GNU/Linux
Docker Info
Client version: 1.0.1
Client API version: 1.12
Go version (client): go1.3
Git commit (client): 990021a
Server version: 1.0.1
Server API version: 1.12
Go version (server): go1.3
Git commit (server): 990021a

I found the solution :
In default docker run it's not a real Operating system as we expect. It doesn't have permissions to access the devices. So we have to use --privileged While running a docker.
By default, Docker containers are "unprivileged" and cannot, for example, run a Docker daemon inside a Docker container. This is because by default a container is not allowed to access any devices, but a "privileged" container is given access to all devices.
When the operator executes docker run --privileged, Docker will enable to access to all devices on the host as well as set some configuration in AppArmor to allow the container nearly all the same access to the host as processes running outside containers on the host.

In my case it was the mount command of linux but it is also valid for unmount.
The previous solution is valid but my scenario could not execute the command 'docker run' since it was being used in real time.
The command I wanted to mount :
mount -o remount,size=5G /dev/shm
Solution
Verification in the docker :
[docker]$ df -h
Filesystem Size Used Avail Use% Mounted on
shm 64M 4.0K 64M 1% /dev/shm
[docker]$ exit
We look for the ID of our container :
$ docker ps
CONTAINER ID IMAGE
<container_id> nameimage
Let's memorize that beginning of the ID
All the following commands must be executed with sudo (use in the shortest possible time) :
# sudo -i
We go to the folder that contains the docker :
# By default
cd /var/lib/docker/containers/
And we open the folder that begins on .
cd <container_id>
We execute our command with the necessary parameters :
mount -o remount,size=5G shm
(Note: I do not remember if the command must be executed to show the file system)
Finally we enter the Docker and verify that the values ​​have been updated correctly:
[docker]$ df -h
Filesystem Size Used Avail Use% Mounted on
shm 5.0G 0 5.0G 0% /dev/shm
Sources
https://github.com/docker/cli/issues/1278

In my case it was a problem during starting docker service.
A cause was not having enough space left on the hard drive.
After cleaning up the disk and rebooting - docker service started successfully.

Please don't follow the accepted answer --privileged give too much permission consider using --cap-add=CAP_SYS_ADMIN that guard many many function but less than privileged:
Perform a range of system administration operations
including: quotactl(2), mount(2), umount(2),
Here we want the mount(2)/umount(2).
If you want to be safe you can cap-add and in the entrypoint drop this capability with prtcl (it depend on the language you use for C/python it is quite easy, maybe prctl has binding for your language).

Related

Issues with Docker Swarm running TeamCity using rexray/ebs for drive persistence in AWS EBS

I'm quite new to Docker but have started thinking about production set-ups, hence needing to crack the challenge of data persistence when using Docker Swarm. I decided to start by creating my deployment infrastructure (TeamCity for builds and NuGet plus the "registry" [https://hub.docker.com/_/registry/] for storing images).
I've started with TeamCity. Obvious this needs data persistence in order to work. I am able to run TeamCity in a container with an EBS drive and everything looks like it is working just fine - TeamCity is working through the set-up steps and my TeamCity drives appear in AWS EBS, but then the worker node TeamCity gets allocated to shuts down and the install process stops.
Here are all the steps I'm following:
Phase 1 - Machine Setup:
Create one AWS instance for master
Create two AWS instances for workers
All are 64-bit Ubuntu t2.mircro instances
Create three elastic IPs for convenience and assign them to the above machines.
Install Docker on all nodes using this: https://docs.docker.com/install/linux/docker-ce/ubuntu/
Install Docker Machine on all nodes using this: https://docs.docker.com/machine/install-machine/
Install Docker Compose on all nodes using this: https://docs.docker.com/compose/install/
Phase 2 - Configure Docker Remote on the Master:
$ sudo docker run -p 2375:2375 --rm -d -v /var/run/docker.sock:/var/run/docker.sock jarkt/docker-remote-api
Phase 3 - install the rexray/ebs plugin on all machines:
$ sudo docker plugin install --grant-all-permissions rexray/ebs REXRAY_PREEMPT=true EBS_ACCESSKEY=XXX EBS_SECRETKEY=YYY
[I lifted the correct values from AWS for XXX and YYY]
I test this using:
$ sudo docker volume create --driver=rexray/ebs --name=delete --opt=size=2
$ sudo docker volume rm delete
All three nodes are able to create and delete drives in AWS EBS with no issue.
Phase 4 - Setup the swarm:
Run this on the master:
$ sudo docker swarm init --advertise-addr eth0:2377
This gives the command to run on each of the workers, which looks like this:
$ sudo docker swarm join --token XXX 1.2.3.4:2377
These execute fine on the worker machines.
Phase 5 - Set up visualisation using Remote Powershell on my local machine:
$ $env:DOCKER_HOST="{master IP address}:2375"
$ docker stack deploy --with-registry-auth -c viz.yml viz
viz.yml looks like this:
version: '3.1'
services:
viz:
image: dockersamples/visualizer
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
ports:
- "8080:8080"
deploy:
placement:
constraints:
- node.role==manager
This works fine and allows me to visualise my swarm.
Phase 6 - Install TeamCity using Remote Powershell on my local machine:
$ docker stack deploy --with-registry-auth -c docker-compose.yml infra
docker-compose.yml looks like this:
version: '3'
services:
teamcity:
image: jetbrains/teamcity-server:2017.1.2
volumes:
- teamcity-server-datadir:/data/teamcity_server/datadir
- teamcity-server-logs:/opt/teamcity/logs
ports:
- "80:8111"
volumes:
teamcity-server-datadir:
driver: rexray/ebs
teamcity-server-logs:
driver: rexray/ebs
[Incorporating NGINX as a proxy is a later step on my to do list.]
I can see both the required drives appear in AWS EBS and the container appear in my swarm visualisation.
However, after a while of seeing the progress screen in TeamCity the worker machine containing the TeamCity instance shuts down and the process abruptly ends.
I'm at a loss as to what to do next. I'm not even sure where to look for logs.
Any help gratefully received!
Cheers,
Steve.
I found a way to get logs for my service. First do this to list the services the stack creates:
$ sudo docker service ls
Then do this to see logs for the service:
$ sudo docker service logs --details {service name}
Now I just need to wade through the logs and see what went wrong...
Update
I found the following error in the logs:
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 | [2018-05-14 17:38:56,849] ERROR - r.configs.dsl.DslPluginManager - DSL plugin compilation failed
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 | exit code: 1
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 | stdout: #
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 | # There is insufficient memory for the Java Runtime Environment to continue.
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 | # Native memory allocation (mmap) failed to map 42012672 bytes for committing reserved memory.
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 | # An error report file with more information is saved as:
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 | # /opt/teamcity/bin/hs_err_pid125.log
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 |
infra_teamcity.1.bhiwz74gnuio#ip-172-31-18-103 | stderr: Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000e2dfe000, 42012672, 0) failed; error='Cannot allocate memory' (errno=12)
Which is making me think this is a memory problem. I'm going to try this again with a better AWS instance and see how I get on.
Update 2
Using a larger AWS instance solved the issue. :)
I then discovered that rexray/ebs doesn't like it when a container switches between hosts in my swarm - it duplicates the EBS volumes so that it keeps one per machine. My solution to this was to use an EFS drive in AWS and mount it to each possible host. I then updated the fstab file so that the drive is remounted on every reboot. Job done. Now to look into using a reverse proxy...

Running Docker container randomly disappears on AWS EC2 Ubuntu

I'm running Docker on a t2.micro AWS EC2 instance with Ubuntu.
I'm running several containers. One of my long-running containers (always the same) just disappeared after running about 2-5 days for the third time right now. It is just gone with no sign of a crash.
The machine has not been restarted (uptime says 15 days).
I do not use the --rm flag: docker run -d --name mycontainer myimage.
There is no exited zombie of this container when running docker ps -a.
There is no log, i.e. docker logs mycontainer does not find any container.
There is no log entry in journalctl -u docker.service within the time frame
where the container disappears. However, there are some other log entries
regarding another container (let's call it othercontainer) which are
occuring repeatedly about every 6 minutes (it's a cronjob, don't know if relevant):
could not remove cluster networks: This node is not a swarm manager. Use
"docker swarm init" or "docker swarm join" to connect this node to swarm
and try again
Handler for GET /v1.24/networks/othercontainer_default returned error:
network othercontainer_default not found
Firewalld running: false
Even if there would be e.g. an out-of-memory issue or if my application just exits, I would still have an exited Docker container zombie in the ps -a overview, probably with exist status 0 or != 0, right?
I also don't want to --restart automatically, I just want to see the exited container.
Where can I look for more details to trace the issue?
Versions:
OS: Ubuntu 16.04.2 LTS (Kernel: 4.4.0-1013-aws)
Docker: Docker version 17.03.1-ce, build c6d412e
Thanks to a hint to look at dmesg or maybe the general journalctl I think I finally found the issue.
Somehow, one of the cronjobs has been running docker system prune -f at its end every 5 minutes. This command basically seems to remove everything unused and non-running.
I didn't know about this command before but certainly this has to be the way how my exited containers got removed without me knowing how it happened.

gdb in docker container returns "ptrace: Operation not permitted."

I've checked /proc/sys/kernel/yama/ptrace_scope in the container and on the host - both report the value as zero but when attached to pid one gdb reports
Reading symbols from /opt/my-web-proxy/bin/my-web-proxy...done.
Attaching to program: /opt/my-web-proxy/bin/my-web-proxy, process 1
ptrace: Operation not permitted.
I've also tried attached to the container with the privileged flag
docker exec --privileged -it mywebproxy_my-proxy_1 /bin/bash
Host OS is Fedora 25 with docker from their repos and container is a official centos6.8
I discovered the answer - the container needs to be started with strace capabilities
Adding this to my docker-compose.yml file allows GDB to work
cap_add:
- SYS_PTRACE
Or it can also be passed on the docker command line with --cap-add=SYS_PTRACE

Sharing directories in a Docker container both with a Dockerfile and after the container is running

Sharing data between a running docker container and my host (on AWS) seems overly complicated. From the docker documentation it seems as if I need to specify volumes when I start the container.
I found this: https://github.com/synack/docker-rsync
But this watches recursively to copy only from the host machine to the docker container
I'm looking for a way to create (preferably in a Dockerfile) a folder visible on my host machine on AWS where I can scp files into that folder and they will be visible on my docker container. I am also looking for my docker image to be able to write to that folder so if the container is stopped I won't lose those files.
As a side note I already declared in my Dockerfile to
VOLUME /Training-master
but I don't know how to access it from my machine and when I stopped the container I lost the data.
Does anyone know how to do this or can they point me in the right direction?
What you are looking for is provided by docker run time options. Documented here: http://docs.docker.com/engine/userguide/dockervolumes/#mount-a-host-directory-as-a-data-volume
At the end of it, its clearly mentioned
Note: The host directory is, by its nature, host-dependent.
For this reason, you can’t mount a host directory from Dockerfile
because built images should be portable. A host directory wouldn’t
be available on all potential hosts.
Like Raghav said a drive cannot be created and shared from a Dockerfile because of image portability.
But after you create the image you can run this command and this will create a shared folder between host and docker. Be careful because you can overview a directory in the docker container if it has the same name as an existing folder:
$ sudo docker run -itd -v /home/ubuntu/Sharing dockeruser/imageID:version bash
/home/ubuntu/Sharing -- Path to sharing folder on host computer
/Share -- Path to sharing folder in my container
dockeruser/imageID:version -- the name of your container
-v -- specifies you are creating a volume
-d -- daemonizes the containe, puts it in the background
bash -- the command for the container to execute
Just for reference for Windows users:
1) You can mount a host folder into a container by
docker run -ti -v C:\local_folder\:c:\container_folder container1
2) Alternatively, you can create a volume:
docker volume create --name temp_volume
See the absolute path of the volume by:
docker volume inspect temp_volume
The mountpoint is the absolute path of the volume. You can add/remove files from that path. Then you can mount it to the container by:
docker run -ti -v temp_volume:c:\tmploc container1
Notice that both host and container are Windows machines.

Vagrant Rsync Error before provisioning

So I'm having some adventures with the vagrant-aws plugin, and I'm now stuck on the issue of syncing folders. This is necessary to provision the machines, which is the ultimate goal. However, running vagrant provision on my machine yields
[root#vagrant-puppet-minimal vagrant]# vagrant provision
[default] Rsyncing folder: /home/vagrant/ => /vagrant
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!
mkdir -p '/vagrant'
I'm almost positive the error is caused because ssh-ing manually and running that command yields 'permission denied' (obviously, a non-root user is trying to make a directory in the root directory). I tried ssh-ing as root but it seems like bad practice. (and amazon doesn't like it) How can I change the folder to be rsynced with vagrant-aws? I can't seem to find the setting for that. Thanks!
Most likely you are running into the known vagrant-aws issue #72: Failing with EC2 Amazon Linux Images.
Edit 3 (Feb 2014): Vagrant 1.4.0 (released Dec 2013) and later versions now support the boolean configuration parameter config.ssh.pty. Set the parameter to true to force Vagrant to use a PTY for provisioning. Vagrant creator Mitchell Hashimoto points out that you must not set config.ssh.pty on the global config, you must set it on the node config directly.
This new setting should fix the problem, and you shouldn't need the workarounds listed below anymore. (But note that I haven't tested it myself yet.) See Vagrant's CHANGELOG for details -- unfortunately the config.ssh.pty option is not yet documented under SSH Settings in the Vagrant docs.
Edit 2: Bad news. It looks as if even a boothook will not be "faster" to run (to update /etc/sudoers.d/ for !requiretty) than Vagrant is trying to rsync. During my testing today I started seeing sporadic "mkdir -p /vagrant" errors again when running vagrant up --no-provision. So we're back to the previous point where the most reliable fix seems to be a custom AMI image that already includes the applied patch to /etc/sudoers.d.
Edit: Looks like I found a more reliable way to fix the problem. Use a boothook to perform the fix. I manually confirmed that a script passed as a boothook is executed before Vagrant's rsync phase starts. So far it has been working reliably for me, and I don't need to create a custom AMI image.
Extra tip: And if you are relying on cloud-config, too, you can create a Mime Multi Part Archive to combine the boothook and the cloud-config. You can get the latest version of the write-mime-multipart helper script from GitHub.
Usage sketch:
$ cd /tmp
$ wget https://raw.github.com/lovelysystems/cloud-init/master/tools/write-mime-multipart
$ chmod +x write-mime-multipart
$ cat boothook.sh
#!/bin/bash
SUDOERS_FILE=/etc/sudoers.d/999-vagrant-cloud-init-requiretty
echo "Defaults:ec2-user !requiretty" > $SUDOERS_FILE
echo "Defaults:root !requiretty" >> $SUDOERS_FILE
chmod 440 $SUDOERS_FILE
$ cat cloud-config
#cloud-config
packages:
- puppet
- git
- python-boto
$ ./write-mime-multipart boothook.sh cloud-config > combined.txt
You can then pass the contents of 'combined.txt' to aws.user_data, for instance via:
aws.user_data = File.read("/tmp/combined.txt")
Sorry for not mentioning this earlier, but I am literally troubleshooting this right now myself. :)
Original answer (see above for a better approach)
TL;DR: The most reliable fix is to "patch" a stock Amazon Linux AMI image, save it and then use the customized AMI image in your Vagrantfile. See below for details.
Background
A potential workaround is described (and linked in the bug report above) at https://github.com/mitchellh/vagrant-aws/pull/70/files. In a nutshell, add the following to your Vagrantfile:
aws.user_data = "#!/bin/bash\necho 'Defaults:ec2-user !requiretty' > /etc/sudoers.d/999-vagrant-cloud-init-requiretty && chmod 440 /etc/sudoers.d/999-vagrant-cloud-init-requiretty\nyum install -y puppet\n"
Most importantly this will configure the OS to not require a tty for user ec2-user, which seems to be the root of the problem. I /think/ that the additional installation of the puppet package is not required for the actual fix (although Vagrant may use Puppet for provisioning the machine later, depending on how you configured Vagrant).
My experience with the described workaround
I have tried this workaround but Vagrant still occasionally fails with the same error. It might be a "race condition" where Vagrant happens to run its rsync phase faster than cloud-init (which is what aws.user_data is passing information to) can prepare the workaround for #72 on the machine for Vagrant. If Vagrant is faster you will see the same error; if cloud-init is faster it works.
What will work (but requires more effort on your side)
What definitely works is to run the command on a stock Amazon Linux AMI image, and then save the modified image (= create an image snapshot) as a custom AMI image of yours.
# Start an EC2 instance with a stock Amazon Linux AMI image and ssh-connect to it
$ sudo su - root
$ echo 'Defaults:ec2-user !requiretty' > /etc/sudoers.d/999-vagrant-cloud-init-requiretty
$ chmod 440 /etc/sudoers.d/999-vagrant-cloud-init-requiretty
# Note: Installing puppet is mentioned in the #72 bug report but I /think/ you do not need it
# to fix the described Vagrant problem.
$ yum install -y puppet
You must then use this custom AMI image in your Vagrantfile instead of the stock Amazon one. The obvious drawback is that you are not using a stock Amazon AMI image anymore -- whether this is a concern for you or not depends on your requirements.
What I tried but didn't work out
For the record: I also tried to pass a cloud-config to aws.user_data that included a bootcmd to set !requiretty in the same way as the embedded shell script above. According to the cloud-init docs bootcmd is run "very early" in the startup cycle for an EC2 instance -- the idea being that bootcmd instructions would be run earlier than Vagrant would try to run its rsync phase. But unfortunately I discovered that the bootcmd feature is not implemented in the outdated cloud-init version of current Amazon's Linux AMIs (e.g. ami-05355a6c has cloud-init 0.5.15-69.amzn1 but bootcmd was only introduced in 0.6.1).