How to make Dockerfile COPY work for windows absolute paths? - dockerfile

How do I make my COPY command work for absolute paths on Windows? I tried git-bash, cmd and powershell consoles to build with docker build -t custom-maven-image .
# Dockerfile
FROM maven:3-openjdk-11-slim
# these are three versions of copy command I tried
COPY C:/Users/myuser/.m2 /root/.m2
COPY /C/Users/myuser/.m2 /root/.m2
COPY /c/Users/myuser/.m2 /root/.m2
What I get is an error:
...
#5 ERROR: "/C/Users/myuser/.m2" not found: not found
UPDATE:
Thx #Jeremy for bugs references and now I see that docs clearly says:
COPY obeys the following rules:
The path must be inside the context of the build; you cannot
COPY ../something /something, because the first step of a docker build
is to send the context directory (and subdirectories) to the docker
daemon.

All the resources need to be in the dir that you run the build, i.e. where your Dockerfile is. You cant use an absolute path from elsewhere, think of it from the build perspective - where is the build happening - in the Dockerfile? It can run commands but those resources need to be public.
https://github.com/moby/moby/issues/4592
https://github.com/docker/compose/issues/4857

Related

Dockerfile COPY command doesn't make my file available with I run the image

I am using Linux Ubuntu 20.04, Pycharm Pro, Python 3.9, Docker (installed a couple weeks ago, don't remember ver).
I have a Python project in the path (
/home/crusty.user/PythonProjects/NoLegals
There are 4 files in this path: main.py, utils.py, NoLegals_Config.csv, Dockerfile
The csv file acts as a config to tell the python project which parts of the research to do, or not to. It reads a line, with a Y or N. Pretty simple. It works great in Linux and in windows.
From the path above, I run sudo docker build -t nolegals .
Everything runs successfully.
When I try to run the Dockerfile (sudo docker run nolegals) it fails when it gets to the csv file with the error:
FileNotFoundError: [Errno 2] No such file or directory: '/home/crusty.user/PythonProjects/NoLegals/NoLegals_Config.csv'
In my Dockerfile I have:
WORKDIR /NoLegals
Further down I have:
COPY NoLegals_Config.csv /
COPY main.py /
COPY utils.py /
--There's a bunch of otherstuff for setting up the environment, libraries, etc. all of which runs successfully on the build. Also, I don't get a failure regarding the path of the csv file during build. I've been digging around and I've learned that it might having something to do with not being able to find the csv file within the Docker image when it builds, but it finds the main.py and utils.py just fine. There is a line of code in the Python main.py file that points to the location of the csv file dynamically as a suggestion to fix the problem but this too has failed. The path the error prints is also the correct path to the csv file.
#this works in linux, just not in the Dockerfile
filename = r'NoLegals_Config.csv
filepath = os.path.join(os.getcwd(), filename)
print(filepath)
I've tried LOTS of different things in that COPY NoLegals_config.csv / line, but to no avail. I appreciate any suggestions.
I've tried various forms of the COPY. Previous to the one listed was using the syntax:
COPY <source-path> <destination-Path> COPY NoLegals_Config.csv / COPY <full path of source> </NoLegals/NoLegals_Config.csv>
I've tried some other things that I can't recall.
What I ended up doing was fully qualifying the COPY statements.
WORKDIR /NoLegals
COPY NoLegals_Config.csv /NoLegals/NoLegals_Config.csv
COPY main.py /NoLegals/main.py
COPY utils.py /NoLegals/utils.py
CMD ["python"."/NoLegals/main.py"]
Is there a simpler way I could have written this?

How to use docker to test multiple compiler versions

What is the idiomatic way to write a docker file for building against many different versions of the same compiler?
I have a project which tests against a wide-range of versions of different compilers like gcc and clang as part of a CI job. At some point, the agents for the CI tasks were updated/changed, resulting in newer jobs failing -- and so I've started looking into dockerizing these builds to try to guarantee better reliability and stability.
However, I'm having some difficulty understanding what a proper and idiomatic approach is to producing build images like this without causing a large amount of duplication caused by layers.
For example, let's say I want to build using the following toolset:
gcc 4.8, 4.9, 5.1, ... (various versions)
cmake (latest)
ninja-build
I could write something like:
# syntax=docker/dockerfile:1.3-labs
# Parameterizing here possible, but would cause bloat from duplicated
# layers defined after this
FROM gcc:4.8
ENV DEBIAN_FRONTEND noninteractive
# Set the work directory
WORKDIR /home/dev
COPY . /home/dev/
# Install tools (cmake, ninja, etc)
# this will cause bloat if the FROM layer changes
RUN <<EOF
apt update
apt install -y cmake ninja-build
rm -rf /var/lib/apt/lists/*
EOF
# Default command is to use CMak
CMD ["cmake"]
However, the installation of tools like ninja-build and cmake occur after the base image, which changes per compiler version. Since these layers are built off of a different parent layer, this would (as far as I'm aware) result in layer duplication for each different compiler version that is used.
One alternative to avoid this duplication could hypothetically be using a smaller base image like alpine with separate installations of the compiler instead. The tools could be installed first so the layers remain shared, and only the compiler changes as the last layer -- however this presents its own difficulties, since it's often the case that certain compiler versions may require custom steps, such as installing certain keyrings.
What is the idiomatic way of accomplishing this? Would this typically be done through multiple docker files, or a single docker file with parameters? Any examples would be greatly appreciated.
I would separate the parts of preparing the compiler and doing the calculation, so the source doesn't become part of the docker container.
Prepare Compiler
For preparing the compiler I would take the ARG approach but without copying the data into the container. In case you wanna fast retry while having enough resources you could spin up multiple instances the same time.
ARG COMPILER=gcc:4.8
FROM ${COMPILER}
ENV DEBIAN_FRONTEND noninteractive
# Install tools (cmake, ninja, etc)
# this will cause bloat if the FROM layer changes
RUN <<EOF
apt update
apt install -y cmake ninja-build
rm -rf /var/lib/apt/lists/*
EOF
# Set the work directory
VOLUME /src
WORKDIR /src
CMD ["cmake"]
Build it
Here you have few options. You could either prepare a volume with the sources or use bind mounts together with docker exec like this:
#bash style
for compiler in gcc:4.9 gcc:4.8 gcc:5.1
do
docker build -t mytag-${compiler} --build-arg COMPILER=${compiler} .
# place to clean the target folder
docker run -v $(pwd)/src:/src mytag-${compiler}
done
And because the source is not part of the docker image you don't have bloat. You can also have two mounts, one for a readonly source tree and one for the output files.
Note: If you remove the CMake command you could also spin up the docker containers in parallel and use docker exec to start the build. The downside of this is that you have to take care of out of source builds to avoid clashes on the output folder.
put an ARG before the FROM and then invoke the ARG as the FROM
so:
ARG COMPILER=gcc:4.8
FROM ${COMPILER}
# rest goes here
then you
docker build . -t test/clang-8 --build-args COMPILER=clang-8
or similar.
If you want to automate just make a list of compilers and a bash script looping over the lines in your file, and paste the lines as inputs to the tag and COMPILER build args.
As for Cmake, I'd just do:
RUN wget -qO- "https://cmake.org/files/v3.23/cmake-3.23.1-linux-"$(uname -m)".tar.gz" | tar --strip-components=1 -xz -C /usr/local
When copying, I find it cleaner to do
WORKDIR /app/build
COPY . .
edit: formatting
As far as I know, there is no way to do that easily and safely. You could use a RUN --mount=type=cache, but the documentation clearly says that:
Contents of the cache directories persist between builder invocations without invalidating the instruction cache. Cache mounts should only be used for better performance. Your build should work with any contents of the cache directory as another build may overwrite the files or GC may clean it if more storage space is needed.
I have not tried it but I guess the layers are duplicated anyway, you just save time, assuming the cache is not emptied.
The other possible solution you have is similar to the one you mention in the question: starting with the tools installation and then customizing it with the gcc image. Instead of starting with an alpine image, you could start FROM scratch. scratch is basically the empty image, you could COPY the files generated by
RUN <<EOF
apt update
apt install -y cmake ninja-build
rm -rf /var/lib/apt/lists/*
EOF
Then you COPY the entire gcc filesystem. However, I am not sure it will work because the order of the initial layers is now reversed. This means that some files that were in the upper layer (coming from tools) now are in the lower layer and could be overwritten. In the comments, I asked you for a working Dockerfile because I wanted to try this out before answering. If you want, you can try this method and let us know. Anyway, the first step is extracting the files created from the tools layer.
How to extract changes from a layer?
Let's consider this Dockerfile and build it with docker build -t test .:
FROM debian:10
RUN apt update && apt install -y cmake && ( echo "test" > test.txt )
RUN echo "new test" > test.txt
Now that we have built the test image, we should find 3 new layers. You mainly have 2 ways to extract the changes from each layer:
the first is docker inspecting the image and then find the ids of the layers in the /var/lib/docker folder, assuming you are on Linux. Each layer has a diff subfolder containing the changes. Actually, I think it is more complex than this, that is why I would opt for...
skopeo: you can install it with apt install skopeo and it is a very useful tool to operate on docker images. The command you are interested in is copy, that extracts the layers of an image and export them as .tar:
skopeo copy docker-daemon:{image_name}:latest "dir:/home/test_img"
where image_name is test in this case.
Extracting layer content with Skopeo
In the specified folder, you should find some tar files and a configuration file (look at the skopeo copy command output and you will know which one is that). Then extract each {layer}.tar in a different folder and you are done.
Note: to find the layer containing your tools just open the configuration file (maybe using jq because it is json) and take the diff_id that corresponds to the RUN instruction you find in the history property. You should understand it once you open the JSON configuration. This is unnecessary if you have a small image that has, for example, debian as parent image and a single RUN instruction containing the tools you want to install.
Get GCC image content
Now that we have the tool layer content, we need to extract the gcc filesystem. we don't need skopeo for this one, but docker export is enough:
create a container from gcc (with the tag you need):
docker create --name gcc4.8 gcc:4.8
export it as tar:
docker export -o gcc4.8.tar gcc4.8
finally extract the tar file.
Putting all together
The final Dockerfile could be something like:
FROM scratch
COPY ./tools_layer/ /
COPY ./gcc_4.x/ /
In this way, the tools layer is always reused (unless you change the content of that folder, of course), but you can parameterize the gcc_4.x with the ARG instruction for example.
Read carefully: all of this is not tested but you might encounter 2 issues:
the gcc image overwrites some files you have changed in the tools layer. You could check if this happens by computing the diff between the gcc layer folder and the tools layer folder. If it happens, you can only keep track of that file/s and add it/them in the dockerfile after the COPY ./gcc ... with another COPY.
When in the upper layer a file is removed, docker marks that file with a .wh extension (not sure if it is different with skopeo). If in the tools layer you delete a file that exists in the gcc layer, then that file will not be deleted using the above Dockerfile (the COPY ./gcc ... instruction would overwrite the .wh). In this case too, you would need to add an additional RUN rm ... instruction.
This is probably not the correct approach if you have a more complex image that the one you are showing us. In my opinion, you could give this a try and just see if this works out with a single Dockerfile. Obviously, if you have many compilers, each one having its own tools set, the maintainability of this approach could be a real burden. Instead, if the Dockerfile is more or less linear for all the compilers, this might be good (after all, you do not do this every day).
Now the question is: is avoiding layer replication so important that you are willing to complicate the image-building process this much?

Github Actions path does not update

Right now, I'm trying to build a tool from source and use it to build a C++ project. I'm able to extract the tar file (gcc-arm-none-eabi). But, when I try to add it to path (using $GITHUB_PATH, not add-path), the path doesn't apply on my next action and I can't build the file. The error states that it can't find the gcc-arm-none-eabi toolset, which means that it didn't go to path.
Here's the script for the entrypoint of the first function (make is ran in the next action to allow for path to apply)
echo "Downloading ARM Toolchain"
# The one from apt isn't updated so I have to build from source
curl -L https://developer.arm.com/-/media/Files/downloads/gnu-rm/10-2020q4/gcc-arm-none-eabi-10-2020-q4-major-x86_64-linux.tar.bz2 -o gcc-arm-none-eabi.tar.bz2
tar -xjf gcc-arm-none-eabi.tar.bz2
echo "/github/workspace/gcc-arm-none-eabi-10-2020-q4-major/bin" >> $GITHUB_PATH
I can't even debug by seeing what's in the path because running echo $(PATH) just says that PATH cannot be found. What should I do?
I can't even debug by seeing what's in the path because running echo $(PATH) just says that PATH cannot be found. What should I do?
First, PATH is not a command so if you want to print its value, it would be something like echo "${PATH}" or echo "$PATH"
Then, if you want to add a value to an existing environment variable, it would be something like
export PATH="${PATH}:/github/workspace/gcc-arm-none-eabi-10-2020-q4-major/bin"
EDIT: seems not a valid way to add something to the path using Github Actions, meanwhile it seems correct in the question. To get more details: https://docs.github.com/en/free-pro-team#latest/actions/reference/workflow-commands-for-github-actions#adding-a-system-path . Thanks to Benjamin W. for pointing this out in the comments.
Finally I think it would be a better fit if you use a docker image that already contains that kind of dependancies (you could easily write your own Dockerfile if this image doesn't already exists). Github action is designed to use docker (or OCI containers) image that contains the dependancies you need to perform your build actions. You should take a look here: https://docs.github.com/en/free-pro-team#latest/actions/creating-actions/dockerfile-support-for-github-actions

Docker: cd into directory using regular expressions

I have to build a Docker image and in the Dockerfile I have move into a directory whose name contains a dynamic id, e.g myfolder12345, that can become myfolder56789 in another build. As I don't know which can be this id everytime I do the build, I have tried to use regular expressions to achieve that.
I've tried with
WORKDIR myfolder*
but the current directory remains /.
How can be solved?
If you know, during the build of the docker image, the name of the folder, use a variable:
WORKDIR ${workdir}
And set the variable value during the build.
Having a non-deterministic content in a docker image is a bad idea.
You should rename the folder first, then the WORKDIR will be easier:
# Supposing there is only one folder123 at each build
RUN mv /path/to/folder* /path/to/folder
WORKDIR /path/to/folder
how about trying the regex syntax?
WORKDIR workdir[\d]+
Let me know if that works for you.

Using %{buildroot} in a SPEC file

I'm creating a simple RPM installer, I just have to copy files to a directory structure I create in the %install process.
The %install process is fine, I create the following folder /opt/company/application/ with the command mkdir -p %{buildroot}/opt/company/%{name} and then I proceed to copy the files and subdirectories from my package. I've tried to install it and it works.
The doubt I have comes when uninstalling. I want to remove the folder /opt/company/application/ and I thought you're supposed to use %{buildroot} anywhere when referencing the install location. Because my understanding is the user might have a different structure and you can't assume that rmdir /opt/company/%{name}/ will work. Using that command in the %postun section deletes succesfully the directories whereas using rmdir ${buildroot}/opt/company/%{name} doesn't delete the folders.
My question is, shouldn't you be using ${buildroot} in the %postun in order to get the proper install location? If that's not the case, why?
Don't worry about it. If you claim the directory as your own in the %files section, RPM will handle it for you.
FYI, %{buildroot} probably won't exist on the target machine.