Can we mkdir + rename and remain atomic?

Can we mkdir + rename and remain atomic? - c++

I am designing a build system that needs to be careful about manipulating the filesystem in an atomic manner.
I have run into a situation where I have a temporary directory that contains files and now I want to move it to a more permanent home. I know how to rename atomically, but this does not work if the full destination path does not exist. If I didn't care about being atomic, the answer would be a simple mkdir -p ... && mv .... Is it possible to rename the temporary directory atomically in a way that will create the parent directories? I cannot do so ahead of time, because the parent directory names are computed.
This will be run inside of a Node.js program, so something from the npm ecosystem would be desirable, but it is trivial to call a command line program or C / C++ library to do the work.
Example:
/tmp/mytempbuild - a directory containing files
/myproject/build/dev/0.2.2 - the desired destination, whose parent may not exist
/myproject/build/master - an older, existing directory that must not be changed

Related

Why is the main() function of my program named "test" not getting called? [duplicate]

When running scripts in bash, I have to write ./ in the beginning:
$ ./manage.py syncdb
If I don't, I get an error message:
$ manage.py syncdb
-bash: manage.py: command not found
What is the reason for this? I thought . is an alias for current folder, and therefore these two calls should be equivalent.
I also don't understand why I don't need ./ when running applications, such as:
user:/home/user$ cd /usr/bin
user:/usr/bin$ git
(which runs without ./)

Because on Unix, usually, the current directory is not in $PATH.
When you type a command the shell looks up a list of directories, as specified by the PATH variable. The current directory is not in that list.
The reason for not having the current directory on that list is security.
Let's say you're root and go into another user's directory and type sl instead of ls. If the current directory is in PATH, the shell will try to execute the sl program in that directory (since there is no other sl program). That sl program might be malicious.
It works with ./ because POSIX specifies that a command name that contain a / will be used as a filename directly, suppressing a search in $PATH. You could have used full path for the exact same effect, but ./ is shorter and easier to write.
EDIT
That sl part was just an example. The directories in PATH are searched sequentially and when a match is made that program is executed. So, depending on how PATH looks, typing a normal command may or may not be enough to run the program in the current directory.

When bash interprets the command line, it looks for commands in locations described in the environment variable $PATH. To see it type:
echo $PATH
You will have some paths separated by colons. As you will see the current path . is usually not in $PATH. So Bash cannot find your command if it is in the current directory. You can change it by having:
PATH=$PATH:.
This line adds the current directory in $PATH so you can do:
manage.py syncdb
It is not recommended as it has security issue, plus you can have weird behaviours, as . varies upon the directory you are in :)
Avoid:
PATH=.:$PATH
As you can “mask” some standard command and open the door to security breach :)
Just my two cents.

Your script, when in your home directory will not be found when the shell looks at the $PATH environment variable to find your script.
The ./ says 'look in the current directory for my script rather than looking at all the directories specified in $PATH'.

When you include the '.' you are essentially giving the "full path" to the executable bash script, so your shell does not need to check your PATH variable. Without the '.' your shell will look in your PATH variable (which you can see by running echo $PATH to see if the command you typed lives in any of the folders on your PATH. If it doesn't (as is the case with manage.py) it says it can't find the file. It is considered bad practice to include the current directory on your PATH, which is explained reasonably well here: http://www.faqs.org/faqs/unix-faq/faq/part2/section-13.html

On *nix, unlike Windows, the current directory is usually not in your $PATH variable. So the current directory is not searched when executing commands. You don't need ./ for running applications because these applications are in your $PATH; most likely they are in /bin or /usr/bin.

This question already has some awesome answers, but I wanted to add that, if your executable is on the PATH, and you get very different outputs when you run
./executable
to the ones you get if you run
executable
(let's say you run into error messages with the one and not the other), then the problem could be that you have two different versions of the executable on your machine: one on the path, and the other not.
Check this by running
which executable
and
whereis executable
It fixed my issues...I had three versions of the executable, only one of which was compiled correctly for the environment.

Rationale for the / POSIX PATH rule
The rule was mentioned at: Why do you need ./ (dot-slash) before executable or script name to run it in bash? but I would like to explain why I think that is a good design in more detail.
First, an explicit full version of the rule is:
if the path contains / (e.g. ./someprog, /bin/someprog, ./bin/someprog): CWD is used and PATH isn't
if the path does not contain / (e.g. someprog): PATH is used and CWD isn't
Now, suppose that running:
someprog
would search:
relative to CWD first
relative to PATH after
Then, if you wanted to run /bin/someprog from your distro, and you did:
someprog
it would sometimes work, but others it would fail, because you might be in a directory that contains another unrelated someprog program.
Therefore, you would soon learn that this is not reliable, and you would end up always using absolute paths when you want to use PATH, therefore defeating the purpose of PATH.
This is also why having relative paths in your PATH is a really bad idea. I'm looking at you, node_modules/bin.
Conversely, suppose that running:
./someprog
Would search:
relative to PATH first
relative to CWD after
Then, if you just downloaded a script someprog from a git repository and wanted to run it from CWD, you would never be sure that this is the actual program that would run, because maybe your distro has a:
/bin/someprog
which is in you PATH from some package you installed after drinking too much after Christmas last year.
Therefore, once again, you would be forced to always run local scripts relative to CWD with full paths to know what you are running:
"$(pwd)/someprog"
which would be extremely annoying as well.
Another rule that you might be tempted to come up with would be:
relative paths use only PATH, absolute paths only CWD
but once again this forces users to always use absolute paths for non-PATH scripts with "$(pwd)/someprog".
The / path search rule offers a simple to remember solution to the about problem:
slash: don't use PATH
no slash: only use PATH
which makes it super easy to always know what you are running, by relying on the fact that files in the current directory can be expressed either as ./somefile or somefile, and so it gives special meaning to one of them.
Sometimes, is slightly annoying that you cannot search for some/prog relative to PATH, but I don't see a saner solution to this.

When the script is not in the Path its required to do so. For more info read http://www.tldp.org/LDP/Bash-Beginners-Guide/html/sect_02_01.html

All has great answer on the question, and yes this is only applicable when running it on the current directory not unless you include the absolute path. See my samples below.
Also, the (dot-slash) made sense to me when I've the command on the child folder tmp2 (/tmp/tmp2) and it uses (double dot-slash).
SAMPLE:
[fifiip-172-31-17-12 tmp]$ ./StackO.sh
Hello Stack Overflow
[fifi#ip-172-31-17-12 tmp]$ /tmp/StackO.sh
Hello Stack Overflow
[fifi#ip-172-31-17-12 tmp]$ mkdir tmp2
[fifi#ip-172-31-17-12 tmp]$ cd tmp2/
[fifi#ip-172-31-17-12 tmp2]$ ../StackO.sh
Hello Stack Overflow

Accessing .in files from a different directory

Suppose that I add a program to path that is dependent on a file name "test.in". I programmed this in C++ so I used ifstream fin("test.in") without specifying the directory. Now if I were to run this program from a different directory, would the program be able to access the file "test.in"?

Firstly, this has nothing to do with the file extension, which is merely a convention given as part of the filename.
Secondly, you were always using a relative path. Even when you were running your program "from the same directory" as test.in, you were reliant on the "working directory" of your shell context being the same as the directory in which the executable and the file reside.
This is not always the case.
For example:
~/myProject:# ls
test.in
program
~/myProject:# ./program
This is okay, because your shell is at ~/myProject, and so is test.in.
However, if you'd written:
~/myProject:# cd ..
~:# ./myProject/program
…then your test.in file wouldn't be found, as it does not exist in ~. It exists in ~/myProject. It doesn't matter that the executable itself is also found in ~/myProject.
This is actually desirable behaviour, as it allows flexibility from the shell. Ideally you would allow support for piping/redirecting the file to the process instead (program < test.in — now there are no assumptions baked into your code at all!), but we can save that for another day.
For now, you seem to be concerned about what happens if you move the executable away. Don't worry: just use this feature!
~:# mv myProject/program .
~:# cd myProject
~/myProject:# ../myProject
Your working directory is the directory in which test.in resides, so it will be found via the relative path given in your program code.

C++ system mkdir with path

Ran into a little snag here. I'm trying to make a directory inside of another directory using a variable directory name created by the function in use. Basically I want to store any created accounts in a directory named accounts that is separate from everything else. Here is what I have for my function:
system(("mkdir -p /home/user/Program/accounts"+accname).c_str());
The problem I am running into is that it creates the directory in Programs as accounts(accname) instead of in accounts with accname being the directory.
Example with accname = tim would currently look like accountstim inside of Program instead of tim inside of accounts.

You're passing the -p flag, which will create all directories that you don't already have, so you're on the right track.
You'll need to add another slash to get a new directory. Without this extra slash, anything at the end of the string becomes part of the accounts directory, and not the name of a new directory:
system(("mkdir -p /home/user/Program/accounts/"+accname).c_str()); // note the slash after accounts!
That would solve your problem, but I advise against using the system function
EDIT: Using mkdir only applies if you are running a POSIX system or other system that supplies a mkdir function. If you're on windows I don't know how that would be done.
It's advisable to use the mkdir system call instead. If you're only creating one directory, the mkdir function call should be relatively straightforward. If you are running Linux you can read about it here.

Using %{buildroot} in a SPEC file

I'm creating a simple RPM installer, I just have to copy files to a directory structure I create in the %install process.
The %install process is fine, I create the following folder /opt/company/application/ with the command mkdir -p %{buildroot}/opt/company/%{name} and then I proceed to copy the files and subdirectories from my package. I've tried to install it and it works.
The doubt I have comes when uninstalling. I want to remove the folder /opt/company/application/ and I thought you're supposed to use %{buildroot} anywhere when referencing the install location. Because my understanding is the user might have a different structure and you can't assume that rmdir /opt/company/%{name}/ will work. Using that command in the %postun section deletes succesfully the directories whereas using rmdir ${buildroot}/opt/company/%{name} doesn't delete the folders.
My question is, shouldn't you be using ${buildroot} in the %postun in order to get the proper install location? If that's not the case, why?

Don't worry about it. If you claim the directory as your own in the %files section, RPM will handle it for you.
FYI, %{buildroot} probably won't exist on the target machine.

How to install programs on Linux from a makefile? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What should Linux/Unix 'make install' consist of?
I'm making a program that can be invoked from the command line, like ./prog arg1 arg2. I was wondering, how can I make it so that I can run it from anywhere on the system? I know that I could put prog into /usr/bin/, but what if my program needs resources from its install directory (that can be wherever the user downloaded it)?

put the directory in which your program resides into the path environment variable or move your program into one of the directories already in path (usually requires superuser permission, which I gather you don't have for then you wouldn't ask this question).
to add a directory to the front of the search path and have the system refresh its database on tcsh, say
setenv "my/directory:"$PATH
rehash
on bash, I think, it's
PATH=/my/directory:$PATH
export PATH
(no need to rehash). Note that the above commands put your directory at the top of the search path, i.e. these will be searched before any other. Thus, if your program is called "gcc", then your program will be executed rather than the GNU C compiler. Alternatively, you can add your directory to the end of the search path, in which case your program will only be picked up if no other program of the same name is found in any of the other directories in the search path.

You probably also want to become familiar with the Linux Filesystem Hierarchy: the standard definition for "what goes where". Here's more information:
https://superuser.com/questions/90479/what-is-the-conventional-install-location-for-applications-in-linux
Environment variables can be defined globally ("for everybody", e.g. /etc/profile), or locally ("per user", e.g. ~/.bashrc). Here's a good summary of some of your options:
https://wiki.archlinux.org/index.php/Environment_Variables

When you execute a programme using prog arg1 arg2, it's thanks to your shell, which search in the $PATH environement variable for folders where programs are. (Try env | grep PATH to see those folder).
You need eather to add a new directory in this variable (export PATH="/new/directory/path/:$PATH" if under bash, setenv PATH "/new/directory/path/:$PATH" if with tcsh) or copy your program and all the files it need to execute in one of the PATH folder.

There are two ways of dealing with this (and Makefiles have nothing to do with them)
Your installer could just put the files where it wants them, so your program doesn't have to search -- it can use hardcoded paths. Or you could put the path to the data directory into yet another file, which would be hardcoded (like /etc/programname.config).
You put all your stuff into one directory (often something like /opt/programname). You can hardcode that too, of course, or your program can readlink() the /proc/pid/exe file for a good chance (no guarantee, though. In particular, it works if for example a symlink is used to point from /usr/bin/programname to your /opt/programname/bin/programname or whatever, but it won't work if that's a hardlink)
to get the path to the executable. From there you should be able to reach your data files.
If prefer the second solution, but that's just me. The first solution works well with package managers, and it's less overkill if you don't really have a lot of data files.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js