Is there a "watch" / "monitor" / "guard" program for Makefile dependencies? - c++

I've recently been spoiled by using nodemon in a terminal window, to run my Node.js program whenever I save a change.
I would like to do something similar with some C++ code I have. My actual project has lots of source files, but if we assume the following example, I would like to run make automatically whenever I save a change to sample.dat, program.c or header.h.
test: program sample.dat
./program < sample.dat
program: program.c header.h
gcc program.c -o program
Is there an existing solution which does this?
(Without firing up an IDE. I know lots of IDEs can do a project rebuild when you change files.)

If you are on a platform that supports inotifywait (to my knowledge, only Linux; but since you asked about Make, it seems there's a good chance you're on Linux; for OS X, see this question), you can do something like this:
inotifywait --exclude '.*\.swp|.*\.o|.*~' --event MODIFY -q -m -r . |
while read
do make
done
Breaking that down:
inotifywait
Listen for file system events.
--exclude '.*\.swp|.*\.o|.*~'
Exclude files that end in .swp, .o or ~ (you'll probably want to add to this list).
--event MODIFY
When you find one print out the filepath of the file for which the event occurred.
-q
Do not print startup messages (so make is not prematurely invoked).
-m
Listen continuously.
-r .
Listen recursively on the current directory. Then it is piped into a simple loop which invokes make for every line read.
Tailor it to your needs. You may find inotifywait --help and the manpage helpful.
Here is a more detailed script. I haven't tested it much, so use with discernment. It is meant to keep the build from happening again and again needlessly, such as when switching branches in Git.
#!/bin/sh
datestampFormat="%Y%m%d%H%M%S"
lastrun=$(date +$datestampFormat)
inotifywait --exclude '.*\.swp|.*\.o|.*~' \
--event MODIFY \
--timefmt $datestampFormat \
--format %T \
-q -m -r . |
while read modified; do
if [ $modified -gt $lastrun ]; then
make
lastrun=$(date +$datestampFormat)
fi
done

Related

Dynamically-created 'zip' command not excluding directories properly

I'm the author of a utilty that makes compressing projects using zip a bit easier, especially when you have to compress regularly, such as for updating projects submitted to an application store (like Chrome's Web Store).
I'm attempting to make quite a few improvements, but have run into an issue, described below.
A Quick Overview
My utility's command format is similar to command OPTIONS DEST DIR1 {DIR2 DIR3 DIR4...}. It works by running zip -r DEST.zip DIR1; a fairly simple process. The benefit to my utility, however, is the ability to use a predetermined file (think .gitignore) to ignore specific files/directories, or files/directories which match a pattern.
It's pretty simple -- if the "ignorefile" exists in a target directory (DIR1, DIR2, DIR3, etc), my utility will add exclusions to the zip -r DEST.zip DIR1 command using the pattern -x some_file or -x some_dir/*.
The Issue
I am running into an issue with directory exclusion, however, and I can't quite figure out why (this is probably be because I am still quite the sh novice). I'll run through some examples:
Let's say that I want to ignore two things in my project directory: .git/* and .gitignore. Running command foo.zip project_dir builds the following command:
zip -r foo.zip project -x project/.git/\* -x project/.gitignore
Woohoo! Success! Well... not quite.
In this example, .gitignore is not added to the compressed output file, foo.zip. The directory, .git/*, and all of it's subdirectories (and files) are added to the compressed output file.
Manually running the command:
zip -r foo.zip project_dir -x project/.git/\* -x project/.gitignore
Works as expected, of course, so naturally I am pretty puzzled as to why my identical, but dynamically-built command, does not work.
Attempted Resolutions
I have attempted a few different methods of resolving this to no avail:
Removing -x project/.git/\* from the command, and instead adding each subdirectory and file within that directory, such as -x project/.git/config -x project/.git/HEAD, etc (including children of subdirectories)
Removing the backslash before the asterisk, so that the resulting exclusion option within the command is -x project/.git/*
Bashing my head on the keyboard in angst (I'm really surprised this didn't work, it usually does)
Some notes
My utility uses /bin/sh; I would prefer to keep it that way for maximum compatibility.
I am aware of the git archive feature -- my use of .git/* and .gitignore in the above example is simply as an example; my utility is not dependent on git nor is used exclusively for projects which are git repositories.
I suspected the problem would be in the evaluation of the generated command, since you said the same command when executed directly did right.
So as the comment section says, I think you already found the correct solution. This happens because if you run that variable directly, some things like globs can be expanded directly, instead of passed to the command. And arguments may be messed up, depending on the situation.
Yes, in that case:
eval $COMMAND
is the way to go.

Copy and Rename Multiple Files with Regular Expressions in bash

I've got a file structure that looks like:
A/
2098765.1ext
2098765.2ext
2098765.3ext
2098765.4ext
12345.1ext
12345.2ext
12345.3ext
12345.4ext
B/
2056789.1ext
2056789.2ext
2056789.3ext
2056789.4ext
54321.1ext
54321.2ext
54321.3ext
54321.4ext
I need to rename all the files that begin with 20 to start with 10; i.e., I need to rename B/2022222.1ext to B/1022222.1ext
I've seen many of the other questions regarding renaming multiple files, but couldn't seem to make it work for my case. Just to see if I can figure out what I'm doing before I actually try to do the copy/renaming I've done:
for file in "*/20?????.*"; do
echo "{$file/20/10}";
done
but all I get is
{*/20?????.*/20/10}
Can someone show me how to do this?
You just have a little bit of incorrect syntax is all:
for file in */20?????.*; do mv $file ${file/20/10}; done
Remove quotes from the argument to in. Otherwise, the filename expansion does not occur.
The $ in the substitution should go before the bracket
Here is a solution which use the find command:
find . -name '20*' | while read oldname; do echo mv "$oldname" "${oldname/20/10}"; done
This command does not actually do your bidding, it only prints out what should be done. Review the output and if you are happy, remove the echo command and run it for real.
Just wanna add to Explosion Pill's answer.
On OS X though, you must say
mv "${file}" "${file_expression}"
Or the mv command does not recognize it.
Brace expansions like :
{*/20?????.*/20/10}
can't be surrounded by quotes.
Instead, try doing (with Perl rename) :
rename 's/^10/^20/' */*.ext
You can do this using the Perl tool rename from the shell prompt. (There are other tools with the same name which may or may not be able to do this, so be careful.)
If you want to do a dry run to make sure you don't clobber any files, add the -n switch to the command.
note
If you run the following command (linux)
$ file $(readlink -f $(type -p rename))
and you have a result like
.../rename: Perl script, ASCII text executable
then this seems to be the right tool =)
This seems to be the default rename command on Ubuntu.
To make it the default on Debian and derivative like Ubuntu :
sudo update-alternatives --set rename /path/to/rename
The glob behavior of * is suppressed in double quotes. Try:
for file in */20?????.*; do
echo "${file/20/10}";
done

Is there a build tool based on inotify-like mechanism

In relatively big projects which are using plain old make, even building the project when nothing has changed takes a few tens of seconds. Especially with many executions of make -C, which have the new process overhead.
The obvious solution to this problem is a build tool based on inotify-like feature of the OS. It would look out when a certain file is changed, and based on that list it would compile this file alone.
Is there such machinery out there? Bonus points for open source projects.
You mean like Tup:
From the home page:
"Tup is a file-based build system - it inputs a list of file changes and a directed acyclic graph (DAG), then processes the DAG to execute the appropriate commands required to update dependent files. The DAG is stored in an SQLite database. By default, the list of file changes is generated by scanning the filesystem. Alternatively, the list can be provided up front by running the included file monitor daemon."
I am just wondering if it is stat()ing the files that takes so long. To check this here is a small systemtap script I wrote to measure the time it takes to stat() files:
# call-counts.stp
global calls, times
probe kernel.function(#1) {
times[probefunc()] = gettimeofday_ns()
}
probe kernel.function(#1).return {
now = gettimeofday_ns()
delta = now - times[probefunc()]
calls[probefunc()] <<< delta
}
And then use it like this:
$ stap -c "make -rC ~/src/prj -j8 -k" ~/tmp/count-calls.stp sys_newstat
make: Entering directory `/home/user/src/prj'
make: Nothing to be done for `all'.
make: Leaving directory `/home/user/src/prj'
calls["sys_newstat"] #count=8318 #min=684 #max=910667 #sum=26952500 #avg=3240
The project I ran it upon has 4593 source files and it takes ~27msec (26952500nsec above) for make to stat all the files along with the corresponding .d files. I am using non-recursive make though.
If you're using OSX, you can use fswatch
https://github.com/alandipert/fswatch
Here's how to use fswatch to for changes to a file and then run make if it detects any
fswatch -o anyFile | xargs -n1 -I{} make
You can run fswatch from inside a makefile like this:
watch: $(FILE)
fswatch -o $^ | xargs -n1 -I{} make
(Of course, $(FILE) is defined inside the makefile.)
make can now watch for changes in the file like this:
> make watch
You can watch another file like this:
> make watch anotherFile
Install inotify-tools and write a few lines of bash to invoke make when certain directories are updated.
As a side note, recursive make scales badly and is error prone. Prefer non-recursive make.
The change-dependency you describe is already part of Make, but Make is flexible enough that it can be used in an inefficient way. If the slowness really is caused by the recursion (make -C commands) -- which it probably is -- then you should reduce the recursion. (You could try putting in your own conditional logic to decide whether to execute make -C, but that would be a very inelegant solution.)
Roughly speaking, if your makefiles look like this
# main makefile
foo:
make -C bar baz
and this
# makefile in bar/
baz: quartz
do something
you can change them to this:
# main makefile
foo: bar/quartz
cd bar && do something
There are many details to get right, but now if bar/quartz has not been changed, the foo rule will not run.

Simple and efficient distribution of C++/Boost source code (amalgamation)

My job mostly consists of engineering analysis, but I find myself distributing code more and more frequently among my colleagues. A big pain is that not every user is proficient in the intricacies of compiling source code, and I cannot distribute executables.
I've been working with C++ using Boost, and the problem is that I cannot request every sysadmin of every network to install the libraries. Instead, I want to distribute a single source file (or as few as possible) so that the user can g++ source.c -o program.
So, the question is: can you pack the Boost libraries with your code, and end up with a single file? I am talking about the Boost libraries which are "headers only" or "templates only".
As an inspiration, please look at the distribution of SQlite or the Lemon Parser Generator; the author amalgamates the stuff into a single source file which is trivial to compile.
Thank you.
Edit:
A related question in SO is for Windows environment. I work in Linux.
There is a utility that comes with boost called bcp, that can scan your source and extract any boost header files that are used from the boost source. I've setup a script that does this extraction into our source tree, so that we can package the source that we need along with our code. It will also copy the boost source files for a couple of boost libraries that we use that are no header only, which are then compiled directly into our applications.
This is done once, and then anybody who uses the code doesn't even need to know that it depends on boost. Here is what we use. It will also build bjam and bcp, if they haven't been build already.
#!/bin/sh
BOOST_SRC=.../boost_1_43_0
DEST_DIR=../src/boost
TOOLSET=
if ( test `uname` = "Darwin") then
TOOLSET="--toolset=darwin"
fi
# make bcp if necessary
if ( ! test -x $BOOST_SRC/dist/bin/bcp ) then
if ( test -x $BOOST_SRC/tools/jam/*/bin.*/bjam ) then
BJAM=$BOOST_SRC/tools/jam/*/bin.*/bjam
else
echo "### Building bjam"
pushd $BOOST_SRC/tools/jam
./build_dist.sh
popd
if ( test -x $BOOST_SRC/tools/jam/*/bin.*/bjam ) then
BJAM=$BOOST_SRC/tools/jam/*/bin.*/bjam
fi
fi
echo "BJAM: $BJAM"
pushd $BOOST_SRC/tools/bcp
echo "### Building bcp"
echo "$BJAM $TOOLSET"
$BJAM $TOOLSET
if [ $? == "0" ]; then
exit 1;
fi
popd
fi
if ( ! test -x $BOOST_SRC/dist/bin/bcp) then
echo "### Couldn't find bpc"
exit 1;
fi
mkdir -p $DEST_DIR
echo "### Copying boost source"
MAKEFILEAM=$DEST_DIR/libs/Makefile.am
rm $MAKEFILEAM
# Signals
# copy source libraries
mkdir -p $DEST_DIR/libs/signals/src
cp $BOOST_SRC/libs/signals/src/* $DEST_DIR/libs/signals/src/.
echo -n "boost_sources += " >> $MAKEFILEAM
for f in `ls $DEST_DIR/libs/signals/src | fgrep .cpp`; do
echo -n "boost/libs/signals/src/$f " >> $MAKEFILEAM
done
echo >> $MAKEFILEAM
echo "### Extracting boost includes"
$BOOST_SRC/dist/bin/bcp --scan --boost=$BOOST_SRC ../src/*/*.[Ch] ../src/boost/libs/*/src/*.cpp ../src/smart_assert/smart_assert/priv/fwd/*.hpp $DEST_DIR
if [ $? != "0" ]; then
echo "### bcp failed"
rm -rf $DEST_DIR
exit 1;
fi
Have you considered just writing a build script for a build system like SCons?
You could write a python script to download boost, unpack it compile the needed files (you can even run bjam if needed) and compile your own code.
The only dependency your colleagues will need is Python and SCons.
Run the preprocessor on your code and save the output. If you started with one main.cpp with a bunch of includes in it, you will end up with one file where all of the includes have been sucked in. If you have multiple cpp files, you will have to concatinate them together and then run the preprocessor on the concatinated file, this should work as long as you don't have any duplicate global symbol names.
For a more portable method, do what sqlite does and write your own script to just combine and concatinate together the files you created+boost, and not get the system includes. See mksqlite3c.tcl in the sqlite code
http://www2.sqlite.org/src/finfo?name=tool/mksqlite3c.tcl
Why not just check in all the necessary files to SVN, and send you co-workers the URL of the repository? Then they can check out the code whenever they want to, do an 'svn up' any time they want to update to the latest version, etc.
If you're on a Debian-derived variety of Linux, well problems like this just shouldn't come up: let the packaging system and policy manual do the work. Just make it clear that the libboost-dev or whatever package is a build-dependency of your code and needs to be installed beforehand, and then /usr/include/boost should be right there where your code expects to find it. If you're using a more recent version of boost than the distro ships, it's probably worth figuring out how to package it yourself and work within the existing packaging/dependencies framework rather than reinventing another one.
I'm not familiar enough with .rpm based distros to comment on how things work there. But knowing I can easily setup exactly the build environment I need is, for me, one of the biggest advantages of Debian based development over Windows.

Unix: fast 'remove directory' for cleaning up daily builds

Is there a faster way to remove a directory then simply submitting
rm -r -f *directory*
? I am asking this because our daily cross-platform builds are really huge (e.g. 4GB per build). So the harddisks on some of the machines are frequently running out of space.
This is namely the case for our AIX and Solaris platforms.
Maybe there are 'special' commands for directory remove on these platforms?
PASTE-EDIT (moved my own separate answer into the question):
I am generally wondering why 'rm -r -f' is so slow. Doesn't 'rm' just need to modify the '..' or '.' files to de-allocate filesystem entries.
something like
mv *directory* /dev/null
would be nice.
For deleting a directory from a filesystem, rm is your fastest option.
On linux, sometimes we do our builds (few GB) in a ramdisk, and it has a really impressive delete speed :) You could also try different filesystems, but on AIX/Solaris you may not have many options...
If your goal is to have the directory $dir empty now, you can rename it, and delete it later from a background/cron job:
mv "$dir" "$dir.old"
mkdir "$dir"
# later
rm -r -f "$dir.old"
Another trick is that you create a seperate filesystem for $dir, and when you want to delete it, you just simply re-create the filesystem. Something like this:
# initialization
mkfs.something /dev/device
mount /dev/device "$dir"
# when you want to delete it:
umount "$dir"
# re-init
mkfs.something /dev/device
mount /dev/device "$dir"
I forgot the source of this trick but it works:
EMPTYDIR=$(mktemp -d)
rsync -r --delete $EMPTYDIR/ dir_to_be_emptied/
On AIX at least, you should be using LVM, the logical volume manager. All our systems bundle all the physical hard drive into a single volume group and then create one big honkin' file system out of that.
That way, you can add physical devices to your machine at will and increase the size of your file system to whatever you need.
One other solution I've seen is to allocate a trash directory on each file system and use a combination of mv and a find cron job to tackle the space problem.
Basically, have a cron job that runs every ten minutes and executes:
rm -rf /trash/*
rm -rf /filesys1/trash/*
rm -rf /filesys2/trash/*
Then, when you want your specific directory on that file system recycled, use something like:
mv /filesys1/overnight /filesys1/trash/overnight
and, within the next ten minutes your disk space will start being recovered. The filesys1/overnight directory will immediately be available for use even before the trashed version has started being deleted.
It's important that the trash directory be on the same filesystem as the directory you want to get rid of, otherwise you have a massive copy/delete operation on your hands rather than a relatively quick move.
rm -r directory works by recursing depth-first down through directory, deleting files, and deleting the directories on the way back up. It has to, since you cannot delete a directory that is not empty.
Long, boring details: Each file system object is represented by an inode in the file system, which has file system-wide, flat array of inodes.[1] If you just deleted directory without first deleting its children then the children would remain allocated, but without any pointers to them. (fsck checks for that kind of thing when it runs, since it represents file system damage.)
[1] That may not be strictly true for every file system out there, and there may be a file system that works the way you describe. It would possibly require something like a garbage collector. However, all the common ones I know of act like fs objects are owned by inodes, and directories are lists of name/inode number pairs.
If rm -rf is slow, perhaps you are using a "sync" option or similar, which is writing to the disk too often. On Linux ext3 with normal options, rm -rf is very quick.
One option for fast removal which would work on Linux and presumably also on various Unixen is to use a loop device, something like:
hole temp.img $[5*1024*1024*1024] # create a 5Gb "hole" file
mkfs.ext3 temp.img
mkdir -p mnt-temp
sudo mount temp.img mnt-temp -o loop
The "hole" program is one I wrote myself to create a large empty file using a "hole" rather than allocated blocks on the disk, which is much faster and doesn't use any disk space until you really need it. http://sam.nipl.net/coding/c-examples/hole.c
I just noticed that GNU coreutils contains a similar program "truncate", so if you have that you can use this to create the image:
truncate --size=$[5*1024*1024*1024] temp.img
Now you can use the mounted image under mnt-temp for temporary storage, for your build. When you are done with it, do this to remove it:
sudo umount mnt-temp
rm test.img
rmdir mnt-temp
I think you will find that removing a single large file is much quicker than removing lots of little files!
If you don't care to compile my "hole.c" program, you can use dd, but this is much slower:
dd if=/dev/zero of=temp.img bs=1024 count=$[5*1024*1024] # create a 5Gb allocated file
I think that actually there is nothing else than "rm -rf" as you quoted to delete your directories.
to avoid doing it manually over and over you can cron daily a script that recursively deletes all the build directories of your build root directory if they're "old enough" with something like :
find <buildRootDir>/* -prune -mtime +4 -exec rm -rf {} \;
(here mtime +4 indicates "any file older than 4 days)
Another way would be to configure your builder (if it allows such things) to crush the previous build with the current one.
I was looking into this as well.
I had a dir with 600,000+ files.
rm * would fail, because there are too many entries.
find . -exec rm {} \; was nice, and deleting ~750 files every 5 seconds. Was checking the rm rate via another shell.
So, instead I wrote a short script to rm many files at once. Which obtained about ~1000 files every 5 seconds. The idea is to put as many files into 1 rm command as you can to increase the efficiency.
#!/usr/bin/ksh
string="";
count=0;
for i in $(cat filelist);do
string="$string $i";
count=$(($count + 1));
if [[ $count -eq 40 ]];then
count=1;
rm $string
string="";
fi
done
On Solaris, this is the fastest way I have found.
find /dir/to/clean -type f|xargs rm
If you have files with odd paths, use
find /dir/to/clean -type f|while read line; do echo "$line";done|xargs rm
Use
perl -e 'for(<*>){((stat)[9]<(unlink))}'
Please refer below link:
http://www.slashroot.in/which-is-the-fastest-method-to-delete-files-in-linux
Needed to delete 700 Gbytes from dozens of directories on AWS EBS 1 TB disk (ext3) before copying remainder to a new 200 Gbyte XFS volume. It was taking hours leaving that volume at 100%wa. Since the disk IO and server time are not free, this took only a fraction of a second per directory.
where /dev/sdb
is an empty volume of any size
directory_to_delete=/ebs/var/tmp/
mount /dev/sdb $directory_to_delete
nohup rsync -avh /ebs/ /ebs2/
I coded a small Java application RdPro (Recursive Directory Purge tool) which is faster than rm. It also can remove target directories user specified under a root.Works for both Linux/Unix and Windows. It has both a command line version and a GUI version.
https://github.com/mhisoft/rdpro
I had to delete more than 3,00,000 files in windows. I had cygwin installed. Luckily i had all the primary directory in a database. Created a for loop and based on line entry and delete using rm -rf
I just use find ./ -delete in the folder to empty, and it has deleted 620000 directories (total size) 100GB in arround 10 minutes.
Source : a comment in this site https://www.slashroot.in/comment/1286#comment-1286