ocamldep: Argument list too long - ocaml

I'm having the infamous Argument list too long error when running ocamldep with a large number of input files.
This issue only happens when using a native Windows OCaml compiler + Cygwin: in this configuration, ARG_MAX returns 32000, an awfully low value, which I believe is a limit imposed by Windows itself, not Cygwin.
The usual solution for this kind of error is to use xargs, if we can split the arguments, such as when using ls or rm, but I don't know how it could work with ocamldep. Since it expects all files to be given in the command line at the same time (to properly compute dependencies), and there seems to be no option to give the list of files in a file (as in -f filelist.txt), is there a way to avoid this issue?

You can use xargs since you do not need to list all the files for ocamldep at once.
For example, in OCaml compiler's source code, to know the dependencies of GraphicsX11 over Graphics, you do not need to list graphcs.ml
$ cd otherlibs/graphics
$ ocamldep graphicsX11.ml
graphicsX11.cmo : graphics.cmi graphicsX11.cmi
graphicsX11.cmx : graphics.cmx graphicsX11.cmi
However, you must make dependees accessible ocamldep, otherwise they are ignored. Actually GraphicsX11 depends on Hashtbl too, and it is found with -I dir option:
# Seek dependencies over stdlib modules too:
$ ocamldep -I ../../stdlib/ graphicsX11.ml
graphicsX11.cmo : ../../stdlib/hashtbl.cmi graphics.cmi graphicsX11.cmi
graphicsX11.cmx : ../../stdlib/hashtbl.cmx graphics.cmx graphicsX11.cmi
(Personally I have never seen an OCaml project more than 32000 ml/mli files. Wow.)


Unable to run basic linux commands on Makefile

I'm trying to run basic Linux commands inside a Makefile, but it fails stating that the file is not found. (the file do exists)
Can anyone suggest how to rectify this error?
.PHONY: Checking
${INFO} "Checking"
ls -ltr ****/scripts.py
ls: cannot access '****/scripts.py': No such file or directory
I'm not sure what you expect ****/scripts.py to do, but in /bin/sh means the same thing as */scripts.py and presumably that doesn't exist, hence the error message you're seeing.
I guess you are hoping to use some fancy shell feature that does deep searching if you give multiple * characters. However, make always uses /bin/sh, it never uses your specific shell (imagine what a disaster that would be for portability!) which is typically a POSIX standard shell. In POSIX, filename globbing is quite simple and doesn't support multiple * to mean search subdirectories.
You either have to use find or similar instead:
ls -ltr `find . -name scripts.py`
Or specifically set your shell to whatever you want (note this causes your makefile to be non-portable to systems which don't have your shell):
SHELL := /bin/myshell

Dynamically-created 'zip' command not excluding directories properly

I'm the author of a utilty that makes compressing projects using zip a bit easier, especially when you have to compress regularly, such as for updating projects submitted to an application store (like Chrome's Web Store).
I'm attempting to make quite a few improvements, but have run into an issue, described below.
A Quick Overview
My utility's command format is similar to command OPTIONS DEST DIR1 {DIR2 DIR3 DIR4...}. It works by running zip -r DEST.zip DIR1; a fairly simple process. The benefit to my utility, however, is the ability to use a predetermined file (think .gitignore) to ignore specific files/directories, or files/directories which match a pattern.
It's pretty simple -- if the "ignorefile" exists in a target directory (DIR1, DIR2, DIR3, etc), my utility will add exclusions to the zip -r DEST.zip DIR1 command using the pattern -x some_file or -x some_dir/*.
The Issue
I am running into an issue with directory exclusion, however, and I can't quite figure out why (this is probably be because I am still quite the sh novice). I'll run through some examples:
Let's say that I want to ignore two things in my project directory: .git/* and .gitignore. Running command foo.zip project_dir builds the following command:
zip -r foo.zip project -x project/.git/\* -x project/.gitignore
Woohoo! Success! Well... not quite.
In this example, .gitignore is not added to the compressed output file, foo.zip. The directory, .git/*, and all of it's subdirectories (and files) are added to the compressed output file.
Manually running the command:
zip -r foo.zip project_dir -x project/.git/\* -x project/.gitignore
Works as expected, of course, so naturally I am pretty puzzled as to why my identical, but dynamically-built command, does not work.
Attempted Resolutions
I have attempted a few different methods of resolving this to no avail:
Removing -x project/.git/\* from the command, and instead adding each subdirectory and file within that directory, such as -x project/.git/config -x project/.git/HEAD, etc (including children of subdirectories)
Removing the backslash before the asterisk, so that the resulting exclusion option within the command is -x project/.git/*
Bashing my head on the keyboard in angst (I'm really surprised this didn't work, it usually does)
Some notes
My utility uses /bin/sh; I would prefer to keep it that way for maximum compatibility.
I am aware of the git archive feature -- my use of .git/* and .gitignore in the above example is simply as an example; my utility is not dependent on git nor is used exclusively for projects which are git repositories.
I suspected the problem would be in the evaluation of the generated command, since you said the same command when executed directly did right.
So as the comment section says, I think you already found the correct solution. This happens because if you run that variable directly, some things like globs can be expanded directly, instead of passed to the command. And arguments may be messed up, depending on the situation.
Yes, in that case:
is the way to go.

How to compile a jocaml file with ocamlbuild and include a package?

How can I compile a jocaml source file which needs the cryptokit package (successfully compiled with the companion ocaml) with the ocamlbuild tool?
When I execute the command ocamlbuild -pkg cryptokit -use-jocaml a.native I get this error:
Warning: tag "package" does not expect a parameter, but is used with parameter "cryptokit"¬
+ jocamlopt -I /prefix/lib/ocaml -I /prefix/lib/ocaml/site-lib/cryptokit -I /prefix/lib/ocaml/site-lib/num /prefix/lib/ocaml/unix.cmxa /prefix/lib/ocaml/nums.cmxa /prefix/lib/ocaml/site-lib/cryptokit/cryptokit.cmxa a.cmx -o a.native¬
File "_none_", line 1:¬
Error: Files /prefix/lib/ocaml/unix.cmxa¬
and /prefix/lib/ocaml/unix.cmxa¬
both define a module named Unix¬
Command exited with code 2.¬
Compilation unsuccessful after building 4 targets (3 cached) in 00:00:00.
Essentially, the ocaml Unix module clashes with himself.
This error only pops when I include Cryptokit (with -pkg cryptokit) probably because Cryptokit requires Unix. a.ml can in fact be empty and still reproduce the error.
I tried to add the -use-ocamlfind flag but as it also uses ocamlfind to get the compiler, it selects the ocaml compiler instead of the jocaml one.
By executing sequentially the same commands as ocamlbuild (displayed by -verbose 1), I got that when I execute the last one without /.../unix.cmxa then there is no more clash, but the wrong Unix module is loaded: it's the one from ocaml and not from jocaml, so it it completely crashes when I use any jocaml feature in a.ml:
jocamlopt -I /prefix/lib/ocaml -I /prefix/lib/ocaml/site-lib/cryptokit -I /prefix/lib/ocaml/site-lib/num /prefix/lib/ocaml/nums.cmxa /prefix/lib/ocaml/site-lib/cryptokit/cryptokit.cmxa a.cmx -o a.native
However, when I also remove the -I /prefix/lib/ocaml part, then it compiles successfully:
jocamlopt -I /prefix/lib/ocaml/site-lib/cryptokit -I /prefix/lib/ocaml/site-lib/num /prefix/lib/ocaml/nums.cmxa /prefix/lib/ocaml/site-lib/cryptokit/cryptokit.cmxa a.cmx -o a.native
To summarize, I got it to work by executing manually a modification of the last command, but I would like to get ocamlbuild working.
I think this error has to do with the fact that Cryptokit requires the Unix module: as I compiled it with ocaml and not jocaml, at the linking stage it tries to link with the ocaml stdlib one (which needs to be included) and not the jocaml stdlib one (which is implicitly included as part of the stdlib).
I had no idea there were active users of the ocamlbuild+JOcaml combination! By curiosity, would you say a bit more about what you are using JOCaml+cryptokit for?
I don't know much about Cryptokit or JOCaml, but it looks like your main problem is not related to ocamlbuild. If I understand correctly, (1) Cryptokit needs Unix and (2) JOCaml needs to use its own variant of Unix. If this is correct, compiling Cryptokit against ocaml's Unix and expecting it to work when linked with a JOCaml program that itself requires JOCaml's Unix is bound to create a lot of trouble. If this work in your case, it must be because either the part of Cryptokit you use doesn't actually require Unix, or the JOCaml program you are testing with does not actually require JOCaml's Unix. In the long run, it would probably be best to compile Cryptokit with JOCaml directly (I don't know how comfortable you are with the OCaml ecosystem in general, but I would personally try to build an OPAM switch where ocaml{c,opt} are aliases for jocaml{c,opt} and build programs from that).
Regarding the ocamlbuild specific part, it's hard to give any accurate advice without a tarball to be able to reproduce your setup and experiment with it. But I would try one of the two following options:
You can use -use-ocamlfind and teach ocamlfind to use jocaml instead of ocaml by using the OCAMLFIND_COMMANDS environment variable (see man ocamlfind)
You can avoid -use-ocamlfind entirely and instead call ocamlfind as a command-line tool to get the location of the cryptokit library (ocamlfind query cryptokit). You would then not use -pkg cryptokit but pass the path yourself (with -lflags and -cflags or by modifying your myocamlbuild.ml configuration file).
Elaborating on the -use-ocamlfind option as suggested by gasche, I got it to work with the addition of a small nasty hack: removing "unix" from the requires field of the META file of the cryptokit package. It works because jocaml links everything with threads and unix by default (a real solution would have been to disable this behavior, but it seems a lot harder). So the working compilation command is:
ocamlbuild -use-ocamlfind -use-jocaml -pkg cryptokit a.ml
I think it is possible to generalize this to any package that uses either unix or threads when compiling with jocaml. A subsidiary question is whether it is possible to do this dynamically with a _tags or myocamlbuild.ml file (note: comment if this remark needs to be moved).

using OCaml Batteries Included as a vanilla cma

I am a bit frustrated and confused by the OCaml Batteries Included concept and the way most tutorials I could find proceed. Before I get to use "productivity" tools like GODI or replace invocations of ocamlc with ocamlfind batteries/ocamlc (which is, at this point, too magical for me) I was hoping to be able to simply use OCaml Batteries Included core set of libraries like any other library. To that end I downloaded the latest source from git (head hash: 9f94ecb) and did a make all. I noticed that I got three .cma libraries at ./_build/src/ together with 102 .cmi files in the same directory. So I assumed that compiling with the -I switch pointing to that directory and linking with one of the three .cma libraries found there would be enough without needing to "install" the Batteries or use the platform tools. To test that, I set out to produce an executable for the following simple program I found somewhere:
(* file euler001.ml *)
open BatEnum
open BatPervasives
let main () =
|> BatEnum.filter (fun i -> i mod 3 = 0 || i mod 5 == 0)
|> BatEnum.reduce (+)
|> BatInt.print stdout
let _ = main ()
I was able to compile it with:
ocamlc -c -I ../batteries-included/_build/src/ euler001.ml
but when I tried to link with:
ocamlc -o euler001 unix.cma nums.cma ../batteries-included/_build/src/batteries.cma euler001.cmo
I got:
File "_none_", line 1, characters 0-1:
Error: Error while linking ../batteries-included/_build/src/batteries.cma(BatBigarray):
The external function `caml_ba_reshape' is not available
The nums.cma and unix.cma I added at the command line because the linker complained about missing references to undefined global Big_int and (when that was added) to Unix. But after these two modules were added on the linker invocation I received the last message (on the missing external function 'caml_ba_reshape') which proved blocking for me. So I would like to ask:
how does one proceed in this particular case?
how does one proceed in the general case (i.e. when the linker complains about a missing external function)
is it viable to use Batteries Included in this fashion? Before I rely on platform tools I want to have the assurance that I can use the underlying artifacts (cma and cmi/mli files) with the standard OCaml compiler and linker if I run into problems.
caml_ba_reshape is, as you could guess from the name but I agree it's not obvious, a primitive of the Bigarray module. You should add bigarray.cma in your compilation command, before batteries.cma which depends on it.
There is a reason why it is advised to use ocamlfind, which is precisely used to abstract over those dependencies. I don't think you are supposed to use ocamlfind batteries/ocamlc, but rather ocamlfind ocamlc -package batteries. If you insist on using the compiler without such support, then indeed you have to compile manually -- I understand your frustration, but I hope you also understand that it is intrisic to any sufficiently sophisticated OCaml library, and that it comes only from your self-imposed constraints.
how does one proceed in the general case (i.e. when the linker complains about a missing external function
You have to know or guess where the primitive comes from. Looking at the META file provided by the library, which is used to inform ocamlfind of the dependencies, may help you. You can use the tool ocamlobjinfo to know which primitive a .cma provides, if you want to check your assumption. (Or better, use ocamlfind to spit the correct compile command, see below.)
is it viable to use Batteries Included in this fashion?
Compiling "by hand" is reasonable if you insist. Working only in the source repository, without installing the library, is not. It's easy to keep doing what you do after an install, just replace your -I ... by the chosen install path.
Before I rely on platform tools I want to have the assurance that I can use the underlying artifacts (cma and cmi/mli files) with the standard OCaml compiler and linker if I run into problems.
ocamlfind is not (only) a platform tool. It is the way to use third-party ocaml libraries, period. It should be a standard on any ocaml-using platform. That it does not come with INRIA's distribution is an historical detail.
You can ask ocamlfind to show you its invocation of the bare compilers:
% ocamlfind ocamlc -linkpkg -package batteries t.ml -o test -verbose
Effective set of compiler predicates:
+ ocamlc.opt -o test -verbose -I /usr/local/lib/ocaml/3.12.1/batteries /usr/lib/ocaml/unix.cma /usr/lib/ocaml/nums.cma /usr/lib/ocaml/bigarray.cma /usr/lib/ocaml/str.cma /usr/local/lib/ocaml/3.12.1/batteries/batteries.cma t.ml
I don't want to throw stones at you. The landscape of OCaml tools, beside the minimal nutshell of what's provided by the source distribution, is quite sparse and lack a coherent point of entry. With time I've grown used to those tools and it's quite natural to use them, but I understand there is some cost of entry that we should try to lower.
PS: any advice on how to improve batteries documentation is warmly welcome. Patches to add things to the documentation or fix it are even better. batteries-devel#lists.forge.ocamlcore.org is the place to go.

Is there a build tool based on inotify-like mechanism

In relatively big projects which are using plain old make, even building the project when nothing has changed takes a few tens of seconds. Especially with many executions of make -C, which have the new process overhead.
The obvious solution to this problem is a build tool based on inotify-like feature of the OS. It would look out when a certain file is changed, and based on that list it would compile this file alone.
Is there such machinery out there? Bonus points for open source projects.
You mean like Tup:
From the home page:
"Tup is a file-based build system - it inputs a list of file changes and a directed acyclic graph (DAG), then processes the DAG to execute the appropriate commands required to update dependent files. The DAG is stored in an SQLite database. By default, the list of file changes is generated by scanning the filesystem. Alternatively, the list can be provided up front by running the included file monitor daemon."
I am just wondering if it is stat()ing the files that takes so long. To check this here is a small systemtap script I wrote to measure the time it takes to stat() files:
# call-counts.stp
global calls, times
probe kernel.function(#1) {
times[probefunc()] = gettimeofday_ns()
probe kernel.function(#1).return {
now = gettimeofday_ns()
delta = now - times[probefunc()]
calls[probefunc()] <<< delta
And then use it like this:
$ stap -c "make -rC ~/src/prj -j8 -k" ~/tmp/count-calls.stp sys_newstat
make: Entering directory `/home/user/src/prj'
make: Nothing to be done for `all'.
make: Leaving directory `/home/user/src/prj'
calls["sys_newstat"] #count=8318 #min=684 #max=910667 #sum=26952500 #avg=3240
The project I ran it upon has 4593 source files and it takes ~27msec (26952500nsec above) for make to stat all the files along with the corresponding .d files. I am using non-recursive make though.
If you're using OSX, you can use fswatch
Here's how to use fswatch to for changes to a file and then run make if it detects any
fswatch -o anyFile | xargs -n1 -I{} make
You can run fswatch from inside a makefile like this:
watch: $(FILE)
fswatch -o $^ | xargs -n1 -I{} make
(Of course, $(FILE) is defined inside the makefile.)
make can now watch for changes in the file like this:
> make watch
You can watch another file like this:
> make watch anotherFile
Install inotify-tools and write a few lines of bash to invoke make when certain directories are updated.
As a side note, recursive make scales badly and is error prone. Prefer non-recursive make.
The change-dependency you describe is already part of Make, but Make is flexible enough that it can be used in an inefficient way. If the slowness really is caused by the recursion (make -C commands) -- which it probably is -- then you should reduce the recursion. (You could try putting in your own conditional logic to decide whether to execute make -C, but that would be a very inelegant solution.)
Roughly speaking, if your makefiles look like this
# main makefile
make -C bar baz
and this
# makefile in bar/
baz: quartz
do something
you can change them to this:
# main makefile
foo: bar/quartz
cd bar && do something
There are many details to get right, but now if bar/quartz has not been changed, the foo rule will not run.