Building CLI scripts in Clojure

Building CLI scripts in Clojure - clojure

What are the common/standard ways to build CLI scripts in Clojure?
In my view such a method should include the following characteristics:
A way of easily dealing with arguments, stdin/out/err.
Without taking too much to boot (ideally having some sort of JIT), otherwise one loses the purpose of hacking things together in one's shell.
Also it is reasonable to expect a easy way of including one time dependencies without setting up a project (maybe installing them globally).
Ideally, providing a simple example of the solution usage would be much appreciated. Somewhat equivalent to:
#!/bin/bash
echo "$#"
cat /dev/stdin
Note: I'm aware that this question was somewhat questioned previously here. But the question is incomplete and the answers don't reach a consensus neither a significant proportion of the solutions that seems to exist.

Now that there is new CLI tooling it is possible to create a standalone Clojure script without using third party tools. Once you've got the clj command line tool installed, a script like the one below should just work.
In terms of the original question, this can be as good as any Clojure/JVM CLI program at dealing with command line arguments and system input/output depending on what libraries you :require. I've haven't benchmarked it, so I won't comment on performance but if it worries you then please experiment yourself to see if startup time is acceptable to you. I would say this scores highly on dependency management though, as the script is entirely standalone (apart from the clj tool which is now the recommended way to run Clojure anyway).
File: ~/bin/script.sh
#!/bin/sh
"exec" "clj" "-Sdeps" "{:deps,{hiccup,{:mvn/version,\"1.0.5\"}}}" "$0" "$#"
(ns my-script
(:require
[hiccup.core :as hiccup]))
(println
(hiccup/html
[:div
[:span "Command line args: " (clojure.string/join ", " *command-line-args*)]
[:span "Stdin: " (read-line)]]))
Then ensure it is executable:
$ chmod +x ~/bin/script.sh
And run it:
$ echo "stdin" | script.sh command line args
<div><span>Command line args: command, line, args</span><span>Stdin: stdin</span></div>
NB. This is primarily a shell script which treats the strings on line three as commands to execute. That subsequent execution will run the clj command line tool with the given arguments, which will evaluate those strings as strings (without side effects) and then proceed to evaluate the Clojure code below.
Note also that dependencies are specified as a map passed to clj on line three. You can read more about how that works on the Clojure website. The tokens in the dependency map are separated by commas, which Clojure treats as whitespace but which most shells do not.
Thanks to the good folk on the #tools-deps channel of the "clojurians" Slack group whence this solution came.

An option would be Planck which runs on MacOS and Linux. It uses self-hosted ClojureScript, has fast startup and targets JavaScriptCore.
It has a nice SDK and mimics some things from Clojure which you do not have in ClojureScript, e.g. planck.io resembles clojure.java.io. It supports loading dependencies via tools.deps.alpha/deps.edn.
Echoing stdin is as easy as:
(require '[planck.core :refer [*in* slurp]])
(print (slurp *in*))
and printing the command line arguments:
(println *command-line-args*)
...
$ echo "foo" | planck stdin.cljs 1 2 3
foo
(1 2 3)
An example of a standalone script, i.e. not a project, with dependencies: the tree command line tool in Planck.
One caveat is that Planck doesn't support using npm dependencies. So if you need those, go for Lumo which targets NodeJS.
A third option would be joker which is a Clojure interpreter written in Go.

I know you asked for non project creating methods to accomplish this but as this specific issue has been on my mind for quite some time I figured I would throw in another alternative.
TLDR: jump to the "Creating an Executable CLI Command" section below
Background
I had pretty much the same list of requirements as you do a while back and landed on creating executable jar files. I'm not talking about executable via java -jar myfile.jar, but rather self-contained uber-jars which you can execute directly as you would with any other binary file.
If you read the zip file specification (which jar files adher to as a jar file is a zip file), it turns out this is actually possible. The short version is that you need to:
build a fat jar with the stuff you need
insert a bash / bat / shell script into the binary jar content at the beginning of your file
chmod +x the uber jar file (or if on windows, check the executable box)
rewrite the jar file meta data records so that the inserted script text does not invalidate the zip file internal offsets
It should be noted that this is actually supported by the zip file specification. This is how self extracting zip files etc work and the resulting fat jar (after the above process) is still a valid jar file and a valid zip archive. All relevant commands such as java -jar still work and the file is now also executable directly from the command line.
In addition, following the above pattern it is also possible to add support for things like the drip jvm launcher which greatly accelerates the startup times of your cli scripts.
As it turns out when I started looking into this about a year ago, a library for the last point of rewriting the jar file meta data did not exist. Not just in clojure but on the JVM as a whole. This still blows my mind: the central deployment unit of all languages on the jvm is the jar file and there was no library out there that actually read the internals of jar files. Internals as in the actual zip file structure, not just what java's ZipFile and friends does.
Furthermore, I could not find a library for clojure which dealt with the kind of binary structure the zip file specification required in a clean way.
Solution:
octet has what I consider the cleanest interface of the available binary libraries for clojure, so I wrote a pull request for octet adding support for the features required by the zip file specification.
I then created a new library clj-zip-meta which reads and interprets the zip file meta data and is capable of the offset rewriting described in the last point above.
I then created a pull request to an existing clojure lib lein-binplus to add support for the zip meta rewriting implemented by clj-zip-meta and also add support for custom preamble scripts to be able to create real executable jars without the need for java -jar.
After all this I created a leiningen template cli-cmd to support creating cli command projects which support all the above bells and whistles and has a well structured command line parsing setup...or what I considered well structured : ). Comments welcomed.
Creating an Executable CLI Command
So with all that, you can create a new command line clojure app with leiningen and run it using:
~> lein new cli-cmd mycmd
~> cd mycmd
~> lein bin
Compiling mycmd.core
Compiling mycmd.core
Created /home/mbjarland/tmp/clj-cmd/mycmd/target/mycmd-0.1.0-SNAPSHOT.jar
Created /home/mbjarland/tmp/clj-cmd/mycmd/target/mycmd-0.1.0-SNAPSHOT-standalone.jar
Creating standalone executable: /home/mbjarland/tmp/clj-cmd/mycmd/target/mycmd
Re-aligning zip offsets
~> target/mycmd
---- debug output, remove for production code ----
options {:port 80, :hostname "localhost", :verbosity 0}
arguments []
errors nil
summary
-p, --port PORT 80 Port number
-H, --hostname HOST localhost Remote host
--detach Detach from controlling process
-v Verbosity level; may be specified multiple times to increase value
-h, --help
--------------------------------------------------
This is my program. There are many like it, but this one is mine.
Usage: mycmd [options] action
Options:
-p, --port PORT 80 Port number
-H, --hostname HOST localhost Remote host
--detach Detach from controlling process
-v Verbosity level; may be specified multiple times to increase value
-h, --help
Actions:
start Start a new server
stop Stop an existing server
status Print a server's status
Please refer to the manual page for more information.
Error: invalid action '' specified!
Where the output from the command is just the boilerplate sample command line parsing I've added to the leiningen template.
The custom preamble script is located at boot/jar-preamble.sh and it has support for drip. In other words, if you have drip on your path, the generated executable will use it, otherwise it will fall back to standard java -jar way of launching the uber jar internally.
The source for the command line parsing and the code for the cli app live under the src directory as per normal.
If you feel like hacking, it is possible to change the preamble script and re-run lein bin and the new preamble will be inserted into your executable by the build process.
Also it should be noted that this method still does java -jar under the covers so you do need java on your path.
Ayway, long-winded explanation, but hopefully it will be of some use for somebody with this problem.

Consider Lumo, a ClojureScript environment which was specially designed for scripting.
Note that while it supports both ClojureScript (JAR) and NPM dependencies, the dependency support is still under development.

I write a number of Clojure (JVM) scripts, and use a the CLI-matic library https://github.com/l3nz/cli-matic/ to abstract most of the boilerplate that goes with command-line parsing, creation and maintenance of help, errors, etc.

Related

List of Jetty9 modules

Is there a list of available Jetty 9 modules somewhere?
Just a simple table "this is the name, this is what it does, and here are links" type.
I have searched the Eclipse site and used search engines for some time now, without any usable result. Is it really that much of a secret what jetty modules exist, and what they do?

Use the command line.
$ cd /path/to/mybase
$ java -jar /path/to/jetty-home/start.jar --list-modules
Some modules are dynamic/virtual (dependent on your environment).
Some are 3rd party (jsp, jolokia, gcloud, etc).
Of the remaining few, you have the module information itself.
IE: rewrite is the rewrite behaviors in doc, http is the http server connector, etc.
Going from module to doc is a 1::n scenario, while going from doc to module is a 1::1 scenario.
If you want to know what they do, look at the module definition - (aka ${jetty.home}/modules/${name}.mod
They might have properties (documented in module)
They might have libs (obvious in module)
They might have xml (see standard XML configuration behaviors in Jetty doc)
They might have a non-Eclipse license (documented in module)
They might have a dependent module (documented in module)
The result of enabling a module is simply a command line along the lines of --module=http.
The combination of enabled modules (via the combination of ini files) is a longer command line + server classpath + xml load order.
You can see this via ...
$ cd /path/to/mybase
$ java -jar /path/to/jetty-home/start.jar --list-config

Dynamically-created 'zip' command not excluding directories properly

I'm the author of a utilty that makes compressing projects using zip a bit easier, especially when you have to compress regularly, such as for updating projects submitted to an application store (like Chrome's Web Store).
I'm attempting to make quite a few improvements, but have run into an issue, described below.
A Quick Overview
My utility's command format is similar to command OPTIONS DEST DIR1 {DIR2 DIR3 DIR4...}. It works by running zip -r DEST.zip DIR1; a fairly simple process. The benefit to my utility, however, is the ability to use a predetermined file (think .gitignore) to ignore specific files/directories, or files/directories which match a pattern.
It's pretty simple -- if the "ignorefile" exists in a target directory (DIR1, DIR2, DIR3, etc), my utility will add exclusions to the zip -r DEST.zip DIR1 command using the pattern -x some_file or -x some_dir/*.
The Issue
I am running into an issue with directory exclusion, however, and I can't quite figure out why (this is probably be because I am still quite the sh novice). I'll run through some examples:
Let's say that I want to ignore two things in my project directory: .git/* and .gitignore. Running command foo.zip project_dir builds the following command:
zip -r foo.zip project -x project/.git/\* -x project/.gitignore
Woohoo! Success! Well... not quite.
In this example, .gitignore is not added to the compressed output file, foo.zip. The directory, .git/*, and all of it's subdirectories (and files) are added to the compressed output file.
Manually running the command:
zip -r foo.zip project_dir -x project/.git/\* -x project/.gitignore
Works as expected, of course, so naturally I am pretty puzzled as to why my identical, but dynamically-built command, does not work.
Attempted Resolutions
I have attempted a few different methods of resolving this to no avail:
Removing -x project/.git/\* from the command, and instead adding each subdirectory and file within that directory, such as -x project/.git/config -x project/.git/HEAD, etc (including children of subdirectories)
Removing the backslash before the asterisk, so that the resulting exclusion option within the command is -x project/.git/*
Bashing my head on the keyboard in angst (I'm really surprised this didn't work, it usually does)
Some notes
My utility uses /bin/sh; I would prefer to keep it that way for maximum compatibility.
I am aware of the git archive feature -- my use of .git/* and .gitignore in the above example is simply as an example; my utility is not dependent on git nor is used exclusively for projects which are git repositories.

I suspected the problem would be in the evaluation of the generated command, since you said the same command when executed directly did right.
So as the comment section says, I think you already found the correct solution. This happens because if you run that variable directly, some things like globs can be expanded directly, instead of passed to the command. And arguments may be messed up, depending on the situation.
Yes, in that case:
eval $COMMAND
is the way to go.

Chaining Hadoop MapReduce with Pipes (C++)

Does anyone know how to chain two MapReduce with Pipes API?
I already chain two MapReduce in a previous project with JAVA, but today I need to use C++. Unfortunately, I haven't seen any examples in C++.
Has someone already done it? Is it impossible?

Use Oozie Workflow. It allows you to use Pipes along with usual MapReduce jobs.

I finally manage to make Hadoop Pipes works. Here some steps to make works the wordcount examples available in src/examples/pipes/impl/.
I have a working Hadoop 1.0.4 cluster, configured following the steps described in the documentation.
To write a Pipes job I had to include the pipes library that is already compiled in the initial package. This can be found in C++ folder for both 32-bit and 64-bit architecture. However, I had to recompile it, which can be done following those steps:
# cd /src/c++/utils
# ./configure
# make install
# cd /src/c++/pipes
# ./configure
# make install
Those two commands will compile the library for our architecture and create a ’install’ directory in /src/c++ containing the compiled files.
Moreover, I had to add −lssl and −lcrypto link flags to compile my program. Without them I encountered some authentication exception at the running time.
Thanks to those steps I was able to run wordcount−simple that can be found in src/examples/pipes/impl/ directory.
However, to run the more complex example wordcount−nopipe, I had to do some other points. Due to the implementation of the record reader and record writer, we are directly reading or writing from the local file system. That’s why we have to specify our input and output path with file://. Moreover, we have to use a dedicated InputFormat component. Thus, to launch this job I had to use the following command:
# bin/hadoop pipes −D hadoop.pipes.java.recordreader=false −D hadoop.pipes.java.recordwriter=false −libjars hadoop−1.0.4/build/hadoop−test−1.0.4.jar −inputformat org.apache.hadoop.mapred.pipes.WordCountInputFormat −input file:///input/file −output file:///tmp/output −program wordcount−nopipe
Furthermore, if we look at org.apache.hadoop.mapred.pipes.Submitter.java of 1.0.4 version, the current implementation disables the ability to specify a non java record reader if you use InputFormat option.
Thus you have to comment the line setIsJavaRecordReader(job,true); to make it possible and recompile the core sources to take into account this change (http://web.archiveorange.com/archive/v/RNVYmvP08OiqufSh0cjR).
if(results.hasOption("−inputformat")) {
setIsJavaRecordReader(job, true);
job.setInputFormat(getClass(results, "−inputformat", job,InputFormat.class));
}

cannot load a new clojure library

I'm trying out clojure on my second day and I don't understand almost anything yet. I am working with the Programming Clojure 2nd ed. and I am stuck with libraries.
I have Leiningen and have the REPL running. The book first tells the reader to run a simple
(require 'clojure.java.io)
which works just fine (I get a nil). Then it wants to load a file called introduction.clj by running another simple
(require 'examples.introduction)
where I get an error message
FileNotFoundException Could not locate clojure/java/introduction__init.class
or clojure/java/introduction.clj on classpath: clojure.lang.RT.load (RT.java:432)
I downloaded the introduction.clj file and looked where should I place it. The error and the book says the command will search in my classpath, but I have no idea where or what that is (after 1h of searching and reading I still don't get it, sorry). I ran a few commands and I had many classpaths listed (from which none contain a clojure/java/io.clj).
So I tried another approach - find the io.clj file on my disk and simply copy the file there and run it with a command
(require 'clojure.java.introduction)
This doesn't seem to work either. By the way, the io.clj file I found was in "C:\Program Files\clojure\src\clj\clojure\java". I tried running several other .clj files from the java folder as well from the clojure folder, like javadoc.clj or inspector.clj and all seem to work just fine with the above mentioned command. Only the new file doesn't seem to load this way.
Any help appreciated :)

Clojure runs on the Java Virtual Machine, so you will need to learn a bit about PATH and CLASSPATH concepts:
See: http://docs.oracle.com/javase/tutorial/essential/environment/paths.html
Regarding the error message, the Clojure runtime is expecting to find introduction.clj in the directory clojure\java\example\introduction.clj (not where it really should be - see below).
The convention for Clojure namespaces is that the last component is the file name, while any previous components are parent directories. So
clojure.java.introduction
would have to be in the directory (relative to your source "root" or classpath)
clojure\java\introduction.clj
(The lein REPL automatically adds your source root to the classpath).
Another concept you need to understand is where the "root" of your source code is located. For Leiningen (the build tool you are using) the default is either "src" or "src/main/clojure" - as documented in the Leiningen sample project file on GitHub).
Finally, if you get really stuck, it seems the complete project for the book is available on GitHub.
Looking at the project, I see that you should actually be placing the file under src\examples\introduction.clj

Are you reading the book "Programming Clojure"?
I have encountered the same problem. It ban be sovled as follows:
If you start clojure by java:
I work in windows, the clojure.jar is placed in D:\backup\clojure-1.5.1, and the source code of the book "Programming Clojure" is placed in D:\study\clojure\shcloj-code\code. You should first delete the user.clj file in folder D:\study\clojure\shcloj-code\code.
java -cp d:\backup\clojure-1.5.1\clojure-1.5.1.jar;d:\study\clojure\shcloj-code\code clojure.main -r
If you work in linux, replace the ";" with ":"
If you start clojure by lein
You should first cd to the D:\study\clojure\shcloj-code\code folder, and then
lein repl
You should also delete the user.clj file in folder D:\study\clojure\shcloj-code\code.

How to bundle C/C++ code with C-shell-script?

I have a C shell script that calls two
C programs - one after the another
with some file handling before,
in-between and afterwards.
Now, as such I have three different files - one C shell script and 2 .c files.
I need to give this script to other users. The problem is that I have to distribute three files - which the users must keep in the same folder and then execute the script.
Is there some better way to do this?
[I know I can make one C code file out of those two... but I will still be left with a shell script and a C code. Actually, the two C codes do entirely different things... so I want them to be separate]

Sounds like you're worried that your users aren't savy enough to figure out how to resolve issues like command not found errors and the like. If absolutely MUST hide "complexity" of a collection of files you could have your script create the other files. In most other circumstances I would suggest that this approach is only going to increase your support workload since semi-experienced users are less likely to know how to troubleshoot the process.
If you choose to rely on the presence of a compiler on the system that you are running on you can store the C code as a collection of cat $STRING >> file.c commands to to create your two C files, which you then compile and use.
If you would want to use pre-compiled programsn instead then the same basic process can be used except instead use xxd to both generate the strings in your script and reverse the conversion process to give you working binaries. Note: Remember to chmod the binary so that it is executable.

use shar command to create self-extracting archive.
or better yet use unzipsfx with AUTORUN option.
This provides users with ONE file, and only ONE command to execute (as opposed to one for untarring and one for execution).
NOTE: The unzip command to run should use "-n" option, that way only the first run would extract the files and the subsequent would skip the extraction.

Use a zip or tar file? And you do realize that .c files aren't executable, you need to compile & link them first?

You can include the c code inside the shell script as a here document:
#!/bin/bash
cat > code.c << EOF
line #1
line #2
...
EOF
# compile
# execute
If you want to get fancy, you can test for the existence of the executable and skip compiling them if they exists.
If you are doing much shell programming, the rest of the Advanced Bash-Scripting Guide is worth looking at as well.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js