How to make namespace alias available after initial compile time? - clojure

I'm working on a Clojure application that needs to be able to read Clojure code from the shell command line. The application has a -main function that reads a string passed on the command line containing a single Clojure form, and passes this string to the Clojure function load-string, which parses the string and executes the code. This runs successfully using both lein run and using an uberjar.
I've found that the code passed at the command line can include fully-qualified namespace names if the namespaces were required at the top of the file containing -main. For example, if the source file begins with
(ns popco.core.popco
(:require [popco.core.reporters :as rpt]))
then the string I pass on the command line can reference my function ticker by popco.core.reporters/ticker. However, I can't use the alias rpt. If I refer to ticker by rpt/ticker, I get an exception: java.lang.RuntimeException: No such namespace: rpt.
I'm guessing that this is because the aliases are only available at compile time, and only one compile time. Since the time at which load-file compiles the string of code is after the time at which compilation of the source file has been completed, rpt is no longer available as an alias.
A solution that allows me to use namespace aliases is to duplicate the requires at the top of the file (which I want for use when running in the repl) within -main. However, there are several namespaces that I might need to include, and duplicating code is undesirable, of course.
Is there another solution? Some way to make aliases defined once, available both when runnning in the repl and when running code from the command line?
(EDIT: The analysis in terms of time of compilation above can't quite be correct--or maybe I don't understand Clojure compilation. (Well, in fact I don't understand Clojure compilation!) The solution mentioned in two paragraphs above works using a require statement in the original source code inside -main, not only with a string passed to load-file. So that require is getting compiled when -main is compiled, I suppose, which would be during the same pass as the top of the source file, I assume. Yet somehow code inside -main is in scope of the alias from the top of the file if the code was typed into the definition of -main, but not if it's brought in via load-file. Yet what's brought in via load-file is in the scope of the alias when (require '[popco.core.reporters :as rpt]) is literally in the source code for -main. Why?)

The alias function, which is the mechanism require uses to create namespace aliases via the :as key, mutates the current namespace. That is, the namespace that is current at the time alias runs.
This is why require embedded into your definition of -main creates the alias in the way you expect: it is mutating your runtime namespace to include the given alias. Emphatically, this does not necessarily alter the popco.core.popco namespace! In fact, that is unlikely to be the working namespace at runtime, when -main is invoked (it would be at compile time, and thus would, as in the ns macro, mutate the ns being defined).
The key here is to realize that the ns at runtime is not always the ns which defined your -main function, and if you want to mutate the evaluation environment at runtime, you need to either switch to the namespace you are mutating, or mutate the namespace you are in.

Related

Is there a standard way to ensure that a piece of code is executed at global scope?

I have some code I want to execute at global scope. So, I can use a global variable in a compilation unit like this:
int execute_global_code();
namespace {
int dummy = execute_global_code();
}
The thing is that if this compilation unit ends up in a static library (or a shared one with -fvisibility=hidden), the linker may decide to eliminate dummy, as it isn't used, and with it my global code execution.
So, I know that I can use concrete solutions based on the specific context: specific compiler (pragma include), compilation unit location (attribute visibility default), surrounding code (say, make an dummy use of dummy in my code).
The question is, is there a standard way to ensure execute_global_code will be executed that can fit in a single macro which will work regardless of the compilation unit placement (executable or lib)? ie: only standard c++ and no user code outside of that macro (like a dummy use of dummy in main())
The issue is that the linker will use all object files for linking a binary given to it directly, but for static libraries it will only pull those object files which define a symbol that is currently undefined.
That means that if all the object files in a static library contain only such self-registering code (or other code that is not referenced from the binary being linked) - nothing from the entire static library shall be used!
This is true for all modern compilers. There is no platform-independent solution.
A non-intrusive to the source code way to circumvent this using CMake can be found here - read more about it here - it will work if no precompiled headers are used. Example usage:
doctest_force_link_static_lib_in_target(exe_name lib_name)
There are some compiler-specific ways to do this as grek40 has already pointed out in a comment.

ClojureScript split one namespace into multiple files

I've read this thread, but it seems like there are no load and load-file in ClojureScript. Is it possible to separate a single namespace over multiple files?
The reason I want to do that is because I'm using Om and I want to separate components into different files. I can do it using separate namespaces, but then I will have to write the same requires in the beginning of each file and also the only way to call those components in the main file is like that:
(:require [some-project.sidebar :as sidebar])
...
(om/build sidebar/sidebar app-state)
i.e. I have to specify namespace before each component's name, which doesn't look pretty. Any ideas on how to improve it? I'm new to Clojure and ClojureScript, so maybe I'm missing something obvious?
There are a few things to note here
You can use :refer in your :require to import unqualified vars into a namespace. This is ok if there are a few, but can quickly get unwieldy if you tried to do it for everything.
Clojure applications are often structured in a tree like fashion where a main namespace requires sub namespaces, and so on, so you won't necessarily be importing the same namespaces into every namespace.
Even if it was possible to split a namespace across multiple files, it wouldn't be idiomatic Clojure. One file = one namespace is the norm.
If you wanted, you can def vars from one namespace into another to make one 'master' namespace to use in your other namespaces.
If you want to minimise the number of imports you have to do, make fewer namespaces and make them bigger.

Difference in Clojure between use and require

I recently started learning Clojure and I'm having a bit of difficulty wrapping my head around namespaces. As the creator of Clojure said, newcomers often struggle to get the concept right. I don't clearly understand the difference between (use ...) and (require ...). For example playing around in the REPL if I say (use 'clojure.contrib.str-utils2) I get warnings about functions in clojure.core namespace being replaced by the ones in clojure.contrib.str-utils2, but that doesn't happen when I use (require 'clojure.contrib.str-utils2). I'm not sure that I will always want to replace what's in clojure.core, so can someone point some best practices for importing external stuff and managing namespaces in Clojure?
Oh and also, when should I use :use and :require? Only inside (ns ....)?
Thanks in advance.
The answer lies in the docstrings:
user> (doc use)
-------------------------
clojure.core/use
([& args])
Like 'require, but also refers to each lib's namespace using
clojure.core/refer. Use :use in the ns macro in preference to calling
this directly.
'use accepts additional options in libspecs: :exclude, :only, :rename.
The arguments and semantics for :exclude, :only, and :rename are the same
as those documented for clojure.core/refer.
nil
And the long one for require:
user> (doc require)
-------------------------
clojure.core/require
([& args])
Loads libs, skipping any that are already loaded. Each argument is
either a libspec that identifies a lib, a prefix list that identifies
multiple libs whose names share a common prefix, or a flag that modifies
how all the identified libs are loaded. Use :require in the ns macro
in preference to calling this directly.
Libs
A 'lib' is a named set of resources in classpath whose contents define a
library of Clojure code. Lib names are symbols and each lib is associated
with a Clojure namespace and a Java package that share its name. A lib's
name also locates its root directory within classpath using Java's
package name to classpath-relative path mapping. All resources in a lib
should be contained in the directory structure under its root directory.
All definitions a lib makes should be in its associated namespace.
'require loads a lib by loading its root resource. The root resource path
is derived from the lib name in the following manner:
Consider a lib named by the symbol 'x.y.z; it has the root directory
<classpath>/x/y/, and its root resource is <classpath>/x/y/z.clj. The root
resource should contain code to create the lib's namespace (usually by using
the ns macro) and load any additional lib resources.
Libspecs
A libspec is a lib name or a vector containing a lib name followed by
options expressed as sequential keywords and arguments.
Recognized options: :as
:as takes a symbol as its argument and makes that symbol an alias to the
lib's namespace in the current namespace.
Prefix Lists
It's common for Clojure code to depend on several libs whose names have
the same prefix. When specifying libs, prefix lists can be used to reduce
repetition. A prefix list contains the shared prefix followed by libspecs
with the shared prefix removed from the lib names. After removing the
prefix, the names that remain must not contain any periods.
Flags
A flag is a keyword.
Recognized flags: :reload, :reload-all, :verbose
:reload forces loading of all the identified libs even if they are
already loaded
:reload-all implies :reload and also forces loading of all libs that the
identified libs directly or indirectly load via require or use
:verbose triggers printing information about each load, alias, and refer
Example:
The following would load the libraries clojure.zip and clojure.set
abbreviated as 's'.
(require '(clojure zip [set :as s]))
nil
They both do the same thing, but use goes the extra step and creates mappings for the stuff in the require'd namespace in the current namespace. That way, rather than doing some.namespace/name you're just referring to it as name. While this is convenient sometimes, it's better to use require or select the individual vars that you want rather than pull in the entire namespace. Otherwise, you could have issues with shadowing (where one var is preferred over another of the same name).
If you don't want to use require, but you know what vars you want out of the namespace, you can do this:
(ns whatever
(:use [some.namespace :only [vars you want]]))
If you don't know which vars you're going to need, or if you need a lot, it's better to use require. Even when you require, you don't always have to type the totally qualified name. You can do this:
(ns whatever
(:require [some.namespace :as sn]))
and then you can use vars from some.namespace like this: (sn/somefunction arg1 arg2)
And to answer your last question: try to only use :require and :use inside of (ns ...). It's much cleaner this way. Don't use and require outside of (ns ..) unless you have a pretty good reason for it.

difference between use and require

Can anyone explain the difference between use and require, both when used directly and as :use and :require in the ns macro?
require loads libs (that aren't already loaded), use does the same plus it refers to their namespaces with clojure.core/refer (so you also get the possibility of using :exclude etc like with clojure.core/refer). Both are recommended for use in ns rather than directly.
It's idiomatic to include external functions with require and refer. You avoid namespace conflicts, you only include functions you actually use/need, and you explicitly declare each function's location:
(ns project.core
(:require [ring.middleware.reload :refer [wrap-reload]]))
I do not have to invoke this function by prefixing it with its namespace:
(wrap-reload) ; works
If you don't use refer you'll need to prefix it with the namespace:
(ring.middleware.reload/wrap-reload) ; works if you don't use refer in your require
If you choose use instead, (pretty much) always use only:
(ns project.core
(:use [ring.middleware.reload :only [wrap-reload]]))
Otherwise you're including everything, making it both an unnecessarily large operation and very confusing for other programmers to find where the functions live.
Also, I highly recommend this blog as a resource for learning more about Clojure namespaces.
Use sure does make it easier by not requiring you to spell out the namespace every time you want to call a function though it can also make a mess of things by creating namespace conflicts. A good middle ground between "use" and "require" is to only 'use' the functions from a namespace that you actually use.
for instance:
(use '[clojure-contrib.duck-streams :only (writer reader)])
or even better, specify it at the top of the file in the namespace definition:
(ns com.me.project
(:use [clojure.contrib.test-is :only (deftest is run-tests)]))
As has been mentioned the big difference is that with (require 'foo), you then refer to names in the lib's namespace like so: (foo/bar ...) if you do (use 'foo) then they are now in your current namespace (whatever that may be and provided there are no conflicts) and you can call them like (bar ...).

Namespace Specification In Absence of Ambuguity

Why do some languages, like C++ and Python, require the namespace of an object be specified even when no ambiguity exists? I understand that there are backdoors to this, like using namespace x in C++, or from x import * in Python. However, I can't understand the rationale behind not wanting the language to just "do the right thing" when only one accessible namespace contains a given identifier and no ambiguity exists. To me it's just unnecessary verbosity and a violation of DRY, since you're being forced to specify something the compiler already knows.
For example:
import foo # Contains someFunction().
someFunction() # imported from foo. No ambiguity. Works.
Vs.
import foo # Contains someFunction()
import bar # Contains someFunction() also.
# foo.someFunction or bar.someFunction? Should be an error only because
# ambiguity exists.
someFunction()
One reason is to protect against accidentally introducing a conflict when you change the code (or for an external module/library, when someone else changes it) later on. For example, in Python you can write
from foo import *
from bar import *
without conflicts if you know that modules foo and bar don't have any variables with the same names. But what if in later versions both foo and bar include variables named rofl? Then bar.rofl will cover up foo.rofl without you knowing about it.
I also like to be able to look up to the top of the file and see exactly what names are being imported and where they're coming from (I'm talking about Python, of course, but the same reasoning could apply for C++).
Python takes the view that 'explicit is better than implicit'.
(type import this into a python interpreter)
Also, say I'm reading someone's code. Perhaps it's your code; perhaps it's my code from six months ago. I see a reference to bar(). Where did the function come from? I could look through the file for a def bar(), but if I don't find it, what then? If python is automatically finding the first bar() available through an import, then I have to search through each file imported to find it. What a pain! And what if the function-finding recurses through the import heirarchy?
I'd rather see zomg.bar(); that tells me where the function is from, and ensures I always get the same one if code changes (unless I change the zomg module).
The problem is about abstraction and reuse : you don't really know if there will not be any future ambiguity.
For example, It's very common to setup different libraries in a project just to discover that they all have their own string class implementation, called "string".
You compiler will then complain that there is ambiguity if the libraries are not encapsulated in separate namespaces.
It's then a delightful pleasure to dodge this kind of ambiguity by specifying wich implementation (like the standard std::string one) you wants to use at each specific instruction or context (read : scope).
And if you think that it's obvious in a particular context (read : in a particular function or .cpp in c++, .py file in python - NEVER in C++ header files) you just have to express yourself and say that "it should be obvious", adding the "using namespace" instruction (or import *). Until the compiler complain because it is not.
If you use using in specific scopes, you don't break the DRY rule at all.
There have been languages where the compiler tried to "do the right thing" - Algol and PL/I come to mind. The reason they are not around anymore is that compilers are very bad at doing the right thing, but very good at doing the wrong one, given half a chance!
The ideal this rule strives for is to make creating reusable components easy - and if you reuse your component, you just don't know which symbols will be defined in other namespaces the client uses. So the rule forces you to make your intention clear with respect to further definitions you don't know about yet.
However, this ideal has not been reached for C++, mainly because of Koenig lookup.
Is it really the right thing?
What if I have two types ::bat and ::foo::bar
I want to reference the bat type but accidentally hit the r key instead of t (they're right next to each others).
Is it "the right thing" for the compiler to then go searching through every namespace to find ::foo::bar without giving me even a warning?
Or what if I use "bar" as shorthand for the "::foo::bar" type all over my codebase.
Then one day I include a library which defines a ::bar datatype. Suddenly an ambiguity exists where there was none before. And suddenly, "the right thing" has become wrong.
The right thing for the compiler to do in this case would be to assume I meant the type I actually wrote. If I write bar with no namespace prefix, it should assume I'm referring to a type bar in the global namespace. But if it does that in our hypothetical scenario, it'll change what type my code references without even alerting me.
Alternatively, it could give me an error, but come on, that'd just be ridiculous, because even with the current language rules, there should be no ambiguity here, since one of the types is hidden away in a namespace I didn't specify, so it shouldn't be considered.
Another problem is that the compiler may not know what other types exist. In C++, the order of definitions matters.
In C#, types can be defined in separate assemblies, and referenced in your code. How does the compiler know that another type with the same name doesn't exist in another assembly, just in a different namespace? How does it know that one won't be added to another assembly later on?
The right thing is to do what gives the programmer the fewest nasty surprises. Second-guessing the programmer based on incomplete data is generally not the right thing to do.
Most languages give you several tools to avoid having to specify the namespace.
In c++, you have "using namespace foo", as well as typedefs. If you don't want to repeat the namespace prefix, then don't. Use the tools made available by the language so you don't have to.
This all depends on your definition of "right thing". Is it the right thing for the compiler to guess your intention if there's only one match?
There are arguments for both sides.
Interesting question. In the case of C++, as I see it, provided the compiler flagged an error as soon as there was a conflict, the only problem this could cause would be:
Auto-lookup of all C++ namespaces would remove the ability to hide the names of internal parts of library code.
Library code often contains parts (types, functions, global variables) that are never intended to be visible to the "outside world." C++ has unnamed namespaces for exactly this reason -- to avoid "internal parts" clogging up the global namespace, even when those library namespaces are explicitly imported with using namespace xyz;.
Example: Suppose C++ did do auto-lookup, and a particular implementation of the C++ Standard Library contained an internal helper function, std::helper_func(). Suppose a user Joe develops an application containing a function joe::helper_func() using a different library implementation that does not contain std::helper_func(), and calls his own method using unqualified calls to helper_func(). Now Joe's code will compile fine in his environment, but any other user who tries to compile that code using the first library implementation will hit compiler error messages. So the first thing required to make Joe's code portable is to either insert the appropriate using declarations/directives or use fully qualified identifiers. In other words, auto-lookup buys nothing for portable code.
Admittedly, this doesn't seem like a problem that's likely to come up very often. But since typing explicit using declarations/directives (e.g. using namespace std;) is not a big deal for most people, solves this problem completely, and would be required for portable development anyway, using them (heh) seems like a sensible way to do things.
NOTE: As Klaim pointed out, you would never in any circumstances want to rely on auto-lookup inside a header file, as this would immediately prevent your module from being used at the same time as any module containing a conflicting name. (This is just a logical extension of why you don't do using namespace xyz; inside headers in C++ as it stands.)