Clojure methods ending in * - clojure

What do methods ending in * tend to have in common? I've seen a few, but have no idea if this is an established naming convention.

In general I've seen this used to distinguish functions that do the same thing but with different signatures, especially in situations where overloads would create conflicting semantics. For example, list* could not be expressed as an overload of list because they are using variable arity in different ways.
In many cases (but not all), the * form is called by the non-* version.

Apart from what other answers have mentioned, This convention is used where the non-* version are macros and these macros emit code that calls the * functions. Even in clojure.core, let and fn are macros whose resulting code calls let* and fn* functions respectively. Other example would be sqlkorma, where non-* (where,delete,update etc) are macros and * ones (where*, delete* etc) are functions.
The reason for using this pattern is that in some cases it is not feasible to use the macro version of the API (apart from using eval, as you don't have the information at compile time), in such cases you can uses the * based functions.

Related

Are there any more useful use-cases of functors?

I am trying to understand cases that require using functors. Most of the answer on Stackoverflow and other websites put emphasis on being able to define different adders or multipliers regarding benefits of functors.
Can the use of functors go beyond them? What are some other uses of functors?
More often than not, functors are used with other API calls that need some kind of function object. For example, sorting vectors of user-defined objects which don't have operator() or operator< (etc.) defined.
There are some cases where a set of functors may prove useful. One such case comes when you have several algorithms which functionally do the same thing, but achieve varying levels of accuracy. This happens a lot with some numeric optimization problems: given the general form of a matrix, we might use a different technique to find the solution of a linear equation (e.g., sparse vs dense problem-matracies can employ different algorithms to invert the matrix).
In particular, you should consider functors versus lambdas. In modern versions of C++, there really isn't a need to specify a functor unless you're implementing a function/method that needs a functor (or lambda) as an argument. There are some cases to consider: Do you need a unit-test? Is the functor itself a prototype of future functionality? etc.
ADDENDUM: The key thing to consider is that the use of functor/lambda ultimately boils down to a design decision. As #t.niese noted in the comments, you could use just use functions in combination of template arguments. In addition to the previous considerations above, consider whether or not you can make a compile-time or run-time assessment of the needed functionality.
Additionally, as you make design decisions, you may want to consider "Is there a need for this function to be used outside of this specific context?" If the answer is no, that's a compelling argument to choose a lambda over a free function. With regards to functor specifically, this was an important pattern added before the addition of lambdas to the standard. Typically they're defined in a somewhat private context (frequently in the implementation files, thus after compiled into a library, obfuscated to users of the API). Now with lambdas, you can simply define them within another function or even as a function argument, instead of pre-defining them prior to need.

How does Clojure's optimiser work, and where is it?

I am new to Clojure, but not to lisp. A few of the design decisions look strange to me - specifically requiring a vector for function parameters and explicitly requesting tail calls using recur.
Translating lists to vectors (and vice versa) is a standard operation for an optimiser. Tail calls can be converted to iteration by rewriting to equivalent clojure before compiling to byte code. The [] and recur syntax suggest that neither of these optimisations are present in the current implementation.
I would like a pointer to where in the implementation I can find any/all source-to-source transformation passes. I don't speak Java very well so am struggling to navigate the codebase.
If there isn't any optimisation before function-by-function translation to the JVM's byte code, I'd be interested in the design rationale for this. Perhaps to achieve faster compilation?
Thank you.
There is no explicit optimizer package in the compiler code. Any optimizations are done "inline". Some can be enabled or disabled via compiler flags.
Observe that literal vectors for function parameters are a syntactic choice how functions are represented in source code. Whether they are represented as vectors or list or anything else would not affect runtime and cannot be optimized hence.
Regarding automatic recur, Rich Hickey explained his decision here:
When speaking about general TCO, we are not just talking about
recursive self-calls, but also tail calls to other functions. Full TCO
in the latter case is not possible on the JVM at present whilst
preserving Java calling conventions (i.e without interpreting or
inserting a trampoline etc).
While making self tail-calls into jumps would be easy (after all,
that's what recur does), doing so implicitly would create the wrong
expectations for those coming from, e.g. Scheme, which has full TCO.
So, instead we have an explicit recur construct.
Essentially it boils down to the difference between a mere
optimization and a semantic promise. Until I can make it a promise,
I'd rather not have partial TCO.
Some people even prefer 'recur' to the redundant restatement of the
function name. In addition, recur can enforce tail-call position.
specifically requiring a vector for function parameters
Most other lisps build structures out of syntactic lists. For an associative "map" for example, you build a list of lists. For a "vector", you make a list. For a conditional switch-like expression, you make a list of lists of lists. Lots of lists, lots of parenthesis.
Clojure has made it an obvious goal to make the syntax of lisp more readable and less redundant. A map, set, list, vector all have their own syntax delimiters so they jump out at the eye, while also providing specific functionality that otherwise you'd have to explicitly request using a function if they were all lists. In addition to these structural primitives, other functions like cond minimize the parentheses by removing one layer of parentheses for each pair in the expression, rather than additionally wrapping each pair in yet another grouped parenthesis. This philosophy is widespread throughout the language and its core library so the code is more readable and elegant.
Function parameters as a vector are just part of this syntax. It's not about whether the language can convert a list to a vector easily, it's about how the language requires the placement of function parameters in a function definition -- and it does so by explicitly requiring a vector. And in fact, you can see this clearly in the source for defn:
https://github.com/clojure/clojure/blob/clojure-1.7.0/src/clj/clojure/core.clj#L296
It's just a requirement for how a function is written, that's all.

What are 'if ,define, lambda' in scheme?

We have a course whose project is to implement a micro-scheme interpreter in C++. In my implementation, I treat 'if', 'define', 'lambda' as procedures, so it is valid in my implementation to eval 'if', 'define' or 'lambda', and it is also fine to write expressions like '(apply define (quote (a 1)))', which will bind 'a' to 1.
But I find in racket and in mit-scheme, 'if', 'define', 'lambda' are not evaluable. For example,
It seems that they are not procedures, but I cannot figure out what they are and how they are implemented.
Can someone explain these to me?
In the terminology of Lisp, expressions to be evaluated are forms. Compound forms (those which use list syntax) are divided into special forms (headed by special operators like let), macro forms, and function call forms.
The Scheme report desn't use this terminology. It calls functions "procedures". Scheme's special forms are called "syntax". Macros are "derived expression types", individually introduced as "library syntax". (The motivation for this may be some conscious decision to blend into the CS academic mainstream by scrubbing some unfamiliar Lisp terminology. Algol has procedures and a BNF-defined syntax, Scheme has procedures and a BNF-defined syntax. That ticks off some sort of familiarity checkbox.)
Special forms (or "syntax") are recognized by interpreters and compilers as a set of special cases. The interpreter or compiler may handle these forms via function-like bindings in some internal table keyed on symbols, but it's not the program-visible binding namespace.
Setting up these associations in the regular namespace isn't necessarily wrong, but it could be problematic. If you want both a compiler and interpreter, but let has only one top-level binding, that will be an issue: who gets to install their procedure into that binding: the interpreter or compiler? (One way to resolve that is simple: make the binding values cons pairs: the car can be the interpreter function, the cdr the compiler function. But then these bindings are not procedures any more that you can apply.)
Exposing these bindings to the application is problematic anyway, because the semantics is so different between interpretation and compilation. If your interpretation is interpreted, then calling the define binding as a function is possible; it has the effect of performing the definition. But in a compiled interpretation, code depending on this won't work; define will be a function that doesn't actually define anything, but rather compiles: it calculates and returns a compiled fragment written in some intermediate representation.
About your implementation, the fact that (apply define (quote (a 1))) works in your implementation raises a bit of a red flag. Either you've made the environment parameter of the function optional, or it doesn't take one. Functions implementing special operators (or "syntax") need an environment parameter, not just the piece of syntax. (At least if we are developing a lexically scoped Scheme or Lisp!)
The fact that (apply define (quote (a 1))) works also suggests that your define function is taking quote and (a 1) as arguments. While that is workable, the usual approach for these kinds of syntax procedures is to take the whole form as one argument (and a lexical environment as another argument). If such a function can be called, the invocation looks something like like (apply define (list '(define a 1) (null-environment 5))). The procedure itself will do any necessary destructuring on the syntax, and checking for validity: are there too many or too few parameters and so on.

What's the convention for using an asterisk at the end of a function name in Clojure and other Lisp dialects?

Note that I'm not talking about ear muffs in symbol names, an issue that is discussed at Conventions, Style, and Usage for Clojure Constants? and How is the `*var-name*` naming-convention used in clojure?. I'm talking strictly about instances where there is some function named foo that then calls a function foo*.
In Clojure it basically means "foo* is like foo, but somehow different, and you probably want foo". In other words, it means that the author of that code couldn't come up with a better name for the second function, so they just slapped a star on it.
Mathematicians and Haskellers can use their apostrophes to indicate similar objects (values or functions). Similar but not quite the same. Objects that relate to each other. For instance, function foo could be a calculation in one manner, and foo' would do the same result but with a different approach. Perhaps it is unimaginative naming but it has roots in mathematics.
Lisps generally (without any terminal reason) have discarded apostrophes in symbol names, and * kind of resembles an apostrophe. Clojure 1.3 will finally fix that by allowing apostrophes in names!
If I understand your question correctly, I've seen instances where foo* was used to show that the function is equivalent to another in theory, but uses different semantics. Take for instance the lamina library, which defines things like map*, filter*, take* for its core type, channels. Channels are similar enough to seqs that the names of these functions make sense, but they are not compatible enough that they should be "equal" per se.
Another use case I've seen for foo* style is for functions which call out to a helper function with an extra parameter. The fact function, for instance, might delegate to fact* which accepts another parameter, the accumulator, if written recursively. You don't necessarily want to expose in fact that there's an extra argument, because calling (fact 5 100) isn't going to compute for you the factorial of 5--exposing that extra parameter is an error.
I've also seen the same style for macros. The macro foo expands into a function call to foo*.
a normal let binding (let ((...))) create separate variables in parallel
a let star binding (let* ((...))) creates variables sequentially so that can be computed from eachother like so
(let* ((x 10) (y (+ x 5)))
I could be slightly off base but see LET versus LET* in Common Lisp for more detail
EDIT: I'm not sure about how this reflects in Clojure, I've only started reading Programming Clojure so I don't know yet

Why and how should I use namespaces in C++?

I have never used namespaces for my code before. (Other than for using STL functions)
Other than for avoiding name conflicts, is there any other reason to use namespaces?
Do I have to enclose both declarations and definitions in namespace scope?
One reason that's often overlooked is that simply by changing a single line of code to select one namespaces over another you can select an alternative set of functions/variables/types/constants - such as another version of a protocol, or single-threaded versus multi-threaded support, OS support for platform X or Y - compile and run. The same kind of effect might be achieved by including a header with different declarations, or with #defines and #ifdefs, but that crudely affects the entire translation unit and if linking different versions you can get undefined behaviour. With namespaces, you can make selections via using namespace that only apply within the active namespace, or do so via a namespace alias so they only apply where that alias is used, but they're actually resolved to distinct linker symbols so can be combined without undefined behaviour. This can be used in a way similar to template policies, but the effect is more implicit, automatic and pervasive - a very powerful language feature.
UPDATE: addressing marcv81's comment...
Why not use an interface with two implementations?
"interface + implementations" is conceptually what choosing a namespace to alias above is doing, but if you mean specifically runtime polymorphism and virtual dispatch:
the resultant library or executable doesn't need to contain all implementations and constantly direct calls to the selected one at runtime
as one implementation's incorporated the compiler can use myriad optimisations including inlining, dead code elimination, and constants differing between the "implementations" can be used for e.g. sizes of arrays - allowing automatic memory allocation instead of slower dynamic allocation
different namespaces have to support the same semantics of usage, but aren't bound to support the exact same set of function signatures as is the case for virtual dispatch
with namespaces you can supply custom non-member functions and templates: that's impossible with virtual dispatch (and non-member functions help with symmetric operator overloading - e.g. supporting 22 + my_type as well as my_type + 22)
different namespaces can specify different types to be used for certain purposes (e.g. a hash function might return a 32 bit value in one namespace, but a 64 bit value in another), but a virtual interface needs to have unifying static types, which means clumsy and high-overhead indirection like boost::any or boost::variant or a worst case selection where high-order bits are sometimes meaningless
virtual dispatch often involves compromises between fat interfaces and clumsy error handling: with namespaces there's the option to simply not provide functionality in namespaces where it makes no sense, giving a compile-time enforcement of necessary client porting effort
Here is a good reason (apart from the obvious stated by you).
Since namespace can be discontiguous and spread across translation units, they can also be used to separate interface from implementation details.
Definitions of names in a namespace can be provided either in the same namespace or in any of the enclosing namespaces (with fully qualified names).
It can help you for a better comprehension.
eg:
std::func <- all function/class from C++ standard library
lib1::func <- all function/class from specific library
module1::func <-- all function/class for a module of your system
You can also think of it as module in your system.
It can also be usefull for an writing documentation (eg: you can easily document namespace entity in doxygen)
Aren't name collisions enough of a reason? ADL subtleties, especially with operator overloads, are another.
That's the easiest way. You can also prefix names with the namespace, e.g. my_namespace::name, when defining.
You can think of namespaces as logical separated units for your application, and logical here means that suppose we have two different classes, putting these two classes each in a file, but when you notice that these classes share something enough to be categorized under one category, that's one strong reason to use namespaces.
Answer: If you ever want to overload the new, placement new, or delete functions you're going to want to do them in a namespace. No one wants to be forced to use your version of new if they don't require the things you require.
Yes