Does Frege perform tail call optimization?

Does Frege perform tail call optimization? - clojure

Are tail calls optimised in Frege. I know that there is TCO neither in Java nor in languages which compile to JVM bytecode like Clojure and Scala. What about Frege?

Frege does Tail Recursion Optimization by simply generating while loops.
General tail calls are handled "by the way" through laziness. If the compiler sees a tail call to a suspectible function that is known to be (indirectly) recursive, a lazy result (a thunk) is returned. Thus, the real burden of calling that function lies with the caller. This way, stacks whose depth depends on the data are avoided.
That being said, already the static stack depth is by nature deeper in a functional language than in Java. Hence, some programs will need to be given a bigger stack (i.e. with -Xss1m).
There are pathological cases, where big thunks are build and when they are evaluated, a stack overflow will happen. A notorious example is the foldl function (same problem as in Haskell). Hence, the standard left fold in Frege is fold, which is tail recursive and strict in the accumulator and thus works in constant stack space (like Haskells foldl').
The following program should not stack overflow but print "false" after 2 or 3s:
module Test
-- inline (odd)
where
even 0 = true
even 1 = false
even n = odd (pred n)
odd n = even (pred n)
main args = println (even 123_456_789)
This works as follows: println must have a value to print, so tries to evaluate (even n). But all it gets is a thunk to (odd (pred n)). Hence it tries to evaluate this thunk, which gets another thunk to (even (pred (pred n))). even must evaluate (pred (pred n)) to see if the argument was 0 or 1, before returning another thunk (odd (pred (n-2)) where n-2 is already evaluated.
This way, all the calling (at JVM level) is done from within println. At no time does even actually invoke odd, or vice versa.
If one uncomments the inline directive, one gets a tail recursive version of even, and the result is obtained ten times faster.
Needless to say, this clumsy algorithm is only for demonstration - normally one would check for even-ness with a bit operation.
Here is another version, that is pathological and will stack overflow:
even 0 = true
even 1 = false
even n = not . odd $ n
odd = even . pred
The problem is here that not is the tail call and it is strict in its argument (i.e., to negate something, you must first have that something). Hence, When even n is computed, then not must fully evaluate odd n which, in turn, must fully evaluate even (pred n) and thus it will take 2*n stack frames.
Unfortunately, this is not going to change, even if the JVM should have proper tail call one day. The reason is the recursion in the argument of a strict function.

Related

Evaluation of arguments in function called by macro

Macros do not evaluate their arguments until explicitly told to do so, however functions do. In the following code:
(defmacro foo [xs]
(println xs (type xs)) ;; unquoted list
(blah xs))
(defn blah [xs] ;; xs is unquoted list, yet not evaluated
(println xs)
xs)
(foo (+ 1 2 3))
It seems that blah does not evaluate xs, since we still have the entire list: (+ 1 2 3) bound to xs in the body of blah.
I have basically just memorized this interaction between helper functions within macros and their evaluation of arguments, but to be honest it goes against what my instincts are (that xs would be evaluated before entering the body since function arguments are always evaluated).
My thinking was basically: "ok, in this macro body I have xs as the unevaluated list, but if I call a function with xs from within the macro it should evaluate that list".
Clearly I have an embarassingly fundamental misunderstanding here of how things work. What am I missing in my interpretation? How does the evaluation actually occur?
EDIT
I've thought on this a bit more and it seems to me that maybe viewing macro arguments as "implicitly quoted" would solve some confusion on my part.
I think I just got mixed up in the various terminologies, but given that quoted forms are synonymous with unevaluated forms, and given macro arguments are unevaluated, they are implicitly quoted.
So in my above examples, saying xs is unquoted is somewhat misleading. For example, this macro:
(defmacro bluh [xs]
`(+ 1 2 ~xs))
Is basically the same as the below macro (excluding namespacing on the symbols). Resolving xs in the call to list gives back an unevaluated (quoted?) list.
(defmacro bleh [xs]
(list '+ '1 '2 xs)) ;; xs resolves to a quoted list (or effectively quoted)
Calling bleh (or bluh) is the same as saying:
(list '+ '1 '2 '(+ 1 2 3))
;; => (+ 1 2 (+ 1 2 3))
If xs did not resolve to a quoted list, then we would end up with:
(list '+ '1 '2 (+ 1 2 3))
;; => (+ 1 2 6)
So, in short, macro arguments are quoted.
I thnk part of my confusion came from thinking about the syntax quoted forms as templates with slots filled in e.g. (+ 1 2 ~xs) I would mentally expand to (+ 1 2 (+ 1 2 3)), and seeing that (+ 1 2 3) was not quoted in that expansion, I found it confusing that function calls using xs (in the first example above blah) would not evalute immediately to 6.
The template metaphor is helpful, but if I instead look at it as a
shortcut for (list '+ '1 '2 xs) it becomes obvious that xs must be a quoted list otherwise the expansion would include 6 and not the entire list.
I'm not sure why I found this so confusing... have I got this right or did I just go down the wrong path entirely?

[This answer is an attempt to explain why macros and functions which don't evaluate their arguments are different things. I believe this applies to macros in Clojure but I am not an expert on Clojure. It's also much too long, sorry.]
I think you are confused between what Lisp calls macros and a construct which modern Lisps don't have but which used to be called FEXPRs.
There are two interesting, different, things you might want:
functions which, when called, do not immediately evaluate their arguments;
syntax transformers, which are called macros in Lisp.
I'll deal with them in order.
Functions which do not immediately evaluate their arguments
In a conventional Lisp, a form like (f x y ...), where f is a function, will:
establish that f is a function and not some special thing;
get the function corresponding to f and evaluate x, y, & the rest of the arguments in some order specified by the language (which may be 'in an unspecified order');
call f with the results of evaluating the arguments.
Step (1) is needed initially because f might be a special thing (like, say if, or quote), and it might be that the function definition is retrieved in (1) as well: all of this, as well as the order that things happen in in (2) is something the language needs to define (or, in the case of Scheme say, leave explicitly undefined).
This ordering, and in particular the ordering of (2) & (3) is known as applicative order or eager evaluation (I'll call it applicative order below).
But there are other possibilities. One such is that the arguments are not evaluated: the function is called, and only when the values of the arguments are needed are they evaluated. There are two approaches to doing this.
The first approach is to define the language so that all functions work this way. This is called lazy evaluation or normal order evaluation (I'll call it normal order below). In a normal order language function arguments are evaluated, by magic, at the point they are needed. If they are never needed then they may never be evaluated at all. So in such a language (I am inventing the syntax for function definition here so as not to commit CL or Clojure or anything else):
(def foo (x y z)
(if x y z))
Only one of y or z will be evaluated in a call to foo.
In a normal order language you don't need to explicitly care about when things get evaluated: the language makes sure that they are evaluated by the time they're needed.
Normal order languages seem like they'd be an obvious win, but they tend to be quite hard to work with, I think. There are two problems, one obvious and one less so:
side-effects happen in a less predictable order than they do in applicative order languages and may not happen at all, so people used to writing in an imperative style (which is most people) find them hard to deal with;
even side-effect-free code can behave differently than in an applicative order language.
The side-effect problem could be treated as a non-problem: we all know that code with side-effects is bad, right, so who cares about that? But even without side-effects things are different. For instance here's a definition of the Y combinator in a normal order language (this is kind of a very austere, normal order subset of Scheme):
(define Y
((λ (y)
(λ (f)
(f ((y y) f))))
(λ (y)
(λ (f)
(f ((y y) f))))))
If you try to use this version of Y in an applicative order language -- like ordinary Scheme -- it will loop for ever. Here's the applicative order version of Y:
(define Y
((λ (y)
(λ (f)
(f (λ (x)
(((y y) f) x)))))
(λ (y)
(λ (f)
(f (λ (x)
(((y y) f) x)))))))
You can see it's kind of the same, but there are extra λs in there which essentially 'lazify' the evaluation to stop it looping.
The second approach to normal order evaluation is to have a language which is mostly applicative order but in which there is some special mechanism for defining functions which don't evaluate their arguments. In this case there often would need to be some special mechanism for saying, in the body of the function, 'now I want the value of this argument'. Historically such things were called FEXPRs, and they existed in some very old Lisp implementations: Lisp 1.5 had them, and I think that both MACLISP & InterLisp had them as well.
In an applicative order language with FEXPRs, you need somehow to be able to say 'now I want to evaluate this thing', and I think this is the problem are running up against: at what point does the thing decide to evaluate the arguments? Well, in a really old Lisp which is purely dynamically scoped there's a disgusting hack to do this: when defining a FEXPR you can just pass in the source of the argument and then, when you want its value, you just call EVAL on it. That's just a terrible implementation because it means that FEXPRs can never really be compiled properly, and you have to use dynamic scope so variables can never really be compiled away. But this is how some (all?) early implementations did it.
But this implementation of FEXPRs allows an amazing hack: if you have a FEXPR which has been given the source of its arguments, and you know that this is how FEXPRs work, then, well, it can manipulate that source before calling EVAL on it: it can call EVAL on something derived from the source instead. And, in fact, the 'source' it gets given doesn't even need to be strictly legal Lisp at all: it can be something which the FEXPR knows how to manipulate to make something that is. That means you can, all of a sudden, extend the syntax of the language in pretty general ways. But the cost of being able to do that is that you can't compile any of this: the syntax you construct has to be interpreted at runtime, and the transformation happens each time the FEXPR is called.
Syntax transformers: macros
So, rather than use FEXPRs, you can do something else: you could change the way that evaluation works so that, before anything else happens, there is a stage during which the code is walked over and possibly transformed into some other code (simpler code, perhaps). And this need happen only once: once the code has been transformed, then the resulting thing can be stashed somewhere, and the transformation doesn't need to happen again. So the process now looks like this:
code is read in and structure built from it;
this initial structure is possibly transformed into other structure;
(the resulting structure is possibly compiled);
the resulting structure, or the result of compiling it is evaluated, probably many times.
So now the process of evaluation is divided into several 'times', which don't overlap (or don't overlap for a particular definition):
read time is when the initial structure is built;
macroexpansion time is when it is transformed;
compile time (which may not happen) is when the resulting thing is compiled;
evaluation time is when it is evaluated.
Well, compilers for all languages probably do something like this: before actually turning your source code into something that the machine understands they will do all sorts of source-to-source transformations. But these things are in the guts of the compiler and are operating on some representation of the source which is idiosyncratic to that compiler and not defined by the language.
Lisp opens this process to users. The language has two features which make this possible:
the structure that is created from source code once it has been read is defined by the language and the language has a rich set of tools for manipulating this structure;
the structure created is rather 'low commitment' or austere -- it does not particularly predispose you to any interpretation in many cases.
As an example of the second point, consider (in "my.file"): that's a function call of a function called in, right? Well, may be: (with-open-file (in "my.file") ...) almost certainly is not a function call, but binding in to a filehandle.
Because of these two features of the language (and in fact some others I won't go into) Lisp can do a wonderful thing: it can let users of the language write these syntax-transforming functions -- macros -- in portable Lisp.
The only thing that remains is to decide how these macros should be notated in source code. And the answer is the same way as functions are: when you define some macro m you use it just as (m ...) (some Lisps support more general things, such as CL's symbol macros. At macroexpansion time -- after the program is read but before it is (compiled and) run -- the system walks over the structure of the program looking for things which have macro definitions: when it finds them it calls the function corresponding to the macro with the source code specified by its arguments, and the macro returns some other chunk of source code, which gets walked in turn until there are no macros left (and yes, macros can expand to code involving other macros, and even to code involving themselves). Once this process is complete then the resulting code can be (compiled and) run.
So although macro look like function calls in the code, they are not just functions which don't evaluate their arguments, like FEXPRs were: instead they are functions which take a bit of Lisp source code and return another bit of Lisp source code: they're syntax transformers, or function which operate on source code (syntax) and return other source code. Macros run at macroexpansion time which is properly before evaluation time (see above).
So, in fact macros are functions, written in Lisp, and the functions they call evaluate their arguments perfectly conventionally: everything is perfectly ordinary. But the arguments to macros are programs (or the syntax of programs represented as Lisp objects of some kind) and their results are (the syntax of) other programs. Macros are functions at the meta-level, if you like. So a macro if a function which computes (parts of) programs: those programs may later themselves be run (perhaps much later, perhaps never) at which point the evaluation rules will be applied to them. But at the point a macro is called what it's dealing with is just the syntax of programs, not evaluating parts of that syntax.
So, I think your mental model is that macros are something like FEXPRs in which case the 'how does the argument get evaluated' question is an obvious thing to ask. But they're not: they're functions which compute programs, and they run properly before the program they compute is run.
Sorry this answer has been so long and rambling.
What happened to FEXPRs?
FEXPRs were always pretty problematic. For instance what should (apply f ...) do? Since f might be a FEXPR, but this can't generally be known until runtime it's quite hard to know what the right thing to do is.
So I think that two things happened:
in the cases where people really wanted normal order languages, they implemented those, and for those languages the evaluation rules dealt with the problems FEXPRs were trying to deal with;
in applicative order languages then if you want to not evaluate some argument you now do it by explicitly saying that using constructs such as delay to construct a 'promise' and force to force evaluation of a promise -- because the semantics of the languages improved it became possible to implement promises entirely in the language (CL does not have promises, but implementing them is essentially trivial).
Is the history I've described correct?
I don't know: I think it may be but it may also be a rational reconstruction. I certainly, in very old programs in very old Lisps, have seen FEXPRs being used the way I describe. I think Kent Pitman's paper, Special Forms in Lisp may have some of the history: I've read it in the past but had forgotten about it until just now.

A macro definition is a definition of a function that transforms code. The input for the macro function are the forms in the macro call. The return value of the macro function will be treated as code inserted where the macro form was. Clojure code is made of Clojure data structures (mostly lists, vectors, and maps).
In your foo macro, you define the macro function to return whatever blah did to your code. Since blah is (almost) the identity function, it just returns whatever was its input.
What is happening in your case is the following:
The string "(foo (+ 1 2 3))" is read, producing a nested list with two symbols and three integers: (foo (+ 1 2 3)).
The foo symbol is resolved to the macro foo.
The macro function foo is invoked with its argument xs bound to the list (+ 1 2 3).
The macro function (prints and then) calls the function blah with the list.
blah (prints and then) returns that list.
The macro function returns the list.
The macro is thus “expanded” to (+ 1 2 3).
The symbol + is resolved to the addition function.
The addition function is called with three arguments.
The addition function returns their sum.
If you wanted the macro foo to expand to a call to blah, you need to return such a form. Clojure provides a templating convenience syntax using backquote, so that you do not have to use list etc. to build the code:
(defmacro foo [xs]
`(blah ~xs))
which is like:
(defmacro foo [xs]
(list 'blah xs))

Iterate produces StackOverflow errors

So I just started out with Frege and Haskell as well. I have experience with functional languages, since I was using Clojure for a couple of years now.
The first thing I wanted to try out is my usual approach at the Fibonacci numbers.
next_fib (a, b) = (b, a + b)
fibs = map fst $ iterate next_fib (0, 1)
fib x = head $ drop x fibs
This is how it turned out in Frege. It works, but for very high numbers for fib, e.g. (fib 4000), it throws StackOverflow errors. This surprised me, because same functions in Clojure would work just fine. Is this a Frege bug or am I getting the whole lazy evaluation thing wrong?

You probably don't "get the whole lazy evaluation thing wrong", but you're bitten twice by too lazy evaluation in this case.
And although GHC essentially works exactly the same as Frege in this regard, the outcome is different and seemingly unfavorable for Frege.
But the reason that Haskell can get awya with really big thunks[see below], while Frege early aborts with stack overflow is the way the runtime systems manage heap and stack. The Haskell RTS is flexible and can devote huge portions of the available memory to the stack, if the need arises. While Frege's runtime system is the JVM, which usually starts out with a tiny stack, just enough to accomodate a call depth of a few hundred. As you have observed, giving the JVM enough stack space makes the think work, exactly like it would in GHC.
Because of the ever scarce stack space in the JVM, we have developed some techniques in Frege to avoid unwanted and unneeded laziness. Two of them will be explained below. In the end, in Frege you are forced to control bad effects of laziness early, while the GHC developer can happily code away without having to take notice.
To understand the following, we need to introduce the concept "thunk". A thunk is first and foremost some yet to be evaluated expression. For example, since tuples are lazy, an expression like
(b, b+a)
is compiled to an application of the tuple constructor (,) to b and {a+b} where the notation { e } for the sake of this discussion means some implementation dependent representation of a thunk that promises to compute the expression e when evaluated. In addition, a thunk memoizes its result upon evaluation, so whenver the same thunk is evaluated again, it just returns the precomputed result. (This is only possible in a pure functional language, of course.)
For example, in Frege, to represent thunks there is a class Delayed<X> that implements Callable<X> and arranges for memoization of the result.
We shall now investigate what the result of
next_fib (next_fib (0, 1))
is. The inner application results in:
(1, {0+1})
and then the outer one computes from that:
({0+1}, {1+{0+1}})
We see here that thunks can get nested in other thunks, and this is the problem here, since every application of next_fib will result in a tuple that will have as its elements thunks that have thunks of the previous iteration nested inside them.
Now consider what is happening when the thunk for the 4000th fib-number gets evaluated, which happens, for instance, when you print it. It will have to perform an addition, but the numbers to add are actually both thunks, which must be evaluated before the addition can take place. In this way, each nested thunk means an invocation of that thunks evaluation method, unless the thunk is already evaluated. Hence, to print the 4000th number, we need a stack depth of at least 4000 in the case when no other thunk of this series was evaluated before.
So the first measure was to replace the lazy tuple constructor with the strict one:
(b; a+b)
It doesn't build thunks but computes the arguments right away. This is not available in Haskell, to do the same there you need to say something like:
let c = a+b in b `seq` c `seq` (b,c)
But this was not the end of the story. It turned out that the computation fib 4000 still overflowed the stack.
The reason is the implementation of iterate that goes like this:
iterate f x = x : iterate f (f x)
This builds an infinite list
[ x, f x, f (f x), f (f (f x)), ...]
Needless to say, all the terms except the first one are thunks!
This is normally not a problem when the list elements are evaluated in sequential order, because when, for example, the 3rd term
{f {f x}}
gets evaluated, the inner thunk is already evaluated and returns the result right away. In general, we need only enough stack depth to reach the first previously evaluated term. Here is a demo straight from the frege online REPL at try.frege-lang.org
frege> next (a,b) = (b; a+b) :: (Integer, Integer)
function next :: (Integer,Integer) -> (Integer,Integer)
frege> fibs = map fst $ iterate next (0,1)
function fibs :: [Integer]
frege> fib = (fibs!!)
function fib :: Int -> Integer
frege> map (length . show . fib) [0,500 ..]
[1,105,209,314,418,523,627,732,836,941,1045,1150,...]
frege> fib 4000
39909473435004422792081248094960912600792...
Here, with the map, we force evaluation of every 500th number (as far as the REPL demands output, it will only print initial portions of infinite lists), and compute the length of the decimal representation of each number (just so as not to display the large resulting numbers). This, in turn forces evaluation of the 500 preceding numbers, but this is ok, as there is enough stack space for that. Once that is done, we can even compute fib 4000! Because now, all the thunks up to 6000 are already evaluated.
But we can do even better with a slightly better version of iterate, which uses the head strict constructor (!:):
a !: as = a `seq` (a:as)
This evaluates the head of the list right away, which is appropriate in our case.
With the two changes, we get a program whose stack demand does not depend on the argument of fib anymore. Here is the proof:
frege> iterate' f x = x !: iterate' f (f x)
function iterate' :: (a->a) -> a -> [a]
frege> fibs2 = map fst $ iterate' next (0,1)
function fibs2 :: [Integer]
frege> (length . show . (fibs2 !!)) 4000
836
frege> (length . show . (fibs2 !!)) 8000
1672
frege> (length . show . (fibs2 !!)) 16000
3344
frege> (length . show . (fibs2 !!)) 32000
6688
frege> (length . show . (fibs2 !!)) 64000
13375
frege> (length . show . (fibs2 !!)) 128000
java.lang.OutOfMemoryError: Java heap space
Well, we'd need more heap space now to keep more than 100.000 huge numbers. But notice that there was no stack problem anymore to compute 32.000 new numbers in the last step.
We could get rid of the heap space problem with a simple tail recursive definition that doesn't need to mark all those numbers:
fib :: Int -> Integer
fib n = go n 0 1 where
go :: Int -> Integer -> Integer -> Integer
go 0 !a !b = a
go n !a !b = go (n-1) b (a+b)
I guess this would be even faster than traversing the list.
Unlike(?) in Clojure, direct list access is O(n), and long lists consume lots of space. Therefore, if you need to cache something and have an upper limit, you better use arrays. Here are 2 ways to construct an array of 10000 fibs:
frege> zzz = arrayFromList $ take 10000 $ map fst $ iterate (\(a,b) -> (b; a+b)) (0n,1)
function zzz :: JArray Integer
frege> elemAt zzz 4000
39909473435004422792081248094960912600792570982820257 ...
This works, because the intermediate list should never exist as a whole. And once created, access is O(1)
And there is also a special function for creating caches like that:
yyy = arrayCache f 10000 where
f 0 a = 0n
f 1 a = 1n
f n a = elemAt a (n-1) + elemAt a (n-2)
fib = elemAt yyy
This avoids even the intermediate list, all the tuples, and so on.
This way, you can keep your good habit of prefering combinators over explicit recursion. Please give it a try.

Using lazy-seq without blowing the stack: is it possible to combine laziness with tail recursion?

To learn Clojure, I'm solving the problems at 4clojure. I'm currently cutting my teeth on question 164, where you are to enumerate (part of) the language a DFA accepts. An interesting condition is that the language may be infinite, so the solution has to be lazy (in that case, the test cases for the solution (take 2000 ....
I have a solution that works on my machine, but when I submit it on the website, it blows the stack (if I increase the amount of acceptable strings to be determined from 2000 to 20000, I also blow the stack locally, so it's a deficiency of my solution).
My solution[1] is:
(fn [dfa]
(let [start-state (dfa :start)
accept-states (dfa :accepts)
transitions (dfa :transitions)]
(letfn [
(accept-state? [state] (contains? accept-states state))
(follow-transitions-from [state prefix]
(lazy-seq (mapcat
(fn [pair] (enumerate-language (val pair) (str prefix (key pair))))
(transitions state))))
(enumerate-language [state prefix]
(if (accept-state? state)
(cons prefix (follow-transitions-from state prefix))
(follow-transitions-from state prefix)))
]
(enumerate-language start-state ""))
)
)
it accepts the DFA
'{:states #{q0 q1 q2 q3}
:alphabet #{a b c}
:start q0
:accepts #{q1 q2 q3}
:transitions {q0 {a q1}
q1 {b q2}
q2 {c q3}}}
and returns the language that DFA accepts (#{a ab abc}). However, when determining the first 2000 accepted strings of DFA
(take 2000 (f '{:states #{q0 q1}
:alphabet #{0 1}
:start q0
:accepts #{q0}
:transitions {q0 {0 q0, 1 q1}
q1 {0 q1, 1 q0}}}))
it blows the stack. Obviously I should restructure the solution to be tail recursive, but I don't see how that is possible. In particular, I don't see how it is even possible to combine laziness with tail-recursiveness (via either recur or trampoline). The lazy-seq function creates a closure, so using recur inside lazy-seq would use the closure as the recursion point. When using lazy-seq inside recur, the lazy-seq is always evaluated, because recur issues a function call that needs to evaluate its arguments.
When using trampoline,I don't see how I can iteratively construct a list whose elements can be lazily evaluated. As I have used it and see it used, trampoline can only return a value when it finally finishes (i.e. one of the trampolining functions does not return a function).
Other solutions are considered out of scope
I consider a different kind of solution to this 4Clojure problem out of scope of this question. I'm currently working on a solution using iterate, where each step only calculates the strings the 'next step' (following transitions from the current statew) accepts, so it doesn't recurse at all. You then only keep track of current states and the strings that got you into that state (which are the prefixes for the next states). What's proving difficult in that case is detecting when a DFA that accepts a finite language will no longer return any results. I haven't yet devised a proper stop-criterion for the take-while surrounding the iterate, but I'm pretty sure I'll manage to get this solution to work. For this question, I'm interested in the fundamental question: can laziness and tail-recursiveness be combined or is that fundamentally impossible?
[1] Note that there are some restrictions on the site, like not being able to use def and defn, which may explain some peculiarities of my code.

When using lazy-seq just make a regular function call instead of using recur. The laziness avoids the recursive stack consumption for which recur is otherwise used.
For example, a simplified version of repeat:
(defn repeat [x]
(lazy-seq (cons x (repeat x))))

The problem is that you are building something that looks like:
(mapcat f (mapcat f (mapcat f ...)))
Which is fine in principle, but the elements on the far right of this list don't get realized for a long time, and by the time you do realize them, they have a huge stack of lazy sequences that need to be forced in order to get a single element.
If you don't mind a spoiler, you can see my solution at https://gist.github.com/3124087. I'm doing two things differently than you are, and both are important:
Traversing the tree breadth-first. You don't want to get "stuck" in a loop from q0 to q0 if that's a non-accepting state. It looks like that's not a problem for the particular test case you're failing because of the order the transitions are passed to you, but the next test case after this does have that characteristic.
Using doall to force a sequence that I'm building lazily. Because I know many concats will build a very large stack, and I also know that the sequence will never be infinite, I force the whole thing as I build it, to prevent the layering of lazy sequences that causes the stack overflow.
Edit: In general you cannot combine lazy sequences with tail recursion. You can have one function that uses both of them, perhaps recurring when there's more work to be done before adding a single element, and lazy-recurring when there is a new element, but most of the time they have opposite goals and attempting to combine them incautiously will lead only to pain, and no particular improvements.

Overflow while using recur in clojure

I have a simple prime number calculator in clojure (an inefficient algorithm, but I'm just trying to understand the behavior of recur for now). The code is:
(defn divisible [x,y] (= 0 (mod x y)))
(defn naive-primes [primes candidates]
(if (seq candidates)
(recur (conj primes (first candidates))
(remove (fn [x] (divisible x (first candidates))) candidates))
primes)
)
This works as long as I am not trying to find too many numbers. For example
(print (sort (naive-primes [] (range 2 2000))))
works. For anything requiring more recursion, I get an overflow error.
(print (sort (naive-primes [] (range 2 20000))))
will not work. In general, whether I use recur or call naive-primes again without the attempt at TCO doesn't appear to make any difference. Why am I getting errors for large recursions while using recur?

recur always uses tail recursion, regardless of whether you are recurring to a loop or a function head. The issue is the calls to remove. remove calls first to get the element from the underlying seq and checks to see if that element is valid. If the underlying seq was created by a call to remove, you get another call to first. If you call remove 20000 times on the same seq, calling first requires calling first 20000 times, and none of the calls can be tail recursive. Hence, the stack overflow error.
Changing (remove ...) to (doall (remove ...)) fixes the problem, since it prevents the infinite stacking of remove calls (each one gets fully applied immediately and returns a concrete seq, not a lazy seq). I think this method only ever keeps one candidates list in memory at one time, though I am not positive about this. If so, it isn't too space inefficient, and a bit of testing shows that it isn't actually much slower.

Clojure warning/error on tail call optimization failure

In Scala 2.8.x, a new annotation (#tailrec) has been added that gives a compile-time error if the compiler cannot perform a tail-call optimization on the annotated method.
Is there some similar facility in Clojure with respect to loop/recur?
EDIT:
After reading the first answer to my question (thanks, Bozhidar Batsov) and further searching in the Clojure docs, I came across this:
(recur exprs*)
Evaluates the exprs in order, then, in parallel, rebinds the bindings of the recursion point to the values of the exprs. If the recursion point was a fn method, then it rebinds the params. If the recursion point was a loop, then it rebinds the loop bindings. Execution then jumps back to the recursion point. The recur expression must match the arity of the recursion point exactly. In particular, if the recursion point was the top of a variadic fn method, there is no gathering of rest args - a single seq (or null) should be passed. recur in other than a tail position is an error.
Note that recur is the only non-stack-consuming looping construct in Clojure. There is no tail-call optimization and the use of self-calls for looping of unknown bounds is discouraged. recur is functional and its use in tail-position is verified by the compiler [emphasis is mine].
(def factorial
(fn [n]
(loop [cnt n acc 1]
(if (zero? cnt)
acc
(recur (dec cnt) (* acc cnt))))))

Actually the situation in Scala w.r.t. Tail Call Optimisation is the same as in Clojure: it is possible to perform it in simple situations, such as self-recursion, but not in general situations, such as calling an arbitrary function in tail position.
This is due to the way the JVM works -- for TCO to work on the JVM, the JVM itself would have to support it, which it currently doesn't (though this might change when JDK7 is released).
See e.g. this blog entry for a discussion of TCO and trampolining in Scala. Clojure has exactly the same features to facilitate non-stack-consuming (= tail-call-optimised) recursion; this includes throwing a compile-time error when user code tries to call recur in non-tail position.

There is no tail-call optimization when you use loop/recur AFAIK. A quote from the official docs:
In the absence of mutable local
variables, looping and iteration must
take a different form than in
languages with built-in for or while
constructs that are controlled by
changing state. In functional
languages looping and iteration are
replaced/implemented via recursive
function calls. Many such languages
guarantee that function calls made in
tail position do not consume stack
space, and thus recursive loops
utilize constant space. Since Clojure
uses the Java calling conventions, it
cannot, and does not, make the same
tail call optimization guarantees.
Instead, it provides the recur special
operator, which does constant-space
recursive looping by rebinding and
jumping to the nearest enclosing loop
or function frame. While not as
general as tail-call-optimization, it
allows most of the same elegant
constructs, and offers the advantage
of checking that calls to recur can
only happen in a tail position.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js