OCaml used in demonstrations? - ocaml

I'm looking for examples of usages in OCaml to demonstrate simple properties or theorems.
An example may be, given an ocaml definition of binary trees, demonstrate that the maximum number of nodes is 2^(h+1)-1.
I have founds such kind of examples for binary trees and graphs but nothing else.. any suggestion or links?

If you are speaking of proofs written on paper, the usages are essentially the same that with other languages: an informal reasoning based on a reasonable (but not formalized) model of the semantics of the program. To handle your case I would write two functions size and height, and prove by inductive reasoning on the tree that size h <= pow 2 (height h + 1) - 1, using an induction hypothesis on the two subtrees -- I can make this explanation more detailed but prefer to let you do it yourself if you wish.
If you want more formal proofs, there are several approaches.
Proof techniques based on hoare logics have been adapted to functional programming languages. See for example the 2008 work of Régis-Gianas and Pottier, A Hoare logic for call-by-value functional programs. They provide a formal basis for what can be used, still in hand-written proofs, to give a more rigorous (because down-to-the-metal) proof of your claim. It could also be used in a theorem prover but I'm not sure this approach has been fully worked out yet.
Another natural approach would be to write your program directly in the Coq proof assistant, whose programming language is mostly a purely functional subset of OCaml, and use its facilities for proving. This is not exactly like writing in OCaml, but quite close; then you can either mirror the implementation in OCaml directly, or use the extraction facility of Coq to get honest-looking OCaml code that has been "compiled" from the Coq program. This approach has been used to formalize the implementation of the balanced binary trees present in the OCaml standard library, and the two implementations (the OCaml one and the Coq one) are sufficiently synchronized that you can transfer results to prove some OCaml-side changes correct.
In the same vein, there are some attempts to design languages for certified programming that may be more convenient, on some domains, than a general theorem prover such as Coq. Why3 is such a "software verification platform": it defines a programming languages (not very far from OCaml) and a specification language on top of it. You can formulate assertions about your program and verify them using a variety of techniques such as general proof assistants (eg. Coq) or more automated theorem provers (SMT solvers). Why3 strives to support verification of classic imperative-style algorithm implementations, but also support a functional programming style, so it may be an interesting choice for experiments with certified programming if you don't want to go to full-Coq (for example if you're not interested in ensuring termination of your algorithms, which can be inconvenient in Coq).
Finally, there has been work on the following technique: read your OCaml program and automatically produce a "Coq description" out of it, that you can properties about with the guarantee that what you prove correct also holds in the OCaml implementation. This has been the main result of Arthur Charguéraud's 2010 PhD thesis, where the "Coq description" is based on the technique of "Characteristic fomrulae". He has been able to prove correct ML implementations of relatively sophisticated algorithms such as Union-Find or examples out of the Chris Okasaki's excellent "Purely Functional Data Structures" book.
(I frequently mention the Coq proof assistant; other tools such as Isabelle and Agda are equally suitable, but Coq is closer in syntax to the OCaml language, so is probably a good choice if you want to re-implement ML programs to prove them formally correct.)

Related

Unit testing a new language

Does anyone know if there's standardized process for unit testing a new language.
I mean any new language will have basic flow control like IF, CASE etc.
How does one normally test the language itself?
Unit testing is one strategy to achieve a goal: verify that a piece of software meets a stated specification. Let's assume you are more interested in the goal, instead of exclusively using unit testing to achieve it.
The question of verifying that a language meets a specification or exhibits specific desirable qualities is profound. The earliest work led to type theory, in which one usually extends the language with new syntax and rules to allow one to talk about well-typed programs: programs that obey these new rules.
Hand-in-hand with these extensions are mathematical proofs demonstrating that any well-typed program will exhibit various desired qualities. For example, perhaps a well-typed program will never attempt to perform integer arithmetic on a string, or try to access an out-of-bounds element of an array.
By requiring programs to be well-typed before allowing them to execute, one can effectively extend these guarantees from well-typed programs to the language itself.
Type systems can be classified by the kinds of rules they include, which in turn determine their expressive power. For example, most typed languages in common use can verify my first case above, but not the second. With the added power comes greater complexity: their type verification algorithms are correspondingly harder to write, reason about, etc.
If you want to learn more, I suggest you read this book, which will take you from the foundations of functional programming up through the common type system families.
You could lookup what other languages do for testing. When I was developing a language I was thinking about doing something like Python. They have tests written in python itself.
You could lookup their tests. These are some of then: grammar, types, exceptions and so on.
Offcourse, there is a lot of useful stuff there if you are looking for examples, so I recomend that you dig in :).

Can I do Aspect Oriented Programming in OCaml?

Whether this question is a wide range or not I would like to ask :
Is it possible to implement aspect-oriented programming (AOP) features into OCaml language?
It is interesting to observe that, in contrast to the traditional
concept of crosscutting in the OO setting where aspects typically
crosscut several classes, the majority of the applications of aspects
in functional programming only involve a single function in the
pointcut. We believe the realisation of this difference as concluded
by this paper is important to both the functional and AOP commu- nity.
There is a pressing need to properly interpret and develop of the
concept of ‘crosscutting’ in the functional setting before functional AOP spreads its wings.
[emphasis mine]
What Does Aspect-Oriented Programming Mean for Functional Programmers? (PDF)
Regardless, there are direct attempts/translations of AOP to OCaml or ML systems. From my comment, I don't find these convincing, and believe that proper use of modules and functors can do a lot to capture the demarcation of concerns. Those direct attempts are,
Aspectual Caml
PolyAml(PDF)
Aspect ML

What are the differences between Clojure, Scheme/Racket and Common Lisp?

I know they are dialects of the same family of language called lisp, but what exactly are the differences? Could you give an overview, if possible, covering topics such as syntax, characteristics, features and resources.
They all have a lot in common:
Dynamic languages
Strongly typed
Compiled
Lisp-style syntax, i.e. code is written as a Lisp data structures (forms) with the most common pattern being function calls like: (function-name arg1 arg2)
Powerful macro systems that allow you to treat code as data and generate arbitrary code at runtime (often used to either "extend the language" with new syntax or create DSLs)
Often used in functional programming style, although have the ability to accommodate other paradigms
Emphasis in interactive development with a REPL (i.e. you interactively develop in a running instance of the code)
Common Lisp distinctive features:
A powerful OOP subsystem (Common Lisp Object System)
Probably the best compiler (Common Lisp is the fastest Lisp according to http://benchmarksgame.alioth.debian.org/u64q/which-programs-are-fastest.html although there isn't much in it.....)
Clojure distinctive features:
Largest library ecosystem, since you can directly use any Java libraries
Vectors [] and maps {} used as standard in addition to the standard lists () - in addition to the general usefullness of vectors and maps some believe this is a innovation which makes generally more readable
Greater emphasis on immutability and lazy functional programming, somewhat inspired by Haskell
Strong concurrency capabilities supported by software transactional memory at the language level (worth watching: http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey)
Scheme distinctive features:
Arguably the simplest and easiest to learn Lisp
Hygienic macros (see http://en.wikipedia.org/wiki/Hygienic_macro) - elegantly avoids the problems with accidental symbol capture in macro expansions
The people above missed a few things
Common Lisp has vectors and hash tables as well. The difference is that Common Lisp uses #() for vectors and no syntax for hash tables. Scheme has vectors, I believe
Common Lisp has reader macros, which allow you to use new brackets (as does Racket, a descendant of Scheme).
Scheme and Clojure have hygienic macros, as opposed to Common Lisp's unhygienic ones
All of the languages are either modern or have extensive renovation projects. Common Lisp has gotten extensive libraries in the past five years (thanks mostly to Quicklisp), Scheme has some modern implementations (Racket, Chicken, Chez Scheme, etc.), and Clojure was created relatively recently
Common Lisp has a built-in OO system, though it's quite different from other OO systems you might have used. Notably, it is not enforced--you don't have to write OO code.
The languages have somewhat different design philosophies. Scheme was designed as a minimal dialect for understanding the Actor Model; it later became used for pedagogy. Common Lisp was designed to unify the myriad Lisp dialects that had sprung up. Clojure was designed for concurrency. As a result, Scheme has a reputation of being minimal and elegant, Common Lisp of being powerful and paradigm-agnostic (functional, OO, whatever), and Clojure of favoring functional programming.
Don't forget about Lisp-1 and Lisp-2 differences.
Scheme and Clojure are Lisp-1:
That means both variables and functions names resides in same namespace.
Common Lisp is Lisp-2:
Function and variables has different namespaces (in fact, CL has many namespaces).
Gimp is written in Scheme :)
In fact allot of software some folks think might be written in C++ was probably done under the Lisp umbrella, its hard to pick out the golden apples out of the bunch. The fact is C++ was not always popular, it only seems to be popular today because of a history of updates. For the lesser half of the century C++ didn't even utilize multithreading, it was where Python is today a cesspool of useless untested buggy glue code. Fasterforward a little and now we are seeing a rise in functional programming, its more like adapt or die. I think Java has it right as far as the adapt part is concerned.
Scheme was designed to simplify the Lisp language, that was its only intent except it never really caught on. I think Clojure does something similar its meant to simplify Scheme for the JVM nothing more. Its just like every other JVM language just there to inflate the user experience, only to simplify writting boilerplate in Java land.

Lisp dialect and comparison to Java/C#

Now I'm generally in Java/C# (love both of them, can't really say I'm dedicated to one).
And I've recently been discussing the differences between F# and C# with a friend, when he surprised me saying: "So.. F# sounds a lot like lisp, but with way less 'Swiss-army knife' feel to it."
Now, I was partly ashamed of saying this but I have no idea what lisp was.
After some searching, I saw that lisp is very interesting, but got stumped by the multiple dialects and running environments.
Here is what I know:
I know of 3 dialects:
Common Lisp (I have the Practical Common Lisp book in my bookmarks.
Scheme (a more "theoretical" version of CL)
Clojure. Seems to be a version of CL that runs on JVM.
The basic idea of lisp seems to be about using code as data.
What I want to know:
What is the running environment for different dialects? How do they work/get installed (by this I mean is it a runtime like Java Virtual Machine, or if it requires something else, or if it's supported generally by the OS (as in compiled)). And how to get them (if something is to be gotten)
What is the better dialect to learn (I want the dialect not to be a "learning language" but one you can fully use afterwards without regret of not learning some other one, for example one should first learn C++ before trying out Visual C++, if you know what I mean)
What are the main advantages of lisp in general (I've seen many pages about that saying it's faster in development and execution, but they were all pretty vague about the details)
Can it be generally used for general purpose, or is it concentrated on AI? (By this I mean if, for example, one could make a full console app with it, and then implement OpenGL just as easily and make a game. Learning a language specialized on something precise is worthwhile, but not at the moment for me)
I would also be very happy about any additional details you guys can give me! (Links are appreciated too! E-Books and whatnot.)
Edit: all of the answers here were very useful. As such, I gave them all a +1 to rep, but chose the more concrete one as best. Thank you all.
I also learnt Java and C# intensively before coming to Lisp so hopefully can share some useful perspectives.
Firstly, all Lisps are great and you should definitely consider learning one. There's a famous quote by Eric Raymond:
"Lisp is worth learning for the profound enlightenment experience you
will have when you finally get it; that experience will make you a
better programmer for the rest of your days, even if you never
actually use Lisp itself a lot."
Reasons that Lisps are particularly interesting and powerful are:
Homoiconicity - in Lisp "code is data" - the language itself is written in Lisp data structures. In itself this is interesting, but where it gets really powerful is when you start using this for code generation and advanced macros. Some believe that this features is a key reason why Lisp can help you be more productive than anyone else (short Paul Graham essay)
Interactice development at the REPL - a few other languages also have this, but it is particularly idiomatic and deep-rooted in Lisp culture. It's remarkably productive and liberating to develop while altering a live running program. Recent examples that caught my eye include music hacking with overtone and editing a live game simulation.
Dynamic typing - opinion is more divide on whether this is an advantage or not (I'm personally neutral) but many people thing that dynamically typed langauges give you a productivity advantage, at least in terms of building things quickly. YMMV.
My personal recommendation for a Lisp to learn nowadays would be Clojure. Clojure has a few distinct advantages that make it stand out:
Modern language design - Clojure "refines" Lisp in a number of ways. For example, Clojure adds some new syntax for vectors [] and hashmaps {} in addition to lists (). Purists may disapprove, but I personally believe these find of innovations make the language much nicer to use and read.
Functional first and foremost - all the Lisps are good as functional languages, however Clojure takes it much further. All the standard library is written in terms of pure functions. All data structures are immutable. Mutable state is strictly limited. Lazy sequences (including infinite sequences) are supported. In some senses it feels a bit more like Haskell than the other Lisps.
Concurrency - Clojure has a unique approach to managing concurrency, supported by a very good STM implementation. Worth watching this excellent video for a much deeper explanation.
Runs on the JVM - whatever you think of Java, the JVM is a great platform with extremely good GC, JIT compilation, cross platform portability etc. This can be a barrier to entry for some, but anyone used to Java or C# should quickly feel at home.
Library ecosystem - since Clojure runs on the JVM, it can use Java libraries extremely easily. Calling a Java API from Clojure is trivial - it's just like any other function call with a syntax of (.methodName someObject arg1 arg2). With the availability of the huge Java library ecosystem (mostly open source) Clojure basically leapfrogs all the "niche" languages in terms of practical usefulness
In terms of applications, Clojure is designed to be a fully general purpose langauge so can be used in any field - certainly not limited to AI. I know of people using it in startups, using it for big data processing, even writing games.
Finally on the performance point: you are basically always going to pay a slight performance penalty for using higher level language constructs. However Clojure in my experience is "close enough" to Java or C# that you won't notice the difference for general purpose development. It helps that Clojure is always compiled and that you can use optional type hints to get the performance benefits of static typing.
The flawed benchmarks (as of early 2012) put Clojure within a factor of 2-3 of the speed of statically typed languages like Java, Scala and C#, a little bit behind Common Lisp and a little bit ahead of Scheme (Racket).
Lisp, as you've discovered, is not one language; it's a family of languages that have certain features in common.
There are two primary dialects of Lisp: Common Lisp and Scheme. Each of those two dialects has many implementations, each with their own features. However, both Common Lisp and Scheme are standardized, and the standards define a certain baseline of features which you can expect any implementation to have.
Scheme is a minimalistic language with a very small standard library. It is used primarily by students and theoreticians. Common Lisp has many more language features and a much larger standard library, including a powerful object system, and has been used in large production systems.
Clojure is another minor, more recent dialect. If you want to understand Lisp, you're better off first learning either Common Lisp or Scheme.
My recommendation is to learn Scheme first; it's a purer expression of the ideas that Lisp is made of, and will help you understand the essence of the language. In many ways, Lisp is completely different from Java and other imperative languages; however, what you learn from it will make you a better programmer in those languages. You can easily learn Common Lisp after you know Scheme.
The advantage of Lisp is, simply put, that it's more powerful than other languages. All Lisp code is Lisp data and can be manipulated as such; this allows you to do really cool things with metaprogramming that simply can't be done in other languages, because they don't give you direct access to the data structures that comprise your code. (The reason Lisp can do this and they can't is intimately related to its strange-looking syntax. Every compiler or interpreter, after reading the source code, must translate it into abstract syntax trees. Unlike other languages, Lisp's syntax is a direct representation of the ASTs that Lisp code is translated into, so you know what those trees look like and can manipulate them directly.) The most commonly used metaprogramming feature is macros; Lisp macros can literally translate a bit of source code into anything you can program. You can't do that with, say, C macros.
The "faster in development and execution" thing may have been a reference to one specific feature which most Lisp implementations provide: the read-eval-print loop. You can type an expression into a prompt and the interpreter will evaluate it and print the result. This is wonderful both for learning the language and for debugging or otherwise investigating code.
Lisp is dynamically typed (though statically typed flavors do exist). Most implementations of Lisp run on their own virtual machine; however, many can also be compiled to machine code. Clojure was written specifically to target the JVM; it can also target .NET and JavaScript.
Though originally created for AI research, Lisp is by no means exclusively for AI. The main reason why it's not more popular in mainstream production environments (apart from the self-perpetuating dominance of Java and C#) is library support. Common Lisp has many good libraries out there (Scheme less so), but it pales in comparison to the vast amount of library support available for Java or Python.
If you want to get started, I recommend downloading Racket, a highly popular implementation of Scheme. It has everything you need, including a simple-but-very-powerful IDE with a read-eval-print loop, right out of the box. Though originally developed as a teaching language, it comes with a very large standard library more characteristic of Common Lisp than of Scheme. As a result, it's seeing use in real production environments.
Runtime Environments
Common Lisp and Scheme generally have their own unique runtime environments. There are some variants of Scheme (Chicken and Gambit) which can be translated to C and then linked with their environments so as to be able to be deployed as stand alone executable programs. Clojure runs in the JVM, and there is also a CLR port, but its not clear to me that the CLR port is current with the JVM. Clojure also has Clojurescript, which targets a Javascript runtime.
Which is Better to Learn First
I don't think that question has a good answer. Its up to you. Although if you have experience with the JVM, Clojure might be a bit smoother to start with.
What is Better about Lisp
That's a question liable to start a flame war. I don't have much lisp experience. I started learning Clojure a few months ago in earnest, have looked at Common Lisp and Scheme on and off over the years.
What I like is their dynamic natures. You need to change a function at runtime while your program is running? No problem! Like any power tool, you have to be careful not to chop your bits off when using this.
The power and expressiveness is addicting too. I am able to do some things with little effort that I know I could not achieve in Java, or I know would require a lot more work. Specifically, I was able to put together a description of a data structure - and though the use of macros, delay evaluation of parts of the data until the right time. If I had done that in Java, I would not have been able to nest the declarations like I did because they would have evaluated in the wrong order. Pain would have ensued.
I also like Clojure's view of functional programming, although I have to say it requires work to adjust.
Is Lisp General Purpose
Yes.
--
Mark Volkman has a really good article on Clojure. Many basics are there. One thing that I did in the beginning was to just fire up a repl and experiment when I needed to figure something out programmatically. e.g. explore an API or do some calculations. After a short period of time with that I started working on more building up levels of effort, and I have a project that I'm working on right now that involves Clojure.
There isn't a bad book about Clojure that has been written. The Stuart Sierra book is being updated; and the Oreilly book is about to come out soon, so you might want to wait. The Joy of Clojure is good, but I don't think its a good starter book.
For Common Lisp, I highly recommend the Land of Lisp.
For Scheme, there are several classics including The Little Schemer and SICP.
Oh, and this: http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey (maybe one of the most important talks you'll ever watch), and this http://www.infoq.com/presentations/hickey-clojure (IIRC, really good intro to Clojure).
common lisp
Common Lisp is both compiled and interpreted. Deployments (in Windows) can be done by an exe with DLLs. Or by a precompiled bytecode. Or by installing a Lisp system on the target device and executing the source against it.
Common Lisp is a fully usable industrial language with an active community and libraries for many different tasks.
Lisps are generally faster for development and due to the abstraction capabilities, better at developing higher level concepts. It's hard to explain. Ruby vs. C is an example of this sort of thing. All Lisps carry this capacity IMO.
Common Lisp is a general purpose language. I don't know offhand if modern Common Lisp implementations directly support executing assembly, so it may be difficult to write drivers or use compiler-unsupported CPU instructions.
I like Common Lisp, but Clojure and Racket are not to be sneezed at either. Clojure in particular represents a very interesting track, in my opinion.
For e-books, you can get On Lisp by Graham and Gentle Introduction to Symbolic Computation. Possibly others but those are the ones I can recall.

References Needed for Implementing an Interpreter in C/C++

I find myself attached to a project to integerate an interpreter into an existing application. The language to be interpreted is a derivative of Lisp, with application-specific builtins. Individual 'programs' will be run batch-style in the application.
I'm surprised that over the years I've written a couple of compilers, and several data-language translators/parsers, but I've never actually written an interpreter before. The prototype is pretty far along, implemented as a syntax tree walker, in C++. I can probably influence the architecture beyond the prototype, but not the implementation language (C++). So, constraints:
implementation will be in C++
parsing will probably be handled with a yacc/bison grammar (it is now)
suggestions of full VM/Interpreter ecologies like NekoVM and LLVM are probably not practical for this project. Self-contained is better, even if this sounds like NIH.
What I'm really looking for is reading material on the fundamentals of implementing interpreters. I did some browsing of SO, and another site known as Lambda the Ultimate, though they are more oriented toward programming language theory.
Some of the tidbits I've gathered so far:
Lisp in Small Pieces, by Christian Queinnec. The person recommending it said it "goes from the trivial interpreter to more advanced techniques and finishes presenting bytecode and 'Scheme to C' compilers."
NekoVM. As I've mentioned above, I doubt that we'd be allowed to incorporate an entire VM framework to support this project.
Structure and Interpretation of Computer Programs. Originally I suggested that this might be overkill, but having worked through a healthy chunk, I agree with #JBF. Very informative, and mind-expanding.
On Lisp by Paul Graham. I've read this, and while it is an informative introduction to Lisp principles, is not enough to jump-start constructing an interpreter.
Parrot Implementation. This seems like a fun read. Not sure it will provide me with the fundamentals.
Scheme from Scratch. Peter Michaux is attacking various implementations of Scheme, from a quick-and-dirty Scheme interpreter written in C (for use as a bootstrap in later projects) to compiled Scheme code. Very interesting so far.
Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages, recommended in the comment thread for Books On Creating Interpreted Languages. The book contains two chapters devoted to the practice of building interpreters, so I'm adding it to my reading queue.
New (and yet Old, i.e. 1979): Writing Interactive Compilers and Interpreters by P. J. Brown. This is long out of print, but is interesting in providing an outline of the various tasks associated with the implementation of a Basic interpreter. I've seen mixed reviews for this one but as it is cheap (I have it on order used for around $3.50) I'll give it a spin.
So how about it? Is there a good book that takes the neophyte by the hand and shows how to build an interpreter in C/C++ for a Lisp-like language? Do you have a preference for syntax-tree walkers or bytecode interpreters?
To answer #JBF:
the current prototype is an interpreter, and it makes sense to me as we're accepting a path to an arbitrary code file and executing it in our application environment. The builtins are used to affect our in-memory data representation.
it should not be hideously slow. The current tree walker seems acceptable.
The language is based on Lisp, but is not Lisp, so no standards compliance required.
As mentioned above, it's unlikely that we'll be allowed to add a full external VM/interpreter project to solve this problem.
To the other posters, I'll be checking out your citations as well. Thanks, all!
Short answer:
The fundamental reading list for a lisp interpreter is SICP. I would not at all call it overkill, if you feel you are overqualified for the first parts of the book jump to chapter 4 and start interpreting away (although I feel this would be a loss since chapters 1-3 really are that good!).
Add LISP in Small Pieces (LISP from now on), chapters 1-3. Especially chapter 3 if you need to implement any non-trivial control forms.
See this post by Jens Axel Søgaard on a minimal self-hosting Scheme: http://www.scheme.dk/blog/2006/12/self-evaluating-evaluator.html .
A slightly longer answer:
It is hard to give advice without knowing what you require from your interpreter.
does it really really need to be an interpreter, or do you actually need to be able to execute lisp code?
does it need to be fast?
does it need standards compliance? Common Lisp? R5RS? R6RS? Any SFRIs you need?
If you need anything more fancy than a simple syntax tree walker I would strongly recommend embedding a fast scheme subsystem. Gambit scheme comes to mind: http://dynamo.iro.umontreal.ca/~gambit/wiki/index.php/Main_Page .
If that is not an option chapter 5 in SICP and chapters 5-- in LISP target compilation for faster execution.
For faster interpretation I would take a look at the most recent JavaScript interpreters/compilers. There seem to be a lot of thought going into fast JavaScript execution, and you can probably learn from them. V8 cites two important papers: http://code.google.com/apis/v8/design.html and squirrelfish cites a couple: http://webkit.org/blog/189/announcing-squirrelfish/ .
There is also the canonical scheme papers: http://library.readscheme.org/page1.html for the RABBIT compiler.
If I engage in a bit of premature speculation, memory management might be the tough nut to crack. Nils M Holm has published a book "Scheme 9 from empty space" http://www.t3x.org/s9fes/ which includes a simple stop-the-world mark and sweep garbage collector. Source included.
John Rose (of newer JVM fame) has written a paper on integrating Scheme to C: http://library.readscheme.org/servlets/cite.ss?pattern=AcmDL-Ros-92 .
Yes on SICP.
I've done this task several times and here's what I'd do if I were you:
Design your memory model first. You'll want a GC system of some kind. It's WAAAAY easier to do this first than to bolt it on later.
Design your data structures. In my implementations, I've had a basic cons box with a number of base types: atom, string, number, list, bool, primitive-function.
Design your VM and be sure to keep the API clean. My last implementation had this as a top-level API (forgive the formatting - SO is pooching my preview)
ConsBoxFactory &GetConsBoxFactory() { return mConsFactory; }
AtomFactory &GetAtomFactory() { return mAtomFactory; }
Environment &GetEnvironment() { return mEnvironment; }
t_ConsBox *Read(iostream &stm);
t_ConsBox *Eval(t_ConsBox *box);
void Print(basic_ostream<char> &stm, t_ConsBox *box);
void RunProgram(char *program);
void RunProgram(iostream &stm);
RunProgram isn't needed - it's implemented in terms of Read, Eval, and Print. REPL is a common pattern for interpreters, especially LISP.
A ConsBoxFactory is available to make new cons boxes and to operate on them. An AtomFactory is used so that equivalent symbolic atoms map to exactly one object. An Environment is used to maintain the binding of symbols to cons boxes.
Most of your work should go into these three steps. Then you will find that your client code and support code starts to look very much like LISP too:
t_ConsBox *ConsBoxFactory::Cadr(t_ConsBox *list)
{
return Car(Cdr(list));
}
You can write the parser in yacc/lex, but why bother? Lisp is an incredibly simple grammar and scanner/recursive-descent parser pair for it is about two hours of work. The worst part is writing predicates to identify the tokens (ie, IsString, IsNumber, IsQuotedExpr, etc) and then writing routines to convert the tokens into cons boxes.
Make it easy to write glue into and out of C code and make it easy to debug issues when things go wrong.
The Kamin Interpreters from Samuel Kamin's book Programming Languages, An Interpreter-Based Approach, translated to C++ by Timothy Budd. I'm not sure how useful the bare source code will be, as it was meant to go with the book, but it's a fine book that covers the basics of implementing Lisp in a lower-level language, including garbage collection, etc. (That's not the focus of the book, which is programming languages in general, but it is covered.)
Lisp in Small Pieces goes into more depth, but that's both good and bad for your case. There's a lot of material on compiling and such that won't be relevant to you, and its simpler interpreters are in Scheme, not C++.
SICP is good, definitely. Not overkill, but of course writing interpreters is only a small fraction of the book.
The JScheme suggestion is a good one, too (and it incorporates some code by me), but won't help you with things like GC.
I might flesh this out with more suggestions later.
Edit: A few people have said they learned from my awklisp. This is admittedly kind of a weird suggestion, but it's very small, readable, actually usable, and unlike other tiny-yet-readable toy Lisps it implements its own garbage collector and data representation instead of relying on an underlying high-level implementation language to provide them.
Check out JScheme from Peter Norvig. I found this amazingly simple to understand and port to C++. Uh, dunno about using scheme as a scripting language though - teaching it to jnrs is cumbersome and feels dated (helloooo 1980's).
I would like to extend my recommendation for Programming Languages: Application and Interpretation. If you want to write an interpreter, that book takes you there in a very short path. If you read through writing the code you read and doing the exercise you end up with a bunch of similar interpreters but different (one is eager, the other is lazy, one is dynamic, the other has some typing, one has dynamic scope, the other has lexical scope, etc).