What is Knuth's WEB? - knuth

I have been trying to figure out what Donald Knuth's WEB is, but it is really conflicting. From what I can glean from the web page is that it's something like doxygen, but all of the sources I am reading insist that it is a programming language. However, it does not look like any programming language I have ever seen.
So what exactly is WEB? Is there some set of documentation that explains it?

Surprising question as nothing like a quick search can't find easily:
From the Wikipedia page at https://en.wikipedia.org/wiki/WEB:
WEB is a computer programming system created by Donald E. Knuth as the
first implementation of what he called "literate programming": the
idea that one could create software as works of literature, by
embedding source code inside descriptive text, rather than the reverse
(as is common practice in most programming languages), in an order
that is convenient for exposition to human readers, rather than in the
order demanded by the compiler.
WEB consists of two secondary programs: TANGLE, which produces
compilable Pascal code from the source texts, and WEAVE, which
produces nicely-formatted, printable documentation using TeX.
CWEB is a version of WEB for the C programming language, while noweb
is a separate literate programming tool, which is inspired by WEB (as
reflected in the name) and which is language agnostic.
The most significant programs written in WEB are TeX and Metafont.
Modern TeX distributions use another program Web2C to convert WEB
source to C.
More info in the highly recommended book from the author:
Literate Programming (Center for the Study of Language and Information - Lecture Notes) Paperback – June 1, 1992
ISBN-13: 978-0937073803 ISBN-10: 0937073806
Check the reviews for the book at Amazon.com or better yet, buy the book and start reading.

Related

Natural Language Programming vs. Literate Programming

I can't see a difference between natural language programming and literate programming. If anyone explains, I would be grateful.
Natural language programming is a system for expressing instructions to a computer in a form approximating a language humans write or speak. NLP syntax structure usually resembles human-language sentence structure, in a form that might sound stilted to a native speaker, but which tends to read almost like the real language. Many NLP implementations are focused on querying data stores rather than writing programs, but actual programming implementations also exist.
Literate programming is a system for simultaneously writing programs and writing about programs. Unlike NLP, the code portions of a literate program are written in traditional programming languages. The classic examples, for which the name was coined, are Donald Knuth's writings on the TeX typesetting system. Published as his Computers and Typesetting series, the printed books are the result of processing his TeX literate program with a tool that extracts and formats only the descriptive portions. Similarly, the compilable source code is the result of processing the same literate program with a tool that extracts and reorganizes the code portions.
A literate program is an explanation of the program logic in a natural language, such as English, interspersed with snippets of macros and traditional source code. Macros in a literate source file are simply title-like or explanatory phrases in a human language that describe human abstractions created while solving the programming problem, and hiding chunks of code or lower-level macros. These macros are similar to the algorithms in pseudocode typically used in teaching computer science. These arbitrary explanatory phrases become precise new operators, created on the fly by the programmer, forming a meta-language on top of the underlying programming language.
The example is shown in following link
http://en.literateprograms.org/Insertion_sort_%28C%29
Natural Language processing deals with processing of natural text. Natural text can be simple english, french or any language sentence. It deals with processing of this sentence. This processing can involve steps like tokenization, Part of Speech tagging, stemming, Lemmatization, Sentiment Analysis etc. Basically It deals with getting the meaning out of any sentence with the help of programming
This link will give intro about NLP
http://www.youtube.com/watch?v=nfoudtpBV68

Resources for becoming more proficient with C++?

I am an undergraduate student and I recently was hired for a co-op for this summer and next fall. From what I understand I will be writing a lot of C++ which I am comfortable with but I don't have a huge amount of experience actually writing C++ in particular.
I wanted to take the time to really dive into C++ and become much more proficient with it. Does anyone have any particular resources that I could use to get some more experience before I start my job?
TopCoder was featured on This Developers Life recently and the algorithm challenges are a great way to hone your skills in various languages including C++
Getting involved in Open Source projects is a regular suggestion.
Good books from many authors are listed conveniently in many places including Amazon.com
To really deeply understand the language and learn some best practices I would recommend to read Herb Sutter's books, in following order:
C++ Coding Standards
Exceptional C++
More Exceptional C++
Exceptional C++ Style
The first one is very useful if you don't have experience with industrial development in C++.
Read The C++ Programming Language by Stroustrup cover to cover, and then read it again and again. Then, as you work with C++ programs, keep referring back to that book to remind yourself of the details of various topics.
It is, by far, the best book on C++ available, and since the author invented C++, you will get a good sense of the intentions and rationales of the various language features.
Once you have mastered the contents of that book, the next step is to familiarise yourself with the C++ 03 and the upcoming 0x standards. Reading a language standard can be hard, but once you get used to the style, standards are invaluable resources.
It is very important to be familiar with the authoritative resources and not get lead astray with possibly incorrect or out-dated information out on the web. I've seen programmers make really dumb mistakes because they just searched on Google to find out how some language feature works.
You need to practice and read manuals (tutorials) for c++.
Else you can try to look at internet website who are not protected and look at the source code.
Everybody work hard for what they learn and apply.
Good luck!

Are there any C++ style and/or standard example files available?

While there are lots of questions about coding style, beautifying, and enforcement, I haven't found any example C++ files that are used as a quick reference for style. The file should be one or two pages long and exemplify a given coding standard/style.
For example, the Google C++ Style Guide is a great reference, but I think a one to two page piece of code written in their style pinned to a wall would be more useful in day-to-day use.
Do any of these already exist?
I think Bjarne's JSF AV C++ Coding Standards (doc) is pretty awesome. Although it's a long read, and it goes against stuff you can use safely in everyday applications, etc, it's a good document.
This is because it explains the reasons and gives examples of when it's ok to break the rules and why, therefore it's a veritable tome of knowledge and experience.
Instant c++ level up! :)
I'm not sure if you would consider it 'style' (I would) but Scott Meyer's Effective C++ is a great book for learning good practices in c++. In my opinion if you exercise these practices your 'style' will follow.
Note that it contains lots of examples to re-enforce his lessons. That being said there isn't a large code sample that you can just browse to look at style. Again i think with good practices in place the style will follow.

create my own programming language [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
References Needed for Implementing an Interpreter in C/C++
How to create a language these days?
Learning to write a compiler
I know some c++, VERY good at php, pro at css html, okay at javascript. So I was thinking of how was c++ created I mean how can computer understand what codes mean? How can it read... so is it possible I can create my own language and how?
If you're interested in compiler design ("how can computer understand what codes mean"), I highly recommend Dragon Book. I used it while in college and went as far as to create programming language myself.
"Every now and then I feel a temptation to design a programming
language but then I just lie down until it goes away." — L. Peter
Deutsch
EDIT (for those who crave context):
"[L. Peter Deutsch] also wrote the PDP-1 Lisp 1.5 implementation, Basic PDP-1 LISP, 'while still in short pants' between the age of 12-15 years old."
If you want to understand how the computer understands the code, you might want to learn some assembly language. It's a much lower-level language and will give you a better feel for the kinds of simple instructions that really get executed. You should also be able to get a feel for how one implements higher level constructs like loops with only conditional jumps.
For an even lower-level understanding, you'll need to study up on electronics. Digital logic shows you how you can take electronic "gates" and implement a generic CPU that can understand the machine code generated from the assembly language code.
For really-low level stuff, you can study material science which can teach you how to actually make the gates work at an atomic level.
You sound like a resourceful person. You'll want to hunt down books and/or websites on these topics tailored to your level of understanding and that focus on what you're interested in the most. A fairly complete understanding of all of this comes with a BS degree in computer science or computer engineering, but many things are quite understandable to a motivated person in your position.
Yes it's possible to create your own language. Take a look at compiler compilers. Or the source code to some scripting languages if you dare. Some useful tools are yacc and bison and lexx.
Others have mentioned the dragon book. We used a book that I think was called "compiler theory and practice" back in my university days.
It's not necessary to learn assembler to write a language. For example, Javascript runs in something called an interpreter which is an application that executes javascript files. In thise case, the interpreter is usually built into the browser.
The easiest starting program language might be to write a simple text based calculator. i.e. taking a text file, run through it and perform the calculations. You could write that in C++ very easily.
My first language for a college project was a language defined in BNF given to us. We then had to write a parser which parsed it into a tree structure in memory and then into something called 3 address code (which is assembler like). You could quite easily turn 3 address code into either real assembler or write an interpreter for that.
Yup! It's definitely possible. Others have mentioned the Dragon Book, but there is also plenty of information online. llvm, for example, has a tutorial on implementing a programming language: http://llvm.org/docs/tutorial/
I really recommend Programming Language Pragmatics. It's a great book that takes you all the way from what a language is through how compilers work and creating your own. It's a bit more accessible than the Dragon Book and explains how things work before jumping in headfirst.
It is possible. You should learn about compilers and/or interpreters: what they are for and how they are made.
Start learning ASM and reading up on how byte code works and you might have a chance :)
If you know C -- it sounds like you do -- grab a used copy of this ancient book:
http://www.amazon.com/Craft-Take-Charge-Programming-Book-Disk/dp/0078818826
In it there's a chapter where the author creates a "C" interpreter, in C. It's not academically serious like the Dragon book would be, but I remember it being pretty simple, very practical and easy to follow, and since you're just getting started, it would be an awesome introduction to the ideas of a "grammar" for languages, and "tokenizing" a program.
It would be a perfect place for you to start. Also, at $0.01 for a used copy, cheaper than the Dragon Book. ;)
Start with creating a parser. Read up on EBNF grammars. This will answer your question about how the computer can read code. This is a very advanced topic, so don't expect too much of yourself, but have fun. Some resources I've used for this are bison, flex, and PLY.
Yes! Getting interested in compilers was my hook into professional CS (previously I'd been on a route to EE, and only formally switched sides in college), it's a great way to learn a TON about a wide range of computer science topics. You're a bit younger (I was in high school when I started fooling around with parsers and interpreters), but there's a whole lot more information at your fingertips these days.
Start small: Design the tiniest language you can possibly think of -- start with nothing more than a simple math calculator that allows variable assignment and substitution. When you get adventurous, try adding "if" or loops. Forget arcane tools like lex and yacc, try writing a simple recursive descent parser by hand, maybe convert to simple bytecodes and write an interpreter for it (avoid all the hard parts of understanding assembly for a particular machine, register allocation, etc.). You'll learn a tremendous amount just with this project.
Like others, I recommend the Dragon book (1986 edition, I don't like the new one, frankly).
I'll add that for your other projects, I recommending using C or C++, ditch PHP, not because I'm a language bigot, but just because I think that working through the difficulties in C/C++ will teach you a lot more about underlying machine architecture and compiler issues.
(Note: if you were a professional, the advice would be NOT to create a new language. That's almost never the right solution. But as a project for learning and exploration, it's fantastic.)
If you want a really general (but very well written) introduction to this topic -- computing fundamentals -- I highly recommend a book titled Code by Charles Petzold. He explains a number of topics you are interest and from there you can decide what you want to create yourself.
Check out this book,
The Elements of Computing Systems: Building a Modern Computer from First Principles it takes you step by step through several aspects of designing a computer language, a compiler, a vm, the assembler, and the computer. I think this could help you answer some of your questions.

References Needed for Implementing an Interpreter in C/C++

I find myself attached to a project to integerate an interpreter into an existing application. The language to be interpreted is a derivative of Lisp, with application-specific builtins. Individual 'programs' will be run batch-style in the application.
I'm surprised that over the years I've written a couple of compilers, and several data-language translators/parsers, but I've never actually written an interpreter before. The prototype is pretty far along, implemented as a syntax tree walker, in C++. I can probably influence the architecture beyond the prototype, but not the implementation language (C++). So, constraints:
implementation will be in C++
parsing will probably be handled with a yacc/bison grammar (it is now)
suggestions of full VM/Interpreter ecologies like NekoVM and LLVM are probably not practical for this project. Self-contained is better, even if this sounds like NIH.
What I'm really looking for is reading material on the fundamentals of implementing interpreters. I did some browsing of SO, and another site known as Lambda the Ultimate, though they are more oriented toward programming language theory.
Some of the tidbits I've gathered so far:
Lisp in Small Pieces, by Christian Queinnec. The person recommending it said it "goes from the trivial interpreter to more advanced techniques and finishes presenting bytecode and 'Scheme to C' compilers."
NekoVM. As I've mentioned above, I doubt that we'd be allowed to incorporate an entire VM framework to support this project.
Structure and Interpretation of Computer Programs. Originally I suggested that this might be overkill, but having worked through a healthy chunk, I agree with #JBF. Very informative, and mind-expanding.
On Lisp by Paul Graham. I've read this, and while it is an informative introduction to Lisp principles, is not enough to jump-start constructing an interpreter.
Parrot Implementation. This seems like a fun read. Not sure it will provide me with the fundamentals.
Scheme from Scratch. Peter Michaux is attacking various implementations of Scheme, from a quick-and-dirty Scheme interpreter written in C (for use as a bootstrap in later projects) to compiled Scheme code. Very interesting so far.
Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages, recommended in the comment thread for Books On Creating Interpreted Languages. The book contains two chapters devoted to the practice of building interpreters, so I'm adding it to my reading queue.
New (and yet Old, i.e. 1979): Writing Interactive Compilers and Interpreters by P. J. Brown. This is long out of print, but is interesting in providing an outline of the various tasks associated with the implementation of a Basic interpreter. I've seen mixed reviews for this one but as it is cheap (I have it on order used for around $3.50) I'll give it a spin.
So how about it? Is there a good book that takes the neophyte by the hand and shows how to build an interpreter in C/C++ for a Lisp-like language? Do you have a preference for syntax-tree walkers or bytecode interpreters?
To answer #JBF:
the current prototype is an interpreter, and it makes sense to me as we're accepting a path to an arbitrary code file and executing it in our application environment. The builtins are used to affect our in-memory data representation.
it should not be hideously slow. The current tree walker seems acceptable.
The language is based on Lisp, but is not Lisp, so no standards compliance required.
As mentioned above, it's unlikely that we'll be allowed to add a full external VM/interpreter project to solve this problem.
To the other posters, I'll be checking out your citations as well. Thanks, all!
Short answer:
The fundamental reading list for a lisp interpreter is SICP. I would not at all call it overkill, if you feel you are overqualified for the first parts of the book jump to chapter 4 and start interpreting away (although I feel this would be a loss since chapters 1-3 really are that good!).
Add LISP in Small Pieces (LISP from now on), chapters 1-3. Especially chapter 3 if you need to implement any non-trivial control forms.
See this post by Jens Axel Søgaard on a minimal self-hosting Scheme: http://www.scheme.dk/blog/2006/12/self-evaluating-evaluator.html .
A slightly longer answer:
It is hard to give advice without knowing what you require from your interpreter.
does it really really need to be an interpreter, or do you actually need to be able to execute lisp code?
does it need to be fast?
does it need standards compliance? Common Lisp? R5RS? R6RS? Any SFRIs you need?
If you need anything more fancy than a simple syntax tree walker I would strongly recommend embedding a fast scheme subsystem. Gambit scheme comes to mind: http://dynamo.iro.umontreal.ca/~gambit/wiki/index.php/Main_Page .
If that is not an option chapter 5 in SICP and chapters 5-- in LISP target compilation for faster execution.
For faster interpretation I would take a look at the most recent JavaScript interpreters/compilers. There seem to be a lot of thought going into fast JavaScript execution, and you can probably learn from them. V8 cites two important papers: http://code.google.com/apis/v8/design.html and squirrelfish cites a couple: http://webkit.org/blog/189/announcing-squirrelfish/ .
There is also the canonical scheme papers: http://library.readscheme.org/page1.html for the RABBIT compiler.
If I engage in a bit of premature speculation, memory management might be the tough nut to crack. Nils M Holm has published a book "Scheme 9 from empty space" http://www.t3x.org/s9fes/ which includes a simple stop-the-world mark and sweep garbage collector. Source included.
John Rose (of newer JVM fame) has written a paper on integrating Scheme to C: http://library.readscheme.org/servlets/cite.ss?pattern=AcmDL-Ros-92 .
Yes on SICP.
I've done this task several times and here's what I'd do if I were you:
Design your memory model first. You'll want a GC system of some kind. It's WAAAAY easier to do this first than to bolt it on later.
Design your data structures. In my implementations, I've had a basic cons box with a number of base types: atom, string, number, list, bool, primitive-function.
Design your VM and be sure to keep the API clean. My last implementation had this as a top-level API (forgive the formatting - SO is pooching my preview)
ConsBoxFactory &GetConsBoxFactory() { return mConsFactory; }
AtomFactory &GetAtomFactory() { return mAtomFactory; }
Environment &GetEnvironment() { return mEnvironment; }
t_ConsBox *Read(iostream &stm);
t_ConsBox *Eval(t_ConsBox *box);
void Print(basic_ostream<char> &stm, t_ConsBox *box);
void RunProgram(char *program);
void RunProgram(iostream &stm);
RunProgram isn't needed - it's implemented in terms of Read, Eval, and Print. REPL is a common pattern for interpreters, especially LISP.
A ConsBoxFactory is available to make new cons boxes and to operate on them. An AtomFactory is used so that equivalent symbolic atoms map to exactly one object. An Environment is used to maintain the binding of symbols to cons boxes.
Most of your work should go into these three steps. Then you will find that your client code and support code starts to look very much like LISP too:
t_ConsBox *ConsBoxFactory::Cadr(t_ConsBox *list)
{
return Car(Cdr(list));
}
You can write the parser in yacc/lex, but why bother? Lisp is an incredibly simple grammar and scanner/recursive-descent parser pair for it is about two hours of work. The worst part is writing predicates to identify the tokens (ie, IsString, IsNumber, IsQuotedExpr, etc) and then writing routines to convert the tokens into cons boxes.
Make it easy to write glue into and out of C code and make it easy to debug issues when things go wrong.
The Kamin Interpreters from Samuel Kamin's book Programming Languages, An Interpreter-Based Approach, translated to C++ by Timothy Budd. I'm not sure how useful the bare source code will be, as it was meant to go with the book, but it's a fine book that covers the basics of implementing Lisp in a lower-level language, including garbage collection, etc. (That's not the focus of the book, which is programming languages in general, but it is covered.)
Lisp in Small Pieces goes into more depth, but that's both good and bad for your case. There's a lot of material on compiling and such that won't be relevant to you, and its simpler interpreters are in Scheme, not C++.
SICP is good, definitely. Not overkill, but of course writing interpreters is only a small fraction of the book.
The JScheme suggestion is a good one, too (and it incorporates some code by me), but won't help you with things like GC.
I might flesh this out with more suggestions later.
Edit: A few people have said they learned from my awklisp. This is admittedly kind of a weird suggestion, but it's very small, readable, actually usable, and unlike other tiny-yet-readable toy Lisps it implements its own garbage collector and data representation instead of relying on an underlying high-level implementation language to provide them.
Check out JScheme from Peter Norvig. I found this amazingly simple to understand and port to C++. Uh, dunno about using scheme as a scripting language though - teaching it to jnrs is cumbersome and feels dated (helloooo 1980's).
I would like to extend my recommendation for Programming Languages: Application and Interpretation. If you want to write an interpreter, that book takes you there in a very short path. If you read through writing the code you read and doing the exercise you end up with a bunch of similar interpreters but different (one is eager, the other is lazy, one is dynamic, the other has some typing, one has dynamic scope, the other has lexical scope, etc).