Arenas where core.logic dominates [soft] - clojure

Community Wiki
I don't care about the reputation points, I just want good answers. Feel free to remark this question as community wiki.
Context
I'm been working through The Reasoned Schemer, and have found the following observations:
Logic programming is very interesting.
Logic programming is sometimes counter-intuitive
Logic programming is often "inefficient" (or at least the code I write).
It seems like in going from
Assembly -> C++, I "give up" control of writing my own machine code
C++ -> Clojure, I give up control of memory management
Clojure -> core.logic/prolog/minikanren, I lose partial control of how computations are done
Question:
Besides (1) solving logic puzzles and (2) type inference, what are the domains of problems that logic programming dominates?
Thanks!

Constraint logic programming can be really useful for solving various scheduling, resource allocation, and other nontrivial constraint satisfaction / combinatorial optimization problems. All you have is declarative: the constraints (e.g. only one aircraft can be on the runway at a time), and maybe something you want to minimize/maximize (throughput/waiting).
There are various well-known flavors of that in Prolog, including CLP(FD), which works in finite integer domain, and CLP(R), which works in real domain. At least CLP(FD) seems to be in core.logic's immediate roadmap.
I believe such Prolog-derived solutions are actively being used in air traffic control and other logistics tasks, although it's hard to get precise info what technologies exactly such mission- and life-critical companies are using under the hood.

Research in artificial intelligence, and in particular cognitive robotics and other application of logic-based knowledge representation, are areas where Prolog is used a lot for its close relation to logic theory. This relation is very useful because it basically brings theory to life. Theorems can be proven on paper, and then implemented almost trivially in prolog and executed, and the executing programs have the proven properties. This allows for programs that are "correct by construction", which is the opposite of first writing programs and then trying to prove properties about them (as is done in formal methods, using, e.g., model checking).
The semantic web is another place where logic programming plays a growing role.

Related

Immutable Data Structure - Application maintenance

I have been reading about Immutable data structures and understood that change detection has been made easy . And quite often, I hear that it makes the application maintenance simpler and provides an easy to understand programming model.
I need help to understand the way it simplifies the job.
The Clojure community has embraced immutability and it is an eye opener. The best I can do is send you to the source: Rich Hickey's essay on State and his talk The Value of Values. Rich explains how separating the concept of a variable into three distinct concepts: identity, state, and value helps you model your system and reason about it.
It boils down to this: in your programming model, you should only allow things to change if they change in the system you are trying to model. Otherwise you are adding moving parts (mutable variables and objects) to a model that doesn't need them. This makes it harder to understand the model (specially as time evolves) but has little or no benefit.
Even though reading helps, the only way to grok this is to program in a language that takes immutability as a default until you realize how most of the systems you model actually have only a handful of things that change instead of pages and pages of mutable variables.
Immutability is certainly more embraced in functional languages than in imperative ones, even if you can have a Java programming style that limits mutability (see this for immutability in Java). That said, I will just comment on [functional/immutability] and [object/mutability].
I'm Clojure fan and find functional programming really powerful, but...
May be I spent too much time with C++ & Java and not enough with Lisp & Clojure, but I reckon that the simpler maintenance argument has yet to be proven by facts. I'm not sure there are reliable surveys on the actual cost of maintenance in big production systems with data on the technology used and associated costs.
Certainly, in terms of LOC, language like Clojure are really more focused and concise than Java. Hence you can say that less code leads to less maintenance, but I think functional style gives really more compact code that needs a very focused attention to fully understand what a function is doing comparing to imperative style which is more verbose but kind of straightforward. One big advantage of functional programming associated with immutability, is the ability to isolate a function and experiment with it without the need to drag a heavy context of satellite objects or build a bunch of mocks, which is very often the case with OO languages. Putting aside the experimentation, a pure function won't modify its arguments, which ease the fear to break unintentionally some piece of code outside the scope of the function.
But, putting aside the merits of functional/immutability over oop/mutability, in terms of maintenance, my experience leads me to think that it's not the technology which is the main issue, but the design, code quality and evolution of this code over time even when the initial one was of good quality. By "good", I mean that the code is respectful of style conventions (like basic naming), managed complexity, and has a sensible test harness, in a continuous (or at least automated) build environment.
Then, the question becomes: is there a paradigm (functional/immutability, object-oriented/mutability) that enforced a better design and better code. My feeling is that functional languages are the land of computer science passionates, OTOH OOP is more mainstream. Isn't it because OOP is easier to apprehend or is ot just a matter of education? but then, in order to maintain a system in the long run, should one go for a "clever" functional environment with few people able to tackle it, or some mainstream OO technology - with its unsafeness or permissiveness - but lots of people having some knowledge in it?
Certainly the solution is to choose the right technologies (plural) with the right, motivated people...

Design patterns commonly used for RTOS (VXworks)

Can anyone help me on design patterns commonly used for RTOS?
In VXworks, which pattern is more preferable?
Can we ignore the second sentence in your question? It is meaningless, and perhaps points to a misunderstanding of design patterns. The first part is interesting however. That said, I would generalise it to cover real-time systems rather than RTOS.
Many of the most familiar patterns are mechanistic, but in real-time systems higher-level architectural patterns are also important.
Bruce Powell Douglass is probably the foremost author on the subject of patterns for real time systems. If you want a flavour of what he has to say on the subject then read this article on Embedded.com (it is part three of a series of three; be sure to read the first two as well, since they also touch on the subject, (1) (2)). You could also do worst than to visit Embedded.com and enter "design patterns" into the search box, there are a number of articles on specific patterns and general articles on the subject.
While I think you are being far to specific in requesting patterns for "RTOS(VxWorks)", patterns I have used specifically with VxWorks are the Facade and Adapter patterns. Partly to provide an OO API, and also to provide a level of RTOS agnostic abstraction. The resulting classes were then implemented for Segger emBOS (to allow us to run a smaller, lower cost, royalty free RTOS), and both Windows and Linux to allow test, debug and simulation of the code in a richer environment with more powerful tools.
A non-exhaustive list of many patterns is provided on Wikipedia, many of which will be applicable to real-time systems. The listed concurrency patterns are most obviously relevant.
As Mike DeSimone commented, way too generic. However, here are couple things to keep in mind for a RTOS (not just VxWorks).
Avoid doing too much in the ISR. If possible pass on some of the processing to a waiting task.
Keep multithreading optimal. Too much and you have context switching overhead. Too little and your problem solution may be complicated.
Another important aspect is keeping the RTOS predictable and understandable for the user. Typically you see fixed-priority schedulers that do not try to be fair or adaptive, but rather do exactly as told and if you mess up with priorities and starve some task, so be it. Time to complete kernel operations tend to be short and predictable, often documented with their worst-case execution times.

Questions to answer before proposing to use a new language?

What are the technical questions I simply must have answers for before I approach someone about introducing a new language?
I'm looking for the list of technical questions that without a really good answer, I should not even waste anyone's time by proposing that we use language X.
PS: (def X clojure)
A crash course in politics for engineers...
Despite all the mission-statement baloney meant to sound noble and emphasize community support, the real purpose of every business is return on investment or, equivalently, maximizing shareholder value. If it's a government agency, it's kind of still the same question but the legal owners will have no direct influence and instead you will have proxy owners, such as higher agencies or powerful individual officials.
Decisions, however, are almost always made by agents, and so the principal-agent problem (also called the agency dilemma) appears; the agents (the management) will make a decision in their interest, and not necessarily according to the shareholder's interest as is theoretically required. In a government agency this is almost 100% of the consideration.
Sadly, this stirs in all the Dilbert and Parkinson's Law complexities.
The best you can conclude is that decisions will be justified on the basis of risk, cost, and benefit, but will tend to be made on the basis of what credit and blame is in store for the agent and understood by the agent, which is a narrow risk consideration of questionable value to the principal but at least an identifiable one.
So, we should now apply this to the language question. Your manager is likely to avoid threats, risks, scandals, and controversies. His application of the principals's concerns will be mainly through the constraints of budgets and expectations. Here are some examples that should be mostly self-explanatory.
If you want to use Java or PHP:
Everyone is doing it this way
This is the industry-standard approach for this type of problem
This is the low-risk approach
Similar systems have been done many times in Java/PHP
(That's the "no one ever got fired for buying IBM" argument.)
If you want to use Ruby:
Ruby is in the Tiobe top-ten (not quite an industry standard, so this is the best you can do)
PHP and Java are higher-cost technologies (he probably has a budget as an attempt to mitigate the principal-agent problem)
PHP and Java are going to be out of fashion "soon" (maybe not, but phrased as a "risk of appearing to stupidly use old tech', and implying the lack of later credit and recognition)
Ruby is an advanced language with powerful abstractions for cost-effective development (a weak argument for the agent, but offers the possibility of credit. The least effective of all the arguments.)
If you want to use Clojure:
You better prototype the system on weekends and evenings and present it as a solved problem.
Emphasize parallel Java / Clojure development ("if necessary the entire system can be written in Clojure Java")
Make all the Java arguments and then say something about "the best of both worlds"
Productivity with a language is neither the only factor, nor a simple scalar in itself. Important questions include:
How easy is it to learn the language, if it's not already familiar to people on the team?
How easy is it to become expert at the language?
Does the team have access to one or more language experts who have the bandwidth to do the necessary mentoring?
Are good learning materials (books, blogs, tutorials) and support channels (fora, IRC, mailing lists) available?
Does the language (or some framework in that language) allow a competent programmer to write the software faster than what you're using now?
How maintainable is the language? How readable is the syntax to a competent programmer encountering someone else's code for the first time? (Think of APL and Perl.)
Is the language somehow better applicable to your problem domain than what you're using now (e.g., functional languages for distributed computing)?
How well does the language/platform meet business needs not related to development speed (e.g., performance, scalability)?
What are the available tools like, and what do they cost? Is there a debugger available? An IDE? Refactoring and unit test support built into the IDE? Build management and deployment tools?
So much depends on what you're currently using, what you're switching to and why that it's difficult to answer. But these are always important:
What can I do if I choose a new language that I could not do before?
What could I do faster than I can currently with the new language?
How will the rest of the team cope with the introduction of the new language?
If I left, could someone else new to the language pick up where I left off without too many problems?
What is the business case?
It comes down to ROI (Return On Investment).
It is not only about an individual's productivity but:
the whole team
impact on product lifecycle
maintainability
etc.
How easy is it to pick up? I find this is not that important.
Does it have IDE support? Pretty important, but you can work without it.
Is there a debugger available? I think this is the most important question I would ask. Once you have a working debugger, you can usually get anything done.
We hired a team this year and decided to use Clojure as our weapon of choice. The team's background was primarily Java-based but also a wide variety of other languages for hobby work.
The criteria we considered were:
Can we leverage the Java/JVM background of the team and integrate with an existing Java-based product?
Can we achieve performance on par with Java?
Can we build thread-safe concurrent maintainable programs?
Can we leverage a higher level of abstraction
Can we hire/train people to work in the language?
Can we maintain a large codebase in the language?
Are sufficient tools available to work effectively in the language?
Is there an active community of people growing the language and libs?
We seriously considered Groovy, Scala, and Clojure. I really enjoy Groovy for lightweight apps but I had serious questions about performance. Scala and Clojure both have lots to offer on all of the points above. In the end, our problem domain involves a lot of symbolic manipulation and we felt that Clojure would be a better match but I suspect Scala also would work well.
What will your new language offer that an existing language doesn't already?
We have languages that do just about everything in every way today. So before introducing a new language, make sure there isn't one already existing that does everything your new language does. And make sure you know exactly what features your new language will offer that aren't offered in the same combination or at all by other languages.
Unless of course you're just doing this for your own education - in which case forget this question and have at it!
How will this improve my productivity?
If this cannot be answered pack up and go home.
What's the point? / Why?
How will it make my job easier?
Q1: Can I hire people with these skills?
Q2: When I call our outsourcing partner account managers, and ask how much would a typical fixed-cost project cost, if done in the usual way, or done using language X, is the multiplier more than 1?
Q3: Does everyone else in my department also have a favorite language that does about the same job as my favorite language, and should their favorite languages be used as well? What are the practical consequences of this?
A good question to ask is what is the size of the community around the language/framework. For instance, ruby/rails has a significant community around it, which would make me more comfortable that I would not be "the first kid on the block" to have to deal with a particular problem.
Why limit yourself to one language? Figure out which problems are solved best by which language and offer up services. If the bandwidth between the services is too high, then migrate the problematic services together based on which language solves both best.

Is Communicating Sequential Processes ever used in large multi threaded C++ programs?

I'm currently writing a large multi threaded C++ program (> 50K LOC).
As such I've been motivated to read up alot on various techniques for handling multi-threaded code. One theory I've found to be quite cool is:
http://en.wikipedia.org/wiki/Communicating_sequential_processes
And it's invented by a slightly famous guy, who's made other non-trivial contributions to concurrent programming.
However, is CSP used in practice? Can anyone point to any large application written in a CSP style?
Thanks!
CSP, as a process calculus, is fundamentally a theoretical thing that enables us to formalize and study some aspects of a parallel program.
If you instead want a theory that enables you to build distributed programs, then you should take a look to parallel structured programming.
Parallel structural programming is the base of the current HPC (high-performance computing) research and provides to you a methodology about how to approach and design parallel programs (essentially, flowcharts of communicating computing nodes) and runtime systems to implements them.
A central idea in parallel structured programming is that of algorithmic skeleton, developed initially by Murray Cole. A skeleton is a thing like a parallel design pattern with a cost model associated and (usually) a run-time system that supports it. A skeleton models, study and supports a class of parallel algorithms that have a certain "shape".
As a notable example, mapreduce (made popular by Google) is just a kind of skeleton that address data parallelism, where a computation can be described by a map phase (apply a function f to all elements that compose the input data), and a reduce phase (take all the transformed items and "combine" them using an associative operator +).
I found the idea of parallel structured programming both theoretical sound and practical useful, so I'll suggest to give a look to it.
A word about multi-threading: since skeletons addresses massive parallelism, usually they are implemented in distributed memory instead of shared. Intel has developed a tool, TBB, which address multi-threading and (partially) follows the parallel structured programming framework. It is a C++ library, so probably you can just start using it in your projects.
Yes and no. The basic idea of CSP is used quite a bit. For example, thread-safe queues in one form or another are frequently used as the primary (often only) communication mechanism to build a pipeline out of individual processes (threads).
Hoare being Hoare, however, there's quite a bit more to his original theory than that. He invented a notation for talking about the processes, defined a specific set of signals that can be sent between the processes, and so on. The notation has since been refined in various ways, quite a bit of work put into proving various aspects, and so on.
Application of that relatively formal model of CSP (as opposed to just the general idea) is much less common. It's been used in a few systems where high reliability was considered extremely important, but few programmers appear interested in learning (yet another) formal design notation.
When I've designed systems like this, I've generally used an approach that's less rigorous, but (at least to me) rather easier to understand: a fairly simple diagram, with boxes representing the processes, and arrows representing the lines of communication. I doubt I could really offer much in the way of a proof about most of the designs (and I'll admit I haven't designed a really huge system this way), but it's worked reasonably well nonetheless.
Take a look at the website for a company called Verum. Their ASD technology is based on CSP and is used by companies like Philips Healthcare, Ericsson and NXP semiconductors to build software for all kinds of high-tech equipment and applications.
So to answer your question: Yes, CSP is used on large software projects in real-life.
Full disclosure: I do freelance work for Verum
Answering a very old question, yet it seems important that one
There is Go where CSPs are a fundamental part of the language. In the FAQ to Go, the authors write:
Concurrency and multi-threaded programming have a reputation for difficulty. We believe this is due partly to complex designs such as pthreads and partly to overemphasis on low-level details such as mutexes, condition variables, and memory barriers. Higher-level interfaces enable much simpler code, even if there are still mutexes and such under the covers.
One of the most successful models for providing high-level linguistic support for concurrency comes from Hoare's Communicating Sequential Processes, or CSP. Occam and Erlang are two well known languages that stem from CSP. Go's concurrency primitives derive from a different part of the family tree whose main contribution is the powerful notion of channels as first class objects. Experience with several earlier languages has shown that the CSP model fits well into a procedural language framework.
Projects implemented in Go are:
Docker
Google's download server
Many more
This style is ubiquitous on Unix where many tools are designed to process from standard in to standard out. I don't have any first hand knowledge of large systems that are build that way, but I've seen many small once-off systems that are
for instance this simple command line uses (at least) 3 processes.
cat list-1 list-2 list-3 | sort | uniq > final.list
This system is only moderately sized, but I wrote a protocol processor that strips away and interprets successive layers of protocol in a message that used a style very similar to this. It was an event driven system using something akin to cooperative threading, but I could've used multithreading fairly easily with a couple of added tweaks.
The program is proprietary (unfortunately) so I can't show off the source code.
In my opinion, this style is useful for some things, but usually best mixed with some other techniques. Often there is a core part of your program that represents a processing bottleneck, and applying various concurrency increasing techniques there is likely to yield the biggest gains.
Microsoft had a technology called ActiveMovie (if I remember correctly) that did sequential processing on audio and video streams. Data got passed from one filter to another to go from input to output format (and source/sink). Maybe that's a practical example??
The Wikipedia article looks to me like a lot of funny symbols used to represent somewhat pedestrian concepts. For very large or extensible programs, the formalism can be very important to check how the (sub)processes are allowed to interact.
For a 50,000 line class program, you're probably better off architecting it as you see fit.
In general, following ideas such as these is a good idea in terms of performance. Persistent threads that process data in stages will tend not to contend, and exploit data locality well. Also, it is easy to throttle the threads to avoid data piling up as a fast stage feeds a slow stage: just block the fast one if its output buffer grows too big.
A little bit off-topic but for my thesis I used a tool framework called TERRA/LUNA which aims for software development for Embedded Control Systems but is used heavily for all sorts of software development at my institute (so only academical use here).
TERRA is a graphical CSP and software architecture editor and LUNA is both the name for a C++ library for CSP based constructs and the plugin you'll find in TERRA to generate C++ code from your CSP models.
It becomes very handy in combination with FDR3 (a CSP refinement checker) to detect any sort of (dead/life/etc) lock or even profiling.

Which concurrent programming concepts do hiring managers expect developers to understand?

When I hire developers for general mid-to-senior web app development positions, I generally expect them to understand core concurrent programming concepts such as liveness vs. safety, race conditions, thread synchronization and deadlocks. I'm not sure whether to consider topics like fork/join, wait/notify, lock ordering, memory model basics (just the basics) and so forth to be part of what every reasonably seasoned developer ought to know, or whether these are topics that are more for semi-specialists (i.e. developers who have made a conscious decision to know more than the average developer about concurrent programming).
I'd be curious to hear your thoughts.
I tend to think that at this point in time concurrent programming at any serious level of depth is still a specialist skill. Many will claim to know about it through study, but many will also make an almighty mess of it when they come to apply it.
In addition to the considerations listed, I would also look at resource implications and the various overheads of using processes, threads and fibers. In some contexts, e.g. mobile devices, excessive multithreading can have serious performance implications. This can lead to portability issues with multithreaded code.
I guess if I was interviewing a candidate in this situation, I would work with a real world example rather than hitting on more general topics which can be quoted back verbatim from a text book. I say this having done a fair bit of multithreaded work myself and remembering how badly I screwed up the first couple of times. Many can talk the talk... ;)
I know all these topics, but I studied them. I also know many competent senior programmers that don't know these. So unless you expect these programmers to be using those concepts actively, there is no reason to turn down a perfectly good candidate because they don't understand every aspect of concurrency
The real question is:
In what ways does it matter to the code they will be developing?
You should know which concepts the development position you're hiring for needs to know to be able to work on the projects that they will be responsible for.
As with anything in the programming world.. The devil is in the details, and you can't know everything. Would you expect them to know Perl if you were hiring for a Java position?
Also, concurrency, at this stage, while well described in generalized theory, is heavily implementation and platform dependent. Concurrency in Perl on an AIX box is not the same game as concurrency in a C++ Winforms app. They can have all the theory in the world under their belts, but if it's required for the job, then they should have intimate knowledge of the platform they are expected to use it on as well.
I interview folks for concurrency-related positions frequently and I look for three general aspects:
General understanding of core concepts like the ones you list (language-independent)
Specific understanding of Java concurrency libraries and primitives (specific to the work they'd be doing)
Ability to design the solution to a concurrent problem in a reasonable way.
I consider #1 a requirement (for my positions). I consider #2 a nice to have. If they understand it and can describe it in terms of pthreads or whatever other library, it's no biggie to learn the latest Java concurrency libraries (the concepts are the hard part). And #3 tends to separate the hires from the maybe-hires.
Per your question, I wouldn't consider fork/join to be known by almost anyone, esp someone applying for a web app developer position. I would look for developers to have experience with some (but not all) of those topics. Most developers I've interviewed have not used the Java 5+ concurrency libs at all but they can typically describe things like data race or deadlock.