A question was posted about chained comparison operators and how they are interpreted in different languages.
Chaining comparison operators means that (x < y < z) would be interpreted as ((x < y) && (y < z)) instead of as ((x < y) < z).
The comments on that question show that Python, Perl 6, and Mathematica support chaining comparison operators, but what other languages support this feature and why is it not more common?
A quick look at the Python documentation shows that this feature has been since at least 1996. Is there a reason more languages have not added this syntax?
A statically typed language would have problems with type conversion, but are there other reasons this is not more common?
It should be more common, but I suspect it is not because it makes parsing languages more complex.
Benefits:
Upholds the principle of least surprise
Reads like math is taught
Reduces cognitive load (see previous 2 points)
Drawbacks:
Grammar is more complex for the language
Special case syntactic sugar
As to why not, my guesses are:
Language author(s) didn't think of it
Is on the 'nice to have' list
Was decided that it wasn't useful enough to justify implementing
The benefit is too small to justify complicating the language.
You don't need it that often, and it is easy to get the same effect cleanly with a few characters more.
Scheme (and probably most other Lisp family languages) supports multiple comparison efficiently within its grammar:
(< x y z)
This can be considered an ordinary function application of the < function with three arguments. See 6.2.5 Numerical Operations in the specification.
Clojure supports chained comparison too.
Chained comparison is a feature of BCPL, since the late 1960s.
I think ICON is the original language to have this, and in ICON it falls out of the way that booleans are handled as special 'fail' tags with all other values being treated as true.
Related
I have just started reading into propositional calculus/logic, and have realised that a lot of the foundational concepts in this area are similar to those encountered in programming. That is, propositions/sentences in logic (due to the fact that they have truth-values), are similar to Booleans in programming. Furthermore, most of the sentential connectives are reflected in the logical operators adopted in programming. For instance, in C++, !, &&, and || behave in the same way that ¬, ∧, and ∨ behave, respectively, in logic.
My question is whether there are (say, C++) "equivalents" for the logical operators of implication (->) and equivalence (<->)? If so, what are they? If not, what is the reason they have they not been adopted? So, although I cannot think of any reason why they would be adopted, I would like to know if there is a specific reason that they are omitted.
Implication can be replaced with <=, and equivalence is ==.
However, unlike && and ||, those don't convert their arguments to bool (since they're not inherently boolean operators).
Traditional if > then relationship in pseudo code:
if (x>y) {
then print "x is greater than y."
}
There are also relational databases.
Or just visual if>then tables. A visual table representation.
There are also tree or hierarchical structure if>then programming aids.
I'm looking for any and all alternatives and flavors of if>then constructs, but preferably practical ones. Since most humans are better at using and remembering visual constructs (tables vs raw code) than symbolic constructs, I'm looking for the most intuitive way to theoretically construct an if>then rule engine, graphically.
Note: I'm not trying to implement this, I'm just trying to get an idea of what could theoretically be done.
I hope I've interpreted the question correctly.
Everything eventually boils down to comparisons, its just a matter of breaking up these comparisons in manageable chunks for humans. There are many techniques to reduce if-thens, or at least transform them into something easier to understand.
One example would be polymorphism. This frees the programmer from one instance of if/then (basically a switch statement). Another example is maps. The implementation of a map uses if/thens, but one might pre-populate the map with all the data and use one logical piece of code instead of using if/then to differentiate. This moves to a data-driven approach. Another example is SQL; it is just a language, a higher level construct, that enables us to express conditions and constraints differently. How you choose to express these conditions is dependent on the problem domain. Some problems work well with traditional procedural programming, some with logic programming, declarative programming etc. If there are many levels of nested if-thens, a state machine approach might work well. Aspect-oriented programming tries to solve the problem of duplicated code in modules that doesn't belong specifically to any one module; a concern that "cross-cuts".
I would do some reading on Programming Paradigms. Do lots of research and if you run into a recurring problem, see if another approach allows you to reduce the amount of if-thens. Most times someone else has run into the same problem and come up with a solution.
Your question is a bit broad and we could ramble from logical gates to mathematical functions. I'm going to focus on this particular bit:
"I'm looking for the most intuitive way to theoretically construct an if>then rule engine, graphically".
First, two caveats:
The best representation depends on the number of possible rules. What works for 3-4 rules probably won't work for 30-40.
I'm going to pretend that else conditions don't exist.
If "X then Y" boils down to: one condition and one instruction whose execution depends on the condition. Let's pretend X -> Y means that "If X is true then Y is executed". Let's create two sets: one is C that contains all the possible conditions. The other one is I which contains all the possible instructions.
With this is mind, X ∈ C and Y ∈ I. In your specific case, can Y ∈ C (can Y be a condition)? If so, you have nested ifs.
Nested ifs can be represented as chains of conditions joined by and operators:
if (x > 3) {
if (y > 5) {
# do something
}
}
Can be written as:
if (x > 3 and y > 5) {
# do something
}
If you're only thinking about code then the latter can become problematic when you have many nested conditions, but when you go graphical, nesting (probably using tree-like structures) can look cluttered while chaining usually looks like a sequence of instructions (which I think is better).
If you don't consider nesting (chaining) in your rules, then connecting elements (boxes, circles, etc) from X -> Y is trivial way to work. The representation of this depends on how graphical you want to get (see the links below for some examples).
If you're considering nesting then three random ideas come to my mind:
Venn Diagrams: Visually attractive, useless for more than 3-4 conditions. They have a good fit with database representations. See: http://share.mheroin.com/image/3i3l1y0S2F39
Flowcharts: Highly functional and easy to read, not too cumbersome to create. Can get out of hand with 10+ elements. See: http://share.mheroin.com/image/2g071j3U1u29
Tables: As you mentioned, tables are a decent way to represent conditionals as long as you can restrain the set of applicable rules. This is an example taken from iTunes: http://share.mheroin.com/image/390y2G18123q. The "Match [all/any] of the following rules" works as a replacement for if/else.
Or are we all sticking to our taught "&&, ||, !" way?
Any thoughts in why we should use one or the other?
I'm just wondering because several answers state thate code should be as natural as possible, but I haven't seen a lot of code with "and, or, not" while this is more natural.
I like the idea of the not operator because it is more visible than the ! operator. For example:
if (!foo.bar()) { ... }
if (not foo.bar()) { ... }
I suggest that the second one is more visible and readable. I don't think the same argument necessarily applies to the and and or forms, though.
"What's in a name? That which we call &&, || or !
By any other name would smell as sweet."
In other words, natural depends on what you are used to.
Those were not supported in the old days. And even now you need to give a special switch to some compilers to enable these keywords. That's probably because old code base may have had some functions || variables named "and" "or" "not".
One problem with using them (for me anyway) is that in MSVC you have to include iso646.h or use the (mostly unusable) /Za switch.
The main problem I have with them is the Catch-22 that they're not commonly used, so they require my brain to actively process the meaning, where the old-fashioned operators are more or less ingrained (kind of like the difference between reading a learned language vs. your native language).
Though I'm sure I'd overcome that issue if their use became more universal. If that happened, then I'd have the problem that some boolean operators have keywords while others don't, so if alternate keywords were used, you might see expressions like:
if ((x not_eq y) and (y == z) or (z <= something)) {...}
when it seems to me they should have alternate tokens for all the (at least comparison) operators:
if ((x not_eq y) and (y eq z) or (z lt_eq something)) {...}
This is because the reason the alternate keywords (and digraphs and trigraphs) were provided was not to make the expressions more readable - it was because historically there have been (and maybe still are) keyboards and/or codepages in some localities that do not have certain punctuation characters. For example, the invariant part of the ISO 646 codepage (surprise) is missing the '|', '^' and '~' characters among others.
Although I've been programming C++ from quite some time, I did not know that the keywords "and" "or" and "not" were allowed, and I've never seen it used.
I searched through my C++ book, and I found a small section mentioning alternative representation for the normal operators "&&", "||" and "!", where it explains those are available for people with non-standard keyboards that do not have the "&!|" symbols.
A bit like trigraphs in C.
Basically, I would be confused by their use, and I think I would not be the only one.
Using a representation which is non-standard, should really have a good reason to be used.
And if used, it should be used consistently in the code, and described in the coding standard.
The digraph and trigraph operators were actually designed more for systems that didn't carry the standard ASCII character set - such as IBM mainframes (which use EBCDIC). In the olden days of mechanical printers, there was this thing called a "48-character print chain" which, as its name implied, only carried 48 characters. A-Z (uppercase), 0-9 and a handful of symbols. Since one of the missing symbols was an underscore (which rendered as a space), this could make working with languages like C and PL/1 a real fun activity (is this 2 words or one word with an underscore???).
Conventional C/C++ is coded with the symbols and not the digraphs. Although I have been known to #define "NOT", since it makes the meaning of a boolean expression more obvious, and it's visually harder to miss than a skinny little "!".
I wish I could use || and && in normal speech. People try very hard to misunderstand when I say "and" or "or"...
I personally like operators to look like operators. It's all maths, and unless you start using "add" and "subtract" operators too it starts to look a little inconsistent.
I think some languages suit the word-style and some suit the symbols if only because it's what people are used to and it works. If it ain't broke, don't fix it.
There is also the question of precedence, which seems to be one of the reasons for introducing the new operators, but who can be bothered to learn more rules than they need to?
In cases where I program with names directly mapped to the real world, I tend to use 'and' and 'or', for example:
if(isMale or isBoy and age < 40){}
It's nice to use 'em in Eclipse+gcc, as they are highlighted. But then, the code doesn't compile with some compilers :-(
Using these operators is harmful. Notice that and and or are logical operators whereas the similar-looking xor is a bitwise operator. Thus, arguments to and and or are normalized to 0 and 1, whereas those to xor aren't.
Imagine something like
char *p, *q; // Set somehow
if(p and q) { ... } // Both non-NULL
if(p or q) { ... } // At least one non-NULL
if(p xor q) { ... } // Exactly one non-NULL
Bzzzt, you have a bug. In the last case you're testing whether at least one of the bits in the pointers is different, which probably isn't what you thought you were doing because then you would have written p != q.
This example is not hypothetical. I was working together with a student one time and he was fond of these literate operators. His code failed every now and then for reasons that he couldn't explain. When he asked me, I could zero in on the problem because I knew that C++ doesn't have a logical xor operator, and that line struck me as very odd.
BTW the way to write a logical xor in C++ is
!a != !b
I like the idea, but don't use them. I'm so used to the old way that it provides no advantage to me doing it either way. Same holds true for the rest of our group, however, I do have concerns that we might want to switch to help avoid future programmers from stumbling over the old symbols.
So to summarize: it's not used a lot because of following combination
old code where it was not used
habit (more standard)
taste (more math-like)
Thanks for your thoughts
Three lispy homoiconic languages, Dylan, Julia and Seph all moved away from leading parenthesis - so a hypothetical function call in Common Lisp that would look like:
(print hello world)
Would look like the following hypothetical function call
print(hello world)
in the three languages mentioned above.
Were Clojure to go down this path - what would it have to sacrifice to get there?
Reasoning:
Apart from the amazing lazy functional data structures in Clojure, and the improved syntax for maps and seqs, the language support for concurrency, the JVM platform, the tooling and the awesome community - the distinctive thing about it being 'a LISP' is leading parenthesis giving homoiconicity which gives macros providing syntax abstraction.
But if you don't need leading parentheses - why have them? The only arguments I can think of for keeping them are
(1) reusing tool support in emacs
(2) prompting people to 'think in LISP' and not try and treat it as another procedural language)
(Credit to andrew cooke's answer, who provided the link to Wheeler's and Gloria's "Readable Lisp S-expressions Project")
The link above is a project intended to provide a readable syntax for all languages based on s-expressions, including Scheme and Clojure. The conclusion is that it can be done: there's a way to have readable Lisp without the parentheses.
Basically what David Wheeler's project does is add syntactic sugar to Lisp-like languages to provide more modern syntax, in a way that doesn't break Lisp's support for domain-specific languages. The enhancements are optional and backwards-compatible, so you can include as much or as little of it as you want and mix them with existing code.
This project defines three new expression types:
Curly-infix-expressions. (+ 1 2 3) becomes {1 + 2 + 3} at every place you want to use infix operators of any arity. (There is a special case that needs to be handled with care if the inline expression uses several operators, like {1 + 2 * 3} - although {1 + {2 * 3} } works as expected).
Neoteric-expressions. (f x y) becomes f(x y) (requires that no space is placed between the function name and its parameters)
Sweet-expressions. Opening and closing parens can be replaced with (optional) python-like semantic indentation. Sweet-expressions can be freely mixed with traditional parentheses s-expressions.
The result is Lisp-compliant but much more readable code. An example of how the new syntactic sugar enhances readability:
(define (gcd_ a b)
(let (r (% b a))
(if (= r 0) a (gcd_ r a))))
(define-macro (my-gcd)
(apply gcd_ (args) 2))
becomes:
define gcd_(a b)
let r {b % a}
if {r = 0} a gcd_(r a)
define-macro my-gcd()
apply gcd_ (args) 2
Note how the syntax is compatible with macros, which was a problem with previous projects that intended to improve Lisp syntax (as described by Wheeler and Gloria). Because it's just sugar, the final form of each new expression is a s-expression, transformed by the language reader before macros are processed - so macros don't need any special treatment. Thus the "readable Lisp" project preserves homoiconicity, the property that allows Lisp to represent code as data within the language, which is what allows it to be a powerful meta-programming environment.
Just moving the parentheses one atom in for function calls wouldn't be enough to satisfy anybody; people will be complaining about lack of infix operators, begin/end blocks etc. Plus you'd probably have to introduce commas / delimiters in all sorts of places.
Give them that and macros will be much harder just to write correctly (and it would probably be even harder to write macros that look and act nicely with all the new syntax you've introduced by then). And macros are not something that's a nice feature you can ignore or make a whole lot more annoying; the whole language (like any other Lisp) is built right on top of them. Most of the "user-visible" stuff that's in clojure.core, including let, def, defn etc are macros.
Writing macros would become much more difficult because the structure would no longer be simple you would need another way to encode where expressions start and stop using some syntactic symbol to mark the start and end of expressions to you can write code that generates expressions perhaps you could solve this problem by adding something like a ( to mark the start of the expression...
On a completely different angle, it is well worth watching this video on the difference between familiar and easy making lisps syntax more familiar wont make it any easier for people to learn and may make it misleading if it looks to much like something it is not.
even If you completely disagree, that video is well worth the hour.
you wouldn't need to sacrifice anything. there's a very carefully thought-out approach by david wheeler that's completely transparent and backwards compatible, with full support for macros etc.
You would have Mathematica. Mathematica accepts function calls as f[x, y, z], but when you investigate it further, you find that f[x, y, z] is actually syntactic sugar for a list in which the Head (element 0) is f and the Rest (elements 1 through N) is {x, y, z}. You can construct function calls from lists that look a lot like S-expressions and you can decompose unevaluated functions into lists (Hold prevents evaluation, much like quote in Lisp).
There may be semantic differences between Mathematica and Lisp/Scheme/Clojure, but Mathematica's syntax is a demonstration that you can move the left parenthesis over by one atom and still interpret it sensibly, build code with macros, etc.
Syntax is pretty easy to convert with a preprocessor (much easier than semantics). You could probably get the syntax you want through some clever subclassing of Clojure's LispReader. There's even a project that seeks to solve Clojure's off-putting parentheses by replacing them all with square brackets. (It boggles my mind that this is considered a big deal.)
I used to code C/C#/Java/Pascal so I emphasize with the feeling that Lisp code is a bit alien. However that feeling only lasts a few weeks - after a fairly short amount of time the Lisp style will feel very natural and you'll start berating other languages for their "irregular" syntax :-)
There is a very good reason for Lisp syntax. Leading parentheses make code logically simpler to parse and read, by collecting both a function and the expressions that make up it's arguments in a single form.
And when you manipulate code / use macros, it is these forms that matter: these are the building blocks of all your code. So it fundamentally makes sense to put the parentheses in a place that exactly delimits these forms, rather than arbitrarily leaving the first element outside the form.
The "code is data" philosophy is what makes reliable metaprogramming possible. Compare with Perl / Ruby, which have complex syntax, their approaches to metaprogramming are only reliable in very confined circumstances. Lisp metaprogramming is so perfectly reliable that the core of the language depends on it. The reason for this is the uniform syntax shared by code and data (the property of homoiconicity). S-expressions are the way this uniformity is realized.
That said, there are circumstances in Clojure where implied parenthesis is possible:
For example the following three expressions are equivalent:
(first (rest (rest [1 2 3 4 5 6 7 8 9])))
(-> [1 2 3 4 5 6 7 8 9] (rest) (rest) (first))
(-> [1 2 3 4 5 6 7 8 9] rest rest first)
Notice that in the third the -> macro is able in this context to infer parentheses and thereby leave them out. There are many special case scenarios like this. In general Clojure's design errs on the side of less parentheses. See for example the controversial decision of leaving out parenthesis in cond. Clojure is very principled and consistent about this choice.
Interestingly enough - there is an alternate Racket Syntax:
#foo{blah blah blah}
reads as
(foo "blah blah blah")
Were Clojure to go down this path - what would it have to sacrifice to get there?
The sacrifice would the feeling/mental-modal of writing code by creating List of List of List.. or more technically "writing the AST directly".
NOTE: There are other things as well that will be scarified as mentioned in other answers.
Or are we all sticking to our taught "&&, ||, !" way?
Any thoughts in why we should use one or the other?
I'm just wondering because several answers state thate code should be as natural as possible, but I haven't seen a lot of code with "and, or, not" while this is more natural.
I like the idea of the not operator because it is more visible than the ! operator. For example:
if (!foo.bar()) { ... }
if (not foo.bar()) { ... }
I suggest that the second one is more visible and readable. I don't think the same argument necessarily applies to the and and or forms, though.
"What's in a name? That which we call &&, || or !
By any other name would smell as sweet."
In other words, natural depends on what you are used to.
Those were not supported in the old days. And even now you need to give a special switch to some compilers to enable these keywords. That's probably because old code base may have had some functions || variables named "and" "or" "not".
One problem with using them (for me anyway) is that in MSVC you have to include iso646.h or use the (mostly unusable) /Za switch.
The main problem I have with them is the Catch-22 that they're not commonly used, so they require my brain to actively process the meaning, where the old-fashioned operators are more or less ingrained (kind of like the difference between reading a learned language vs. your native language).
Though I'm sure I'd overcome that issue if their use became more universal. If that happened, then I'd have the problem that some boolean operators have keywords while others don't, so if alternate keywords were used, you might see expressions like:
if ((x not_eq y) and (y == z) or (z <= something)) {...}
when it seems to me they should have alternate tokens for all the (at least comparison) operators:
if ((x not_eq y) and (y eq z) or (z lt_eq something)) {...}
This is because the reason the alternate keywords (and digraphs and trigraphs) were provided was not to make the expressions more readable - it was because historically there have been (and maybe still are) keyboards and/or codepages in some localities that do not have certain punctuation characters. For example, the invariant part of the ISO 646 codepage (surprise) is missing the '|', '^' and '~' characters among others.
Although I've been programming C++ from quite some time, I did not know that the keywords "and" "or" and "not" were allowed, and I've never seen it used.
I searched through my C++ book, and I found a small section mentioning alternative representation for the normal operators "&&", "||" and "!", where it explains those are available for people with non-standard keyboards that do not have the "&!|" symbols.
A bit like trigraphs in C.
Basically, I would be confused by their use, and I think I would not be the only one.
Using a representation which is non-standard, should really have a good reason to be used.
And if used, it should be used consistently in the code, and described in the coding standard.
The digraph and trigraph operators were actually designed more for systems that didn't carry the standard ASCII character set - such as IBM mainframes (which use EBCDIC). In the olden days of mechanical printers, there was this thing called a "48-character print chain" which, as its name implied, only carried 48 characters. A-Z (uppercase), 0-9 and a handful of symbols. Since one of the missing symbols was an underscore (which rendered as a space), this could make working with languages like C and PL/1 a real fun activity (is this 2 words or one word with an underscore???).
Conventional C/C++ is coded with the symbols and not the digraphs. Although I have been known to #define "NOT", since it makes the meaning of a boolean expression more obvious, and it's visually harder to miss than a skinny little "!".
I wish I could use || and && in normal speech. People try very hard to misunderstand when I say "and" or "or"...
I personally like operators to look like operators. It's all maths, and unless you start using "add" and "subtract" operators too it starts to look a little inconsistent.
I think some languages suit the word-style and some suit the symbols if only because it's what people are used to and it works. If it ain't broke, don't fix it.
There is also the question of precedence, which seems to be one of the reasons for introducing the new operators, but who can be bothered to learn more rules than they need to?
In cases where I program with names directly mapped to the real world, I tend to use 'and' and 'or', for example:
if(isMale or isBoy and age < 40){}
It's nice to use 'em in Eclipse+gcc, as they are highlighted. But then, the code doesn't compile with some compilers :-(
Using these operators is harmful. Notice that and and or are logical operators whereas the similar-looking xor is a bitwise operator. Thus, arguments to and and or are normalized to 0 and 1, whereas those to xor aren't.
Imagine something like
char *p, *q; // Set somehow
if(p and q) { ... } // Both non-NULL
if(p or q) { ... } // At least one non-NULL
if(p xor q) { ... } // Exactly one non-NULL
Bzzzt, you have a bug. In the last case you're testing whether at least one of the bits in the pointers is different, which probably isn't what you thought you were doing because then you would have written p != q.
This example is not hypothetical. I was working together with a student one time and he was fond of these literate operators. His code failed every now and then for reasons that he couldn't explain. When he asked me, I could zero in on the problem because I knew that C++ doesn't have a logical xor operator, and that line struck me as very odd.
BTW the way to write a logical xor in C++ is
!a != !b
I like the idea, but don't use them. I'm so used to the old way that it provides no advantage to me doing it either way. Same holds true for the rest of our group, however, I do have concerns that we might want to switch to help avoid future programmers from stumbling over the old symbols.
So to summarize: it's not used a lot because of following combination
old code where it was not used
habit (more standard)
taste (more math-like)
Thanks for your thoughts