Defining how long a list can be in Haskell - list

So I'm new to Haskell and i'm trying to define a list which is a Max of 4 elements long.
so far I have type IntL = [Int,Int,Int,Int]
but I was thinking there must be a better/proper way of doing this.
Is there?

This is problematic in Haskell because phantom types encoding sizes need proper compiler support (otherwise it's pretty annoying to use), and type nats in GHC appeared somewhat recently.
That being said libraries exist, just to give you an idea.
Alternatively, just use a tuple.

it might look stupid and it certainly does not scale but what about
data Max4 a
= Empty
| One a
| Two a a
| Three a a a
| Four a a a a
with type IntL = Max4 Int? It's basic, you should be able to understand it and you can learn a lot by implementing operations on it.

Basic Haskell Types are not so powerful as to encode the maximum length of a list. In order to do that, you must rely on extensions such as GADTs and Phantom Types and yet it is not straightforward as well.
If you are really a newbie, I advice you to learn other basic concepts like Monads, IO and other idioms.
This site is a very good reading for an initial approach to Haskell:
http://learnyouahaskell.com

Related

Should function composition and piping be tested?

In F# (and most of the functional languages) some codes are extremely short as follows:
let f = getNames
>> Observable.flatmap ObservableJson.jsonArrayToObservableObjects<string>
or :
let jsonArrayToObservableObjects<'t> =
JsonConvert.DeserializeObject<'t[]>
>> Observable.ToObservable
And the simplest property-based test I ended up for the latter function is :
testList "ObservableJson" [
testProperty "Should convert an Observable of `json` array to Observable of single F# objects" <| fun _ ->
//--Arrange--
let (array , json) = createAJsonArrayOfString stringArray
//--Act--
let actual = jsonArray
|> ObservableJson.jsonArrayToObservableObjects<string>
|> Observable.ToArray
|> Observable.Wait
//--Assert--
Expect.sequenceEqual actual sArray
]
Regardless of the arrange part, the test is more than the function under test, so it's harder to read than the function under test!
What would be the value of testing when it's harder to read than the production code?
On the other hand:
I wonder whether the functions which are a composition of multiple functions are safe to not to be tested?
Should they be tested at integration and acceptance level?
And what if they are short but do complex operations?
Depends upon your definition of what 'functional programming' is. Or even more precise - upon how close you wanna stay to the origin of functional programming - math with both broad and narrow meanings.
Let's take something relevant to the programming. Say, mappings theory. Your question could be translated in such a way: having a bijection from A to B, and a bijection from B to C, should I prove that composition of those two is a bijection as well? The answer is twofold: you definitely should, and you do it only once: your prove is generic enough to cover all possible cases.
Falling back into programming, it means that pipe-lining has to be tested (proved) only once - and I guess it was before deploy to the production. Since that your job as a programmer to create functions (mappings) of such a quality, that, being composed with a pipeline operator or whatever else, the desired properties are preserved. Once again, it's better to stick with generic arguments rather than write tons of similar tests.
So, finally, we come down to a much more valuable question: how one can guarantee that some operation preserve some property? It turns out that the easiest way to acknowledge such a fact is to deal with types like Monoid from the great Haskell: for example, Monoind is there to represent any associative binary operation A -> A -> A together with some identity-element of type A. Having such a generic containers is extremely profitable and the best known way of being explicit in what and how exactly your code is designed to do.
Personally I would NOT test it.
In fact having less need for testing and instead relying more on stricter compiler rules, side effect free functions, immutability etc. is one major reason why I prefer F# over C#.
of course I continue (unit)testing of "custom logic" ... e.g. algorithmic code

Why would I ever want to use Maybe instead of a List?

Seeing as the Maybe type is isomorphic to the set of null and singleton lists, why would anyone ever want to use the Maybe type when I could just use lists to accomodate absence?
Because if you match a list against the patterns [] and [x] that's not an exhaustive match and you'll get a warning about that, forcing you to either add another case that'll never get called or to ignore the warning.
Matching a Maybe against Nothing and Just x however is exhaustive. So you'll only get a warning if you fail to match one of those cases.
If you choose your types such that they can only represent values that you may actually produce, you can rely on non-exhaustiveness warnings to tell you about bugs in your code where you forget to check for a given a case. If you choose more "permissive" types, you'll always have to think about whether a warning represents an actual bug or just an impossible case.
You should strive to have accurate types. Maybe expresses that there is exactly one value or that there is none. Many imperative languages represent the "none" case by the value null.
If you chose a list instead of Maybe, all your functions would be faced with the possibility that they get a list with more than one member. Probably many of them would only be defined for one value, and would have to fail on a pattern match. By using Maybe, you avoid a class of runtime errors entirely.
Building on existing (and correct) answers, I'll mention a typeclass based answer.
Different types convey different intentions - returning a Maybe a represents a computation with the possiblity of failing while [a] could represent non-determinism (or, in simpler terms, multiple possible return values).
This plays into the fact that different types have different instances for typeclasses - and these instances cater to the underlying essence the type conveys. Take Alternative and its operator (<|>) which represents what it means to combine (or choose) between arguments given.
Maybe a Combining computations that can fail just means taking the first that is not Nothing
[a] Combining two computations that each had multiple return values just means concatenating together all possible values.
Then, depending on which types your functions use, (<|>) would behave differently. Of course, you could argue that you don't need (<|>) or anything like that, but then you are missing out on one of Haskell's main strengths: it's many high-level combinator libraries.
As a general rule, we like our types to be as snug fitting and intuitive as possible. That way, we are not fighting the standard libraries and our code is more readable.
Lisp, Scheme, Python, Ruby, JavaScript, etc., manage to get along with just one type each, which you could represent in Haskell with a big sum type. Every function handling a JavaScript (or whatever) value must be prepared to receive a number, a string, a function, a piece of the document object model, etc., and throw an exception if it gets something unexpected. People who program in typed languages like Haskell prefer to limit the number of unexpected things that can occur. They also like to express ideas using types, making types useful (and machine-checked) documentation. The closer the types come to representing the intended meaning, the more useful they are.
Because there are an infinite number of possible lists, and a finite number of possible values for the Maybe type. It perfectly represents one thing or the absence of something without any other possibility.
Several answers have mentioned exhaustiveness as a factor here. I think it is a factor, but not the biggest one, because there is a way to consistently treat lists as if they were Maybes, which the listToMaybe function illustrates:
listToMaybe :: [a] -> Maybe a
listToMaybe [] = Nothing
listToMaybe (a:_) = Just a
That's an exhaustive pattern match, which rules out any straightforward errors.
The factor I'd highlight as bigger is that by using the type that more precisely models the behavior of your code, you eliminate potential behaviors that would be possible if you used a more general alternative. Say for example you have some context in your code where you uses a type of the form a -> [b], though the only correct alternatives (given your program's specification) are empty or singleton lists. Try as hard as you may to enforce the convention that this context should obey that rule, it's still possible that you'll mess up and:
Somehow a function used in that context will produce a list of two or more items;
And somehow a function that uses the results produced in that context will observe whether the lists have two or more items, and behave incorrectly in that case.
Example: some code that expects there to be no more than one value will blindly print the contents of the list and thus print multiple items when only one was supposed to be.
But if you use Maybe, then there really must be either one value or none, and the compiler enforces this.
Even though isomorphic, e.g. QuickCheck will run slower because of the increase in search space.

Where does the k prefix for constants come from?

it's a pretty common practice that constants are prefixed with k (e.g. k_pi). But what does the k mean?
Is it simply that c already meant char?
It's a historical oddity, still common practice among teams who like to blindly apply coding standards that they don't understand.
Long ago, most commercial programming languages were weakly typed; automatic type checking, which we take for granted now, was still mostly an academic topic. This meant that is was easy to write code with category errors; it would compile and run, but go wrong in ways that were hard to diagnose. To reduce these errors, a chap called Simonyi suggested that you begin each variable name with a tag to indicate its (conceptual) type, making it easier to spot when they were misused. Since he was Hungarian, the practise became known as "Hungarian notation".
Some time later, as typed languages (particularly C) became more popular, some idiots heard that this was a good idea, but didn't understand its purpose. They proposed adding redundant tags to each variable, to indicate its declared type. The only use for them is to make it easier to check the type of a variable; unless someone has changed the type and forgotten to update the tag, in which case they are actively harmful.
The second (useless) form was easier to describe and enforce, so it was blindly adopted by many, many teams; decades later, you still see it used, and even advocated, from time to time.
"c" was the tag for type "char", so it couldn't also be used for "const"; so "k" was chosen, since that's the first letter of "konstant" in German, and is widely used for constants in mathematics.
I haven't seen it that much, but maybe it comes from certain languages' (the germanic ones in particular) spelling of the word constant - konstant.
Don't use Hungarian Notation. If you want constants to stand out, make them all caps.
As a side note: there are a lot of things in the Google Coding Standards that are poor practice (in terms of code readability). That is what happens when you design a coding standard by committee.
It means the value is k-onstant.
I think mathematical convention was the precedent. k is used in maths all the time as just some constant.
K stands for konstant, a wordplay on constant. It relates to Coding Styles.
It's just a matter of preference, some people and projects use them which means they also embrace the Hungarian notation, many don't. That's not that important.
If you're unsure what a prefix or style might mean, always check if the project has a coding style reference and read that.
Actually, whenever I define constants in typescript, I do something like this -
NODE_ENV = 'production';
But recently, I saw that the k prefix is being used in the Flutter SDK. It makes sense to me to keep using the k prefix cuz' it helps your editor/IDE in searching out constants in your codebase.
It's a convention, probably from math. But there are other suggestions for constant too, for example Kernighan and Ritchie in their book "The C language" suggest writing constants' name in capital letters (e.g. #define MAX 55).
I think, it means coefficient (as k in math means)

Is FC++ used by any open source projects?

The FC++ library provides an interesting approach to supporting functional programming concepts in C++.
A short example from the FAQ:
take (5, map (odd, enumFrom(1)))
FC++ seems to take a lot of inspiration from Haskell, to the extent of reusing many function names from the Haskell prelude.
I've seen a recent article about it, and it's been briefly mentioned in some answers on stackoverflow, but I can't find any usage of it out in the wild.
Are there any open source projects actively using FC++? Or any history of projects which used it in the past? Or does anyone have personal experience with it?
There's a Customers section on the web site, but the only active link is to another library by the same authors (LC++).
As background: I'm looking to write low latency audio plugins using existing C++ APIs, and I'm looking for tooling which allows me to write concise code in a functional style. For this project I wan't to use a C++ library rather than using a separate language, to avoid introducing FFI bindings (because of the complexity) or garbage collection (to keep the upper bound on latency in the sub-millisecond range).
I'm aware that the STL and Boost libraries already provide support from many FP concepts--this may well be a more practical approach. I'm also aware of other promising approaches for code generation of audio DSP code from functional languages, such as the FAUST project or the Haskell synthesizer package.
This isn't an answer to your question proper, but my experience with embedding of functional style into imperative languages has been horrid. While the code can be almost as concise, it retains the complexity of reasoning found in imperative languages.
The complexity of the embedding usually requires the most intimate knowledge of the details and corner cases of the language. This greatly increases the cost of abstraction, as these things must always be taken into careful consideration. And with a cost of abstraction so high, it is easier just to put a side-effectful function in a lazy stream generator and then die of subtle bugs.
An example from FC++:
struct Insert : public CFunType<int,List<int>,List<int> > {
List<int> operator()( int x, const List<int>& l ) const {
if( null(l) || (x > head(l)) )
return cons( x, l );
else
return cons( head(l), curry2(Insert(),x,tail(l)) );
}
};
struct Isort : public CFunType<List<int>,List<int> > {
List<int> operator()( const List<int>& l ) const {
return foldr( Insert(), List<int>(), l );
}
};
I believe this is trying to express the following Haskell code:
-- transliterated, and generalized
insert :: (Ord a) => a -> [a] -> [a]
insert x [] = [x]
insert x (a:as) | x > a = x:a:as
| otherwise = a:insert x as
isort :: (Ord a) => [a] -> [a]
isort = foldr insert []
I will leave you to judge the complexity of the approach as your program grows.
I consider code generation a much more attractive approach. You can restrict yourself to a miniscule subset of your target language, making it easy to port to a different target language. The cost of abstraction in a honest functional language is nearly zero, since, after all, they were designed for that (just as abstracting over imperative code in an imperative language is fairly cheap).
I'm the primary original developer of FC++, but I haven't worked on it in more than six years. I have not kept up with C++/boost much in that time, so I don't know how FC++ compares now. The new C++ standard (and implementations like VC++) has a bit of stuff like lambda and type inference help that makes some of what is in there moot. Nevertheless, there might be useful bits still, like the lazy list types and the Haskell-like (and similarly named) combinators. So I guess try it and see.
(Since you mentioned real-time, I should mention that the lists use reference counting, so if you 'discard' a long list there may be a non-trivial wait in the destructor as all the cells' ref-counts go to zero. I think typically in streaming scenarios with infinite streams/lists this is a non-issue, since you're typically just tailing into the stream and only deallocating things one node at a time as you stream.)

Expression Evaluation in C++

I'm writing some excel-like C++ console app for homework.
My app should be able to accept formulas for it's cells, for example it should evaluate something like this:
Sum(tablename\fieldname[recordnumber], fieldname[recordnumber], ...)
tablename\fieldname[recordnumber] points to a cell in another table,
fieldname[recordnumber] points to a cell in current table
or
Sin(fieldname[recordnumber])
or
anotherfieldname[recordnumber]
or
"10" // (simply a number)
something like that.
functions are Sum, Ave, Sin, Cos, Tan, Cot, Mul, Div, Pow, Log (10), Ln, Mod
It's pathetic, I know, but it's my homework :'(
So does anyone know a trick to evaluate something like this?
Ok, nice homework question by the way.
It really depends on how heavy you want this to be. You can create a full expression parser (which is fun but also time consuming).
In order to do that, you need to describe the full grammar and write a frontend (have a look at lex and yacc or flexx and bison.
But as I see your question you can limit yourself to three subcases:
a simple value
a lookup (possibly to an other table)
a function which inputs are lookups
I think a little OO design can helps you out here.
I'm not sure if you have to deal with real time refresh and circular dependency checks. Else they can be tricky too.
For the parsing, I'd look at Recursive descent parsing. Then have a table that maps all possible function names to function pointers:
struct FunctionTableEntry {
string name;
double (*f)(double);
};
You should write a parser. Parser should take the expression i.e., each line and should identify the command and construct the parse tree. This is the first phase. In the second phase you can evaluate the tree by substituting the data for each elements of the command.
Previous responders have hit it on the head: you need to parse the cell contents, and interpret them.
StackOverflow already has a whole slew of questions on building compilers and interperters where you can find pointers to resources. Some of them are:
Learning to write a compiler (#1669 people!)
Learning Resources on Parsers, Interpreters, and Compilers
What are good resources on compilation?
References Needed for Implementing an Interpreter in C/C++
...
and so on.
Aside: I never have the energy to link them all together, or even try to build a comprehensive list.
I guess you cannot use yacc/lex (or the like) so you have to parse "manually":
Iterate over the string and divide it into its parts. What a part is depends on you grammar (syntax). That way you can find the function names and the parameters. The difficulty of this depends on the complexity of your syntax.
Maybe you should read a bit about lexical analysis.