Compare two values of two "lists" against each other and count instances where values match - sml

To start, let me share that I'm a complete novice at formal programming and have decided to start learning after many years of working in IT from the administrative side. So I am starting from scratch and hopefully unlearning any bad habits I've picked up over the years.
To help myself out, I've started a course online that has some homework. I've promised myself to not cheat, but I'm struggling on this problem and would like some help. I did see a similar post that was answered, but the answer doesn't make sense to me, so I'm asking again.
Here is the assignment ...
" Write a function "number_in_months" that takes a list of dates and a list of months (i.e., an int list) and returns the number of dates in the list of dates that are in any of the months in the list of months. Assume the list of months has no number repeated. Hint: Use your answer to the previous problem. "
You'll notice that it refers to the previous problem. I've shared it below.
" Write a function "number_in_month" that takes a list of dates and a month (i.e., an int) and returns how many dates in the list are in the given month. "
I was able to solve the previous problem with this code.
fun number_in_month (dates : (int * int * int) list , month : int ) =
if null dates
then 0
else
if ( #2 (hd dates) = month )
then 1 + number_in_month (tl dates , month)
else number_in_month (tl dates , month)
In my REPL val test2 = number_in_month ([(2012,2,28),(2013,12,1)],2) = 1 tests true
So I've re-read the material and re-watched the videos a few times and still haven't gotten to an understanding on how to solve the problem. I can get it all to type check correctly but then get an uncaught exception EMPTY
I can recurse the "dates" list against the hd of the "months" list with no problem but to move on to the next value in the "months" list is killing me. So far I have tried many ways, but I'm stumped, and I'm feeling not just a little stupid. ;-)
fun number_in_months (dates : (int * int * int) list , months : int list ) =
if null dates
then 0
else
if (#2 (hd dates) = (hd months))
then 1 + number_in_months ( tl dates , months )
else
????
It may be that I solved the first problem in such a way that it is throwing me off with the second. I'm open to any ideas, guidance, or clues. All will be appreciated

The more idiomatic way to handle empty lists and accessing elements of tuples in SML is pattern matching. If we consider your number_in_month function, we'd look at something like the following. If the list of dates is empty, it doesn't matter what month we're looking for, so we represent it with _.
Otherwise we consider the first tuple (but we only need to bind a name to the second element representing the month) and bind a name to the tail of the list. We count the tail of the list, and then if the current one is a match, add 1.
fun number_in_month([], _) = 0
| number_in_month(((_, mon, _)::other_dates), month) =
let
val count_remaining = number_in_month(other_dates, month)
in
(if mon = month then 1 else 0) + count_remaining
end;
For your numbers_in_months it seems you want to map over the list of months. List.map provides exactly this, and can directly use your prior work.
fun numbers_in_months(dates, months) =
List.map (fn m => number_in_month(dates, m)) months;
numbers_in_months([(2012, 2, 28), (2013, 12, 1)], [4, 12, 2]);
Returns:
[0, 1, 1]
map can be implemented trivially, even without the niceties of pattern-matching.
fun map f lst =
if null lst then []
else f (hd lst) :: map f (tl lst)

Related

Haskell function that returns a list of elements in a list with more than given amount of occurrences

I tried making a function that as in the title takes 2 arguments, a number that specifies how many times the number must occur and a list that we are working on, I made a function that counts number of appearances of given number in a list and I tried using it in my main function, but I cannot comprehend how the if else and indentations work in Haskell, it's so much harder fixing errors than in other languages, i think that I'm missing else statement but even so I don't know that to put in there
count el list = count el list 0
where count el list output
| list==[] = output
| head(list)==el = count el (tail(list)) output+1
| otherwise = count el (tail(list)) output
moreThan :: Eq a => Int -> [a] -> [a]
moreThan a [] = []
moreThan a list = moreThan a list output i
where moreThan a list [] 0
if i == length (list)
then output
else if elem (list!!i) output
then moreThan a list output i+1
else if (count (list!!i) list) >= a
then moreThan a list (output ++ [list!!i]) i+1
All I get right now is
parse error (possibly incorrect indentation or mismatched brackets)
You just forgot the = sign and some brackets, and the final else case. But also you switched the order of the internal function declaration and call:
moreThan :: Eq a => Int -> [a] -> [a]
moreThan a [] = []
moreThan a list = go a list [] 0 -- call
where go a list output i = -- declaration =
if i == length (list)
then output
else if elem (list!!i) output
then go a list output (i+1) -- (i+1) !
else if (count (list!!i) list) >= a
then go a list (output ++ [list!!i]) (i+1) -- (i+1) !
else
undefined
I did rename your internal function as go, as is the custom.
As to how to go about fixing errors in general, just read the error messages, slowly, and carefully -- they usually say what went wrong and where.
That takes care of the syntax issues that you asked about.
As to what to put in the missing else clause, you've just dealt with this issue in the line above it -- you include the ith element in the output if its count in the list is greater than or equal to the given parameter, a. What to do else, we say in the else clause.
And that is, most probably, to not include that element in the output:
then go a list (output ++ [list!!i]) (i+1)
else ---------------------
undefined
So, just keep the output as it is, there, instead of the outlined part, and put that line instead of the undefined.
More importantly, accessing list elements via an index is an anti-pattern, it is much better to "slide along" by taking a tail at each recursive step, and always deal with the head element only, like you do in your count code (but preferably using the pattern matching, not those functions directly). That way our code becomes linear instead of quadratic as it is now.
Will Ness's answer is correct. I just wanted to offer some general advice for Haskell and some tips for improving your code.
First, I would always avoid using guards. The syntax is quite inconsistent with Haskell's usual fare, and guards aren't composable in the same way that other Haskell syntax is. If I were you, I'd stick to using let, if/then/else, and pattern matching.
Secondly, an if statement in Haskell is very often not the right answer. In many cases, it's better to avoid using if statements entirely (or at least as much as possible). For example, a more readable version of count would look like this:
count el list = go list 0 where
go [] output = output
go (x:xs) output = go xs (if x == el
then 1 + output
else output)
However, this code is still flawed because it is not properly strict in output. For example, consider the evaluation of the expression count 1 [1, 1, 1, 1], which proceeds as follows:
count 1 [1, 1, 1, 1]
go [1, 1, 1, 1] 0
go [1, 1, 1] (1 + 0)
go [1, 1] (1 + (1 + 0))
go [1] (1 + (1 + (1 + 0)))
go [] (1 + (1 + (1 + (1 + 0))))
(1 + (1 + (1 + (1 + 0))))
(1 + (1 + 2))
(1 + 3)
4
Notice the ballooning space usage of this evaluation. We need to force go to make sure output is evaluated before it makes a recursive call. We can do this using seq. The expression seq a b is evaluated as follows: first, a is partially evaluated. Then, seq a b evaluates to b. For the case of numbers, "partially evaluated" is the same as being totally evaluated.
So the code should in fact be
count el list = go list 0 where
go [] output = output
go (x:xs) output =
let new_output = if x == el
then 1 + output
else output
in seq new_output (go xs new_output)
Using this definition, we can again trace the execution:
go [1, 1, 1, 1] 0
go [1, 1, 1] 1
go [1, 1] 2
go [1] 3
go [] 4
4
which is a more efficient way to evaluate the expression. Without using library functions, this is basically as good as it gets for writing the count function.
But we're actually using a very common pattern - a pattern so common, there is a higher-order function named for it. We're using foldl' (which must be imported from Data.List using the statement import Data.List (foldl')). This function has the following definition:
foldl' :: (b -> a -> b) -> b -> [a] -> b
foldl' f = go where
go output [] = output
go output (x:xs) =
let new_output = f output x
in seq new_output (go new_output xs)
So we can further rewrite our count function as
count el list = foldl' f 0 list where
f output x = if x == el
then 1 + output
else output
This is good, but we can actually improve even further on this code by breaking up the count step into two parts.
count el list should be the number of times el occurs in list. We can break this computation up into two conceptual steps. First, construct the list list', which consists of all the elements in list which are equal to el. Then, compute the length of list'.
In code:
count el list = length (filter (el ==) list)
This is, in my view, the most readable version yet. And it is also just as efficient as the foldl' version of count because of laziness. Here, Haskell's length function takes care of finding the optimal way to do the counting part of count, while the filter (el ==) takes care of the part of the loop where we check whether to increment output. In general, if you're iterating over a list and have an if P x statement, you can very often replace this with a call to filter P.
We can rewrite this one more time in "point-free style" as
count el = length . filter (el ==)
which is most likely how the function would be written in a library. . refers to function composition. The meaning of this is as follows:
To apply the function count el to a list, we first filter the list to keep only the elements which el ==, and then take the length.
Incidentally, the filter function is exactly what we need to write moreThan compactly:
moreThan a list = filter occursOften list where
occursOften x = count x list >= a
Moral of the story: use higher-order functions whenever possible.
Whenever you solve a list problem in Haskell, the first tool you should reach for is functions defined in Data.List, especially map, foldl'/foldr, filter, and concatMap. Most list problems come down to map/fold/filter. These should be your go-to replacement for loops. If you're replacing a nested loop, you should use concatMap.
in a functional way, ;)
moreThan n xs = nub $ concat [ x | x <- ( group(sort(xs))), length x > n ]
... or in a fancy way, lol
moreThan n xs = map head [ x | x <- ( group(sort(xs))), length x > n ]
...
mt1 n xs = [ head x | x <- ( group(sort(xs))), length x > n ]

Iterate through list of tuples with strings and ints

So I have this function
getSales :: (Num p, Eq t) => t -> [(t, p)] -> p
getSales day [] = 0
getSales day ((x,y):xs) | (day == x) = y + (getSales day xs) | otherwise = getSales day XS
So basically if I did getSales "Mon" storelog
and storelog was storelog = [("Mon",50),("Fri",20), ("Tue",20),("Fri",10),("Wed",25),("Fri",30)]
it would return 50. But now I want to be able to iterate through a tuple like this
sales = [("Amazon",[("Mon",30),("Wed",100),("Sat",200)]), ("Etsy",[("Mon",50),("Tue",20),("Wed",25),("Fri",30)]), ("Ebay",[("Tue",60),("Wed",100),("Thu",30)]), ("Etsy",[("Tue",100),("Thu",50),("Sat",20),("Tue",10)])]
With the company name given and then I use getSales to find the sales for the day asked.
sumSales:: (Num p)=> String -> String -> [(String,[(String,p)])] -> p
This is the function I have for the iteration for the company name given and such but I am having a real hard time understanding how to iterate through the tuple for it to find "Amazon" for example and then pass in the list of the sales for the days.
You've actually already implemented most of the logic needed in your getSales function - which I repeat below for reference, although tidying up the recursive case to put the guards on separate lines, this is much more readable and is how it would normally be written in practice:
getSales :: (Num p, Eq t) => t -> [(t, p)] -> p
getSales day [] = 0
getSales day ((x,y):xs)
| (day == x) = y + (getSales day xs)
| otherwise = getSales day XS
For your sumSales function, you just need to repeat this pattern. The only important differences are:
the function takes 2 arguments instead of 1
the second element of the tuples, denoted by y above, now isn't simply a number, but a list of tuples. So we can't simply add it to the result of the recursive call. However, what it is is a tuple consisting of a day name and a number, which is exactly the case you've already dealt with in getSales. So the solution is to make use of that function you've already written.
Taking account of the above, we get:
sumSales:: (Num p) => String -> String -> [(String, [(String, p)])] -> p
sumSales company day [] = 0
sumSales company day ((x, y) : xs)
| company == x = getSales day y + sumSales company day xs
| otherwise = sumSales company day xs
which will work as I believe you intend.
However, I don't particularly like these two functions. They are very "noisy", with a lot of details to process before you can understand what the functions do - and further, the important parts of the two functions are more or less the same. In particular, one common way in functional programming to write functions like these which "summarise" some information about a list is to use a "fold". So you could rewrite the functions in this way, which uses foldr to abstract away the common recursion pattern in both:
getSales day = foldr (\(x, y) sum -> if x == day then y + sum else sum) 0
sumSales company day = foldr (\(x, sales) sum -> if x == company then getSales day sales + sum else sum) 0
That is, in my opinion, better - but still not great. There's still repetition across the functions which we're using to fold in each case - both simply check a condition and use it to either add a new term or not.
The Haskell standard library - in fact, even the Prelude, the set of functions/types/classes available by default in every Haskell program, without any imports - already includes some functions for dealing with these things. In particular, there is sum for adding a list of numbers, filter for reducing a list to just those elements fulfilling a particular condition, and map for getting a new list from an old one by applying a function to each element. map, filter and the various folds are really the "bread and butter" of functional programming with lists, so I would advise becoming familiar and comfortable with them.
Using these common tools, we can write the two functions in much more readily understandable form:
getSales day = sum . map snd . filter (\(x, _) -> x == day)
sumSales company day = sum . map (getSales day . snd) . filter (\(x, _) -> x == company)
(One could argue I've taken these too far in terms of being "point free" - although I could have gone further and written eg \(x, _) -> x == day as (== day) . fst. The "best" way to write a function, in terms of readability, is very much a matter of opinion. I just wanted to show you that there are other ways to write these functions, and other similar ones you may need in future, that will likely make more sense to you when you come back to look at them later.)

number_in_month exercise (Getting EQUAL. OP Error in SML, One function works other does not)

(* Write a function number_in_month that takes a list
of dates and a month (i.e., an int) and
returns how many dates in the list are in the given month.*)
fun number_in_month(datelist : (int*int*int) list, month : int) =
if null(tl (datelist))
then if #2(hd (datelist)) = month then 1 else 0
else if #2(hd (datelist)) = month
then 1 + number_in_month(tl datelist, month)
else number_in_month(tl datelist, month)
(* Write a function number_in_months that takes a list of dates and a list of months
(i.e., an int list) and returns the number of dates in the list of dates that are
in any of the months in the list of months. Assume the list of months
has no number repeated. Hint: Use your answer to the previous problem. *)
fun number_in_months(datelist : (int*int*int) list, monthlist : int list)
if null(tl (monthlist))
then number_in_month(datelist, hd monthlist)
else number_in_month(datelist, hd monthlist)
+ number_in_months(datelist, tl monthlist)
The second function gives me this error when I try to compile it:
hw1.sml:42.5 Error: syntax error: inserting EQUALOP
[unexpected exception: Compile]
uncaught exception Compile [Compile: "syntax error"]
raised at: ../compiler/Parse/main/smlfile.sml:19.24-19.46
../compiler/TopLevel/interact/evalloop.sml:45.54
../compiler/TopLevel/interact/evalloop.sml:306.20-306.23
../compiler/TopLevel/interact/interact.sml:65.13-65.16
-
"syntax error: inserting EQUALOP" means that SML is expecting a = character.
The error messages from SML/NJ is one of the things that haven't improved one bit over the past twenty years. They often report what the parser does in order to try to recover from an error rather than what the error might be.
List recursion (and most everything else) is much nicer to write with pattern matching than with conditionals and selectors:
fun number_in_month ([], _) = 0
| number_in_month ((_, m, _)::ds, m') = (if m = m' then 1 else 0) + number_in_month(ds, m');
fun number_in_months (_, []) = 0
| number_in_months (ds, m::ms) = number_in_month(ds, m) + number_in_months(ds, ms);
This also lets SML let you know when you have forgotten a case, for instance the case of the empty list (which you forgot about).
Answer: Forgot the = sign. This is correct:
fun number_in_months(datelist : (int*int*int) list, monthlist : int list) =
if null(tl (monthlist))
then number_in_month(datelist, hd monthlist)
else number_in_month(datelist, hd monthlist)
+ number_in_months(datelist, tl monthlist)

number_in_month exercise (How to return nothing instead of an empty list in SML)

I am doing a programming assignment with SML. One of the functions requires me to return a list of triple tuples of ints ( (int * int * int) list ) use to other lists. The function sorts through dates and months to see if any of them coincide, if they do, then they add it to the list. Here is the code for that.
fun dates_in_month (dates : (int * int * int) list, month : int) =
if null dates
then []
else
if #2 (hd dates) = month
then (hd dates) :: dates_in_month(tl dates, month)
else dates_in_month(tl dates, month)
fun dates_in_months (dates : (int * int * int) list, months : int list) =
if null months orelse null dates
then []
else
dates_in_month(dates, hd months) ::
dates_in_months(dates, tl months)
Using this code works to a point, however the function returns an (int * int * int) list list, instead of a (int * int * int) list. I think the problem lies with the
then [] statement. Any help would be appreciated.
The problem is not the then [], the problem lies here:
dates_in_month(dates, hd months) ::
dates_in_months(dates, tl months)
Here you take the result of dates_in_month(dates, hd months), which is a list, and use it as the first argument to ::. As you know, h :: t produces a list whose first element is h. So in this case you create a list whose first element is a list. That is you're creating a list of lists.
Since you don't want that, you shouldn't use ::. You can use #, which takes two lists as its operands and concatenates them. So while [1,2] :: [3,4] :: [] would produce [[1,2], [3,4]], [1,2] # [3,4] # [] will produce [1,2,3,4], which is what you want.

How to get the oldest date from a list of dates in SML?

I'm having some troubles with this assignment. This is what the professor is asking for:
Write a function oldest that takes a list of dates and evaluates to an
(int*int*int) option. It evaluates to NONE if the list has no dates
and SOME d if the date d is the oldest date in the list.
I know how to create the function and have some idea on how to work with the list of dates, but I don't know how to "store" the oldest value to compare it with the tail of the list of dates. This is what I submitted (it doesn't work, it always retrieves the first date, but I would really love to know the answer)
fun oldest (datelist : (int * int * int) list) =
if null datelist
then NONE
else if null (tl datelist) then
SOME (hd datelist)
else let val date = if is_older (hd datelist, hd (tl datelist)) then SOME (hd datelist) else SOME (hd (tl datelist))
in oldest(tl datelist)
end
One way of keeping a value across recursive calls, is to pass it along in an argument. As you can't change the original function, the most used solution is to have a helper function which takes this extra argument, and possible others as well.
Such a helper function could take the head of the list and compare it with the extra argument, using the oldest of the two in the recursive call on the tail of the list. Then when the list is empty, you just return this extra argument, as it must be the oldest.
fun oldestOfTwo (d1, d2) = (* return the oldest/minimum of the two dates *)
fun oldest [] = NONE
| oldest (d::ds) =
let
fun oldest' max [] = SOME max
| oldest' max (d::ds) =
oldest (oldestOfTwo (max, d2)) ds
in
oldest' d ds
end
Another solution could be to take out the two first elements of the list, putting back the oldest of the two, thus in each recursive call you remove one element of the list, and at some point there will only be one element in the list, which must be the oldest.
fun oldest [] = NONE
| oldest [d] = SOME d
| oldest (d1 :: d2 :: ds) = oldest (oldestOfTwo (d1, d2) :: ds)