Looping through List and Adding to Tuple - list

I'm trying to create a function that loops through an array of Strings, adds the word to a new tuple which counts how many times a word occurs in a block of text. In an OO language, this is simple- create a KV pair for each word and the number of times it occurs. I'm trying to translate that code into Haskell, but I don't think its that straightforward.
countWords:: [String] -> [(String, Int)]
I know I need to create a list of tuples but I'm not sure how to loop through the list passed to the function using recursion.

A pretty direct translation of what you seem to be saying you'd do in OO would be to “loop” for each word through the list recursively and either update the entry that has it already, or append it as a new one:
registerWord :: String -> [(String, Int)] -> [(String, Int)]
registerWord w ((w',c):ws)
| w==w' = (w,c+1) : ws
| otherwise = (w',c) : registerWord w ws
registerWord w [] = [(w,1)]
Then do that for every given word, each time updating the register. This is easily done with a fold:
countWords :: [String] -> [(String, Int)]
countWords = foldr registerWord []
This list-inserting is awkward though, and inefficient (both in FP and OO), namely O(n2). A much nicer approach is to think functional-modularly: you effectively want to group equal words together. For this, you need to first sort them, so equal words are actually adjacent. Then you need to replace each group of duplicates with a single example, and the count. Nice functional pipeline:
countWords :: [String] -> [(String, Int)]
countWords = map (\gp#(w:_) -> (w, length gp)) . group . sort
Incidentally, there's nothing in this function that requires the keys to be “words” / strings, so you might as well generalise the signature to
countWords :: Ord a => [a] -> [(a, Int)]
(The other, inefficient approach would be even more general, requiring only Eq.)

Related

Haskell List with tuples which can be extended - like a Dictionary

I am a beginner in Haskell and trying to learn it, so please excuse my obliviousness.
I am currently trying to implement a Telephone book, which is a List of tuples [(Name, Number)] (Both are Strings).
type TelephoneBook = [(String),(String)] (?)
However, I have no clue how I can extend this list by another tuple.
For example: [("Fred", "47/273")] and now I want to add another tuple.
I was trying to understand how the module dictionary works to see how I can extend this List and I stumbled upon "data" and "type".
An idea I had was to create a several types of this TelephonBook:
let a = TelephoneBook ("Fred","42/2321")
but that is just a simple idea... I am kinda lost on how to extend this list by another tuple, taking into account that once something is defined it can't be altered (or can it).
(Please don't give the solution to the Problem but simply an idea on how to start or what I should Research further)
The (:) operator prepends elements to lists. For example:
> ("John", "555-1212") : [("Fred", "42/2321")]
[("John","555-1212"),("Fred","42/2321")]
because you're asking to extend a list:
i have to disappoint you. That's not possible in Haskell. You can construct a new one. Out of one Element and another List.
the list type in Haskell is defined similar to:
-- 1 2 3 4
data [a] = a : [a] | []
-- 1: if you encounter the type [a]
-- 3: it is either
-- 2: an element `e` and a list `l` forming the new list `e:l`
-- 4: or an empty List `[]`
-- so the types of the constructors are:
-- (:) :: a -> [a] -> [a]
-- [] :: [a]
So having a new element and a list you can construct a new one, using (:)!
type Entry = (String, String)
type Book = [Entry]
addEntry :: Entry -> Book -> Book
addEntry e b = e : b -- this works, because book is just a list
-- without type aliases: (this is the same, but maybe slightly less nice to read)
addEntry' :: (String, String) -> [(String, String)] -> [(String, String)]
addEntry' e b = e : b
-- or even simpler:
addEntry'' = (:)
The type keyword in Haskell has to be understood as a type alias, so it's just another name for something, the representation in Haskell is the same.

How to add tuples to list after reading from a text file in Haskell

I am trying to create a program in Haskell that reads text from a text file and adds them to a list.
My idea is:
type x = [(String, Integer)]
where the String is each word from the text and Integer is how many times that word occurs in the text. So I want to create a tuple of those values and add it to a list. I then want to print the contents of the list.
I know how to read a text file in Haskell, but am unsure as to what to do next. I am new to programming in Haskell and have predominantly been programming in Java which is very different.
EDIT:
This is what I have so far from the suggestions. I am able to write to an output text file with the text received from the file and make it lower case. The issues I am having is using the other functions because it says:
Test.hs:14:59: Not in scope: ‘group’
Here is the code:
import System.IO
import Data.Char(toLower)
main = do
contents <- readFile "testFile.txt"
let lowContents = map toLower contents
let outStr = countWords (lowContents)
let finalStr = sortOccurrences (outStr)
print outStr
-- Counts all the words
countWords :: String -> [(String, Int)]
countWords fileContents = countOccurrences (toWords fileContents)
-- Split words
toWords :: String -> [String]
toWords s = words s
-- Counts, how often each string in the given list appears
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = map (\xs -> (head xs, length xs)) . group . sortOccurrences xs
-- Sort list in order of occurrences.
sortOccurrences :: [(String, Int)] -> [(String, Int)]
sortOccurrences sort = sortBy sort (comparing snd)
Please can anyone help me with this.
Haskell features a fairly expressive type system (much more so than Java) so it's a good idea to consider this issue purely in terms of types, in a top-down fashion. You mentioned that you already know how read a text file in Haskell, so I'll assume you know how to get a String which holds the file contents.
The function you'd like to define is something like this. For now, we'll set the definition to undefined such that the code typechecks (but yields an exception at runtime):
countWords :: String -> [(String, Int)]
countWords fileContents = undefined
Your function maps a String (the file contents) to a list of tuples, each of which associating some word with the count how often that word appeared in the input. This sounds like one part of the solution will be a function which can split a string into a list of words such that you can then process that to count the words. I.e. you'll want something like this:
-- Splits a string into a list of words
toWords :: String -> [String]
toWords s = undefined
-- Counts, how often each string in the given list appears
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = undefined
With these at hand, you can actually define the original function:
countWords :: String -> [(String, Int)]
countWords fileContents = countOccurrences (toWords fileContents)
You now nicely decomposed the problem into two sub-problems.
Another nice aspect of this type-driven programm is that Hoogle can be told to go look for functions for a given type. For instance, consider the type of the toWords function we sketched earlier:
toWords :: String -> [String]
toWords s = undefined
Feeding this to Hoogle reveals a nice function: words which seems to do just what we want! So we can define
toWords :: String -> [String]
toWords s = words s
The only thing missing is coming up with an appropriate definition for countOccurrences. Alas, searching for this type on Hoogle doesn't show any ready-made solutions. However, there are three functions which will be useful for coming up with our own definition: sort, group and map:
The sort function does, what the name suggests: it sorts a list of things:
λ: sort [1,1,1,2,2,1,1,3,3]
[1,1,1,1,1,2,2,3,3]
The group function groups consecutive(!) equal elements, yielding a list of lists. E.g.
λ: group [1,1,1,1,1,2,2,3,3]
[[1,1,1,1,1],[2,2],[3,3]]
The map function can be used to turn the list of lists produced by group into a list of tuples, giving the length of each group:
λ: map (\xs -> (head xs, length xs)) [[1,1,1,1,1],[2,2],[3,3]]
[(1,5),(2,2),(3,2)]
Composing these three functions allows you to define
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = map (\xs -> (head xs, length xs)) . group . sort $ xs
Now you have all the pieces in place. Your countWords is defined in terms to toWords and countOccurrences, each of which having a proper definition.
The nice thing about this type-driven approach is that writing down the funciton signatures will help both your thinking as well as the compiler (catching you when you violate assumptions). You also, automatically, decompose the problem into smaller problems, each of which you can test independently in ghci.
Data.Map is the easiest way to do this.
import qualified Data.Map as M
-- assuming you already have your list of words:
listOfWords :: [String]
-- you can generate your list of tuples with this
listOfTuples :: [(String, Integer)]
listOfTuples = M.toList . M.fromListWith (+) $ zip listOfWords (repeat 1)

Grouping a list into lists of n elements in Haskell

Is there an operation on lists in library that makes groups of n elements? For example: n=3
groupInto 3 [1,2,3,4,5,6,7,8,9] = [[1,2,3],[4,5,6],[7,8,9]]
If not, how do I do it?
A quick search on Hoogle showed that there is no such function. On the other hand, it was replied that there is one in the split package, called chunksOf.
However, you can do it on your own
group :: Int -> [a] -> [[a]]
group _ [] = []
group n l
| n > 0 = (take n l) : (group n (drop n l))
| otherwise = error "Negative or zero n"
Of course, some parentheses can be removed, I left there here for understanding what the code does:
The base case is simple: whenever the list is empty, simply return the empty list.
The recursive case tests first if n is positive. If n is 0 or lower we would enter an infinite loop and we don't want that. Then we split the list into two parts using take and drop: take gives back the first n elements while drop returns the other ones. Then, we add the first n elements to the list obtained by applying our function to the other elements in the original list.
This function, among other similar ones, can be found in the popular split package.
> import Data.List.Split
> chunksOf 3 [1,2,3,4,5,6,7,8,9]
[[1,2,3],[4,5,6],[7,8,9]]
You can write one yourself, as Mihai pointed out. But I would use the splitAt function since it doesn't require two passes on the input list like the take-drop combination does:
chunks :: Int -> [a] -> [[a]]
chunks _ [] = []
chunks n xs =
let (ys, zs) = splitAt n xs
in ys : chunks n zs
This is a common pattern - generating a list from a seed value (which in this case is your input list) by repeated iteration. This pattern is captured in the unfoldr function. We can use it with a slightly modified version of splitAt (thanks Will Ness for the more concise version):
chunks n = takeWhile (not . null) . unfoldr (Just . splitAt n)
That is, using unfoldr we generate chunks of n elements while at the same time we shorten the input list by n elements, and we generate these chunks until we get the empty list -- at this point the initial input is completely consumed.
Of course, as the others have pointed out, you should use the already existing function from the split module. But it's always good to accustom yourself with the list processing functions in the standard Haskell libraries.
This is ofte called "chunk" and is one of the most frequently mentioned list operations that is not in base. The package split provides such an operation though, copy and pasting the haddock documentation:
> chunksOf 3 ['a'..'z']
["abc","def","ghi","jkl","mno","pqr","stu","vwx","yz"]
Additionally, against my wishes, hoogle only searches a small set of libraries (those provided with GHC or perhaps HP), but you can explicitly add packages to the search using +PKG_NAME - hoogle with Int -> [a] -> [[a]] +split gets what you want. Some people use Hayoo for this reason.

Haskell - Convert x number of tuples into a list [duplicate]

I have a question about tuples and lists in Haskell. I know how to add input into a tuple a specific number of times. Now I want to add tuples into a list an unknown number of times; it's up to the user to decide how many tuples they want to add.
How do I add tuples into a list x number of times when I don't know X beforehand?
There's a lot of things you could possibly mean. For example, if you want a few copies of a single value, you can use replicate, defined in the Prelude:
replicate :: Int -> a -> [a]
replicate 0 x = []
replicate n | n < 0 = undefined
| otherwise = x : replicate (n-1) x
In ghci:
Prelude> replicate 4 ("Haskell", 2)
[("Haskell",2),("Haskell",2),("Haskell",2),("Haskell",2)]
Alternately, perhaps you actually want to do some IO to determine the list. Then a simple loop will do:
getListFromUser = do
putStrLn "keep going?"
s <- getLine
case s of
'y':_ -> do
putStrLn "enter a value"
v <- readLn
vs <- getListFromUser
return (v:vs)
_ -> return []
In ghci:
*Main> getListFromUser :: IO [(String, Int)]
keep going?
y
enter a value
("Haskell",2)
keep going?
y
enter a value
("Prolog",4)
keep going?
n
[("Haskell",2),("Prolog",4)]
Of course, this is a particularly crappy user interface -- I'm sure you can come up with a dozen ways to improve it! But the pattern, at least, should shine through: you can use values like [] and functions like : to construct lists. There are many, many other higher-level functions for constructing and manipulating lists, as well.
P.S. There's nothing particularly special about lists of tuples (as compared to lists of other things); the above functions display that by never mentioning them. =)
Sorry, you can't1. There are fundamental differences between tuples and lists:
A tuple always have a finite amount of elements, that is known at compile time. Tuples with different amounts of elements are actually different types.
List an have as many elements as they want. The amount of elements in a list doesn't need to be known at compile time.
A tuple can have elements of arbitrary types. Since the way you can use tuples always ensures that there is no type mismatch, this is safe.
On the other hand, all elements of a list have to have the same type. Haskell is a statically-typed language; that basically means that all types are known at compile time.
Because of these reasons, you can't. If it's not known, how many elements will fit into the tuple, you can't give it a type.
I guess that the input you get from your user is actually a string like "(1,2,3)". Try to make this directly a list, whithout making it a tuple before. You can use pattern matching for this, but here is a slightly sneaky approach. I just remove the opening and closing paranthesis from the string and replace them with brackets -- and voila it becomes a list.
tuplishToList :: String -> [Int]
tuplishToList str = read ('[' : tail (init str) ++ "]")
Edit
Sorry, I did not see your latest comment. What you try to do is not that difficult. I use these simple functions for my task:
words str splits str into a list of words that where separated by whitespace before. The output is a list of Strings. Caution: This only works if the string inside your tuple contains no whitespace. Implementing a better solution is left as an excercise to the reader.
map f lst applies f to each element of lst
read is a magic function that makes a a data type from a String. It only works if you know before, what the output is supposed to be. If you really want to understand how that works, consider implementing read for your specific usecase.
And here you go:
tuplish2List :: String -> [(String,Int)]
tuplish2List str = map read (words str)
1 As some others may point out, it may be possible using templates and other hacks, but I don't consider that a real solution.
When doing functional programming, it is often better to think about composition of operations instead of individual steps. So instead of thinking about it like adding tuples one at a time to a list, we can approach it by first dividing the input into a list of strings, and then converting each string into a tuple.
Assuming the tuples are written each on one line, we can split the input using lines, and then use read to parse each tuple. To make it work on the entire list, we use map.
main = do input <- getContents
let tuples = map read (lines input) :: [(String, Integer)]
print tuples
Let's try it.
$ runghc Tuples.hs
("Hello", 2)
("Haskell", 4)
Here, I press Ctrl+D to send EOF to the program, (or Ctrl+Z on Windows) and it prints the result.
[("Hello",2),("Haskell",4)]
If you want something more interactive, you will probably have to do your own recursion. See Daniel Wagner's answer for an example of that.
One simple solution to this would be to use a list comprehension, as so (done in GHCi):
Prelude> let fstMap tuplist = [fst x | x <- tuplist]
Prelude> fstMap [("String1",1),("String2",2),("String3",3)]
["String1","String2","String3"]
Prelude> :t fstMap
fstMap :: [(t, b)] -> [t]
This will work for an arbitrary number of tuples - as many as the user wants to use.
To use this in your code, you would just write:
fstMap :: Eq a => [(a,b)] -> [a]
fstMap tuplist = [fst x | x <- tuplist]
The example I gave is just one possible solution. As the name implies, of course, you can just write:
fstMap' :: Eq a => [(a,b)] -> [a]
fstMap' = map fst
This is an even simpler solution.
I'm guessing that, since this is for a class, and you've been studying Haskell for < 1 week, you don't actually need to do any input/output. That's a bit more advanced than you probably are, yet. So:
As others have said, map fst will take a list of tuples, of arbitrary length, and return the first elements. You say you know how to do that. Fine.
But how do the tuples get into the list in the first place? Well, if you have a list of tuples and want to add another, (:) does the trick. Like so:
oldList = [("first", 1), ("second", 2)]
newList = ("third", 2) : oldList
You can do that as many times as you like. And if you don't have a list of tuples yet, your list is [].
Does that do everything that you need? If not, what specifically is it missing?
Edit: With the corrected type:
Eq a => [(a, b)]
That's not the type of a function. It's the type of a list of tuples. Just have the user type yourFunctionName followed by [ ("String1", val1), ("String2", val2), ... ("LastString", lastVal)] at the prompt.

How do I add x tuples into a list x number of times?

I have a question about tuples and lists in Haskell. I know how to add input into a tuple a specific number of times. Now I want to add tuples into a list an unknown number of times; it's up to the user to decide how many tuples they want to add.
How do I add tuples into a list x number of times when I don't know X beforehand?
There's a lot of things you could possibly mean. For example, if you want a few copies of a single value, you can use replicate, defined in the Prelude:
replicate :: Int -> a -> [a]
replicate 0 x = []
replicate n | n < 0 = undefined
| otherwise = x : replicate (n-1) x
In ghci:
Prelude> replicate 4 ("Haskell", 2)
[("Haskell",2),("Haskell",2),("Haskell",2),("Haskell",2)]
Alternately, perhaps you actually want to do some IO to determine the list. Then a simple loop will do:
getListFromUser = do
putStrLn "keep going?"
s <- getLine
case s of
'y':_ -> do
putStrLn "enter a value"
v <- readLn
vs <- getListFromUser
return (v:vs)
_ -> return []
In ghci:
*Main> getListFromUser :: IO [(String, Int)]
keep going?
y
enter a value
("Haskell",2)
keep going?
y
enter a value
("Prolog",4)
keep going?
n
[("Haskell",2),("Prolog",4)]
Of course, this is a particularly crappy user interface -- I'm sure you can come up with a dozen ways to improve it! But the pattern, at least, should shine through: you can use values like [] and functions like : to construct lists. There are many, many other higher-level functions for constructing and manipulating lists, as well.
P.S. There's nothing particularly special about lists of tuples (as compared to lists of other things); the above functions display that by never mentioning them. =)
Sorry, you can't1. There are fundamental differences between tuples and lists:
A tuple always have a finite amount of elements, that is known at compile time. Tuples with different amounts of elements are actually different types.
List an have as many elements as they want. The amount of elements in a list doesn't need to be known at compile time.
A tuple can have elements of arbitrary types. Since the way you can use tuples always ensures that there is no type mismatch, this is safe.
On the other hand, all elements of a list have to have the same type. Haskell is a statically-typed language; that basically means that all types are known at compile time.
Because of these reasons, you can't. If it's not known, how many elements will fit into the tuple, you can't give it a type.
I guess that the input you get from your user is actually a string like "(1,2,3)". Try to make this directly a list, whithout making it a tuple before. You can use pattern matching for this, but here is a slightly sneaky approach. I just remove the opening and closing paranthesis from the string and replace them with brackets -- and voila it becomes a list.
tuplishToList :: String -> [Int]
tuplishToList str = read ('[' : tail (init str) ++ "]")
Edit
Sorry, I did not see your latest comment. What you try to do is not that difficult. I use these simple functions for my task:
words str splits str into a list of words that where separated by whitespace before. The output is a list of Strings. Caution: This only works if the string inside your tuple contains no whitespace. Implementing a better solution is left as an excercise to the reader.
map f lst applies f to each element of lst
read is a magic function that makes a a data type from a String. It only works if you know before, what the output is supposed to be. If you really want to understand how that works, consider implementing read for your specific usecase.
And here you go:
tuplish2List :: String -> [(String,Int)]
tuplish2List str = map read (words str)
1 As some others may point out, it may be possible using templates and other hacks, but I don't consider that a real solution.
When doing functional programming, it is often better to think about composition of operations instead of individual steps. So instead of thinking about it like adding tuples one at a time to a list, we can approach it by first dividing the input into a list of strings, and then converting each string into a tuple.
Assuming the tuples are written each on one line, we can split the input using lines, and then use read to parse each tuple. To make it work on the entire list, we use map.
main = do input <- getContents
let tuples = map read (lines input) :: [(String, Integer)]
print tuples
Let's try it.
$ runghc Tuples.hs
("Hello", 2)
("Haskell", 4)
Here, I press Ctrl+D to send EOF to the program, (or Ctrl+Z on Windows) and it prints the result.
[("Hello",2),("Haskell",4)]
If you want something more interactive, you will probably have to do your own recursion. See Daniel Wagner's answer for an example of that.
One simple solution to this would be to use a list comprehension, as so (done in GHCi):
Prelude> let fstMap tuplist = [fst x | x <- tuplist]
Prelude> fstMap [("String1",1),("String2",2),("String3",3)]
["String1","String2","String3"]
Prelude> :t fstMap
fstMap :: [(t, b)] -> [t]
This will work for an arbitrary number of tuples - as many as the user wants to use.
To use this in your code, you would just write:
fstMap :: Eq a => [(a,b)] -> [a]
fstMap tuplist = [fst x | x <- tuplist]
The example I gave is just one possible solution. As the name implies, of course, you can just write:
fstMap' :: Eq a => [(a,b)] -> [a]
fstMap' = map fst
This is an even simpler solution.
I'm guessing that, since this is for a class, and you've been studying Haskell for < 1 week, you don't actually need to do any input/output. That's a bit more advanced than you probably are, yet. So:
As others have said, map fst will take a list of tuples, of arbitrary length, and return the first elements. You say you know how to do that. Fine.
But how do the tuples get into the list in the first place? Well, if you have a list of tuples and want to add another, (:) does the trick. Like so:
oldList = [("first", 1), ("second", 2)]
newList = ("third", 2) : oldList
You can do that as many times as you like. And if you don't have a list of tuples yet, your list is [].
Does that do everything that you need? If not, what specifically is it missing?
Edit: With the corrected type:
Eq a => [(a, b)]
That's not the type of a function. It's the type of a list of tuples. Just have the user type yourFunctionName followed by [ ("String1", val1), ("String2", val2), ... ("LastString", lastVal)] at the prompt.