Haskell comparing characters in a word

Haskell comparing characters in a word - list

I'm trying to make a function that will check how many characters are matching in a word
The characters are non-repeating A-Z. The position of the letters doesn't matter
INPUT target = "ABCDE" , attempt = "CDXYZ",
OUTPUT match = 2 letters (C & D)
I've managed to write a function that compares one character in a word, but I've no clue
how to make it compare every character.
import Data.List
--check if a char appears in a [char]
checkForCharInAWord :: [Char] -> Char -> Bool
checkForCharInAWord wrd lt = elem lt wrd
compareChars :: [Char] -> [Char] -> Int
compareChars = ?????
I would also like to know how to count the matching characters in case of words with repeating characters, the position of the letter doesn't matter. I.e:
INPUT target = "AAAB", attempt = "ABBB" OUTPUT match = 2 letters (A &
B)
INPUT target = "AAAB", attempt = "AABB" OUTPUT match = 3 letters (A,
A & B)
and finally how to count the matching characters in a word, but where position is also taken into consideration
INPUT target = "NICE", attempt = "NEAR" OUTPUT match = 1 letters (N)
-- correct letter and a correct position
INPUT target = "BBBB", attempt = "BABA" OUTPUT match = 2 letters (B,
B) -- correct letter and a correct position
In each case I need just a simple int output with a similarity rating between 0 and (max number of letters in a target word.
The method should be flexible enough to allow words of different length (however target and attempt word will always be of equal length).
Thanks!

Below function seems to work for point 3. It takes two formal parameters (w1 and w2) representing the two words you want to compare, creates a list of 2-tuples of the corresponding letters using the built-in zip function (where xs = zip w1 w2 ). Then it goes through this list of 2-tuples and for each tuple compares the two elements using a list comprehension with guard. If they match, it adds a 1 to the list, otherwise it does nothing. Finaly, it sums all elements of this list (sum [1 | x<-xs, fst x == snd x]).
match :: [Char] -> [Char] -> Int
match w1 w2 = sum [1 | x<-xs, fst x == snd x]
where xs = zip w1 w2

Related

What is the idiomatic Haskell way to check if one list contains all the values in another list?

My question is similar to How can i use haskell to check if a list contains the values in a tuple, but in my case, the list with values which must all be in my original list is the alphabet. It feels really messy to have 26 calls to elem, all ANDed together.
Is there any more concise way to check if a given list contains all of the elements in another list, where the other list is 26 items long?

I like the answer by user1984, but it does leave implicit a bit the way you would handle non-alphabetic characters. Here's one that's almost as simple, but doesn't need extra code to handle non-alphabetic characters. The basic idea is to go the other direction; their answer builds up a set, this one tears one down by deleting one character at a time. It is also O(n), like theirs.
import qualified Data.Set as S
isPangram :: String -> Bool
isPangram = S.null . foldr S.delete (S.fromList ['a'..'z'])
If you want case insensitivity, you can import Data.Char and change S.delete to (S.delete . toLower).

This is based on Daniel Wagner's answer, but it finishes early if all the letters are found, whereas his walks the whole list, then works backwards deleting elements, then gives an answer when it gets back to the beginning.
import qualified Data.Set as S
import Data.Set (Set)
letters :: Set Char
letters = S.fromList ['a'..'z']
isPangram :: String -> Bool
isPangram xs0 = go xs0 letters
where
go :: String -> Set Char -> Bool
go [] letters = S.null letters
go (c : cs) letters =
S.null letters || go cs (S.delete c letters)
You may be wondering why I match on the string first and check on letters under the match, instead of the other way around. Well, that's only because it's fun to rewrite this using foldr:
isPangram :: String -> Bool
isPangram xs0 = foldr step S.null xs0 letters
where
step :: String -> Set Char -> Bool
step c r letters =
S.null letters || r (S.delete c letters)
For something like the alphabet, Data.Set is not actually the most efficient way. It's better to use something like IntSet, which is much more compact and significantly faster.
import qualified Data.IntSet as S
import Data.IntSet (IntSet)
letters :: IntSet
letters = S.fromList (map fromEnum ['a'..'z'])
isPangram :: String -> Bool
isPangram xs0 = foldr step S.null xs0 letters
where
step :: String -> IntSet -> Bool
step c r letters =
S.null letters || r (S.delete (fromEnum c) letters)

My understanding is that you are trying to check whether a string is a pangram, meaning it contains all the [insert your preferred language] alphabet.
The following code uses Data.Set. A set data structure can't contain duplicate elements by definition. We use this property of a set to count the number of elements that are in it after we added all the character from the string to it. If it's equal to the number of the alphabet we are working with, it means that the string contains all the characters of the alphabet, so it's a pangram.
Note that this code does not take care of the difference between lower and upper case letters and also counts white space and punctuation. This would give you wrong answers if there are any of these cases. With some tweaking you can make it work though.
The time complexity of this solution is n * log 26 which is just n in Big O, as Daniel Wagner pointed out. The reason is that while Haskell's default set is a balanced tree, the number of elements in the set never goes above 26, so it becomes linear. Using a hashing implementation of the set may result in some optimization for very large strings. The space complexity is constant since the additional space we are using never exceeds 26 characters.
import qualified Data.Set as S
isPangram :: [Char] -> Bool
isPangram s = length (S.fromList s) == 26

To answer your question literally,
allPresent :: Eq a => [a] -> [a] -> Bool
allPresent alphabet xs = foldr (\a r -> elem a xs && r) True alphabet
is a concise and inefficient code that will call elem as many times as there are letters in your alphabet and will "AND" the results.
For example, for a three-letter alphabet a, b, and c, it will be equivalent to an &&-chain of three calls to elem and a True:
allPresent [a, b, c] xs
==
elem a xs && elem b xs && elem c xs && True
It will stop as soon as the first missing letter will be detected, returning False. But it will potentially comb through the whole of xs anew for each letter of the alphabet, which is inefficient.

If you are infact looking for a pangram, you could try this solution which I came up with.
import Data.Char ( toLower )
isPangram :: String -> Bool
isPangram text = all isInText ['a'..'z'] where
isInText x = x `elem` map toLower text

How to count the number of recurring character repetitions in a char list?

My goal is to take a char list like:
['a'; 'a'; 'a'; 'a'; 'a'; 'b'; 'b'; 'b'; 'a'; 'd'; 'd'; 'd'; 'd']
Count the number of repeated characters and transform it into a (int * char) list like this:
[(5, 'a'); (3, 'b'); (1, 'a'); (4, 'd')]
I am completely lost and also am very very new to OCaml. Here is the code I have rn:
let to_run_length (lst : char list) : (int * char) list =
match lst with
| [] -> []
| h :: t ->
let count = int 0 in
while t <> [] do
if h = t then
count := count + 1;
done;
I am struggling on how to check the list like you would an array in C or Python. I am not allowed to use fold functions or map or anything like that.
Edit: Updated code, yielding an exception on List.nth:
let rec to_run_length (lst : char list) : (int * char) list =
let n = ref 0 in
match lst with
| [] -> []
| h :: t ->
if h = List.nth t 0 then n := !n + 1 ;
(!n, h) :: to_run_length t ;;
Edit: Added nested match resulting in a function that doesn't work... but no errors!
let rec to_run_length (lst : char list) : (int * char) list =
match lst with
| [] -> []
| h :: t ->
match to_run_length t with
| [] -> []
| (n, c) :: tail ->
if h <> c then to_run_length t
else (n + 1, c) :: tail ;;
Final Edit: Finally got the code running perfect!
let rec to_run_length (lst : char list) : (int * char) list =
match lst with
| [] -> []
| h :: t ->
match to_run_length t with
| (n, c) :: tail when h = c -> (n + 1, h) :: tail
| tail -> (1, h) :: tail ;;

One way to answer your question is to point out that a list in OCaml isn't like an array in C or Python. There is no (constant-time) way to index an OCaml list like you can an array.
If you want to code in an imperative style, you can treat an OCaml list like a list in C, i.e., a linked structure that can be traversed in one direction from beginning to end.
To make this work you would indeed have a while statement that continues only as long as the list is non-empty. At each step you examine the head of the list and update your output accordingly. Then replace the list with the tail of the list.
For this you would want to use references for holding the input and output. (As a side comment, where you have int 0 you almost certainly wanted ref 0. I.e., you want to use a reference. There is no predefined OCaml function or operator named int.)
However, the usual reason to study OCaml is to learn functional style. In that case you should be thinking of a recursive function that will compute the value you want.
For that you need a base case and a way to reduce a non-base case to a smaller case that can be solved recursively. A pretty good base case is an empty list. The desired output for this input is (presumably) also an empty list.
Now assume (by recursion hypothesis) you have a function that works, and you are given a non-empty list. You can call your function on the tail of the list, and it (by hypothesis) gives you a run-length encoded version of the tail. What do you need to do to this result to add one more character to the front? That's what you would have to figure out.
Update
Your code is getting closer, as you say.
You need to ask yourself how to add a new character to the beginning of the encoded value. In your code you have this, for example:
. . .
match to_run_length t with
| [] -> []
. . .
This says to return an empty encoding if the tail is empty. But that doesn't make sense. You know for a fact that there's a character in the input (namely, h). You should be returning some kind of result that includes h.
In general if the returned list starts with h, you want to add 1 to the count of the first group. Otherwise you want to add a new group to the front of the returned list.

List of tuples in haskell

Given a list of tuples asd :: [(Char, Char)], I want to write a function that takes in a string and returns the same string, with characters matching the first element of tuples in asd replaced with the corresponding second element.
For example, with asd = [('a', 'b'), ('c', 'd')], with input "ac", it should return "bd".
I also want to write a function that does the reverse, which when given input "bd" should return "ac".
I have a solution, but I can't use list generator and recursion.
This is my solution:
xyz :: String -> String
xyz x = concat (map y x) where y ys = [a | (b,a) <- asd, ys == b]
zyx x :: String -> String
zyx x = concat (map y x) where y ys = [a | (a,b) <- asd, ys == b]
How can I write this without recursion and list generator?

Pending clarification regarding exact specification in comments, I am assuming that you want to replace characters while keeping non-matching as-is.
To apply a function to each element of a list (a String is just [Char]), use map :: (a -> b) -> [a] -> [b]. Hence, you will need to write a map on the input string like so
xyz :: String -> String
xyz = map replace
This replace function should run through the list of tuples asd, and find the first tuple that matches. (I'm making an assumption here as it wasn't specified how to handle having more than one matching tuple.) To do this, we make use of filter :: (a -> Bool) -> [a] -> [a] to find which tuples are matching.
replace :: Char -> Char
replace c = case filter ((== c) . fst) asd of
((_, c'):_) -> c' -- found a replacement
[] -> c -- no match, don't replace character
where ((== c) . fst) compares the first element of the tuple to the character c.
Lastly, to implement the reverse function, simply do the same thing but with the replacement looking up by the second element instead, like so
zyx :: String -> String
zyx = map replace'
replace' :: Char -> Char
replace' c = case filter ((== c) . snd) asd of
((c', _):_) -> c' -- found a replacement
[] -> c -- no match, don't replace character
Reference links for fst and snd.

Partitioning a String into more pieces with separating char in Haskell

I have the following homework:
Define a function split :: Char -> String -> [String] that splits a string, which consists of substrings separated by a separator, into a list of strings.
Examples:
split '#' "foo##goo" = ["foo","","goo"]
split '#' "#" = ["",""]
I have written the following function:
split :: Char -> String -> [String]
split c "" = [""]
split a "a" = ["",""]
split c st = takeWhile (/=c) st : split c tail((dropWhile (/=c) st))
It does not compile, and I can't see why.
TakeWhile adds all the characters which are not c to the result, then tail drops that c that was found already, and we recursively apply split to the rest of the string, gotten with dropWhile. The : should make a list of "lists" as strings are lists of chars in Haskell. Where is the gap in my thinking?
Update:
I have updated my program to the following:
my_tail :: [a]->[a]
my_tail [] = []
my_tail xs = tail xs
split :: Char -> String -> [String]
split c "" = [""]
split a "a" = ["",""]
split c st = takeWhile (/=c) st ++ split c (my_tail(dropWhile (/=c) st))
I still get an error, the following:
Why is the expected type [String] and then [Char]?

The reason why this does not compile is because Haskell, sees your last clause as:
split c st = takeWhile (/=c) st : split c tail ((dropWhile (/=c) st))
It thus thinks that you apply three parameters to split: c, tail and ((dropWhile (/=c) st)). You should use brackets here, like:
split c st = takeWhile (/=c) st : split c (tail (dropWhile (/=c) st))
But that will not fully fix the problem. For example if we try to run your testcase, we see:
Prelude> split '#' "foo##goo"
["foo","","goo"*** Exception: Prelude.tail: empty list
tail :: [a] -> [a] is a "non-total" function. For the empty list, tail will error. Indeed:
Prelude> tail []
*** Exception: Prelude.tail: empty list
Eventually, the list will run out of characters, and then tail will raise an error. We might want to use span :: (a -> Bool) -> [a] -> ([a], [a]) here, and use pattern matching to determine if there is still some element that needs to be processed, like:
split :: Eq a => a -> [a] -> [[a]]
split _ [] = [[]]
split c txt = pf : rst
where rst | (_:sf1) <- sf = split c sf1
| otherwise = []
(pf,sf) = span (c /=) txt
Here span (c /=) txt will thus split the non-empty list txt in two parts pf (prefix) is the longest prefix of items that are not equal to c. sf (suffix) are the remaining elements.
Regardless whether sf is empty or not, we emit the prefix pf. Then we inspect the suffix. We know that either sf is empty (we reached the end of the list), or that the the first element of sf is equal to c. We thus use pattern guard to check if this matches with the (_:sf1) pattern. This happens if sf is non-empty. In that case we bind sf1 with the tail of sf, and we recurse on the tail. In case sf1 is empty, we can stop, and thus return [].
For example:
Prelude> split '#' "foo##goo"
["foo","","goo"]
Prelude> split '#' "#"
["",""]

Frequency table in Haskell with list comprehension only, find frequency of characters in a String

I am new to Haskell, trying to learn some stuff and pass the task that I was given. I would like to find the number of characters in a String but without importing Haskell modules.
I need to implement a frequency table and I would like to understand more about programming in Haskell and how I can do it.
I have my FreqTable as a tuple with the character and the number of occurrences of the 'char' in a String.
type FreqTable = [(Char, Int)]
I have been searching for for a solution for couple of days and long hours to find some working examples.
My function or the function in the task id declares as follows:
fTable :: String -> FreqTable
I know that the correct answer can be:
map (\x -> (head x, length x)) $ group $ sort
or
map (head &&& length) . group . sort
or
[ (x,c) | x <- ['A'..'z'], let c = (length . filter (==x)), c>0 ]
I can get this to work exactly with my list but I found this as an optional solution. I am getting an error which I can solve at the moment with the above list comprehension.
Couldn't match expected type ‘String -> FreqTable’
with actual type ‘[(Char, [Char] -> Int)]’
In the expression:
[(x, c) |
x <- ['A' .. 'z'], let c = (length . filter (== x)), c > 0]
In an equation for ‘fTable’:
fTable
= [(x, c) |
x <- ['A' .. 'z'], let c = (length . filter (== x)), c > 0]
Can please someone share with me and explain me a nice and simple way of checking the frequency of characters without importing Data.List or Map

You haven't included what you should be filtering and taking the length of
[ (x,c) | x <- ['A'..'z'], let c = (length . filter (==x)), c>0 ]
-- ^_____________________^
-- this is a function from a String -> Int
-- you want the count, an Int
-- The function needs to be applied to a String
The string to apply it to is the argument to fTable
fTable :: String -> FreqTable
fTable text = [ (x,c) | x <- ['A'..'z'], let c = (length . filter (==x)) text, c>0 ]
-- ^--------------------------------------------------------------------^

The list: ['A'..'z'] is this string:
"ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz"
so you are iterating over both upper and lower case letters (and some symbols.) That's why you have a tuple, e.g., for both 'A' and 'a'.
If you want to perform a case-insensitive count, you have to perform a case-insensitive comparison instead of straight equality.
import Data.Char
ciEquals :: Char -> Char -> Bool
ciEquals a b = toLower a == toLower b
Then:
ftable text = [ (x,c) | x <- ['A'..'Z'],
, let c = (length . filter (ciEquals x)) text,
, c > 0 ]

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Haskell comparing characters in a word - list

Related

What is the idiomatic Haskell way to check if one list contains all the values in another list?

How to count the number of recurring character repetitions in a char list?

List of tuples in haskell

Partitioning a String into more pieces with separating char in Haskell

Frequency table in Haskell with list comprehension only, find frequency of characters in a String

Categories

Resources