Process a string using foldr where '#' means deleting the previous character - list

I need to process a string using foldr where '#' means deleting the previous character. For example:
>backspace "abc#d##c"
"ac"
>backspace "#####"
""
It needs to be done using foldr through one pass of the list, without using reverse and/or (++).
Here what I have got so far:
backspace :: String -> String
backspace xs = foldr func [] xs where
func c cs | c /= '#' = c:cs
| otherwise = cs
But it just filter the '#' from the string. I thought about deleting the last element of current answer every time c == '#' and got something like that
backspace :: String -> String
backspace xs = foldr func [] xs where
func c cs | c /= '#' = c:cs
| cs /= [] = init cs
| otherwise = cs
but it is not working properly,
ghci> backspace "abc#d##c"
"abc"

You can use (Int, String) as state for your foldr where the first Int is the number of backspaces, and the String is the current string constructed.
This thus means that you can work with:
backspace :: String -> String
backspace = snd . foldr func (0, [])
where func '#' (n, cs) = (n+1, cs)
func c (n, cs)
| n > 0 = … -- (1)
| otherwise = … -- (2)
In case we have a character that is not a #, but n > 0 it means we need to remove that character, and thus ignore c and decrement n. In case n == 0 we can add c to the String.
I leave filling in the … parts as an exercise.

Related

Haskell: How to carry a value from one function to another

I have a function 'one' that creates a string of length i,
fillWithEmpty :: Int -> String
fillWithEmpty i =
if i == 0 then "." else "." ++ fillWithEmpty(i - 1)
I then want the system to remember the length i so that it can replace a character in the string with 'S' at a position in the string of length i, given a value of a position needed to be replaced, e
replaceWithS :: String -> Int -> String
replaceWithS i e=
if i == e then "S" else "." ++ replaceWithS(i - 1)
Any help would be appreciated. Thanks
You can use explicit recursion to enumerate over the list. Each time you make a call where you decrement the index, and if the index is 0 you use an S instead of the value of the given string, so:
replaceWithS :: String -> Int -> String
replaceWithS "" _ = ""
replaceWithS (_:xs) 0 = … : …
replaceWithS (x:xs) i = … : replaceWithS … …
Here x is thus the head of the string (its first character), and xs is a list with the remaining characters. You here still need to fill in the … parts.

How do you split a string into lists unless it is inside quotation marks ("") in Ocaml?

I'm reading an input file of several lines. Each line has the following format:
Greeting "hello"
Greeting " Good morning"
Sit
Smile
Question "How are you?"
My current can read each line into a string list. Then I process it using this function which is supposed to break it into a string list list:
let rec process (l : string list) (acc : string list list) : string list list =
match l with
| [] -> acc
| hd :: tl -> String.split_on_char ' ' hd :: (process tl acc)
Which, unfortunately, does not work, since it also splits spaces inside quotation marks. Anyone think of a the right way to do this, possibly using map or fold_left, etc? This would be my expected output:
[["Greeting"; "/"hello/""];[Greeting; "/" Good morning"];["Sit"]]
and so on. Thank you!
You want a real (but very simple) lexical analysis. IMHO this is beyond what you can do with simple string splitting.
A scanner takes a stream of characters and returns the next token it sees. You can make a string into a stream by having an index that traverses the string.
Here is a scanner that is roughly what you would want:
let rec scan s offset =
let slen = String.length s in
if offset >= slen then
None
else if s.[offset] = ' ' then
scan s (offset + 1)
else if s.[offset] = '"' then
let rec qlook loff =
if loff >= slen then
(* Unterminated quotation *)
let tok = String.sub s offset (slen - offset) in
Some (tok, slen)
else if s.[loff] = '"' then
let tok = String.sub s offset (loff - offset + 1) in
Some (tok, loff + 1)
else qlook (loff + 1)
in
qlook (offset + 1)
else
let rec wlook loff =
if loff >= slen then
let tok = String.sub s offset (slen - offset) in
Some (tok, slen)
else if s.[loff] = ' ' || s.[loff] = '"' then
let tok = String.sub s offset (loff - offset) in
Some (tok, loff)
else
wlook (loff + 1)
in
wlook (offset + 1)
It handles a few cases that you didn't specify: what to do if there is an unclosed quotation. What to do with something like abc"def ghi".
The scanner returns None at the end of the string, or Some (token, offset), i.e., the next token and the offset to continue scanning.
A recursive function to break up a string would look something like this:
let split s =
let rec isplit accum offset =
match scan s offset with
| None -> List.rev accum
| Some (tok, offset') -> isplit (tok :: accum) offset'
in
isplit [] 0
This can be visualized with a state machine. You have 2 main states: looking for ' ' and looking for '"'. Processing strings is ugly and you can't pattern match it. So first thing I did is turn the string into a char list. Implementing the two states then becomes simple:
let split s =
let rec split_space acc word = function
| [] -> List.rev (List.rev word::acc)
| ' '::xs -> split_space (List.rev word::acc) [] xs
| '"'::xs -> find_quote acc ('"'::word) xs
| x::xs -> split_space acc (x::word) xs
and find_quote acc word = function
| [] -> List.rev (List.rev word::acc)
| '"'::xs -> split_space acc ('"'::word) xs
| x::xs -> find_quote acc (x::word) xs
in
split_space [] [] s
;;
# split ['a';'b';' ';'"';'c';' ';'d';'"';' ';'e'];;
- : char list list = [['a'; 'b']; ['"'; 'c'; ' '; 'd'; '"']; ['e']]
Now if you want to do it with strings that's left to you. The Idea would be the same. Or you can just turn the char list list into a string list at the end.

Partitioning a String into more pieces with separating char in Haskell

I have the following homework:
Define a function split :: Char -> String -> [String] that splits a string, which consists of substrings separated by a separator, into a list of strings.
Examples:
split '#' "foo##goo" = ["foo","","goo"]
split '#' "#" = ["",""]
I have written the following function:
split :: Char -> String -> [String]
split c "" = [""]
split a "a" = ["",""]
split c st = takeWhile (/=c) st : split c tail((dropWhile (/=c) st))
It does not compile, and I can't see why.
TakeWhile adds all the characters which are not c to the result, then tail drops that c that was found already, and we recursively apply split to the rest of the string, gotten with dropWhile. The : should make a list of "lists" as strings are lists of chars in Haskell. Where is the gap in my thinking?
Update:
I have updated my program to the following:
my_tail :: [a]->[a]
my_tail [] = []
my_tail xs = tail xs
split :: Char -> String -> [String]
split c "" = [""]
split a "a" = ["",""]
split c st = takeWhile (/=c) st ++ split c (my_tail(dropWhile (/=c) st))
I still get an error, the following:
Why is the expected type [String] and then [Char]?
The reason why this does not compile is because Haskell, sees your last clause as:
split c st = takeWhile (/=c) st : split c tail ((dropWhile (/=c) st))
It thus thinks that you apply three parameters to split: c, tail and ((dropWhile (/=c) st)). You should use brackets here, like:
split c st = takeWhile (/=c) st : split c (tail (dropWhile (/=c) st))
But that will not fully fix the problem. For example if we try to run your testcase, we see:
Prelude> split '#' "foo##goo"
["foo","","goo"*** Exception: Prelude.tail: empty list
tail :: [a] -> [a] is a "non-total" function. For the empty list, tail will error. Indeed:
Prelude> tail []
*** Exception: Prelude.tail: empty list
Eventually, the list will run out of characters, and then tail will raise an error. We might want to use span :: (a -> Bool) -> [a] -> ([a], [a]) here, and use pattern matching to determine if there is still some element that needs to be processed, like:
split :: Eq a => a -> [a] -> [[a]]
split _ [] = [[]]
split c txt = pf : rst
where rst | (_:sf1) <- sf = split c sf1
| otherwise = []
(pf,sf) = span (c /=) txt
Here span (c /=) txt will thus split the non-empty list txt in two parts pf (prefix) is the longest prefix of items that are not equal to c. sf (suffix) are the remaining elements.
Regardless whether sf is empty or not, we emit the prefix pf. Then we inspect the suffix. We know that either sf is empty (we reached the end of the list), or that the the first element of sf is equal to c. We thus use pattern guard to check if this matches with the (_:sf1) pattern. This happens if sf is non-empty. In that case we bind sf1 with the tail of sf, and we recurse on the tail. In case sf1 is empty, we can stop, and thus return [].
For example:
Prelude> split '#' "foo##goo"
["foo","","goo"]
Prelude> split '#' "#"
["",""]

A Haskell function that takes two strings and filters the second according to the first

The objective is: Using foldr, define a function remove which takes two strings as its arguments and removes every letter from the second list that occurs in the first list. For example, remove "first" "second" = "econd".
If this function took a single character and a string, I would do:
remove a xs = foldr (\x acc -> if x /= a then x : acc else acc) [] xs
But I can't figure out how I am supposed to do this with two strings. Thank you!
remove xs ys = foldr (\x acc -> if elem x xs then acc else x : acc) [] ys
yep.
remove :: String -> String -> String
remove xs ys = foldr (condCons) "" ys
where
condCons l rs | l `notElem` xs = l : rs
| otherwise = rs
It is also allowed to discard the 'ys' paramenter:
remove :: String -> String -> String
remove xs = foldr (condCons) ""
where
condCons l rs | l `notElem` xs = l : rs
| otherwise = rs
Basically, condCons takes a character L and a string Rs. If L is not an element of xs, then it cons to Rs, otherwise leave Rs unchanged. foldr takes the condCons, the intial string "", and the second argument ys. L takes on each character of the string ys from right to left, building a new string from "" using the condCons binary operator.

Trying to get first word from character list

I have a character list [#"h", #"i", #" ", #"h", #"i"] which I want to get the first word from this (the first character sequence before each space).
I've written a function which gives me this warning:
stdIn:13.1-13.42 Warning: type vars not generalized because of value
restriction are instantiated to dummy types (X1,X2,...)
Here is my code:
fun next [] = ([], [])
| next (hd::tl) = if(not(ord(hd) >= 97 andalso ord(hd) <= 122)) then ([], (hd::tl))
else
let
fun getword [] = [] | getword (hd::tl) = if(ord(hd) >= 97 andalso ord(hd) <= 122) then [hd]#getword tl else [];
in
next (getword (hd::tl))
end;
EDIT:
Expected input and output
next [#"h", #"i", #" ", #"h", #"i"] => ([#"h", #"i"], [#" ", #"h", #"i"])
Can anybody help me with this solution? Thanks!
This functionality already exists within the standard library:
val nexts = String.tokens Char.isSpace
val nexts_test = nexts "hi hi hi" = ["hi", "hi", "hi"]
But if you were to build such a function anyway, it seems that you return ([], []) sometimes and a single list at other times. Normally in a recursive function, you can build the result by doing e.g. c :: recursive_f cs, but this is assuming your function returns a single list. If, instead, it returns a tuple, you suddenly have to unpack this tuple using e.g. pattern matching in a let-expression:
let val (x, y) = recursive_f cs
in (c :: x, y + ...) end
Or you could use an extra argument inside a helper function (since the extra argument would change the type of the function) to store the word you're extracting, instead. A consequence of doing that is that you end up with the word in reverse and have to reverse it back when you're done recursing.
fun isLegal c = ord c >= 97 andalso ord c <= 122 (* Only lowercase ASCII letters *)
(* But why not use one of the following:
fun isLegal c = Char.isAlpha c
fun isLegal c = not (Char.isSpace c) *)
fun next input =
let fun extract (c::cs) word =
if isLegal c
then extract cs (c::word)
else (rev word, c::cs)
| extract [] word = (rev word, [])
in extract input [] end
val next_test_1 =
let val (w, r) = next (explode "hello world")
in (implode w, implode r) = ("hello", " world")
end
val next_test_2 = next [] = ([], [])