Using binary trees in Haskell - Trying to use paths [closed] - list

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I'm struggling to go about this, so basically binary tree can be used as a database. Here, the leaves of a tree are either ND indicating no data, or Data d where d is a data item.
data Btree a = ND | Data a | Branch (Btree a) (Btree a)
data Dir = L | R
type Path = [Dir]
So one can give a path to a leaf by giving a list such as [L,R,L] which
indicates the leaf one arrives at by moving left, right, left from the root of the tree (there may be no such leaf).
What I'm trying to do is for example define something called e.g. extract where
extract :: Path -> Btree a -> Error a
which given a path and a binary tree, outputs the data at the end of the path, and gives an error value when the path does not match any data. Any help would be appreciated

It might help you get started to look at which cases you need to handle. You have two arguments, each of which has a small number of constructors:
The Path argument can be empty ([]) or not (_:_).
The Btree a argument can be empty (ND), have data (Data _), or be an interior node (Branch _ _).
Further, the return value isn't always an error; it could be a value of type a! The return value you probably have in mind is something like Either e a (where e is your error type, usually String but it could be an enumerated type like your Dir type) or even Maybe a if you just want to signal "no data found" with Nothing.
With that in mind, you have 6 cases for which your function should be defined:
extract :: Path -> Btree a -> Maybe a
extract [] ND = ...
extract [] (Data d) = ...
extract [] (Branch left right) = ...
extract (d:ds) ND = ...
extract (d:ds) (Data dat) = ...
extract (d:ds) (Branch left right) = ...
Now you have six simpler functions to define. Some are easy, some are less so, but this should give you somewhere to start, or at least refine what you are asking about.

Related

How can I change this if-then-else construction to a construction that uses pattern matching or/and guards?

I've been given the following exercise (one out of several that link together to pretty print a table and make selections in it):
Write a function select :: Field → Field → Table → Table that given a column name and a field value, selects only those rows from the table that have the given field value in the given column. If the given column is not present in the table then the table should be returned unchanged. (Hint: use the functions (!!), elemIndex, filter and maybe.)
And ended up with this solution:
select :: Field -> Field -> Table -> Table
select column value table#(header:rows) =
let i = fromMaybe (-1) (elemIndex column header)
in if i == (-1) then table
else header : filter (\r -> r !! i == value) rows
While it seems to be perfectly correct in its function - it works - I've been told that if-then-else constructions such as these are 'bad form' and should be avoided using guards (as should my use of fromMaybe using pattern matching).
How would I go about changing this into a 'better' style with pattern matching/guards?
One simple improvement that I immediately see when looking at your code is that it seems pointless to use fromMaybe to convert Nothing to -1, then simply do a check (whether with an if or with guards doesn't matter) on whether that value is -1. Why not check against Nothing in the first place?
I guess you might have been led this way by similar functions in other languages where if an index isn't found then -1 is returned as an artifical value (sometimes called a "sentinel value") to mean "the element wasn't found" - but in Haskell, Nothing communicates this much better.
Further Nothing can be pattern matched on, so rather than using if or guards, you can use a case statement. This turns your function into:
select :: Field -> Field -> Table -> Table
select column value table#(header:rows) =
case elemIndex column header of
Nothing -> table
Just i -> header : filter (\r -> r !! i == value) rows
This to my mind is much better than your original.
I still feel it could be improved further - in particular (!!) is rarely used in idiomatic Haskell code as it risks crashing if the index is out of bounds. It's also quite inefficient compared to the array-indexing operators of other languages (because Haskell lists are linked-lists, not arrays, and to get say the 100th element it has to run through the previous 99 rather than being able to do direct "random access"). However I don't see how you can really avoid that given what you have to work with here.
Replace
let i = fromMaybe (-1) (elemIndex column header)
in if i == (-1) then ... else ...
with
case elemIndex column header of
Nothing -> ... -- no index found
Just i -> ... -- index i found
Your original code suffers from "boolean blindness": you already have a useful value of type Maybe Int which tells you:
if there is no index (Nothing)
if there is an index, and what it is (Just i)
The last emphasized part is important! In your code you work hard to lose that information, reducing everything to a boolean telling you only:
if there is no index (True)
if there is an index, and nothing else (False)
Don't use booleans unless you really have to. Many languages must use if someBooleanCondition to conditionally branch. Haskell does not have to: we have case and pattern matching that combines branching and extracting information from each case (the i above).

How to group by elements of tuple in Scala and map to list of new elements?

I going through some Scala exercises in order to understand better high order functions. I have the following problem, which I cant understand how to solve. I have the following list:
val input = List((1,"a"), (1,"b"), (1,"c"), (2,"d"), (2,"y"), (2,"e"), (3, "u"), (3,"a"), (3,"d"))
I want to create a Map from this list which maps each letter encountered on the input list to a list of all the numbers which were previously in the same tuple as that letter. Example output:
Map(a -> List(1,3), b->List(1), c->List(1), d->List(2,3), y->List(2), e->List(2), u->List(3)
What I have tried so far is:
val output = input.groupBy(_._2)
However this gives as output:
Map(e -> List((2,e)), y -> List((2,y)), u -> List((3,u)), a -> List((1,a), (3,a)), b -> List((1,b)), c -> List((1,c)), d -> List((2,d), (3,d)))
Could someone help me understand how I could go about solving this? I appreciate any help as I am new to functional programming
As #sinanspd said:
input
.groupBy {
case (numer, letter) => letter
} map {
case (key, list) => key -> list.map {
case (number, letter) => number
}
}
// Or equivalently
input.groupBy(_._2).view.mapValues(list => list.map(_._1)).toMap
// Or even simpler if you have access to Scala 2.13
input.groupMap(_._2)(_._1)
Every time you want to transform some collection of values where the transformation is one to one, you just need a map, so in this case, you want to transform the values of the Map returned by the groupBy, which are Lists, and then you want to transform every element of those Lists to just return the first component of the tuple. So it is a map inside another map.
Also, the Scaldoc is your friend. Usually, there are many useful methods, like mapValues or groupMap.
Here is a more detailed explanation of my comment. This is what you need to do:
val output = input.groupBy(_._2).mapValues(_.map(_._1))
It is important to remember in functional programming we love pure functions. Meaning we will not have shady side effects and we like to stick to one function one purpose principle
This allows us to chain these pure functions and reason about them linearly. Why am I telling you this? While your instinct might be to look for or implement a single function to do this, don't
In this case, we first groupBy the key, and then map each value where we take the value List and map it to extract the _._1 of the tuple.
This is the best I could do, to make it sorted too , as I see the other responses are not sorted , maybe someone can improve this too:
import scala.collection.immutable.ListMap
ListMap(input.groupBy(_._2).mapValues(_.map(_._1)).toSeq.sortBy(_._1):_*)

check possible english words in long random string (C++)

Given a random string:
KUHPVIBQKVOSHWHXBPOFUXHRPVLLDDAPPLEWPREDDVVIDWQRBHBGLLBBPKQUNRVOHQEIRLWOKKRDD
How do i check if the random string contains possible english words in it?
What's the most efficient way of searching for all possible English words embedded in this string?
I already downloaded english dictionary text file.
I would like to compare the string and english dictionary text file to find the possible words.
Can anyone give some hints how to do for this?
I recommend the brute force approach. After getting this method working, you can optimize later.
The brute force algorithm:
For each word in the dictionary,
search the string for that word.
Other methods may take longer. You will have to ask yourself, "is spending time making this algorithm more efficient worthwhile?"
For infrequent uses, the answer would be no. As an answer to an Online Judge, maybe you will need to improve the efficiency. If you have a lot of strings like this, then maybe you should optimize the algorithm.
You can build a DAG from the words in your dictionary and use this to search for hits. For example, if your dictionary contains the words
auto
autobahn
austria
This would lead to a graph like this
a -> u -> t -> o -> 'hit'
| |
| |-> b -> a -> h -> n -> 'hit'
|
-> s -> t -> r -> i -> a -> 'hit'
Based on this data structurce (here is a library for this) you can start feeding letters starting from each position in your random string until there is no edge to follow or until you obtain a hit.
Since the DAG is not updated, this can be done in parallel by starting at different positions in your random string.
Here is how to build such a search structure:
// Inserts keys into a simple dawg.
dawgdic::DawgBuilder dawg_builder;
dawg_builder.Insert("auto");
dawg_builder.Insert("autobahn");
dawg_builder.Insert("austria");
// Finishes building a simple dawg.
dawgdic::Dawg dawg;
dawg_builder.Finish(&dawg);
// Builds a dictionary from a simple dawg.
dawgdic::Dictionary dic;
dawgdic::DictionaryBuilder::Build(dawg, &dic);
// Checks if a key exists or not.
if (dic.Contains("auto"))
std::cout << "auto: found" << std::endl;
// Finds a key and gets its associated record.
if (dic.Find("august") < 0)
std::cout << "august: not found" << std::endl;

Pattern matching on complex data structures ocaml

I am completely new to OCaml, so I am having some trouble with the basics. For my program, I have to match a nucleotide with it's complement (G -> C, C -> G, A -> T, T -> A) in order to find the other half of the double helix. The general idea is that DNA is composed of 2 complementary helixes, each of which is a sequence of nucleotides. Currently, I am trying to compute the other half of the double helix.
So far, I have represented the nucleotides with an enumeration and I've represented the DNA with a nucleotide list which corresponds to one helix.
type nucleotide =
| G
| C
| A
| T
type helix = nucleotide list
let rec complementary_helix (x:helix): helix =
| [G] -> [C]
| [C] -> [G]
| [A] -> [T]
| [T] -> [A]
end
I know something is missing here, but I don't know how to go about it. Can somebody steer me in the right direction?
You're basically just missing List.map:
let complement = function
| G -> C
| C -> G
| A -> T
| T -> A
let complementary_helix (x: helix) : helix =
List.map complement x
(For what it's worth, it's not necessary to specify types. OCaml will infer the types. It's good style to specify them for documentation but maybe not if they're obvious.)
Edit
OK, I guess this is a homework problem in which you're supposed to use recursion to solve the problem.
The way to think of recursion is that you want to solve a little piece of the problem, which gives you a smaller problem to solve. You pass the smaller problem to yourself (either before or after you solve your little piece). You also need to know when the problem has gotten so small there's no more work to do on it.
In your case, the little piece would be to translate one nucleotide to its complement. You're doing that semi-OK (you have lists where you would really just want to work on single nucleotides). But you're not passing the remainder of the problem to yourself to solve recursively. You're also not checking whether the problem is so small there's nothing to do.
For functions on lists, around 99% of the time you're going to make the problem smaller by splitting the list into the head (a single element) and the tail (a list that's smaller by one). That will work for you here.
Edit 2
As an example of how list recursion looks, here's a function that adds up all the integers in a list:
let rec sum l =
match l with
| [] -> 0
| head :: tail -> head + sum tail
This has all the parts I described. The match is used both to tell when the problem is trivial (when the list is empty) and to split the list into the head and the tail. Assuming you had the sum for the tail (which you can get recursively), the answer is pretty obvious. You just need to add the head onto this sum. You just need to ask yourself (almost always): if I had the answer for the tail of the list, what would I need to do to combine it with the head?

Algorithm to print to screen path(s) through a text maze

For my C++ assignment, I'm basically trying to search through a chunk of text in a text file (that's streamed to my vector vec) beginning at the second top character on the left. It's for a text maze, where my program in the end is supposed to print out the characters for a path through it.
An example of a maze would be like:
###############
Sbcde####efebyj
####hijk#m#####
#######lmi#####
###############
###############
###############
###############
###############
###############
###############
###############
###############
###############
###############
Where '#' is an unwalkable wall and you always begin on the left at the second top character. Alphabetical characters represent walkable squares. Exit(s) are ALWAYS on the right. The maze is always a 15x15 size in a maze.text file. Alphabetical characters repeat within the same maze, but not directly beside each other.
What I'm trying to do here is: if a square next to the current one has an alphabetical character, add it to the vector vec, and repeat this process until I get to the end of the maze. Eventually I am supposed to make this more complicated by printing to the screen multiple paths that exist in some mazes.
So far I have this for the algorithm itself, which I know is wrong:
void pathcheck()
{
if (isalpha(vec.at(x)) && !(find(visited.begin(), visited.end(), (vec.at(x))) != visited.end()) )
{
path.push_back(vec.at(x));
visited.push_back(vec.at(x));
pathcheck(vec.at(x++));
pathcheck(vec.at(x--));
pathcheck(vec.at(x + 16));
pathcheck(vec.at(x - 16));
}
}
visited is my vector keeping track of the visited squares.
How would I update this so it actually works, and eventually so I can manage more than one path (i.e. if there were 2 paths, the program would print to the screen both of them)? I recall being told that I may need another vector/array that keeps track of squares that I've already visited/checked, but then how would I implement that here exactly?
You're on the right track. When it comes to mazes, the typical method of solving is through either a depth-first search (the most efficient solution for finding some path) or breadth-first search (less efficient, but is guarenteed to find the optimal path). Since you seem to want to do an exhaustive search, these choices are basically interchangeable. I suggest you read up on them:
http://en.wikipedia.org/wiki/Depth-first_search
http://en.wikipedia.org/wiki/Breadth-first_search
Basically, you will need to parse your maze and represent it as a graph (where each non "#" is a node and each link is a walkable path). Then, you keep a list of partial paths (i.e. a list of nodes, in the order you visited them, for example, [S, b, c] is the partial path starting from S and ending at c). The main idea of DFS and BFS is that you have a list of partial paths, and one by one you remove items from the list, generate all possible partial paths leading from that partial path, then place them in the list and repeat. The main difference between DFS and BFS is that DFS implements this list as a stack (i.e. new items have greatest priority) and BFS uses a queue (i.e. new items have lowest priority).
So, for your maze using DFS it would work like this:
Initial node is S, so your initial path is just [S]. Push [S] into your stack ([ [S] ]).
Pop the first item (in this case, [S]).
Make a list of all possible nodes you can reach in 1 move from the current node (in your case, just b).
For each node from step 3, remove any nodes that are part of your current partial path. This will prevent loops. (i.e. for partial path [S, b], from b we can travel to c and to S, but S is already part of our partial path so returning is pointless)
If one of the nodes from step 4 is the goal node, add it to your partial path to create a completed path. Print the path.
For each node from step 4 that IS NOT the goal node, generate a new partial path and push it into the stack (i.e. for [S], we generate [S, b] and push it into the stack, which now should look like [ [S, b] ])
Repeat steps 2 through 6 until the stack is empty, meaning you have traversed every possible path from the starting node.
NOTE: in your example there are duplicate letters (for example, three "e"s). For your case, maybe make a simple "Node" class that includes a variable to hold the letter. That way each "e" will have it's own instance and the pointers will be different values letting you easily tell them apart. I don't know C++ exactly, but in pseudo code:
class Node:
method Constructor(label):
myLabel = label
links = list()
method addLink(node):
links.add(node)
You could read every character in the file and if it is not "#", create a new instance of Node for that character and add all the adjacent nodes.
EDIT: I've spent the last 3 years as a Python developer and I've gotten a bit spoiled. Look at the following code.
s = "foo"
s == "foo"
In Python, that assertion is true. "==" in Python compares the string's content. What I forgot from my days as a Java developer is that in many languages "==" compares the string's pointers. That's why in many languages like Java and C++ the assertion is false because the strings point to different parts of memory.
My point is that because this assertion is not true, you COULD forgo making a Node class and just compare the characters (using ==, NOT using strcmp()!) BUT this code could be a bit confusing to read and must be documented.
Overall, I'd use some sort of Node class just because it's fairly simple to implement and results in more readable code AND only requires parsing your maze once!
Good Luck