Print out all words in a Trie implemented with a map

Print out all words in a Trie implemented with a map - c++

I have a TrieNode class defined as follows:
class TrieNode {
public:
map<char, TrieNode*> children;
bool isLeaf = false; // if node represents end of word
int wordCount = 0; // How many times the word appears
TrieNode();
};
I'm trying to print out all of the words in the trie (preferably in alphabetical order, although I'd settle for anything at this point). I've been trying to implement a recursive solution, but I haven't been able to make a decent start.
EDIT: I should mention that all the other questions I've looked at for how to print all words in a trie store children as an array, rather than a map.

Here's a depth-first recursive traversal.
It would be best not to use raw pointers, but I did it here because you asked and I like you.
I did not delete the child nodes allocated by AddTrie, because I just wanted to demonstrate the traversal, rather than write an entire implementation.
So, you need to add code to delete these if you use this.
#include <iostream>
#include <map>
#include <string>
class TrieNode {
public:
std::map<char, TrieNode*> children;
bool isLeaf = false; // if node represents end of word
int wordCount = 0; // How many times the word appears
TrieNode() {}
};
void AddTrie(TrieNode& trie, const char* word) {
auto c = *(word++);
auto next = trie.children[c];
if(!next) { trie.children[c] = next = new TrieNode; }
if(*word) { AddTrie(*next, word); }
else { next->isLeaf = true; }
}
void DumpTrie(const TrieNode& trie, std::string word={}) {
for(const auto& child : trie.children) {
const auto next_word = word + child.first;
if(child.second->isLeaf) { std::cout << next_word << '\n'; }
DumpTrie(*child.second, next_word);
} }
int main() {
TrieNode trie;
AddTrie(trie, "goodbye");
AddTrie(trie, "hello");
AddTrie(trie, "good");
AddTrie(trie, "goodyear");
DumpTrie(trie);
}
Output
good
goodbye
goodyear
hello

I assume you want to waste less memory than a 26 slot array in each node by using a map instead? But seeing how maps initial construction cost is pretty high you might want to use a mutual map for all nodes instead of storing one in each node.

Related

How to get an element (struct) in an array by a value in the struct

Let's say I have this struct containing an integer.
struct Element
{
int number;
Element(int number)
{
this->number = number;
}
};
And I'm gonna create a vector containing many Element structs.
std::vector<Element> array;
Pretend that all the Element structs inside array have been initialized and have their number variable set.
My question is how can I instantly get an element based on the variable number?
It is very possible to do it with a for loop, but I'm currently focusing on optimization and trying to avoid as many for loops as possible.
I want it to be as instant as getting by index:
Element wanted_element = array[wanted_number]
There must be some kind of overloading stuff, but I don't really know what operators or stuff to overload.
Any help is appreciated :)

With comparator overloading implemented, std::find is available to help:
#include <iostream>
#include <vector>
#include <algorithm>
struct Element
{
int number;
Element(int number)
{
this->number = number;
}
bool operator == (Element el)
{
return number == el.number;
}
};
int main()
{
std::vector<Element> array;
std::vector<int> test;
for(int i=0;i<100;i++)
{
auto t = clock();
test.push_back(t);
array.push_back(Element(t));
}
auto valToFind = test[test.size()/2];
std::cout << "value to find: "<<valToFind<<std::endl;
Element toFind(valToFind);
auto it = std::find(array.begin(),array.end(),toFind);
if(it != array.end())
std::cout<<"found:" << it->number <<std::endl;
return 0;
}
The performance on above method depends on the position of the searched value in the array. Non-existing values & last element values will take the highest time while first element will be found quickest.
If you need to optimize searching-time, you can use another data-structure instead of vector. For example, std::map is simple to use here and fast on average (compared to latest elements of vector-version):
#include <iostream>
#include <vector>
#include <algorithm>
#include <map>
struct Element
{
int number;
Element(){ number = -1; }
Element(int number)
{
this->number = number;
}
};
int main()
{
std::map<int,Element> mp;
std::vector<int> test;
for(int i=0;i<100;i++)
{
auto t = clock();
test.push_back(t);
mp[t]=Element(t);
}
auto valToFind = test[test.size()/2];
std::cout << "value to find: "<<valToFind<<std::endl;
auto it = mp.find(valToFind);
if(it != mp.end())
std::cout<<"found:" << it->second.number <<std::endl;
return 0;
}
If you have to use vector, you can still use the map near the vector to keep track of its elements the same way above method just with extra memory space & extra deletions/updates on the map whenever vector is altered.
Anything you invent would with success would look like hashing or a tree in the end. std::unordered_map uses hashing while std::map uses red-black tree.
If range of values are very limited, like 0-to-1000 only, then simply saving its index in a second vector would be enough:
vec[number] = indexOfVector;
Element found = array[vec[number]];
If range is full and if you don't want to use any map nor unordered_map, you can still use a direct-mapped caching on the std::find method. On average, simple caching should decrease total time taken on duplicated searches (how often you search same item?).

Comparing stack data structure c++

I have the following program.What is the best , and most efficient way to check if the two stacks are equal by the values they contain?
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
struct StackNode
{
int data;
StackNode* next;
};
StackNode* newNode(int data)
{
StackNode* stackNode = new StackNode[sizeof(StackNode)];
stackNode->data = data;
stackNode->next = NULL;
return stackNode;
}
int isEmpty( StackNode *root)
{
return !root;
}
void push( StackNode** root, int data)
{
StackNode* stackNode = newNode(data);
stackNode->next = *root;
*root = stackNode;
}
int pop( StackNode** root)
{
if (isEmpty(*root))
return INT_MIN;
StackNode* temp = *root;
*root = (*root)->next;
int popped = temp->data;
free(temp);
return popped;
}
int peek(StackNode* root)
{
if (isEmpty(root))
return INT_MIN;
return root->data;
}
bool AreEqual(StackNode** lhs, StackNode** rhs)
{
////// ?
}
int main()
{
StackNode* root = NULL;
StackNode * r2 = NULL;
push(&root, 10);
push(&root, 20);
push(&root, 30);
push(&r2, 123);
push(&r2, 1231213);
AreEqual(&root, &r2);
}
if the stacks contains equivalent numbers but in different order , then the method should return true .. I would be very thankful if you could give me some directions for that task.Thanks in advice.

Considering
if the stacks contains equivalent numbers but in different order , then the method should return true
I think the optimal solution would be via sort and compare.
Extract the data from each stack
k1 = {data of lhs} -> O(n)
k2 = {data of rhs} -> O(n)
Sort the two arrays
k1_sorted = sort(k1) -> O(n log(n))
k2_sorted = sort(k2) -> O(n log(n))
Now you can compare the two sorted arrays in O(n). Keep in mind the possible repeated numbers in k1_sorted and k2_sorted.

As well as the sort method, you could take a counting approach:
#include <string>
#include <iostream>
#include <vector>
#include <map>
/**
* Returns true iff the two vectors contain the same elements
* (including number of duplicates) in any order.
*
* Alternatively std::sort both vectors and just compare them
* for equality, which may or may not be faster.
*/
bool sortOfEquivalent(const std::vector<int>& lhs, const std::vector<int>& rhs)
{
std::map<int, std::pair<int,int>> accumulator;
for (const auto x : lhs) {
accumulator[x].first++;
}
for (const auto x : rhs) {
accumulator[x].second++;
if (accumulator[x].second > accumulator[x].first) {
// Can bail early here; the RHS already has
// more x's than the LHS does
return false;
}
}
for (const auto& y : accumulator) {
if (y.second.first != y.second.second)
return false;
}
return true;
}
int main()
{
std::vector<int> lhs{3,5,5,7,1};
std::vector<int> rhs{1,2,3,4,5,6,7};
std::cout << sortOfEquivalent(lhs, rhs);
}
Depending on your data, this may or may not be faster than the sorting method. It also may or may not take less storage than the sorting method.
Also in reality you'd probably take a reference to accumulator[x] in that second loop rather than looking up the element three times.
However, you can only apply this solution to your situation if you treat your stack as not-a-stack, i.e. using its underlying data store (forward iteration is required). This may or may not be permitted.

On the outside your functions handle a stack, but the actual structure you implement the stack in is a simple linked list.
And comparing two linked lists is done by comparing each element one by one, stopping when either list runs out or you find a difference in the elements.

incrementing the value in map using insert c++

I have the following problem - I want to count the occurrences of each word in a file. I'm using a map<string,Count> so the key is the string object representing the word, and the value being looked up is the object that keeps count of the strings so that :
class Count {
int i;
public:
Count() : i(0) {}
void operator++(int) { i++; } // Post-increment
int& val() { return i; }
};
The problem is that I want to use insert() instead of the operator[]. Here is the code.
typedef map<string, Count> WordMap;
typedef WordMap::iterator WMIter;
int main( ) {
ifstream in("D://C++ projects//ReadF.txt");
WordMap wordmap;
string word;
WMIter it;
while (in >> word){
// wordmap[word]++; // not that way
if((it= wordmap.find(word)) != wordmap.end()){ //if the word already exists
wordmap.insert(make_pair(word, (*it).second++); // how do I increment the value ?
}else{
...
}
for (WMIter w = wordmap.begin();
w != wordmap.end(); w++)
cout << (*w).first << ": "
<< (*w).second.val() << endl;
}

Could you refactor so as not to use find but simply attempt the insert?
Insert always returns a pair<iter*, bool>. The bool is 0 if it finds the key, and the iter* points to the existing pair. So we can take the pointer to the pair and increment the value:
// On successful insertion, we get a count of 1 for that word:
auto result_pair = wordmap.insert( { word, 1 } );
// Increment the count if the word is already there:
if (!result_pair.second)
result_pair.first->second++;
It was my first time posting. I'm learning C++ and welcome feedback on my idea.

The problem is that I want to use insert() instead of the operator[]
...why? std::map::insert cannot mutate existing values. operator[] is the right job for this.
If you really want to use insert (please don't), you first need to erase the existing value, if present:
if((it= wordmap.find(word)) != wordmap.end())
{
const auto curr = it->second; // current number of occurrences
wordmap.erase(word);
wordmap.insert(make_pair(word, curr + 1));
}

Depth-first search

I have a suffix tree, each node of this tree is a struct
struct state {
int len, link;
map<char,int> next; };
state[100000] st;
I need to make dfs for each node and get all strings which I can reach, but I don't know how to make.
This is my dfs function
void getNext(int node){
for(map<char,int>::iterator it = st[node].next.begin();it != st[node].next.end();it++){
getNext(it->second);
}
}
It will be perfect if I can make something like
map<int,vector<string> >
where int is a node of my tree and vector strings which I can reach
now it works
void createSuffices(int node){//, map<int, vector<string> > &suffices) {
if (suffices[sz - 1].size() == 0 && (node == sz - 1)) {
// node is a leaf
// add a vector for this node containing just
// one element: the empty string
//suffices[node] = new vector<string>
//suffices.add(node, new vector<string>({""}));
vector<string> r;
r.push_back(string());
suffices[node] = r;
} else {
// node is not a leaf
// create the vector that will be built up
vector<string> v;
// loop over each child
for(map<char,int>::iterator it = st[node].next.begin();it != st[node].next.end();it++){
createSuffices(it->second);
vector<string> t = suffices[it->second];
for(int i = 0; i < t.size(); i ++){
v.push_back(string(1,it->first) + t[i]);
}
}
suffices[node] = v;
}
}

You can pass the map<int, vector<string>> together with your depth first search. When a recursive call returns from a certain node n, you know that all suffices from that node are ready. My C++ skills are too limited, so I'll write it in pseudo-code:
void createSuffices(int node, map<int, vector<string>> suffices) {
if (st[node].next.empty()) {
// node is a leaf
// add a vector for this node containing just
// one element: the empty string
suffices.add(node, new vector<string>({""}));
} else {
// node is not a leaf
// create the vector that will be built up
vector<string> v;
// loop over each child
foreach pair<char, int> p in st[node].next {
// handle the child
createSuffices(p.second, suffices);
// prepend the character to all suffices of the child
foreach string suffix in suffices(p.second) {
v.add(concatenate(p.first, suffix));
}
}
// add the created vector to the suffix map
suffices.add(node, v);
}
}

Filling map with 2 keys from a string. Character and frequency c++

I am new to maps so an a little unsure of the best way to do this. This task is in relation to compression with huffman coding. Heres what I have.
#include <map>
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
typedef map<char,int> huffmanMap;
void getFreq(string file, map<char, int> map)
{
map.clear();
for (string::iterator i = file.begin(); i != file.end(); ++i) {
++map[*i];
}
}
above is one method I found online but was unable to print anything
int main()
{
map<char, int> huffmanMap;
string fileline;
ifstream myfile;
myfile.open("text.txt",ios::out);
while(!myfile.eof()) {
getline(myfile, fileline); //get the line and put it in the fileline string
}
myfile.close();
I read in a from a text file to populate string fileline.
for (int i=0; i<fileline.length(); i++) {
char t = fileline[i];
huffmanMap[i]? huffmanMap[i]++ : huffmanMap[i]=1;
}
here is a second method I tried for populating the map, the char values are incorrect, symbols and smileyfaces..
getFreq(fileline,huffmanMap);
huffmanMap::iterator position;
for (position = huffmanMap.begin(); position != huffmanMap.end(); position++) {
cout << "key: \"" << position->first << endl;
cout << "value: " << position->second << endl;
}
This is how I tried to print map
system("pause");
return 0;
}
When I run my getFreq method the program crashes. I dont get any errors with either. With the second method the char values are nonsense.Note I have not had both methods running at the same time I just incuded them both to show what i have tried.
Any insight would be appreciated.Thanks. Be lenient im a beginner ;)

Your code is all over the place, it's not very coherent so difficult to understand the flow.
Here are some low-lights:
This is wrong: myfile.open("text.txt",ios::out); - why would you open an input stream with out flag? it should simply be:
string fileline;
ifstream myfile("text.txt");
while(getline(myfile, fileline)) {
// now use fileline.
}
In the while loop, what you want to do is to iterate over the content and add it to your map? So now the code looks like:
string fileline;
ifstream myfile("text.txt");
while(getline(myfile, fileline)) {
getFreq(fileline, huffmanMap);
}
Next fix, this is wrong: you have a typedef and a variable of the same name!
typedef map<char,int> huffmanMap;
map<char, int> huffmanMap;
Use sensible naming
typedef map<char,int> huffmanMap_Type;
huffmanMap_Type huffmanMap;
Next fix, your getFreq method signature is wrong, you are passing the map by value (i.e. copy) rather than reference, hence you modification in the function is to a copy not the original!
wrong: void getFreq(string file, map<char, int> map)
correct: void getFreq(string file, huffmanMap_Type& map)
Next: why clear() in the above method? What if there is more than one line? No need for that surely?
That's enough for now, clean up your code and update your question if there are more issues.

One fix and One improvement.
Fix is : make second parameter in getFreq reference:
void getFreq(string file, map<char, int> & map); //notice `&`
Improvement is : just write
huffmanMap[i]++;
instead of
huffmanMap[i]? huffmanMap[i]++ : huffmanMap[i]=1;
After all, by writing huffmanMap[i]? you're checking whether it's zero or not. If zero, then you make it one, which is same as huffmanMap[i]++.

(An answer using C++ language features fom C++20.
But first, you were asking about getting getting the count (frequency) of letters in a text.
There is nearly a universal solution approach for this. We can use the std::unordered_map. It is described in the C++ reference here.
It is the std::unordered_maps very convienient index operator [] which makes counting very simple. This operator returns a reference to the value that is mapped to a key. So, it searched for the key and then returns the value. If the key does not exist, it inserts a new key/value pair and returns a reference to the value.
So, in any way, a reference to the value is returned. And this can be incremented. Example:
With a "std::unordered_map<char, int> mymap{};" and a text "aba", the follwoing can be done with the index operator:
mymap['a'] will search for an 'a' in the map. It is not found, so a new entry for 'a' with corresponding value=0 is created: The a reference to the value is returned. And, if we now increment that, we will increment the counter value. So, mymap['a']++, wiil insert a new gey/value pair 'a'/0, then increment it, resulting in 'a'/1
For 'b' the same as above will happen.
For the next 'a', an entry will be found in the map, an so a reference to the value (1) is returned. This is incremented and will then be 2
And so on and so on.
By using some modern language elements, a whole file can be read and its characters counted, with one simple for-loop:
for (const char c : rng::istream_view<char>(ifs)) counter[c]++;
Additional information:
For building a Huffmann tree, we can use a Min-Heap, which can be easily implemented with the existing std::priority_queue. Please read here abour it.
With 4 lines of code, the complete Huffmann tree can be build.
And the end, we put the result in a code book. Again a std::unordered_map and show the result to the user.
This could zhen be for example implemented like the below:
#include <iostream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <algorithm>
#include <queue>
#include <ranges>
#include <vector>
#include <utility>
namespace rng = std::ranges; // Abbreviation for the rnages namespace
using namespace std::string_literals; // And we want to use stding literals
// The Node of the Huffmann tree
struct Node {
char letter{ '\0' }; // The letter that we want to encode
std::size_t frequency{}; // The letters frequency in the source text
Node* left{}, *right{}; // The pointers to the children of this node
};
// Some abbreviations to reduce typing work and make code more readable
using Counter = std::unordered_map<char, std::size_t>;
struct Comp { bool operator ()(const Node* n1, const Node* n2) { return n1->frequency > n2->frequency; } };
using MinHeap = std::priority_queue<Node*, std::vector<Node*>, Comp>;
using CodeBook = std::unordered_map<char, std::string>;
// Traverse the Huffmann Tree and build the code book
void buildCodeBook(Node* root, std::string code, CodeBook& cb) {
if (root == nullptr) return;
if (root->letter != '\0') cb[root->letter] = code;
buildCodeBook(root->left, code + "0"s, cb);
buildCodeBook(root->right, code + "1"s, cb);
}
// Get the top most two Elements from the Min-Heap
std::pair<Node*, Node*> getFrom(MinHeap& mh) {
Node* left{ mh.top() }; mh.pop();
Node* right{ mh.top() }; mh.pop();
return { left, right };
}
// Demo function
int main() {
if (std::ifstream ifs{ "r:\\lorem.txt" }; ifs) {
// Define moste important resulting work products
Counter counter{};
MinHeap minHeap{};
CodeBook codeBook{};
// Read complete text from source file and count all characters ---------
for (const char c : rng::istream_view<char>(ifs)) counter[c]++;
// Build the Huffmann tree ----------------------------------------------
// First, create a min heap, based and the letters frequency
for (const auto& p : counter) minHeap.push(new Node{p.first, p.second});
// Compress the nodes
while (minHeap.size() > 1u) {
auto [left, right] = getFrom(minHeap);
minHeap.push(new Node{ '\0', left->frequency + right->frequency, left, right });
}
// And last but not least, generate the code book -----------------------
buildCodeBook(minHeap.top(), {}, codeBook);
// And, as debug output, show the code book -----------------------------
for (const auto& [letter, code] : codeBook) std::cout << '\'' << letter << "': " << code << '\n';
}
else std::cerr << "\n\n***Error: Could not open source text file\n\n";
}
You my notice that we use new to allocate memory. But we do not delete it afterwards.
We could now add the delete statements at the approiate positions but I will show you a modified solution using smart pointers.
Please see here:
#include <iostream>
#include <fstream>
#include <string>
#include <unordered_map>
#include <map>
#include <algorithm>
#include <queue>
#include <ranges>
#include <vector>
#include <utility>
#include <memory>
namespace rng = std::ranges; // Abbreviation for the rnages namespace
using namespace std::string_literals; // And we want to use stding literals
struct Node; // Forward declaration
using UPtrNode = std::unique_ptr<Node>; // Using smart pointer for memory management
// The Node of the Huffmann tree
struct Node {
char letter{ '\0' }; // The letter that we want to encode
std::size_t frequency{}; // The letters frequency in the source text
UPtrNode left{}, right{}; // The pointers to the children of this node
};
// Some abbreviations to reduce typing work and make code more readable
using Counter = std::unordered_map<char, std::size_t>;
struct CompareNode { bool operator ()(const UPtrNode& n1, const UPtrNode& n2) { return n1->frequency > n2->frequency; } };
using MinHeap = std::priority_queue<UPtrNode, std::vector<UPtrNode>, CompareNode>;
using CodeBook = std::map<Counter::key_type, std::string>;
// Traverse the Huffmann Tree and build the code book
void buildCodeBook(UPtrNode&& root, std::string code, CodeBook& cb) {
if (root == nullptr) return;
if (root->letter != '\0') cb[root->letter] = code;
buildCodeBook(std::move(root->left), code + "0"s, cb);
buildCodeBook(std::move(root->right), code + "1"s, cb);
}
// Get the top most to Elements from the Min-Heap
std::pair<UPtrNode, UPtrNode> getFrom(MinHeap& mh) {
UPtrNode left = std::move(const_cast<UPtrNode&>(mh.top()));mh.pop();
UPtrNode right = std::move(const_cast<UPtrNode&>(mh.top()));mh.pop();
return { std::move(left), std::move(right) };
}
// Demo function
int main() {
if (std::ifstream ifs{ "r:\\lorem.txt" }; ifs) {
// Define moste important resulting work products
Counter counter{};
MinHeap minHeap{};
CodeBook codeBook{};
// Read complete text from source file and count all characters ---------
for (const char c : rng::istream_view<char>(ifs)) counter[c]++;
// Build the Huffmann tree ----------------------------------------------
// First, create a min heap, based and the letters frequency
for (const auto& p : counter) minHeap.push(std::make_unique<Node>(Node{ p.first, p.second }));
// Compress the nodes
while (minHeap.size() > 1u) {
auto [left, right] = getFrom(minHeap);
minHeap.push(std::make_unique<Node>(Node{ '\0', left->frequency + right->frequency, std::move(left), std::move(right) }));
}
// And last but not least, generate the code book -----------------------
buildCodeBook(std::move(const_cast<UPtrNode&>(minHeap.top())), {}, codeBook);
// And, as debug output, show the code book -----------------------------
for (std::size_t k{}; const auto & [letter, code] : codeBook) std::cout << ++k << "\t'" << letter << "': " << code << '\n';
}
else std::cerr << "\n\n***Error: Could not open source text file\n\n";
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Print out all words in a Trie implemented with a map - c++

I assume you want to waste less memory than a 26 slot array in each node by using a map instead? But seeing how maps initial construction cost is pretty high you might want to use a mutual map for all nodes instead of storing one in each node.

Related

How to get an element (struct) in an array by a value in the struct

Comparing stack data structure c++

incrementing the value in map using insert c++

Depth-first search

Filling map with 2 keys from a string. Character and frequency c++

Categories

Resources