How would I find a partial word match/find c++ - c++

I'm having trouble with a scrabble-like game I'm making. My goal was to have 5 random letters that possibly match at least one word in a dictionary I made so the game can start. I've done this but since the 5 letters are random there's probably a slim chance it will match one of the words in the dictionary. I've done test runs and got random letters but no match every time (it works hardcoded however). I figure if I can find a partial match, I can keep the letter(s) that do form part of a word and get different letters for the ones that don't. The thing is, I don't know how to go about doing this.
This is my code for the five random letters:
void fiveLetters()
{
srand(time(NULL));
for (int i =0; i<=4; i++) // display 5 random letters
{
int n = rand()%26;
char letter = (char)(n+97);
letters[i] = letter;
cout << letters[i]<< " ";
}
letters[5] = '\0'; // last value in char array null value so I can convert to string
attempt = letters; // the 4 letters
checkWords();
}
For checking if it matches a word I use:
void checkWords()
{
if (find(words.begin(),words.end(), attempt) != words.end()) // if matches word in set
{
cout << " Words found" << endl;
}
else // if it doesn't match
{
cout << "cannot find word " << endl;
attempt.clear();
fiveLetters();
}
}
The dictionary is a text file with a lot of words however since I'm just trying to get things working before I implement it, I used 5-10 words and put them in a set<string>. Sorry for the long read, any help is appreciated!

This example demonstrates using a Trie to recursively search for words loaded into it, using search criteria like counts of available letters, minimum word length, and a maximum number of "missing" letters -- letters which are used without being included as an "available" letter.
This Trie is a 26-ary tree, so each node has 26 child nodes, one for each letter in the alphabet. The root node's children are selected using the first letter in a word, the second letter chooses one of this child's children, and so on. A node contains a boolean value indicating that a word ends as that node. Such a node is not a "leaf", however, because the terminated word might be a prefix of a longer word (node terminates "ball", but still has a child branch for "balloon").
The Trie, along with a recursive search is amazingly rapid for your particular task. (By the way, the recursive search doesn't use a recursive function, it stores each level's context on a vector-based stack).
#include <iostream>
#include <string>
#include <unordered_map>
#include <vector>
#include <fstream>
#include <random>
#include <algorithm>
class Trie {
// branch_count defines the number of characters which are combined to make words
// 26 is the number of letters, interpreted case-insensitively
static const int branch_count = 26;
// index_map takes a character and returns a number between 0 and branch_count-1
static int index_map(char c) {
if((c >= 'a') & (c <= 'z')) return c - 'a';
if((c >= 'A') & (c <= 'Z')) return c - 'A';
return -1;
}
// index_unmap takes an index, between 0 and branch_count-1, and returns
// the canonical character representation for that index
static char index_unmap(int index) {
return char('a' + index);
}
// verify_word returns true if all characters in the string map to an index
// returns false if at least one character is not recognized
static bool verify_word(const std::string& word) {
for(auto&& c : word) {
if(index_map(c) == -1) return false;
}
return true;
}
// Node is the Trie's branch_count-ary tree node class
struct Node {
Node* child_nodes[branch_count];
bool terminates_word;
Node() : terminates_word(false) {
for(auto& p : child_nodes) { p = nullptr; }
}
};
// make_lower(str) changes upper-case letters in str to lower-case (in-place)
static void make_lower(std::string& str) {
for(auto& c : str) {
if((c >= 'A') & (c <= 'Z')) {
c += 'a' - 'A';
} } }
// is_space_char(x) returns true when c is some kind of
// unprintable or blank, but nonzero, character
static bool is_space_char(char c) { return (c > 0) & (c <= ' '); }
// trim(str) removes whitespace from the left and right sides
// of str (str is modified in-place)
static void trim(std::string& str) {
const auto len = str.length();
if(!len) return;
auto i = len-1;
if(is_space_char(str[i])) {
for(--i; i+1; --i) {
if(!is_space_char(str[i])) {
str.resize(i+1);
break;
} } }
if(!(i+1)) {
str.clear();
return;
}
i=0;
if(is_space_char(str[i])) {
for(++i;; ++i) {
if(!is_space_char(str[i])) {
str = str.substr(i);
return;
} } } }
Node *root;
int node_count;
int word_count;
public:
// Trie::AddWord(word) stores a string in the Trie
void AddWord(std::string word) {
if(word.empty()) return;
make_lower(word);
if(!verify_word(word)) return;
Node *p = root;
for(const auto c : word) {
const int child_index = index_map(c);
if(child_index == -1) {
// verify_word(word) should have caught this.
// Well-behaved, but might use excess memory.
return;
}
Node *pchild = p->child_nodes[child_index];
if(!pchild) {
p->child_nodes[child_index] = pchild = new Node;
++node_count;
}
p = pchild;
}
if(!p->terminates_word) {
p->terminates_word = true;
++word_count;
} }
// LoadWords(input_stream) loads all line-delimited words from
// the stream into the Trie
int LoadWords(std::istream& stream_in) {
const int start_count = word_count;
std::string line;
while(std::getline(stream_in, line)) {
trim(line);
AddWord(line);
}
return word_count - start_count;
}
// LoadWords(filename) loads all line-delimited words from
// the file at the given path
int LoadWords(const std::string& file_path) {
std::ifstream stream_in(file_path.c_str());
return LoadWords(stream_in);
}
// WordCount() returns the number of words loaded so far
int WordCount() const { return word_count; }
// select_type is a helper for specifying iterator behavior
template <bool select_A, typename A, typename B>
struct select_type { typedef A type; };
template <typename A, typename B>
struct select_type<false, A, B> { typedef B type; };
template <bool select_A, typename A, typename B>
using select_type_t = typename select_type<select_A, A, B>::type;
// The iterator class is used for begin() and end(), as a minimal
// implementation compatible with range-based for,
// as well as by the destructor when destroying the
// tree's Node objects.
template <bool is_const=true, bool delete_iter=false>
class iterator {
friend class Trie;
typedef select_type_t<is_const, const Node*, Node*> pnode_t;
struct context {
pnode_t node;
int child_index;
};
std::vector<context> stack;
pnode_t advance() {
for(;;) {
if(stack.empty()) return nullptr;
pnode_t p = stack.back().node;
int &child_index = stack.back().child_index;
while(++child_index < branch_count) {
pnode_t pchild = p->child_nodes[child_index];
if(pchild) {
stack.push_back({pchild, -1});
if(!delete_iter && pchild->terminates_word) return nullptr;
break;
} }
if(stack.back().child_index == branch_count) {
stack.pop_back();
if(delete_iter) return p;
} } }
public:
iterator(pnode_t root) {
stack.push_back({root, -1});
if(!delete_iter) advance();
}
iterator() {}
std::string operator * () const {
if(stack.empty()) return std::string();
std::string word;
for(int i=0; stack[i].child_index != -1; ++i) {
word += index_unmap(stack[i].child_index);
}
return word;
}
iterator& operator ++ () {
advance();
return *this;
}
bool operator != (const iterator& iter) const {
if(stack.size() != iter.stack.size()) return true;
const int size = static_cast<int>(stack.size());
for(int i=0; i<size; ++i) {
if(stack[i].node != iter.stack[i].node) return true;
}
return false;
}
};
// ctor
Trie() : root(new Node), node_count(1), word_count(0) {}
// dtor
~Trie() {
iterator<false, true> iter(root);
int count = 0;
while(auto pn = iter.advance()) {
delete pn;
++count;
}
//std::cout << "Final word count: " << word_count << '\n';
//std::cout << count << " of " << node_count << " Node objects deleted\n";
}
// const_iterator defined from iterator with template parameter
// for selecting const Node pointers
typedef iterator<true> const_iterator;
const_iterator begin() const { return const_iterator(root); }
const_iterator end() const { return const_iterator(); }
// FindWords:
// * takes an unordered map with char keys and int values
// (counts[ch] = <how many ch may be used>)
// * takes a "max_missing" count (number of substituted characters which
// may be used when none remain in "counts")
// * takes a "min_length", the minimum length a word
// must have to be added to the results
std::vector<std::string> FindWords(const std::unordered_map<char, int>& counts, int max_missing=0, int min_length=0) {
struct context {
const Node* node;
int child_index;
std::unordered_map<char, int> counts;
int missing_count;
bool missing_letter;
context(const Node* node, const std::unordered_map<char, int>& counts, int missing_count) :
node(node),
child_index(-1),
counts(counts),
missing_count(missing_count),
missing_letter(false)
{}
};
std::vector<context> stack;
stack.push_back(context(root, counts, 0));
std::vector<std::string> match_list; // results are added to this
// This walks the tree just like the iterator's advance() function
// however, this function's context includes more info, like
// each level's available letter counts and whether a letter
// was used by taking one of the available "missing" letters.
while(!stack.empty()) {
context& ctx = stack.back();
while(++ctx.child_index < branch_count) {
const Node* pchild = ctx.node->child_nodes[ctx.child_index];
if(!pchild) continue;
const char c = index_unmap(ctx.child_index);
if(ctx.counts[c] > 0) {
ctx.missing_letter = false;
context child_ctx(pchild, ctx.counts, ctx.missing_count);
--child_ctx.counts[c];
stack.push_back(child_ctx); // ctx made invalid here
break;
}
else if(ctx.missing_count < max_missing) {
ctx.missing_letter = true;
context child_ctx(pchild, ctx.counts, ctx.missing_count + 1);
stack.push_back(child_ctx); // ctx made invalid here
break;
} }
context& fresh_ctx = stack.back();
if(fresh_ctx.child_index == branch_count) {
stack.pop_back();
continue;
}
// After a new level is pushed on the stack, check if the last node
// completes a word from the tree, then check various conditions
// required for adding it to the results.
if(static_cast<int>(stack.size()) > min_length && fresh_ctx.node->terminates_word) {
std::string word;
for(const auto& entry : stack) {
if(entry.child_index != -1) {
char c = index_unmap(entry.child_index);
if(entry.missing_letter) {
// modify the character to show it was substituted
if(c >= 'a' && c <= 'z') {
// capitalize missing lower-case letters
word += c + 'A' - 'a';
} else {
// put funky square brackets around other types of missing characters
word += '[';
word += c;
word += ']';
}
} else {
word += c;
} } }
match_list.push_back(word);
} }
return match_list;
}
// FindWords(letters, max_missing, min_length) creates a "counts" map
// from the "letters" string and uses it to call FindWords(counts...)
std::vector<std::string> FindWords(std::string letters, int max_missing=0, int min_length=0) {
std::unordered_map<char, int> counts;
for(auto c : letters) {
switch(c) {
case '?': ++max_missing; break; // '?' is a wildcard (blank tile)
default: ++counts[c]; break;
} }
return FindWords(counts, max_missing, min_length);
}
// DumpAllWords dumps all words contained in the Trie to cout (in alphabetical order)
void DumpAllWords() {
for(auto&& s : *this) {
std::cout << s << '\n';
} }
};
class DrawPool {
std::mt19937 rng;
std::string letters;
int last_count = 1;
struct arg_is_int { char i[4]; };
static arg_is_int argtype(int);
static char argtype(char);
void Add() {}
template <typename T, typename ...Args>
void Add(T arg, Args ...args) {
if(sizeof(argtype(arg)) == sizeof(arg_is_int)) {
last_count = arg;
} else {
letters += std::string(last_count, arg);
}
Add(args...);
}
public:
void Shuffle() {
letters.clear();
Add(2, '?',
12,'e', 9, 'a', 'i', 8, 'o', 6, 'n', 'r', 't',
4, 'l', 's', 'u', 'd', 3, 'g',
2, 'b', 'c', 'm', 'p', 'f', 'h', 'v', 'w', 'y',
1, 'k', 'j', 'x', 'q', 'z');
std::shuffle(letters.begin(), letters.end(), rng);
}
int Count() const { return static_cast<int>(letters.length()); }
std::string Get(int count) {
if(count > Count()) count = Count();
const std::string draw = letters.substr(0, count);
letters.erase(0, count);
return draw;
}
DrawPool() {
std::random_device rd;
std::seed_seq seed = {rd(), rd(), rd(), rd(), rd()};
rng.seed(seed);
Shuffle();
}
};
int main() {
Trie trie;
// Call trie.LoadWords(filename) with each filename in the list
// The names here are files from the SCOWL word lists.
// These may be replaced with your own file name(s).
for(auto s : {
"english-words.10",
"english-words.20",
"english-words.35",
"english-words.40",
"english-words.50",
"english-words.55",
"english-words.60",
"english-words.70",
"english-words.80",
"english-words.95"
}) {
int count = trie.LoadWords(s);
std::cout << "Loaded " << count << " new words from " << s << '\n';
}
std::cout << "\nLoaded a total of " << trie.WordCount() << " words\n";
//trie.DumpAllWords(); // send all words to cout
// match a set of 7 randomly-drawn letters
// draw one more letter each time no matches found
DrawPool draw_pool;
std::string rack;
std::vector<std::string> word_list;
do {
rack += draw_pool.Get(rack.empty() ? 7 : 1);
std::cout << "\nRack: " << rack << '\n';
// find words at least 3 letters long, using the letters
// from "rack". The only "missing" matches allowed are
// 1 for each wildcard ('?') in "rack".
word_list = trie.FindWords(rack, 0, 3);
} while(word_list.empty() && draw_pool.Count());
// Dump the results to cout
std::cout << "\nFound " << word_list.size() << " matches\n";
for(auto&& word : word_list) {
std::cout << " " << word << '\n';
}
}

This is something from my class
if (Variable_name.find(thing to check for) != string::npos)
example
if (myname.find(james) != string::npos)
{
found=true;
//do something
}

Related

String function optimisation?

I'm new to C++ and i just wrote a function to tell me if certain characters in a string repeat or not:
bool repeats(string s)
{
int len = s.size(), c = 0;
for(int i = 0; i < len; i++){
for(int k = 0; k < len; k++){
if(i != k && s[i] == s[k]){
c++;
}
}
}
return c;
}
...but i can't help but think it's a bit congested for what it's supposed to do. Is there any way i could write such a function in less lines?
Is there any way i could write such a function in less lines?
With std, you might do:
bool repeats(const std::string& s)
{
return std::/*unordered_*/set<char>{s.begin(), s.end()}.size() != s.size();
}
#include <algorithm>
bool repeats(std::string s){
for (auto c : s){
if(std::count(s.begin(), s.end(), c) - 1)
return true;
}
return false;
}
Assuming you are not looking for repeated substrings :
#include <iostream>
#include <string>
#include <set>
std::set<char> ignore_characters{ ' ', '\n' };
bool has_repeated_characters(const std::string& input)
{
// std::set<char> is a collection of unique characters
std::set<char> seen_characters{};
// loop over all characters in the input string
for (const auto& c : input)
{
// skip characters to ignore, like spaces
if (ignore_characters.find(c) == ignore_characters.end())
{
// check if the set contains the character, in C++20 : seen_characters.contains(c)
// and maybe you need to do something with "std::tolower()" here too
if (seen_characters.find(c) != seen_characters.end())
{
return true;
}
// add the character to the set, we've now seen it
seen_characters.insert(c);
}
}
return false;
}
void show_has_repeated_characters(const std::string& input)
{
std::cout << "'" << input << "' ";
if (has_repeated_characters(input))
{
std::cout << "has repeated characters\n";
}
else
{
std::cout << "doesn't have repeated characters\n";
}
}
int main()
{
show_has_repeated_characters("Hello world");
show_has_repeated_characters("The fast boy");
return 0;
}
std::string str;
... fill your string here...
int counts[256]={0};
for(auto s:str)
counts[(unsigned char)s]++;
for(int i=0;i<256;i++)
if(counts[i]>1) return true;
return false;
6 lines instead of 9
O(n+256) instead of O(n^2)
This is your new compact function :
#include <iostream>
#include <algorithm>
using namespace std;
int occurrences(string s, char c) {
return count(s.begin(), s.end(), c); }
int main() {
//occurrences count how many times char is repetated.
//any number other than 0 is considered true.
occurrences("Hello World!",'x')?cout<<"repeats!":cout<<"no repeats!";
//It is equal write
//
// if(occurrences("Hello World!",'x'))
// cout<<"repeats!";
// else
// cout<<"no repeats!";
//So to count the occurrences
//
// int count = occurrences("Hello World!",'x');
}

Getting Permutations with Repetitions in this special way

I have a list of {a,b} and i need all possible combinatations where say n=3.
so:
[a,b,a],
[b,a,b]
[a,a,a]
[b,b,b]
etc.
Is there a name of such a problem
My current solution just uses random sampling and is very inefficient:
void set_generator(const vector<int>& vec, int n){
map<string, vector<int>> imap;
int rcount = 0;
while(1){
string ms = "";
vector<int> mset;
for(int i=0; i<n; i++){
int sampled_int = vec[rand() % vec.size()];
ms += std::to_string(sampled_int);
mset.emplace_back(sampled_int);
}
if(rcount > 100)
break;
if(imap.count(ms)){
rcount += 1;
//cout << "*" << endl;
continue;
}
rcount = 0;
imap[ms] = mset;
cout << ms << endl;
}
}
set_generator({1,2},3);
Let us call b the size of the input vector.
The problem consists in generating all numbers from 0 to b^n - 1, in base b.
A simple solution increments the elements of an array one by one, each from 0 to b-1.
This is performed by the function increment in the code hereafter.
Output:
111
211
121
221
112
212
122
222
The code:
#include <iostream>
#include <vector>
#include <string>
#include <map>
void set_generator_op (const std::vector<int>& vec, int n){
std::map<std::string, std::vector<int>> imap;
int rcount = 0;
while(1){
std::string ms = "";
std::vector<int> mset;
for(int i=0; i<n; i++){
int sampled_int = vec[rand() % vec.size()];
ms += std::to_string(sampled_int);
mset.emplace_back(sampled_int);
}
if(rcount > 100)
break;
if(imap.count(ms)){
rcount += 1;
//cout << "*" << endl;
continue;
}
rcount = 0;
imap[ms] = mset;
std::cout << ms << "\n";
}
}
// incrementation of a array of int, in base "base"
// return false if max is already attained
bool increment (std::vector<int>& cpt, int base) {
int n = cpt.size();
for (int i = 0; i < n; ++i) {
cpt[i]++;
if (cpt[i] != base) {
return true;
}
cpt[i] = 0;
}
return false;
}
void set_generator_new (const std::vector<int>& vec, int n){
int base = vec.size();
std::vector<int> cpt (n, 0);
while (true) {
std::string permut = "";
for (auto &k: cpt) {
permut += std::to_string (vec[k]);
}
std::cout << permut << "\n";
if (!increment(cpt, base)) return;
}
}
int main() {
set_generator_op ({1,2},3);
std::cout << "\n";
set_generator_new ({1,2},3);
}
Following advices of Jarod42, I have
suppressed the useless conversion to a string
used a more elegant do ... while instead of the while true
inversed the iterators for printing the result
Moreover, I have created a templated version of the program.
New output:
111
112
121
122
211
212
221
222
aaa
aab
aba
abb
baa
bab
bba
bbb
And the new code:
#include <iostream>
#include <vector>
#include <string>
#include <map>
// incrementation of a array of int, in base "base"
// return false if max is already attained
bool increment (std::vector<int>& cpt, int base) {
int n = cpt.size();
for (int i = 0; i < n; ++i) {
cpt[i]++;
if (cpt[i] != base) {
return true;
}
cpt[i] = 0;
}
return false;
}
template <typename T>
void set_generator_new (const std::vector<T>& vec, int n){
int base = vec.size();
std::vector<int> cpt (n, 0);
do {
for (auto it = cpt.rbegin(); it != cpt.rend(); ++it) {
std::cout << vec[*it];
}
std::cout << "\n";
} while (increment(cpt, base));
}
int main() {
set_generator_new<int> ({1,2}, 3);
std::cout << "\n";
set_generator_new<char> ({'a','b'}, 3);
}
Besides the concrete answer for integer usage, I want to provide a generic way I needed during test case construction for scenarios with a wide spread of various parameter variations. Maybe it's helpful to you too, at least for similar scenarios.
#include <vector>
#include <memory>
class SingleParameterToVaryBase
{
public:
virtual bool varyNext() = 0;
virtual void reset() = 0;
};
template <typename _DataType, typename _ParamVariationContType>
class SingleParameterToVary : public SingleParameterToVaryBase
{
public:
SingleParameterToVary(
_DataType& param,
const _ParamVariationContType& valuesToVary) :
mParameter(param)
, mVariations(valuesToVary)
{
if (mVariations.empty())
throw std::logic_error("Empty variation container for parameter");
reset();
}
// Step to next parameter value, return false if end of value vector is reached
virtual bool varyNext() override
{
++mCurrentIt;
const bool finished = mCurrentIt == mVariations.cend();
if (finished)
{
return false;
}
else
{
mParameter = *mCurrentIt;
return true;
}
}
virtual void reset() override
{
mCurrentIt = mVariations.cbegin();
mParameter = *mCurrentIt;
}
private:
typedef typename _ParamVariationContType::const_iterator ConstIteratorType;
// Iterator to the actual values this parameter can yield
ConstIteratorType mCurrentIt;
_ParamVariationContType mVariations;
// Reference to the parameter itself
_DataType& mParameter;
};
class GenericParameterVariator
{
public:
GenericParameterVariator() : mFinished(false)
{
reset();
}
template <typename _ParameterType, typename _ParameterVariationsType>
void registerParameterToVary(
_ParameterType& param,
const _ParameterVariationsType& paramVariations)
{
mParametersToVary.push_back(
std::make_unique<SingleParameterToVary<_ParameterType, _ParameterVariationsType>>(
param, paramVariations));
}
const bool isFinished() const { return mFinished; }
void reset()
{
mFinished = false;
mNumTotalCombinationsVisited = 0;
for (const auto& upParameter : mParametersToVary)
upParameter->reset();
}
// Step into next state if possible
bool createNextParameterPermutation()
{
if (mFinished || mParametersToVary.empty())
return false;
auto itPToVary = mParametersToVary.begin();
while (itPToVary != mParametersToVary.end())
{
const auto& upParameter = *itPToVary;
// If we are the very first configuration at all, do not vary.
const bool variedSomething = mNumTotalCombinationsVisited == 0 ? true : upParameter->varyNext();
++mNumTotalCombinationsVisited;
if (!variedSomething)
{
// If we were not able to vary the last parameter in our list, we are finished.
if (std::next(itPToVary) == mParametersToVary.end())
{
mFinished = true;
return false;
}
++itPToVary;
continue;
}
else
{
if (itPToVary != mParametersToVary.begin())
{
// Reset all parameters before this one
auto itBackwd = itPToVary;
do
{
--itBackwd;
(*itBackwd)->reset();
} while (itBackwd != mParametersToVary.begin());
}
return true;
}
}
return true;
}
private:
// Linearized parameter set
std::vector<std::unique_ptr<SingleParameterToVaryBase>> mParametersToVary;
bool mFinished;
size_t mNumTotalCombinationsVisited;
};
Possible usage:
GenericParameterVariator paramVariator;
size_t param1;
int param2;
char param3;
paramVariator.registerParameterToVary(param1, std::vector<size_t>{ 1, 2 });
paramVariator.registerParameterToVary(param2, std::vector<int>{ -1, -2 });
paramVariator.registerParameterToVary(param3, std::vector<char>{ 'a', 'b' });
std::vector<std::tuple<size_t, int, char>> visitedCombinations;
while (paramVariator.createNextParameterPermutation())
visitedCombinations.push_back(std::make_tuple(param1, param2, param3));
Generates:
(1, -1, 'a')
(2, -1, 'a')
(1, -2, 'a')
(2, -2, 'a')
(1, -1, 'b')
(2, -1, 'b')
(1, -2, 'b')
(2, -2, 'b')
For sure, this can be further optimized/specialized. For instance you can simply add a hashing scheme and/or an avoid functor if you want to avoid effective repetitions. Also, since the parameters are held as references, one might consider to protect the generator from possible error-prone usage via deleting copy/assignement constructors and operators.
Time complexity is within the theoretical permutation complexity range.

Huffman Encoding C++ code is throwing a fatal error

I am writing a code for famous algorithm Huffman Encoding. I am getting a fatal error which turn system into blue screen and then restart. This error occurs in display_Codes which have recursive calls. The error occurs on the following lines:
display_Codes(root->l, s + "0");
display_Codes(root->r, s + "1" );
Following is the complete code.
#include <iostream>
#include <bits/stdc++.h>
using namespace std;
class HeapNode_Min {
public:
char d;
unsigned f;
HeapNode_Min *l, *r;
HeapNode_Min(char d, unsigned f)
{
this->d = d;
this->f = f;
}
~HeapNode_Min()
{
delete l;
delete r;
}
};
class Analyze {
public:
bool operator()(HeapNode_Min* l, HeapNode_Min* r)
{
return (l->f > r->f);
}
};
void display_Codes(HeapNode_Min* root, string s)
{
if(!root)
return;
if (root->d != '$')
cout << root->d << " : " << s << "\n";
display_Codes(root->l, s + "0");
display_Codes(root->r, s + "1" );
}
void HCodes(char data[], int freq[], int s)
{
HeapNode_Min *t,*r, *l ;
priority_queue<HeapNode_Min*, vector<HeapNode_Min*>, Analyze> H_min;
int a=0;
while (a<s){H_min.push(new HeapNode_Min(data[a], freq[a])); ++a;}
while (H_min.size() != 1) {
l = H_min.top(); H_min.pop();
r = H_min.top(); H_min.pop();
t = new HeapNode_Min('$', r->f + l->f);
t->r = r; t->l = l;
H_min.push(t);
}
display_Codes(H_min.top(), "");
}
int main()
{
try
{
int frequency[] = { 3, 6, 11, 14, 18, 25 }; char alphabet[] = { 'A', 'L', 'O', 'R', 'T', 'Y' };
int size_of = sizeof(alphabet) / sizeof(alphabet[0]);
cout<<"Alphabet"<<":"<<"Huffman Code\n";
cout<<"--------------------------------\n";
HCodes(alphabet, frequency, size_of);
}
catch(...)
{
}
return 0;
}
You never set l or r to nullptr, but your code relies on the pointers being either valid or nullptr:
void display_Codes(HeapNode_Min* root, string s)
{
if(!root)
return;
if (root->d != '$')
cout << root->d << " : " << s << "\n";
display_Codes(root->l, s + "0");
display_Codes(root->r, s + "1" );
}
Pass a root with no left and no right node, then neither root->l nor root->r have a value that you could use for anything. Passing them to the next recursion lets you dereference them which invokes undefined behavior.
To fix that you need to initialize the pointers, eg in the constructor:
HeapNode_Min(char d, unsigned f) : d(d),f(f),l(nullptr),r(nullptr) { }
Also your class does not follow the rule of 3/5/0.

How to design sort algorithm based on two indicators?

I have a container (array or vector) and millions of words. I need to sort them in following order s.
The primary sort order should be the number of characters in the word. The secondary sort order should
be lexicographical. I can not use any library such as sort. I want to create the algorithms from scratch. I appreciate if anyone can hit me up with any reference.
So sorting the words:
This is a list of unsorted words
should give:
a is of This list words unsorted
Edit:
I am not allowed to use any STL such as sort
//Following is my final program
//It wi be run with following: args: <inputfile> <outputfile> <timesfile> <ntests>
//timesfile is for storing times and ntests is for number of test
/*
Bernard Grey
10 Wednesday 10 Sep 2014
*/
#include <iostream>
#include <ctime>
#include <algorithm>
#include <fstream>
#include <cctype>
#include <cstdlib>
#include <cstring>
#include <vector>
using namespace std;
//This node contain two type of information both in the vector
//First is vector for hash function. it contains number of repetition of the word
//Second node contain a word for values in my vector and the other field is for future implementation ;)
struct node
{
string val;
int count;
};
//Definition of inner and outer vectors as cintainer of words and hash table
typedef std::vector<node> StringVector;
typedef std::vector<StringVector> StringVector2D;
//Cited at http://stackoverflow.com/questions/8317508/hash-function-for-a-string :In the comment
int HashTable (string word)
{
int seed = 378551;
unsigned long hash = 0;
for(int i = 0; i < word.length(); i++)
{
hash = (hash * seed) + word[i];
}
return hash % 1000000;//Later assign it to number of words
}
//Cite at: http://stackoverflow.com/questions/25726530/how-to-find-an-struct-element-in-a-two-dimention-vector
struct find_word
{
string val;
find_word(string val) : val(val) {}
bool operator () ( const node& m ) const
{
return m.val == val;
}
};
//I could use swap function in vector instead of implementing this function
void swap(StringVector& vec, int i, int j)
{
node tmp = vec[i];
vec[i] = vec[j];
vec[j] = tmp;
}
//To compare string alphabetically order
bool comp(node& i,node& p)
{
int cmp;
if(i.val.compare(p.val)<0)
{
return true;
}
return false;
}
void quickSort(StringVector& aVec, int left, int right);
int partition(StringVector& aVec, int left, int right);
void swap(StringVector& aVec, int left, int right);
void quickSort(StringVector& aVec, int left, int right)
{
if(right>0){
int index = partition(aVec,left,right);
if (left<index-1) {
quickSort(aVec, left, index-1);
}
if (index<right) {
quickSort(aVec, index,right);
}
}
}
int partition(StringVector& aVec, int left, int right)
{
string pivotNode;
pivotNode = aVec[(left+right)/2].val;
while (left<=right) {
while (aVec[left].val.compare(pivotNode)<0) {left++; }
while (aVec[right].val.compare(pivotNode)>0) {right--; }
if (left<=right) {
swap(aVec,left,right);
left++;
right--;
}
}
return left;
}
//Welcome to Maaaain
int main(int argc, char* argv[])
{
/*file reading and preprocessing*/
if(argc != 5)
{
cerr << "usage: " << argv[0] << " infile outfile timesfile ntests" << endl;
}
ifstream fin(argv[1]);
if(fin.fail())
{
cerr << "Error: failed to open file " << argv[1] << " for input" << endl;
exit(EXIT_FAILURE);
}
int ntests = atoi(argv[4]);
//Len of string and max num word
int stringlen, numwords;
get_max_words(fin, stringlen, numwords);
//initial string
string init[numwords];
//Read the file and add it to first array
for(int i=0; i<numwords; i++)
{
string tmp;
fin >> tmp;
int len = tmp.length();
//There is one single ' in the example output file. so I do not want to delete that one :-)
bool pp = true;
//Remove punct from leading and tail
if(len==1)
{
pp=false;
}
//Remove punc
if( ispunct(tmp[0]) && pp)
{
tmp.erase(0,1);
}
//Remove punc
if( ispunct(tmp[len-1]) && pp)
{
tmp.erase(len-1,1);
}
init[i] =tmp;
}
/*
At this point, everything should be in the initial array
The temporary array should be declared but not filled
*/
clockid_t cpu;
timespec start, end;
long time[ntests];
//2 Dimension vector this will called outer vector
StringVector2D twoD;
if(clock_getcpuclockid(0, &cpu) != 0)
{
cerr << "Error: could not get cpu clock" << endl;
exit(EXIT_FAILURE);
}
int rep = 0;
node tmp;
tmp.count = 0;
tmp.val = "";
//Later I need to assign it to number of words * M ... Good for encryption... It is not a security subject
vector<node> first(1000000,tmp);
//This is called inner vector
vector<string> templateVec;
//Last search?
bool last = false;
//Initialize inner map as needed and put it inside the outer vector with no data
for(int f=0;f<(stringlen);f++)
{
StringVector myVec;
twoD.push_back(myVec);
}
for(int i=0; i<ntests; i++)
{
if(clock_gettime(cpu, &start) == -1)
{
cerr << "Error: could not get start time" << endl;
exit(EXIT_FAILURE);
}
//Check if it is last iteration so do not delete data for printing purposeses
if(i == ntests-1)
{
last = true;
}
/*copy from initial array to temporary array*/
//Initialize inner vector with the values. In this point outer vector is filled with inner vector
//&&& inner vector is empty myvec.empty() = true;
//vector at index 0 is for words with one char... vector 1 is for words with two chars and so on...
for(int j=0; j<numwords; j++)
{
int len = init[j].length()-1;
if(len<0)continue;
//Initilize a node to fill up the vector
node currNode;
currNode.val = init[j];
//currNode.count = 0;
int hash = HashTable(init[j]);
//Node already existed
if(first[hash].count != 0){
//Add to its value in hash table
first[hash].count++;
}
else
{
//Activate word first time!
first[hash].count =1;
//I can even not use this because of the hash table but it may help in future improvment!!!
first[hash].val = init[j];
//Add the word to appropriate level in outer string! 1char == [0] --- 2char== [1] so on
twoD[len].push_back(currNode);
}
}
//Sort Alphabetically order
for(int f=0;f<(stringlen);f++)
{
//Eficcient sorting algorithm with no chance of segmentation dump ;)
quickSort(twoD[f],0,twoD[f].size()-1);
}
//Time finished
if(clock_gettime(cpu, &end) == -1)
{
cerr << "Error: could not get end time" << endl;
exit(EXIT_FAILURE);
}
//Delete items from vector if it is not last iteration --- This is not part of sorting algorithm so it is after clock
if(!last)
{
for(int f=0;f<stringlen;f++)
{
twoD[f].clear();
}
twoD.clear();
for(StringVector::iterator it3 = first.begin();it3!=first.end();it3++)
{
it3->val="";
it3->count=0;
}
//Initialize inner map as needed and put it inside the outer vector
for(int f=0;f<(stringlen);f++)
{
StringVector myVec;
twoD.push_back(myVec);
}
}
/*time per trial in nanoseconds*/
time[i] = (end.tv_sec - start.tv_sec)*1000000000 + end.tv_nsec - start.tv_nsec;
}
/*output sorted temporary array*/
int k=0;
int y =0;
int num=0;
ofstream fout(argv[2]);
//Pointer for inner vector
StringVector::iterator it2;
for (StringVector2D::iterator outer = twoD.begin(); outer != twoD.end(); ++outer){
y++;
k=0;
for (it2= outer->begin(); it2!=outer->end(); ++it2){
//Get back data from hash table
int hash = HashTable(it2->val);
//Number of word in other field of the node
int repWord = first[hash].count;
//Print according to that
for(int g=0; g < repWord ;g++){
num++;
//10 char in one line
if(num%10 == 0)
{
fout << it2->val;
fout<<endl;
k++;
}
else
{
fout<< it2->val << " ";
}
}
}
}
//Sort times with STL for god sake....
sort(time,time+ntests);
//print times to the file///
ofstream ftimes(argv[3]);
for(int i=0; i<ntests; i++)
ftimes << time[i] << endl;
}
//Helper function .. nice job
void get_max_words(ifstream& fin, int& wordlen, int& numwords)
{
char c;
int count=0;
wordlen = numwords = 0;
while(fin.good() && fin.get(c) && isspace(c)){;} //skip leading space
while(fin.good())
{
++numwords;
while(fin.good() && !isspace(c))
{
++count;
fin.get(c);
}
if(count > wordlen)
wordlen = count;
count = 0;
while(fin.good() && fin.get(c) && isspace(c)){;} //skip space
}
if(count > wordlen)
wordlen = count;
fin.clear();
fin.seekg(0, ios::beg);
}
You'll primarily need a comparator for your sort routine to sort on:
bool lessThan(const std::string a, const std::string b) {
if (a.length() != b.length())
return a.length() < b.length();
return a < b;
}
There's actually an easy way to implement this in stl. There's a sort method that takes a comparator:
template <class RandomAccessIterator, class Compare>
void sort (RandomAccessIterator first, RandomAccessIterator last, Compare comp);
So you can do this:
bool comparator(const string& a, const string& b) {
if (a.length() < b.length())
return true;
if (a.length() == b.length())
return a < b;
return false;
}
sort(words.begin(), words.end(), comparator);
It's about sorting based on multiple keys. I suggest you study some efficient sorting algorithm, say Quick Sort, then change the comparator to adapt the multiple keys.
For any sorting algorithm that is based on comparing, the easiest way to adapt multiple key sorting is to change the comparing criteria, from a single value to multiple values.
If you are not even allowed to use STL, i.e. you are not allowed to use sort in , here is a post you can start with: Sorting an array using multiple sort criteria (QuickSort)
If you are allowed, just write a comparing function which supports the multiple key comparison and plug it in the sort function. You can check this C++ reference for more details.
An illustration (it's just an illustration to point out how you can plug in the compare function):
bool comparator(const string& a, const string& b) {
if (a.length() < b.length())
return true;
if (a.length() > b.length())
return false;
return a < b;
}
void Qsort(string a[],int low,int high)
{
if(low >= high)
{
return;
}
int left = low;
int right = high;
string key = a[(low + high) >> 1];
while(left < right)
{
while(left < right && comparator(a[left], key)) left++;
while(left < right && !comparator(a[right], key)) right--;
if (left < right)
{
swap(a[left], a[right]);
left++; right--;
}
}
if (left == right) left ++;
if (low < right) Qsort(a, low, left - 1);
if (high > left) Qsort(a, right + 1, high);
}
The answer wants a design, so I'll focus on the design of your sorting library, than an implementation
Your sort algorithm can use your custom comparator objects with a member operator() implemented for comparison between two elements.
Your comparator can be a Linked List of comparators and can call the next comparator if the current one gives a tie. You'll have to ensure that there is always a true and false return though. Or implement something that can create a stable_sort if nothing else.
So the first comparator is number of characters and the second comparator is lexicographical..
This idea is then general enough so that if your requirement changes tomorrow. This can then be reused.
This is on the lines of Chain of Responsibility Pattern. You can templat-ize the comparator after you've got the gist.
Ex:
class Chain_Comparator
{
Chain_Comparator* next;
public:
bool operator()( void* a, void* b )
{
if( a_is_less_b(a, b) )
return true;
else if( b_is_less_a(a,b) )
return false;
else if( next )
return next( a, b )
}
virtual bool a_is_less( void* a, void* b) = 0;
virtual bool b_is_less( void* a, void* b) = 0;
};
class Num_Comparator : public Chain_Comparator
{
// Implements a_is_less etc.
};
class Lex_Comparator : public Chain_Comparator
{
// Implement lex comparisons.
};
void your_custom_sorting_method( vector<int > a, Chain_Comparator& c)
{
// Implementation goes here.
// call the operator() for c with simply : c( a[i], a[j] )
}

string Vector push_back failing in class

I have a class with a method that should return a vector of strings. the getCommVector method has to push_back the elements of a string array into a string vector that can then be returned by the method. When trying to add a string element to the string vector i get:
libc++abi.dylib: terminate called throwing an exception
2Program ended with exit code: 0
I cannot understand why I can't push_back strings to the vector. Any ideas?
Thanks in advance!
code segments of interest (edited after suggestions):
class Command {
public:
//Command (std::string, bool, bool);
void setOperators(std::string,bool, bool);
void analyseCommand();
Command();
std::vector<std::string> getCommVector ();
private:
int numOperators; //number of total commands
int opCount; //current command number
std::string input_string;
bool field_command, byte_command;
std::string commVector[3];
std::vector<std::string> finalCommVector;
void byte_analysis();
void field_analysis();
void decode_command();
void syntax_error();
void decode_error();
};
Command::Command() : numOperators(0), opCount(0), field_command(false),byte_command(false)
{
}
std::vector<std::string> Command::getCommVector ()
{
std::string s ="test";
finalCommVector.push_back("s");
return finalCommVector;
}
adding SSCE:
class Command {
public:
//Command (std::string, bool, bool);
void setOperators(std::string,bool, bool);
void analyseCommand();
Command();
std::vector<std::string> getCommVector ();
private:
int numOperators; //number of total commands
int opCount; //current command number
std::string input_string;
bool field_command, byte_command;
std::string commVector[3];
std::vector<std::string> finalCommVector;
void byte_analysis();
void field_analysis();
void decode_command();
void syntax_error();
void decode_error();
};
Command::Command() : numOperators(0), opCount(0), field_command(false),byte_command(false)
{
}
void Command::syntax_error()
{
std::cout<<"Incorrect Syntax Error: Usage: linuxcut -b num -f num \n";
exit(EXIT_FAILURE);
}
void Command::decode_error()
{
std::cout<<"Decode Error: Usage: linuxcut -b num -f num \n";
exit(EXIT_FAILURE);
}
void Command::analyseCommand()
{
if (byte_command) {
//start byte command analysys
byte_analysis();
}
else if (field_command)
{
//start field command analysys
field_analysis();
}
}
void Command::setOperators(std::string input_argument, bool is_field, bool is_byte)
{
input_string = input_argument;
field_command = is_field;
byte_command = is_byte;
}
std::vector<std::string> Command::getCommVector ()
{
std::string s = "ssd";
finalCommVector.push_back(s);
/*
for (int i = 0; i<sizeof(commVector); i++)
{
if (commVector[i] != "")
{
//debug
std::cout<<"asdas";
}
}
*/
return finalCommVector;
}
void Command::byte_analysis()
{
int next_state = 0;
int dashCount = 0;
int commVectIndex = 0;
//iterate through string and check if the argument is valid
for (int i= 0; i<input_string.length(); i++) {
switch (next_state) {
case 0: //start
//if character is a number:
if (isdigit(input_string.at(i)))
{
//first elemnt of command commVector is number
commVector[commVectIndex]+=input_string.at(i);
//DEBUG
std::cout<<commVector[commVectIndex];
next_state = 1;
}
//if character is a dash:
else if (input_string[i] == '-')
{
//increment dashCount
dashCount++;
//if next character in input_string is a number continue
if (isdigit(input_string[i+1])) {
commVector[commVectIndex]+=input_string.at(i);
commVectIndex++;
next_state = 1;
}
else //else error
{
syntax_error();
}
}
//if it's niether: error!
else
{
syntax_error();
}
break;
case 1:
//if next character is a number:
if (isdigit(input_string[i]))
{
commVector[commVectIndex]+=input_string.at(i);
next_state = 1;
}
//if next character is dash
else if (input_string[i] == '-'&& dashCount <= 3)
{
dashCount++;
//increment commandVectIndex
commVectIndex++;
next_state = 2;
commVector[commVectIndex]+=input_string.at(i);
//increment commandVectIndex to accomodate next operation
commVectIndex++;
}
//if it's niether: error!
else
{
syntax_error();
}
break;
case 2://previous character was dash
//if next character is number
if (isdigit(input_string[i]))
{
commVector[commVectIndex]+=input_string.at(i);
next_state = 1;
}
//if it's niether: error!
else
{
syntax_error();
}
break;
default:
syntax_error();
break;
}
}
}
void Command::field_analysis()
{
}
/*****************FUNCTIONS DEFINITIONS***************/
void print_usage() {
std::cout<<"Incorrect Syntax Error: Usage: linuxcut -b num -f num \n";
}
/*****************END OF FUNCTIONS DEFINITIONS***************/
/***************** MAIN ***************/
int main(int argc, char *argv[]) {
int opt= 0;
std::string byte = "-1-2,2",field = "";
std::string sub_arg_delimiter = ","; //delimiter for comma serparated arguments
static bool complement = false;
int diffOpt = 0; //stores the difference between optind and argc to read filenames in command
std::string fileName;
//Specifying the expected options
//The two options l and b expect numbers as argument
static struct option long_options[] = {
{"byte", required_argument, 0, 'b' },
{"field", required_argument, 0, 'f' },
{"complement", no_argument, 0, 0 },
{0, 0, 0, 0 }
};
Command testCommand;
testCommand.setOperators("-2-", false, true);
std::vector<std::string> trial = testCommand.getCommVector();
std::cout<<"filename:"<<fileName<<std::endl;
std::cout<<"Selected flags:\n"<< "b: "<< byte<<"\nf: "<<field<<"\ncomplement: "<<complement<<std::endl;
return 0;
}
You're iterating way beyond the array size. sizeof(commVector) returns the size of the array in bytes.
If you have C++11 available, you can do this:
for (const auto &s : commVector) {
if (s != "") {
// as before
}
}
Or at least this (if you only have partial C++11 support):
for (auto it = std::begin(commVector); it != std::end(commVector); ++it) {
std::string s = *it;
// the rest as before
}
Without C++11, you can at least do this:
for (int i = 0; i < sizeof(commVector) / sizeof(commVector[0]); ++i) {
// the rest as before
}
Or provide your own function for obtaining the correct array size:
template <class T, size_t N>
size_t arraySize(const T (&)[N]) { return N; }
// Use:
for (size_t i = 0; i < arraySize(commVector); ++i) {
// the rest as before
}
i<sizeof(commVector);
should be
i<countof(commVector);
if countof/_countof is defined for your compiler. If not, you can do it yourself, it is typically defined as:
#define countof(a) (sizeof(a)/sizeof(a[0]))
and I won't go into discussion about using macros in C++ :)
Of course, you could also use a constant are your array has fixed number of elements, but I guess it's just an example.
sizeof returns the size of the object (in this case the string array) itself, not the count of elements inside the vector.
Because of this, it is equal to number of the array elements multiplied by size of a single string instance, so you try to access non-existing items with operator[].
This is also broken:
finalCommVector.push_back("s");
and probably you meant:
finalCommVector.push_back(s);
If all you need is the array of std::string commVector as a std::vector<String>, you can use std::vecor::assign:
finalCommVector.assign(commVector, commVector+3)
The '3' is the length of you array.