Maps and substrings - c++

I want to sort suffices of a string.
The most simple way to do that is putting all the suffices into map.
In order to use memory efficiently, I pass suffix as (str+i), where str is char* and i is a position suffix starts with. However, I found out that map is not going to sort these suffices. Here goes an example
typedef std::map < char*, int,Comparator> MapType;
MapType data;
// let's declare some initial values to this map
char* bob=(char* )"Bobs score";
char* marty=(char* ) "Martys score";
data.insert(pair<char*,int>(marty+1,15));
data.insert(pair<char*,int>(bob+1,10));
MapType::iterator end = data.end();
for (MapType::iterator it = data.begin(); it != end; ++it) {
std::cout << "Who(key = first): " << it->first;
std::cout << " Score(value = second): " << it->second << '\n';
}
The output is
Who(key = first): obs score Score(value = second): 10
Who(key = first): artys score Score(value = second): 15
However, strcmp, standard function for comparing strings, works correctly for bob+1 and marty+1. It says marty+1 is less than bob+1.

The map will sort by the address of the char*, not lexiographically. Change the key to a std::string or define a comparator.
EDIT:
It looks as though you have attempted to define a Comparator but the definition of it is not posted. Here is an example:
#include <iostream>
#include <map>
#include <string.h>
struct cstring_compare
{
bool operator()(const char* a_1, const char* a_2) const
{
return strcmp(a_1, a_2) < 0;
}
};
typedef std::map<const char*, int, cstring_compare> cstring_map;
int main()
{
cstring_map m;
m["bcd"] = 1;
m["acd"] = 1;
m["abc"] = 1;
for (cstring_map::iterator i = m.begin(); i != m.end(); i++)
{
std::cout << i->first << "\n";
}
return 0;
}
Output:
abc
acd
bcd

define a custom Comparator, eg
class compare_char {
public:
bool operator()(const char* lhs, const char* rhs) { return strcmp(lhs, rhs); }
};
define your map using this comparator instead of whatever you currently have. Alternatively, use a map with a key type that has a comparison operator that works with values, a std::string is better for you. Currently you have a map using char* as the key which compares char* types, ie. the value of the pointer, not the contents.

You should add the comparer class or function you are using since that is where your error is probably coming from.
There is a slight difference between strcmp and a map comparaison function.
strcmp returns 0 if a == b, -1 if a < b, 1 if a > b
comp returns true is a < b, false otherwise.
A correct way to implement the comparison function is the following:
bool operator() (char* lhs, char* rhs) const
{
return strcmp(lhs,rhs) < 0;
}

Related

C++ map customize comparator

I defined a map to count the number of strings while sorting the strings by their length:
struct cmp {
bool operator()(const string& a, const string& b) {
return a.size() > b.size();
}
};
int main() {
map<string, int, cmp> mp;
mp["aaa"] = 1;
mp["bbb"] = 2;
cout << mp["aaa"];
}
I'm confused as the output is 2. How should I achieve my goal?
Because of the way your comparator is defined, strings "aaa" and "bbb" are considered equal. Your map has one item, not two. First you assigned 1 to that item, then you assigned 2.
To solve the problem, define your comparator as follows:
struct cmp {
bool operator()(const string& a, const string& b) {
return a.size() == b.size() ? a > b : a.size() > b.size();
}
};
That way, the strings will be considered equal only if they actually are equal, not only when their sizes match, but the string length will still have priority for sorting.
std::map not only sorts items by key, it stores them by (unique) key - 1 item per key.
This behavior is defined by the comparator: if for keys a & b neither of a<b and b<a is true, these keys are considered equal.
In your case mp["bbb"] = 2 just overwrites mp["aaa"].
If you want to fit all the strings in the map, you can use std::multimap, which allows more than 1 value per key.
The other way is to redefine the comparator, so that it would take the different strings into account:
struct cmp {
bool operator()(const string& a, const string& b) {
if(a.size() < b.size()) return true;
if(b.size() < a.size()) return false;
return a<b;
}
};
Thus your map will still prioritize sorting by string length, but it will also distinguish different strings of same size.
Depending on your use case, you can also check other containers like priority_queue or just plain vector with a proper insertion technique.
If you want to allow strings of identical size in your map but do not care about their relative order, then std::multimap is an alternative solution:
#include <map>
#include <iostream>
#include <string>
struct cmp {
bool operator()(const std::string& a, const std::string& b) const {
return a.size() > b.size();
}
};
int main() {
std::multimap<std::string, int, cmp> mp;
mp.emplace("eee", 5);
mp.emplace("aaa", 1);
mp.emplace("bbb", 2);
mp.emplace("cccc", 3);
mp.emplace("dd", 4);
auto const elements_of_size_3 = mp.equal_range("aaa");
for (auto iter = elements_of_size_3.first; iter != elements_of_size_3.second; ++iter)
{
std::cout << iter->first << " -> " << iter->second << '\n';
}
}
Output:
eee -> 5
aaa -> 1
bbb -> 2
From std::multimap<std::string, int, cmp>'s point of view, "eee", "aaa" and "bbb" are all completely equal to each other, and std::multimap allows different keys to be equal. Their relative order is actually guaranteed to be the order of insertion since C++11.

Check if a string contains duplicates using std::map

I have a function that checks if a string contains duplicates using std::map by putting each char as a key. Can't figure out why this doesn't work.
#include<iostream>
#include<map>
#include<string>
int unique_char(std::string s){
for(int i=0 ; i < s.size(); i++ )
{
std::map<char,int> uniq_hash_table;
std::pair<std::map<char,int>::iterator,bool> ret;
ret = uniq_hash_table.insert(std::pair<char,int>(s[i],0));
if(ret.second==false)
{
std::cout << "The string contains duplicates" << std::endl;
return 1;
}
}
return 0;
}
int main()
{
std::string s="abcd";
std::string s1="aabc";
if(unique_char(s)==0){
std::cout << "The 1st string does not contain duplicates" << std::endl;}
if(unique_char(s1)==0){
std::cout << "The 2nd string does not contain duplicates" << std::endl;}
return 0;
}
The program returns "string does not contain duplicates" for both examples.
ps: I'm purposely using std::map to get O(n) solution.
Your solution doesn't work because your std::map<char,int> is re-created at each iteration of the loop. Then, at each iteration of the loop, the map is empty. Then, there is no duplication.
Better to use a std::set<char>. You can do something like that :
bool contains_duplicated_char(const std::string& s)
{
std::set<char> check_uniq;
for(unsigned long int i = 0; i < s.length(); ++i)
if(!check_uniq.insert(s[i]).second)
return true; // Duplicated char found
return false; // No duplicated char found
}
and then call it by this way :
const std::string str = "abcdefghijklamnopb";
const bool dupl = contains_duplicated(str);
In order to make your code more generic (managing more data types), you can also create your function in that way :
template <typename Type, typename IteratorType>
bool contains_duplicated(IteratorType begin, IteratorType end)
{
std::set<Type> check_uniq;
for(IteratorType it = begin; it != end; ++it)
if(!check_uniq.insert(*it).second)
return true;
return false;
}
and then call it like :
std::vector<std::string> vec_str;
vec_str.push_back("Foo");
vec_str.push_back("Bar");
vec_str.push_back("Baz");
vec_str.push_back("Bar");
const bool dupl = contains_duplaicated<std::string>(vec_str.begin(), vec_str.end());
//...
const std::string str = "abcdefab";
const bool dupl2 = contains_duplacated<char>(str.begin(), str.end());
//...
const std::deque<long int> x(4, 0);
x[0] = 1;
x[1] = 17;
x[2] = 31;
x[3] = 0;
const bool dupl3 = contains_duplicated<long int>(x.begin(), x.end());
It does not work because uniq_hash_table is recreated for each symbol inside for loop.
Try to move it into the beginning of the function right before the for loop:
std::map<char,int> uniq_hash_table;
for(int i=0 ; i < s.size(); i++ )
{
// ...
}
As your map definition is within the body of the for loop, you recreate an empty map at each iteration.
Declare your container outside the loop and it 'll work better.
Note that you could use a set instead of a map if you never increment the int value.

Vector point to another vector

What I have here is two arrays of different types that I'm converting to vectors.
int ham_array[] = {32,71,12,45,26};
char word_array[] = {"cat", "bat", "green", "red", "taxi"};
vector < int > hamvector (ham_array, ham_array + 5);
vector < char > wordvector(word_array, word_array + 5);
I am going to call a sort function to sort the elements of ham_array from least to greatest. At the same time, I would like the word_array to also get sorted the same way ham_vector gets sorted using references.
For example,
after I call sort(hamvector)
ham_array[] = {12, 26, 32, 45, 71}
and sort(wordvector)
word_array[] = {"green", "taxi", "cat", "red", "bat"};
Is there an easy way to do this?
Well for one thing, that would be char *word_array[], the way you declared it would be a string.
Anyway the way to do this is you declare a structure to keep these things paired:
struct t {string name; int number;};
vector<t> list;
// fill in list
// comparer to compare two such structs
bool comparer(t &a, t &b) { return a.number>=b.number; }
// and to sort the list
sort(list.begin(), list.end(), comparer);
If by simple, you mean a more direct way then yes. The std::sort() does support sorting of raw arrays as well:
sort(word_array, word_array + 5, wordcmp);
As Blindy showed, you need a comparator function to tell sort how the ordering is suppose to be done for your list of words. Otherwise you'll end up sorting by the memory address that the string resides at instead of by the letters in your string. Something like this should work:
int wordcmp(const char *lhs, const char *rhs)
{
return strncmp(lhs, rhs, 256) < 0;
}
One other note, in practice you'll want to prefer std::vector over just raw pointer arrays since the latter isn't as safe.
I've tried to find a solution to a similar problem before and ultimately had to sort it manually. Another way I imagine you could do this would be to write a sorter functor that can somehow figure out, based on which string is being sorted, which integer is associated, and sort based on that. This is terribly inefficient, so I would highly advise doing your own manual sorting using std::swap.
#include <map>
#include <string>
#include <vector>
#include <algorithm>
#include <iostream>
template<typename KeyType, typename ValueType>
class CMappedSorter
{
std::map<KeyType, ValueType>* const m_Mappings;
public:
CMappedSorter(std::map<KeyType, ValueType>* Mappings) : m_Mappings(Mappings)
{
}
bool operator()(KeyType& LHS, KeyType& RHS)
{
const ValueType LHSSortingValue = m_Mappings->find(LHS)->second;
const ValueType RHSSortingValue = m_Mappings->find(RHS)->second;
return (LHSSortingValue < RHSSortingValue);
}
};
int main(int argc, char* argv[])
{
std::vector<int> Integers;
std::vector<std::string> Strings;
Integers.push_back(3);
Integers.push_back(1);
Integers.push_back(2);
Strings.push_back("Apple");
Strings.push_back("Banana");
Strings.push_back("Cherry");
std::map<std::string, int> Mappings;
if(Integers.size() == Strings.size())
{
const unsigned int ElementCount = Strings.size();
// Generate mappings.
auto StringsIterator = Strings.begin();
auto IntegersIterator = Integers.begin();
for(unsigned int i = 0; i < ElementCount; ++i)
{
Mappings[*(StringsIterator)] = *(IntegersIterator);
++StringsIterator;
++IntegersIterator;
}
// Print out before sorting.
std::cout << "Before Sorting" << std::endl;
std::cout << "Int\tString" << std::endl;
StringsIterator = Strings.begin();
IntegersIterator = Integers.begin();
for(unsigned int i = 0; i < ElementCount; ++i)
{
std::cout << *(IntegersIterator) << '\t' << *(StringsIterator) << std::endl;
++StringsIterator;
++IntegersIterator;
}
// Sort
std::sort(Strings.begin(), Strings.end(), CMappedSorter<std::string, int>(&(Mappings)));
std::sort(Integers.begin(), Integers.end());
// Print out after sorting.
std::cout << "After Sorting" << std::endl;
std::cout << "Int\tString" << std::endl;
StringsIterator = Strings.begin();
IntegersIterator = Integers.begin();
for(unsigned int i = 0; i < ElementCount; ++i)
{
std::cout << *(IntegersIterator) << '\t' << *(StringsIterator) << std::endl;
++StringsIterator;
++IntegersIterator;
}
}
else
{
std::cout << "Error: Number of elements in each container are not equivalent." << std::endl;
}
}

C++: std::sort using already destroyed object with custom predicate?

I'm having a very odd problem with some code using std::sort. If I replace std::sort by stable_sort the problem goes away.
class Entry
{
public:
Entry() : _date(0), _time(0), _size(0) {}
Entry(unsigned int d, unsigned int t, unsigned int s) : _date(d), _time(t), _size(s) {}
~Entry() {_size=0xfffffffe;}
unsigned int _date, _time, _size;
};
void initialise(std::vector<Entry> &vec)
vec.push_back(Entry(0x3f92, 0x9326, 0x1ae));
vec.push_back(Entry(0x3f92, 0x9326, 0x8a54));
//.... + a large number of other entries
}
static bool predicate(const Entry &e1, const Entry &e2)
{
// Sort by date and time, then size
if (e1._date < e2._date )
return true;
if (e1._time < e2._time )
return true;
return e1._size < e2._size;
}
int main (int argc, char * const argv[]) {
using namespace std;
vector<Entry> vec;
initialise(vec);
sort(vec.begin(), vec.end(), predicate);
vector<Entry>::iterator iter;
for (iter=vec.begin(); iter!=vec.end(); ++iter)
cout << iter->_date << ", " << iter->_time <<
", 0x" << hex << iter->_size << endl;
return 0;
}
The idea is that I sort the data first by date and time then by size. However, depending on the data in the vector, I will end up with 0xfffffffe in the size printed out at the end for the first object, indicating that a destroyed object has been accessed, or a seg fault during the sort.
(Xcode 3.2.4 - 64 bit intel target)
Any ideas anyone??
I suspect it has something to do with my predicate, but I can't see for the life of me what it is....!!
This page seems to refer to the same problem:
http://schneide.wordpress.com/2010/11/01/bug-hunting-fun-with-stdsort/
but the reason he gives (that the predicate needs to define a strict weak ordering) seems to be satisfied here...
Your predicate does not satisfy strict weak ordering criteria. Look at your function and ask yourself, what happens if e1's date comes after e2, but e1's time comes before e2?
I think what your predicate really should be is something like this:
static bool predicate(const Entry &e1, const Entry &e2)
{
// Sort by date and time, then size
return e1._date < e2._date ||
(e1._date == e2._date &&
(e1._time < e2._time ||
(e1._time == e2._time && e1._size < e2._size)));
}
What you wrote - if e1._date>e2._date, the first condition will be false, but the second may still be true and the function will still claim that e1<e2 which is probably not what you want.
Your code needs to be:
static bool predicate(const Entry &e1, const Entry &e2)
{
// Sort by date and time, then size
if (e1._date != e2._date )
return e1._data < e2._date;
if (e1._time != e2._time )
return e1._time < e2._time;
return e1._size < e2._size;
}
If e2's date is after e1, then your version treats goes on to compare the time and size. This is not what you want. This eventually confuses std::sort because if you swap e1 and e2 you will not get a consistent answer.

Sorting a set<string> on the basis of length

My question is related to this.
I wanted to perform a sort() operation over the set with the help of a lambda expression as a predicate.
My code is
#include <set>
#include <string>
#include <iostream>
#include <algorithm>
int main() {
using namespace std;
string s = "abc";
set<string> results;
do {
for (int n = 1; n <= s.size(); ++n) {
results.insert(s.substr(0, n));
}
} while (next_permutation(s.begin(), s.end()));
sort (results.begin(),results.end());[](string a, string b)->bool{
size_t alength = a.length();
size_t blength = b.length();
return (alength < blength);
});
for (set<string>::const_iterator x = results.begin(); x != results.end(); ++x) {
cout << *x << '\n';
}
return 0;
}
But the numbers and types of errors were so complex that I couldn't understand how to fix them. Can someone tell me whats wrong with this code.
Edit: Note that Steve Townsend's solution is actually the one you're searching for, as he inlines as a C++0x Lambda what I write as C++03 code below.
Another solution would be to customize the std::set ordering function:
The std::set is already ordered...
The std::set has its own ordering, and you are not supposed to change it once it is constructed. So, the following code:
int main(int argc, char* argv[])
{
std::set<std::string> aSet ;
aSet.insert("aaaaa") ;
aSet.insert("bbbbb") ;
aSet.insert("ccccccc") ;
aSet.insert("ddddddd") ;
aSet.insert("e") ;
aSet.insert("f") ;
outputSet(aSet) ;
return 0 ;
}
will output the following result:
- aaaaa
- bbbbb
- ccccccc
- ddddddd
- e
- f
... But you can customize its ordering function
Now, if you want, you can customize your set by using your own comparison function:
struct MyStringLengthCompare
{
bool operator () (const std::string & p_lhs, const std::string & p_rhs)
{
const size_t lhsLength = p_lhs.length() ;
const size_t rhsLength = p_rhs.length() ;
if(lhsLength == rhsLength)
{
return (p_lhs < p_rhs) ; // when two strings have the same
// length, defaults to the normal
// string comparison
}
return (lhsLength < rhsLength) ; // compares with the length
}
} ;
In this comparison functor, I did handle the case "same length but different content means different strings", because I believe (perhaps wrongly) that the behaviour in the original program is an error. To have the behaviour coded in the original program, please remove the if block from the code.
And now, you construct the set:
int main(int argc, char* argv[])
{
std::set<std::string, MyStringLengthCompare> aSet ;
aSet.insert("aaaaa") ;
aSet.insert("bbbbb") ;
aSet.insert("ccccccc") ;
aSet.insert("ddddddd") ;
aSet.insert("e") ;
aSet.insert("f") ;
outputSet(aSet) ;
return 0 ;
}
The set will now use the functor MyStringLengthCompare to order its items, and thus, this code will output:
- e
- f
- aaaaa
- bbbbb
- ccccccc
- ddddddd
But beware of the ordering mistake!
When you create your own ordering function, it must follow the following rule:
return true if (lhs < rhs) is true, return false otherwise
If for some reason your ordering function does not respect it, you'll have a broken set on your hands.
std::sort rearranges the elements of the sequence you give it. The arrangement of the sequence in the set is fixed, so the only iterator you can have is a const iterator.
You'll need to copy results into a vector or deque (or such) first.
vector sortable_results( results.begin(), results.end() );
You can customize the ordering of the elements in the set by providing a custom predicate to determine ordering of added elements relative to extant members. set is defined as
template <
class Key,
class Traits=less<Key>,
class Allocator=allocator<Key>
>
class set
where Traits is
The type that provides a function
object that can compare two element
values as sort keys to determine their
relative order in the set. This
argument is optional, and the binary
predicate less is the default
value.
There is background on how to use lambda expression as a template parameter here.
In your case this translates to:
auto comp = [](const string& a, const string& b) -> bool
{ return a.length() < b.length(); };
auto results = std::set <string, decltype(comp)> (comp);
Note that this will result in set elements with the same string length being treated as duplicates which is not what you want, as far as I can understand the desired outcome.
sort requires random access iterators which set doesn't provide (It is a bidirectional iterator). If you change the code to use vector it compiles fine.
You cannot sort a set. It's always ordered on keys (which are elements themselves).
To be more specific, std::sort requires random access iterators. The iterators provided by std::set are not random.
Since I wrote the original code you're using, perhaps I can expand on it... :)
struct cmp_by_length {
template<class T>
bool operator()(T const &a, T const &b) {
return a.length() < b.length() or (a.length() == b.length() and a < b);
}
};
This compares by length first, then by value. Modify the set definition:
set<string, cmp_by_length> results;
And you're good to go:
int main() {
using namespace std;
string s = "abc";
typedef set<string, cmp_by_length> Results; // convenience for below
Results results;
do {
for (int n = 1; n <= s.size(); ++n) {
results.insert(s.substr(0, n));
}
} while (next_permutation(s.begin(), s.end()));
// would need to add cmp_by_length below, if I hadn't changed to the typedef
// i.e. set<string, cmp_by_length>::const_iterator
// but, once you start using nested types on a template, a typedef is smart
for (Results::const_iterator x = results.begin(); x != results.end(); ++x) {
cout << *x << '\n';
}
// of course, I'd rather write... ;)
//for (auto const &x : results) {
// cout << x << '\n';
//}
return 0;
}
std::set is most useful to maintain a sorted and mutating list. It faster and smaller to use a vector when the set itself wont change much once it's been built.
#include <vector>
#include <string>
#include <iostream>
#include <algorithm>
int main() {
using namespace std;
string s = "abc";
vector<string> results;
do {
for (size_t n = 1; n <= s.size(); ++n) {
results.push_back(s.substr(0, n));
}
} while (next_permutation(s.begin(), s.end()));
//make it unique
sort( results.begin(), results.end() );
auto end_sorted = unique( results.begin(), results.end() );
results.erase( end_sorted, results.end() );
//sort by length
sort (results.begin(),results.end());
[](string lhs, string rhs)->bool
{ return lhs.length() < rhs.length(); } );
for ( const auto& result: results ) {
cout << result << '\n';
}
}
I used the classic, sort/unique/erase combo to make the results set unique.I also cleaned up your code to be a little bit more c++0x-y.