How to force std::map::find() to search by value - c++

From what I have deduced, the std::map::find() method searches the map by comparising pointer address instead of values. Example:
std::string aa = "asd";
const char* a = aa.c_str();
const char* b = "asd";
// m_options is a std::map<const char*, int )
m_options.insert( std::make_pair( a, 0 ) );
if( m_options.find( b ) != m_options.end() ) {
// won't reach this place
}
I am kinda surprised (because I am using primitive types instead of some class) and I think that I have done something wrong, if not then how to force it to use value instead of address?

You are using char * as a key type for the map. For the pointer types, comparison is performed by their address (as the map cannot know that these pointers are NULL-terminated 8-bit strings).
To achieve your goal, you could create the map with custom compare function, e.g.:
bool MyStringCompare(const char *s1, const char *s2) {
return strcmp(s1, s2) < 0;
}
...
std::map<const char*, int, MyStringCompare> m_options;
Or consider using std::string as the key type.

Actually, map uses a strict ordering comparison operator to look for values, not the equality operator. Anyway, you can achieve this by passing a custom functor that compares the values of the strings, or do the right thing and use std::string instead.

Related

std::set find behavior with char * type

I have below code line:
const char *values[] = { "I", "We", "You", "We"};
std::set<const char*> setValues;
for( int i = 0; i < 3; i++ ) {
const char *val = values[i];
std::set<const char*>::iterator it = setValues.find( val );
if( it == setValues.end() ) {
setValues.insert( val );
}
else {
cout << "Existing value" << endl;
}
}
With this I am trying to insert non-repeated values in a set, but somehow code is not hitting to print for existing element and duplicate value is getting inserted.
What is wrong here?
The std::set<T>::find uses a default operator < of the type T.
Your type is const char*. This is a pointer to an address in memory so the find method just compares address in memory of given string to addresses in memory of all strings from set. These addresses are different for each string (unless compiler optimizes it out).
You need to tell std::set how to compare strings correctly. I can see that AnatolyS already wrote how to do it in his answer.
You should define less predicate for const char* and pass into the set template to make the set object works correctly with pointers:
struct cstrless {
bool operator()(const char* a, const char* b) const {
return strcmp(a, b) < 0;
}
};
std::set<const char*, cstrless> setValues;
Unless you use a custom comparison function object, std::set uses operator<(const key_type&,key_type&) by default. Two pointers are equal if, and only if they point to the same object.
Here is an example of three objects:
char a[] = "apple";
char b[] = "apple";
const char (&c)[6] = "apple"
First two are arrays, the third is an lvalue reference that is bound to a string literal object that is also an array. Being separate objects, their address is of course also different. So, if you were to write:
setValues.insert(a)
bool is_in_map = setValues.find("apple") != setValues.end();
The value of is_in_map would be false, because the set contains only the address of the string in a, and not the address of the string in the literal - even though the content of the strings are same.
Solution: Don't use operator< to compare pointers to c strings. Use std::strcmp instead. With std::set, this means using a custom comparison object. However, you aren't done with caveats yet. You must still make sure that the strings stay in memory as long as they are pointed to by the keys in the set. For example, this would be a mistake:
char a[] = "apple";
setValues.insert(a);
return setValues; // oops, we returned setValues outside of the scope
// but it contains a pointer to the string that
// is no longer valid outside of this scope
Solution: Take care of scope, or just use std::string.
(This answer plagiarises my own answer about std::map here)

Map C-style string to int using C++ STL?

Mapping of string to int is working fine.
std::map<std::string, int> // working
But I want to map C-style string to int
For example:
char A[10] = "apple";
map<char*,int> mapp;
mapp[A] = 10;
But when I try to access the value mapped to "apple" I am getting a garbage value instead of 10. Why it doesn't behave the same as std::string?
map<char*,int> mapp;
They key type here is not "c string". At least not, if we define c string to be "an array of characters, with null terminator". The key type, which is char*, is a pointer to a character object. The distinction is important. You aren't storing strings in the map. You are storing pointers, and the strings live elsewhere.
Unless you use a custom comparison function object, std::map uses operator<(const key_type&,key_type&) by default. Two pointers are equal if, and only if they point to the same object.
Here is an example of three objects:
char A[] = "apple";
char B[] = "apple";
const char (&C)[6] = "apple"
First two are arrays, the third is an lvalue reference that is bound to a string literal object that is also an array. Being separate objects, their address is of course also different. So, if you were to write:
mapp[A] = 10;
std::cout << mapp[B];
std::cout << mapp[C];
The output would be 0 for each, because you hadn't initialized mapp[B] nor mapp[C], so they will be value initialized by operator[]. The key values are different, even though each array contains the same characters.
Solution: Don't use operator< to compare pointers to c strings. Use std::strcmp instead. With std::map, this means using a custom comparison object. However, you aren't done with caveats yet. You must still make sure that the strings must stay in memory as long as they are pointed to by the keys in the map. For example, this would be a mistake:
char A[] = "apple";
mapp[A] = 10;
return mapp; // oops, we returned mapp outside of the scope
// but it contains a pointer to the string that
// is no longer valid outside of this scope
Solution: Take care of scope, or just use std::string.
It can be done but you need a smarter version of string:
struct CString {
CString(const char *str) {
strcpy(string, str);
}
CString(const CString &copy); // Copy constructor will be needed.
char string[50]; // Or char * if you want to go that way, but you will need
// to be careful about memory so you can already see hardships ahead.
bool operator<(const CString &rhs) {
return strcmp(string, rhs.string) < 0;
}
}
map<CString,int> mapp;
mapp["someString"] = 5;
But as you can likely see, this is a huge hassle. There are probably some things that i have missed or overlooked as well.
You could also use a comparison function:
struct cmpStr{
bool operator()(const char *a, const char *b) const {
return strcmp(a, b) < 0;
}
};
map<char *,int> mapp;
char A[5] = "A";
mapp[A] = 5;
But there is a lot of external memory management, what happens if As memory goes but the map remains, UB. This is still a nightmare.
Just use a std::string.

std::map.count using c-strings does not work?

I wish to use c-strings instead of std::string for a performance situation. I have the following code:
std::map<const char*, int> myMap;
.
.
.
myMap.insert(std::pair<const char*, int>(str.c_str(), myint));
std::cout << myMap.count(str.c_str()) << std::endl;
Strangely enough the value I just entered returns 0 for count()?
By default, std::map uses std::less to compare the keys (which is the same as <, really, except it's guaranteed to work on unrelated pointers too). Which means it just does pointer comparison, definitely not what you want.
Just use the C++11 string type (std::string) instead of a legacy type used for nul-terminated strings (const char*) and you'll be fine.
Why do you think using raw C strings will increase performance?
Anyway, std::map has no special treatment for char pointers. It treats them like any other kind of pointer and not like strings, which means that it simply compares the keys with std::less. Perhaps confusingly, this is different from the behaviour of C++ streams, which do behave in a special way when passed a char const *.
You'd get the same behaviour with something like std::map<double *, int>, std::map<long *, int> or std::map<MyClass *, int>. It's interesting to note that the pointer comparison works because std::less is guaranteed to work with pointers, even though pointer comparison with < is formally unspecified behaviour.
So, you are obviously not interested in comparing the pointer values directly. If you want lexicographical string comparison, you can specify the comparison for your map via the third template parameter:
std::map<char const *, int, RawPointerComparion>
What I called RawPointerComparison in this example must be a functor taking two pointers and returning whether the first is less than the second. You can use the strcmp C function for that. This should do the trick:
struct RawPointerComparison
{
bool operator()(char const *lhs, char const *rhs) const
{
return strcmp(lhs, rhs) < 0;
}
};
It seems that you use variable str to enter different strings in the map. For example
str = "first";
myMap.insert( { str.c_str(), 1 } );
str = "second";
myMap.insert( { str.c_str(), 2 } );
str = "first";
std::cout << myMap.count(str.c_str()) << std::endl;
In this case the first str.c_str() is not equal to the last str.c_str() (where you compare pointers to allocated strings) because different memory regions were allocated in these cases.
If you would do the following
str = "first";
myMap.insert( { str.c_str(), 1 } );
std::cout << myMap.count(str.c_str()) << std::endl;
without intermediate statements then the result would be the output 1.
It seems that you are doing what you do not want.:)

C++: set of C-strings

I want to create one so that I could check whether a certain word is in the set using set::find
However, C-strings are pointers, so the set would compare them by the pointer values by default. To function correctly, it would have to dereference them and compare the strings.
I could just pass the constructor a pointer to the strcmp() function as a comparator, but this is not exactly how I want it to work. The word I might want to check could be part of a longer string, and I don't want to create a new string due to performance concerns. If there weren't for the set, I would use strncmp(a1, a2, 3) to check the first 3 letters. In fact, 3 is probably the longest it could go, so I'm fine with having the third argument constant.
Is there a way to construct a set that would compare its elements by calling strncmp()? Code samples would be greatly appreciated.
Here's pseudocode for what I want to do:
bool WordInSet (string, set, length)
{
for (each word in set)
{
if strncmp(string, word, length) == 0
return true;
}
return false;
}
But I'd prefer to implement it using the standard library functions.
You could create a comparator function object.
struct set_object {
bool operator()(const char* first, const char* second) {
return strncmp(first, second, 3);
}
};
std::set<const char*, set_object> c_string_set;
However it would be far easier and more reliable to make a set of std::strings.
Make a wrapper function:
bool myCompare(const char * lhs, const char * rhs)
{
return strncmp(lhs, rhs, 3) < 0;
}
Assuming a constant value as a word length looks like asking for trouble to me. I recommend against this solution.
Look: The strcmp solution doesn't work for you because it treats the const char* arguments as nul-terminated strings. You want a function which does exactly the same, but treats the arguments as words - which translates to "anything-not-a-letter"-terminated string.
One could define strcmp in a generic way as:
template<typename EndPredicate>
int generic_strcmp(const char* s1, const char* s2) {
char c1;
char c2;
do {
c1 = *s1++;
c2 = *s2++;
if (EndPredicate(c1)) {
return c1 - c2;
}
} while (c1 == c2);
return c1 - c2;
}
If EndPredicate is a function which returns true iff its argument is equal to \0, then we obtain a regular strcmp which compares 0-terminated strings.
But in order to have a function which compares words, the only required change is the predicate. It's sufficient to use the inverted isalpha function from <cctype> header file to indicate that the string ends when a non-alphabetic character is encountered.
So in your case, your comparator for the set would look like this:
#include <cctype>
int wordcmp(const char* s1, const char* s2) {
char c1;
char c2;
do {
c1 = *s1++;
c2 = *s2++;
if (!isalpha(c1)) {
return c1 - c2;
}
} while (c1 == c2);
return c1 - c2;
}

Inserting objects into hash table (C++)

This is my first time making a hash table. I'm trying to associate strings (the keys) with pointers to objects (the data) of class Strain.
// Simulation.h
#include <ext/hash_map>
using namespace __gnu_cxx;
struct eqstr
{
bool operator()(const char * s1, const char * s2) const
{
return strcmp(s1, s2) == 0;
}
};
...
hash_map< const char *, Strain *, hash< const char * >, struct eqstr > liveStrainTable;
In the Simulation.cpp file, I attempt to initialize the table:
string MRCA;
for ( int b = 0; b < SEQ_LENGTH; b++ ) {
int randBase = rgen.uniform(0,NUM_BASES);
MRCA.push_back( BASES[ randBase ] );
}
Strain * firstStrainPtr;
firstStrainPtr = new Strain( idCtr, MRCA, NUM_STEPS );
liveStrainTable[ MRCA ]= firstStrainPtr;
I get an error message that reads "no match for ‘operator[]’ in ‘((Simulation*)this)->Simulation::liveStrainTable[MRCA]’." I've also tried using "liveStrainTable.insert(...)" in different ways, to no avail.
Would really love some help on this. I'm having a difficult time understanding the syntax appropriate for SGI hash_map, and the SGI reference barely clarifies anything for me. Thanks.
Try liveStrainTable[ MRCA.c_str() ]= firstStrainPtr;. It expects const char * as type of key value, but MRCA has type string.
Another way is to change liveStrainTable to:
hash_map< string, Strain *, hash<string>, eqstr > liveStrainTable;
Others answered your direct question, but may I suggest using unordered_map instead - it's coming with the next version of the STL and is supported by all major compilers.
hash_map is not part of STL. There's no implementation provided for hash, or in other words, the hash_map can't hash strings by default. You need your own hash function. T
Try:
typedef struct {
size_t operator()( const string& str ) const {
return __gnu_cxx::__stl_hash_string( str.c_str() );
}
} strhash;
hash_map< string, Strain *, strhash, eqstr > liveStrainTable;
The hash_map is defined with const char * as the key type and you are using an std::string as the key when accessing. These are 2 different types, the template did not build an operator for the second type, so this is an error. Use std::string for the hashmap definition or use MRCA.c_str()
Right now, you have a type mis-match. You're passing MRCA (a string) where a char const * is expected. You can either use c_str() to get a char const * from the string, or (far better) change the definition of your hash table to take a string as its key type.