Using an array as key in map C ++ - c++

Basically, I need to find all matching anagrams to a word. What I was doing was using an array of size 26 to represent the letters in a word.
Ex:
abcdefg={1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
aaaaaaa={7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
This is how I'm creating the array.
//stringtemp is a C++ string representing the word.
//letters is a size 26 int array representing all the letters in the string.
for(int i=0;i<stringtemp.length();i++)
{
letters[stringtemp[i]-65]+=1;
}
And this is how I'm storing the array in the map.
dictionary[letters].push_back(stringtemp);
So, am I doing something wrong or is this impossible in C++. In all the other answers I found, they suggested to use a vector as the key, but that won't work in my case(I think.)

All of std::array<T, 26>, std::string and std::vector<T> are perfectly valid key types for a std::map, since they all define less-than comparison operators. Note that std::array<T, 26> is similar to std::tuple<T, T, ..., T>, and comparison is defined lexicographically, very similar to string comparison.
#include <array>
#include <map>
typedef std::array<unsigned int, 26> alphabet;
std::map<alphabet, std::string> dictionary;
dictionary[{{1, 0, ..., 8}}] = "hello";
With a bit more work, you can also make all of those types keys for an std::unordered_map, though you'll have to add a bit of boilerplate code from Boost (using hash_combine).

std::map allows you to provide a Compare operator in the constructor. You may need to provide such a Comparator in order for two arrays {1,....} and {1,....} to match since they may be different actual objects.

The key type in a map must have an operator< defined for it. You could define operator< for your array type, but there's a much simpler approach: sort the letters in each word into alphabetical order, and use that sorted string as the key.

Related

Same key, multiple entries for std::unordered_map?

I have a map inserting multiple values with the same key of C string type.
I would expect to have a single entry with the specified key.
However the map seems to take it's address into consideration when uniquely identifying a key.
#include <cassert>
#include <iostream>
#include <string>
#include <unordered_map>
typedef char const* const MyKey;
/// #brief Hash function for StatementMap keys
///
/// Delegates to std::hash<std::string>.
struct MyMapHash {
public:
size_t operator()(MyKey& key) const {
return std::hash<std::string>{}(std::string(key));
}
};
typedef std::unordered_map<MyKey, int, MyMapHash> MyMap;
int main()
{
// Build std::strings to prevent optimizations on the addresses of
// underlying C strings.
std::string key1_s = "same";
std::string key2_s = "same";
MyKey key1 = key1_s.c_str();
MyKey key2 = key2_s.c_str();
// Make sure addresses are different.
assert(key1 != key2);
// Make sure hashes are identical.
assert(MyMapHash{}(key1) == MyMapHash{}(key2));
// Insert two values with the same key.
MyMap map;
map.insert({key1, 1});
map.insert({key2, 2});
// Make sure we find them in the map.
auto it1 = map.find(key1);
auto it2 = map.find(key2);
assert(it1 != map.end());
assert(it2 != map.end());
// Get values.
int value1 = it1->second;
int value2 = it2->second;
// The first one of any of these asserts fails. Why is there not only one
// entry in the map?
assert(value1 == value2);
assert(map.size() == 1u);
}
A print in the debugger shows that map contains two elements just after inserting them.
(gdb) p map
$4 = std::unordered_map with 2 elements = {
[0x7fffffffda20 "same"] = 2,
[0x7fffffffda00 "same"] = 1
}
Why does this happen if the hash function which delegates to std::hash<std::string> only takes it's value into account (this is asserted in the code)?
Moreover, if this is the intended behaviour, how can I use a map with C string as key, but with a 1:1 key-value mapping?
The reason is that hash maps (like std::unordered_map) do not only rely on the hash function for determining if two keys are equal. The hash function is the first comparison layer, after that the elements are always also compared by value. The reason is that even with good hash functions you might have collisions where two different keys yield the same hash value - but you still need to be able to save both entries in the hashmap. There are various strategies to handle that, you can find more information on looking for collision resolution for hash maps.
In your examples both entries have the same hash value but different values. The values are just compared by the standard comparison function, which compares the char* pointers, which are different. Therefore the value comparison fails and you get two entries in the map. To solve your issue you also need to define a custom equality function for your hash map, which can be done by specifiying the fourth template parameter KeyEqual for std::unordered_map.
This fails because the unordered_map does not and cannot solely rely on the hash function for the key to differentiate keys, but it must also compare keys with the same hash for equality. And comparing two char pointers compares the address pointed to.
If you want to change the comparison, pass a KeyEqual parameter to the map in addition to the hash.
struct MyKeyEqual
{
bool operator()(MyKey const &lhs, MyKey const &rhs) const
{
return std::strcmp(lhs, rhs) == 0;
}
};
unordered_map needs to be able to perform two operations on the key - checking equality, and obtaining hash code. Naturally, two unequal keys are allowed to have different hash codes. When this happens, unordered map applies hash collision resolution strategy to treat these unequal keys as distinct.
That is precisely what happens when you supply a character pointer for the key, and provide an implementation of hash to it: the default equality comparison for pointers kicks in, so two different pointers produce two different keys, even though the content of the corresponding C strings is the same.
You can fix it by providing a custom implementation of KeyEqual template parameter to perform actual comparison of C strings, for example, by calling strcmp:
return !strcmp(lhsKey, rhsKey);
You didn't define a map of keys but a map of pointers to a key.
typedef char const* const MyKey;
The compiler can optimize the two instances of "name" and use only one instance in the const data segment, but that can happen or not. A.k.a. undefined behavior.
Your map should contain the key itself. Make the key a std::string or similar.

C++ sort cannot sort set of strings?

I am wondering how I can sort a set that contains strings. For example, I have a set:
std::set<std::string> setA = {"B","A","C"}
Then I wanna use this to do the sorting:
std::sort(setA.begin(),setA.end());
But the C++ compiler cannot let it pass. The error message reports:
40: error: invalid operands to binary expression
('std::__1::__tree_const_iterator<std::__1::basic_string<char>, std::__1::__tree_node<std::__1::basic_string<char>, void *> *, long>'
and 'std::__1::__tree_const_iterator<std::__1::basic_string<char>, std::__1::__tree_node<std::__1::basic_string<char>, void *> *, long>')
difference_type __len = __last - __first;
Then I recheck sort function in C++, it seems that it can only deal with int, double, long ... but there is no way to use this function sort() to sort strings.
So how can I sort strings?
std::sort requires random access iterators while std::set provides only bidirectional iterators.
Generally speaking, any try to sort a std::set contradicts to the design of this container, because it stores its elements in the sorted order.
From the cppreference.com:
std::set is an associative container that contains a sorted set of
unique objects of type Key.
You cannot resort a set, how it sorts is part of the type of the particular set. A given set has a fixed set order that cannot be changed.
You could create a new set with the same data relatively easily. Just create a new set that sorts based on the new criteria.
If you want to use the two sets in the same code, you'll have to abstract the access to the underlying set.
Copied shamelessly from Sorting Sets using std::sort. Change your container.
You just insert these three strings and they are already sorted if your using std::set
set<string> s;
s.insert("A");
...
OR
set<string> str = {"A", "B", "C", "D"}; //C++ 11
OR
string s[] = {"A", "B", "C", "D"};
set<string> str(s, s+ sizeof(s) / sizeof(s[0]));
And they are sorted.
If you want custom sorting (which is probably the case with you?)
Then use vector<string>() and sort()
bool cmp(string a, string b)
{
// do something and return boolean
}
vector<string> v;
v.push_back("s");
...
sort(v.begin(),v.end(),cmp);
Btw you can't resort set as the sorting of it's elements is entirely upto the set's standard implementation as done in c++.

Is there a CompareTo method in C++ similar to Java where you can use > < = operations on a data type

I know that in java there is a compareTo method that you can write in a class that will compare two variables and return a value -1, 1, or 0 signifing greater than, less than, and equal to operations. Is there a way to do this in C++?
Background:
Im creating a modified string class in which it takes a string and an arraylist. I want to be able to compare the string in a traditional fashion where if its lower in the alphabet it will be less than, than higher it would be greater than. Than i just want the array list to be linked to the files to store pages in which the word was indexed on in a text file. Anyways the specifics do not matter since i already have the class written. I just need to create compareTo method that would be able to be used in the main of my cpp file or by other data type like various trees for instance.
Ill write the code in java as i know how and maybe someone can help me with C++ Syntax (im required to write in c++ for this project unfortunatly, and i am new to C++)
I will shorten the code to give the rough outline of what im doing than write the compareTo method as i know how in java
class name ModifiedString
Has variables: word , arraylist pagelist
Methods:
getWord (returns the word associated with the class, i.e its string)
appendPageList (adds page numbers to the array list, this doesnt matter in this question)
Hers how i would do it in java
int compareTo(ModifiedString a){
if(this.getWord() > a.getWord())
return 1;
else if (this.word() < a.getWord())
return -1;
else return 0;
}
Then when < , > , or == is used on a ModifiedWord than the operations would be valid.
std::string already includes a working overload of operator<, so you can just compare strings directly. Java uses compareTo primarily because the built-in comparison operator produces results that aren't generally useful for strings. Being a lower-level language, Java doesn't support user-defined operator overloads, so it uses compareTo as a band-aid to cover for the inadequacy of the language.
From your description, however, you don't need to deal with any of that directly at all. At least as you've described the problem, you really want is something like:
std::map<std::string, std::vector<int> > page_map;
You'll then read words in from your text file, and insert the page number where each occurs into the page map:
page_map[current_word].push_back(current_page);
Note that I've used std::map above, on the expectation that you may want ordered results (e.g., be able to quickly find all words from age to ale in alphabetical order). If you don't care about ordering, you may want to use std::unordered_map instead.
Edit: here's a simple text cross-reference program that reads a text file (from standard input) and writes out a cross-reference by line number (i.e., each "word", and the numbers of the lines on which that word appeared).
#include <map>
#include <unordered_map>
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <iterator>
#include "infix_iterator.h"
typedef std::map<std::string, std::vector<unsigned> > index;
namespace std {
ostream &operator<<(ostream &os, index::value_type const &i) {
os << i.first << ":\t";
std::copy(i.second.begin(), i.second.end(),
infix_ostream_iterator<unsigned>(os, ", "));
return os;
}
}
void add_words(std::string const &line, size_t num, index &i) {
std::istringstream is(line);
std::string temp;
while (is >> temp)
i[temp].push_back(num);
}
int main() {
index i;
std::string line;
size_t line_number = 0;
while (std::getline(std::cin, line))
add_words(line, ++line_number, i);
std::copy(i.begin(), i.end(),
std::ostream_iterator<index::value_type>(std::cout, "\n"));
return 0;
}
If you look at the first typedef (of index), you can change it from map to unordered_map if you want to test a hash table vs. a red-black tree. Note that this interprets "word" pretty loosely -- basically any sequence of non-whitespace characters, so for example, it'll treat example, as a "word" (and it'll be separate from example).
Note that this uses the infix_iterator I've posted elsewhere.
There is no standard way in C++ to define an operator that does what the Java compareTo() function does. You can, however, implement
int compareTo(const ModifiedString&, const ModifiedString&);
Another option is to overload the <, <=, >, >=, == and != operators, e.g. by implementing
bool operator<(const ModifiedString&, const ModifiedString&);
In C++, you define bool operator< directly, no need to invent funny names, same for operator< and operator==. They're generally implemented as member functions taking one extra argument, the righthand side, but you could also define them as non-member functions taking two arguments.
Sun decided to not include operator overloading in Java, so them provided an in-class way (through member functions) to do that job: The equals() and compareTo() functions.
C++ has operator overloading, which allows you to specify the behaviour of the language operators within your own types.
To learn how to overload operators, I suggest you to read this thread: Operator overloading

Can we hold 2 data types in a STL list?

i want my list to hold an integer value as well as a string value. is this possible?
I am implementing a hash table using STL lists which can store only the integer. I am hashing a string to get the index where i am storing my integer. Now i want my string to be stored with the integer as well.
EDIT 1:
so i am using this statement:
list<pair<int,string>> table[127];
and here is the error im getting:
>>' should be> >' within a nested template argument list
ok i fixed this.. it seems i didn't put a space in the ">>" so now its fix
next question
how do i add my pair to the table array?
You can have a list of std::pairs or, with c++11, std::tuple, for example:
std::list < std::pair< int, std::string > >list;
std::list < std::tuple< int, std::string > >list;
To access the elements inside a pair, use pair.first and pair.second. To access the elements inside a tuple, use std::get:
auto t = std::make_tuple(1,"something");
std::get<0>(t);//will get the first element of the tuple
You can use std::pair or std::tuple,
std::list<std::pair<int, string>> list;
You can store the string and the integer in a structure and store the objects of the structure.
Each list element can look like:
struct element {
string str;
int val;
};
This is the C way to handle, please #SingerOfTheFall's answer also.

predicate for a map from string to int

I have this small program that reads a line of input & prints the words in it, with their respective number of occurrences. I want to sort the elements in the map that stores these values according to their occurrences. I mean, the words that only appear once, will be ordered to be at the beginning, then the words that appeared twice 7 so on. I know that the predicate should return a bool value, but I don't know what the parameters should be. Should it be two iterators to the map? If some one could explain this, it would be greatly appreciated. Thank you in advance.
#include<iostream>
#include<map>
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::map;
int main()
{
string s;
map<string,int> counters; //store each word & an associated counter
//read the input, keeping track of each word & how often we see it
while(cin>>s)
{
++counters[s];
}
//write the words & associated counts
for(map<string,int>::const_iterator iter = counters.begin();iter != counters.end();iter++)
{
cout<<iter->first<<"\t"<<iter->second<<endl;
}
return 0;
}
std::map is always sorted according to its key. You cannot sort the elements by their value.
You need to copy the contents to another data structure (for example std::vector<std::pair<string, int> >) which can be sorted.
Here is a predicate that can be used to sort such a vector. Note that sorting algorithms in C++ standard library need a "less than" predicate which basically says "is a smaller than b".
bool cmp(std::pair<string, int> const &a, std::pair<string, int> const &b) {
return a.second < b.second;
}
You can't resort a map, it's order is predefined (by default, from std::less on the key type). The easiest solution for your problem would be to create a std::multimap<int, string> and insert your values there, then just loop over the multimap, which will be ordered on the key type (int, the number of occurences), which will give you the order that you want, without having to define a predicate.
You are not going to be able to do this with one pass with an std::map. It can only be sorted on one thing at a time, and you cannot change the key in-place. What I would recommend is to use the code you have now to maintain the counters map, then use std::max_element with a comparison function that compares the second field of each std::pair<string, int> in the map.
A map has its keys sorted, not its values. That's what makes the map efficent. You cannot sort it by occurrences without using another data structure (maybe a reversed index!)
As stated, it simply won't work -- a map always remains sorted by its key value, which would be the strings.
As others have noted, you can copy the data to some other structure, and sort by the value. Another possibility would be to use a Boost bimap instead. I've posted a demo of the basic idea previously.
You probably want to transform map<string,int> to vector<pair<const string, int> > then sort the vector on the int member.
You could do
struct PairLessSecond
{
template< typename P >
bool operator()( const P& pairLeft, const P& pairRight ) const
{
return pairLeft.second < pairRight.second;
}
};
You can probably also construct all this somehow using a lambda with a bind.
Now
std::vector< std::map<std::string,int>::value_type > byCount;
std::sort( byCount.begin(), byCount.end(), PairLessSecond() );