I have the following problem - I want to count the occurrences of each word in a file. I'm using a map<string,Count> so the key is the string object representing the word, and the value being looked up is the object that keeps count of the strings so that :
class Count {
int i;
public:
Count() : i(0) {}
void operator++(int) { i++; } // Post-increment
int& val() { return i; }
};
The problem is that I want to use insert() instead of the operator[]. Here is the code.
typedef map<string, Count> WordMap;
typedef WordMap::iterator WMIter;
int main( ) {
ifstream in("D://C++ projects//ReadF.txt");
WordMap wordmap;
string word;
WMIter it;
while (in >> word){
// wordmap[word]++; // not that way
if((it= wordmap.find(word)) != wordmap.end()){ //if the word already exists
wordmap.insert(make_pair(word, (*it).second++); // how do I increment the value ?
}else{
...
}
for (WMIter w = wordmap.begin();
w != wordmap.end(); w++)
cout << (*w).first << ": "
<< (*w).second.val() << endl;
}
Could you refactor so as not to use find but simply attempt the insert?
Insert always returns a pair<iter*, bool>. The bool is 0 if it finds the key, and the iter* points to the existing pair. So we can take the pointer to the pair and increment the value:
// On successful insertion, we get a count of 1 for that word:
auto result_pair = wordmap.insert( { word, 1 } );
// Increment the count if the word is already there:
if (!result_pair.second)
result_pair.first->second++;
It was my first time posting. I'm learning C++ and welcome feedback on my idea.
The problem is that I want to use insert() instead of the operator[]
...why? std::map::insert cannot mutate existing values. operator[] is the right job for this.
If you really want to use insert (please don't), you first need to erase the existing value, if present:
if((it= wordmap.find(word)) != wordmap.end())
{
const auto curr = it->second; // current number of occurrences
wordmap.erase(word);
wordmap.insert(make_pair(word, curr + 1));
}
Related
Say I have
vector<shared_ptr<string>> enemy;
how do I remove elements from the enemy vector?
Thanks for your help in advance
**Edit (code in context)
void RemoveEnemy( vector<shared_ptr<Enemy>> & chart, string id )
{
int i = 0;
bool found = FALSE;
for(auto it = chart.begin(); it != chart.end(); i++)
{
if(id == chart[i]->GetEnemyID() )
{
found = TRUE;
chart.erase(it);
}
}
the code above segfaults me
You remove elements the same way you remove any elements from any std::vector - via the std::vector::erase() method, for instance. All you need for that is an iterator to the desired element to remove.
In your case, since you are storing std::shared_ptr<std::string> objects rather than storing actual std::string objects, you may need to use something like std::find_if() to find the vector element containing the desired string value, eg:
void removeEnemy(string name)
{
auto iter = std::find_if(enemy.begin(), enemy.end(),
[&](auto &s){ return (*s == name); }
);
if (iter != enemy.end())
enemy.erase(iter);
}
UPDATE: in the new code you have added, you are incorrectly mixing indexes and iterators together. You are creating an infinite loop if the vector is not empty, as you never increment the it iterator that controls your loop, you are incrementing your index i variable instead (see what happens when you don't give your variables unique and meaningful names?). So you end up going out of bounds of the vector into surrounding memory. That is why you get the segfault error.
Even though you are (trying to) use an iterator to loop through the vector, you are using indexes to access the elements, instead of dereferencing the iterator to access the elements. You don't need to use indexes at all in this situation, the iterator alone will suffice.
Try this instead:
void RemoveEnemy( vector<shared_ptr<Enemy>> & chart, string id )
{
for(auto it = chart.begin(); it != chart.end(); ++it)
{
if (id == it->GetEnemyID() )
{
chart.erase(it);
return;
}
}
Or, using the kind of code I suggested earlier:
void RemoveEnemy( vector<shared_ptr<Enemy>> & chart, string id )
{
auto iter = std::find_if(chart.begin(), chart.end(),
[&](auto &enemy){ return (enemy->GetEnemyID() == id); }
);
if (iter != chart.end())
chart.erase(iter);
}
The problem with your code is that erase() invalidates the iterator. You must use it = chart.erase(it).
I like mine which will remove aliens at high speed and without any care for the ordering of the other items. Removal with prejudice!
Note: remove_if is most often used with erase and it will preserve the order of the remaining elements. However, partition does not care about the ordering of elements and is much faster.
partition-test.cpp:
make partition-test && echo 1 alien 9 alien 2 8 alien 4 7 alien 5 3 | ./partition-test
#include <algorithm>
#include <iostream>
#include <iterator>
#include <memory>
#include <string>
#include <vector>
using namespace std;
template <typename T>
ostream &operator<<(ostream &os, const vector<T> &container) {
bool comma = false;
for (const auto &x : container) {
if (comma)
os << ", ";
os << *x;
comma = true;
}
return os;
}
int main() {
vector<shared_ptr<string>> iv;
auto x = make_shared<string>();
while (cin >> *x) {
iv.push_back(x);
x = make_shared<string>();
}
cout << iv << '\n';
iv.erase(partition(begin(iv), end(iv),
[](const auto &x) { return *x != "alien"s; }),
end(iv));
cout << iv << '\n';
return 0;
}
This is a2.hpp, and is the program that can be edited, as far as I know the code is correct, just too slow. I am honestly lost here, I know my for loops are probably whats slowing me down so much, maybe use an iterator?
// <algorithm>, <list>, <vector>
// YOU CAN CHANGE/EDIT ANY CODE IN THIS FILE AS LONG AS SEMANTICS IS UNCHANGED
#include <algorithm>
#include <list>
#include <vector>
class key_value_sequences {
private:
std::list<std::vector<int>> seq;
std::vector<std::vector<int>> keyref;
public:
// YOU SHOULD USE C++ CONTAINERS TO AVOID RAW POINTERS
// IF YOU DECIDE TO USE POINTERS, MAKE SURE THAT YOU MANAGE MEMORY PROPERLY
// IMPLEMENT ME: SHOULD RETURN SIZE OF A SEQUENCE FOR GIVEN KEY
// IF NO SEQUENCE EXISTS FOR A GIVEN KEY RETURN 0
int size(int key) const;
// IMPLEMENT ME: SHOULD RETURN POINTER TO A SEQUENCE FOR GIVEN KEY
// IF NO SEQUENCE EXISTS FOR A GIVEN KEY RETURN nullptr
const int* data(int key) const;
// IMPLEMENT ME: INSERT VALUE INTO A SEQUENCE IDENTIFIED BY GIVEN KEY
void insert(int key, int value);
}; // class key_value_sequences
int key_value_sequences::size(int key) const {
//checks if the key is invalid or the count vector is empty.
if(key<0 || keyref[key].empty()) return 0;
// sub tract 1 because the first element is the key to access the count
return keyref[key].size() -1;
}
const int* key_value_sequences::data(int key) const {
//checks if key index or ref vector is invalid
if(key<0 || keyref.size() < static_cast<unsigned int>(key+1)) {
return nullptr;
}
// ->at(1) accesses the count (skipping the key) with a pointer
return &keyref[key].at(1);
}
void key_value_sequences::insert(int key, int value) {
//checks if key is valid and if the count vector needs to be resized
if(key>=0 && keyref.size() < static_cast<unsigned int>(key+1)) {
keyref.resize(key+1);
std::vector<int> val;
seq.push_back(val);
seq.back().push_back(key);
seq.back().push_back(value);
keyref[key] = seq.back();
}
//the index is already valid
else if(key >=0) keyref[key].push_back(value);
}
#endif // A2_HPP
This is a2.cpp, this just tests the functionality of a2.hpp, this code cannot be changed
// DO NOT EDIT THIS FILE !!!
// YOUR CODE MUST BE CONTAINED IN a2.hpp ONLY
#include <iostream>
#include "a2.hpp"
int main(int argc, char* argv[]) {
key_value_sequences A;
{
key_value_sequences T;
// k will be our key
for (int k = 0; k < 10; ++k) { //the actual tests will have way more than 10 sequences.
// v is our value
// here we are creating 10 sequences:
// key = 0, sequence = (0)
// key = 1, sequence = (0 1)
// key = 2, sequence = (0 1 2)
// ...
// key = 9, sequence = (0 1 2 3 4 5 6 7 8 9)
for (int v = 0; v < k + 1; ++v) T.insert(k, v);
}
T = T;
key_value_sequences V = T;
A = V;
}
std::vector<int> ref;
if (A.size(-1) != 0) {
std::cout << "fail" << std::endl;
return -1;
}
for (int k = 0; k < 10; ++k) {
if (A.size(k) != k + 1) {
std::cout << "fail";
return -1;
} else {
ref.clear();
for (int v = 0; v < k + 1; ++v) ref.push_back(v);
if (!std::equal(ref.begin(), ref.end(), A.data(k))) {
std::cout << "fail 3 " << A.data(k) << " " << ref[k];
return -1;
}
}
}
std::cout << "pass" << std::endl;
return 0;
} // main
If anyone could help me improve my codes efficiency I would really appreciate it, thanks.
First, I'm not convinced your code is correct. In insert, if they key is valid you create a new vector and insert it into sequence. Sounds wrong, as that should only happen if you have a new key, but if your tests pass it might be fine.
Performance wise:
Avoid std::list. Linked lists have terrible performance on today's hardware because they break pipelineing, caching and pre-fetching. Always use std::vector instead. If the payload is really big and you are worried about copies use std::vector<std::unique_ptr<T>>
Try to avoid copying vectors. In your code you have keyref[key] = seq.back() which copies the vector, but should be fine since it's only one element.
Otherwise there's no obvious performance problems. Try to benchmark and profile your program and see where the slow parts are. Usually there's one or two places that you need to optimize and get great performance. If it's still too slow, ask another question where you post your results so that we can better understand the problem.
I will join Sorin in saying don't use std::list if avoidable.
So you use key as direct index, where does it say it is none-negative? where does it say its less than 100000000?
void key_value_sequences::insert(int key, int value) {
//checks if key is valid and if the count vector needs to be resized
if(key>=0 && keyref.size() < static_cast<unsigned int>(key+1)) {
keyref.resize(key+1); // could be large
std::vector<int> val; // don't need this temporary.
seq.push_back(val); // seq is useless?
seq.back().push_back(key);
seq.back().push_back(value);
keyref[key] = seq.back(); // we now have 100000000-1 empty indexes
}
//the index is already valid
else if(key >=0) keyref[key].push_back(value);
}
Can it be done faster? depending on your key range yes it can. You will need to implement a flat_map or hash_map.
C++11 concept code for a flat_map version.
// effectively a binary search
auto key_value_sequences::find_it(int key) { // type should be iterator
return std::lower_bound(keyref.begin(), keyref.end(), [key](const auto& check){
return check[0] < key; // key is 0-element
});
}
void key_value_sequences::insert(int key, int value) {
auto found = find_it(key);
// at the end or not found
if (found == keyref.end() || found->front() != key) {
found = keyref.emplace(found, key); // add entry
}
found->emplace_back(value); // update entry, whether new or old.
}
const int* key_value_sequences::data(int key) const {
//checks if key index or ref vector is invalid
auto found = find_it(key);
if (found == keyref.end())
return nullptr;
// ->at(1) accesses the count (skipping the key) with a pointer
return found->at(1);
}
(hope I got that right ...)
I have a function in which it checks for the characters and number of times it is repeated in a string.
It is stored as (for example)string is "hello" [h]=>1 [e]=>1 [l]=>2 [o]=1
Whenever a letter occurs more than once I need to update it.
I tried using
it->second = it->second+1;
But it doesn't works
How can I do that?
Full code is
int fn(string a) {
map<char,int> mymap;
for(int i=0;i<a.size();i++)
{
std::map<char, int>::iterator it = mymap.find(i);
if(it!=mymap.end())
{
//say i need to update occurrence from 1 to 2 or 2 to 3...
it->second = it->second+1;//(how can i do that)
}
else
mymap.insert(pair<char,int>(a[i],1));
}
std::map<char,int>::iterator i;
for(i=mymap.begin();i!=mymap.end();i++)
{
cout<<i->first<<i->second;
}
}
You don't need all that code. You can just say
for (auto c : a) mymap[c]++;
This works because map's operator[] inserts a zero initialized element when one doesn't exist for a given key.
I am trying to write a program in C++ using maps...
My goal is to avoid the same values repeated in maps.
If the keys are same, we can use maps to avoid the duplicated keys. To allow duplicate keys, we use multimaps.
In case the value is the same, how can we avoid that?
The program which I have written allows duplicated values:
typedef std::map<int, std::string> MyMap;
int main()
{
MyMap map;
MyMap::iterator mpIter;
int key;
string value;
int count;
for(count = 0; count < 3;count++)
{
cin >> key;
cin >> value;
std::pair<MyMap::iterator, bool> res = map.insert(std::make_pair(key,value));
}
for (mpIter=map.begin(); mpIter != map.end(); ++mpIter)
cout << " " << (*mpIter).second << endl;
}
Make the value part of the key and/or use a set but that may not really solve the problem. It isn't possible to easily define a container that has both unique keys AND values if that's what you want. However, you might still construct one. Here's a very simple example to illustrate what is needed:
// Assuming keys are KEY and values are VAL
class MyMap {
public:
std::set<KEY> keyset;
std::set<VAL> valset;
std::map<KEY,VAL> theRealMap;
// assuming existence of function HAS(S,V)
// which returns true if v is in set S
bool MyInsert(KEY ky, VAL val) {
if (HAS(keyset, ky) return false;
if (HAS(valset, val) return false;
keyset.insert(ky);
valset.insert(vl);
return theRealMap.insert(std::pair<KEY,VAL>(ky, val));
}
:
:
Since this is an example it's not intended to be copied. You will likely want to include the functionality provided by std:map. An easy way would be to use std::map as a base class but you will need to hide (by making private) or implement similar code for each variant of insert otherwise you might get inadvertent insert that may not be unique.
Note: this requires twice the size of a single map. You can save some space by using theRealMap instead of a separate set for keys set. Another way would be to search the map but that sacrifices time for space. It's your call.
One way to do this is to maintain a separate std::set of the values. When you insert a value into a set it returns a std::pair<iterator, bool>. The bool value is true if the value was not already in the set. This tells you if it is safe to also put the value in the map.
First, however, you need to make sure the key is unique because the same key may already have been inserted with a different value:
typedef std::map<int, std::string> MyMap;
int main()
{
MyMap map;
MyMap::iterator mpIter;
int key;
string value;
int count;
// keep track of values with a std::set
std::set<std::string> values;
for(count = 0; count < 3; count++)
{
cin >> key;
cin >> value;
auto found = map.find(key);
if(found != map.end()) // key already in map
continue; // don't add it again
// now try to add it to the set
// only add to the map if its value is not already in the set
if(values.insert(value).second)
map.insert(std::make_pair(key, value));
}
for(mpIter = map.begin(); mpIter != map.end(); ++mpIter)
cout << " " << (*mpIter).second << endl;
}
One (inefficient) way to do it is to create a reverse map (with <string,int>) and insert your input in reverse order as that of MyMap into it. If ok, then insert into MyMap
Here is the working code.
typedef std::map<int, std::string> MyMap;
typedef std::map<string, int> rev_Map;
int main()
{
MyMap map;
rev_Map rmap;
MyMap::iterator mpIter;
rev_Map::iterator rmap_iter;
int key;
string value;
int count;
for(count = 0; count < 3;count++)
{
cin >> key;
cin >> value;
std::pair<rev_Map::iterator, bool> ok = rmap.insert(std::make_pair(value,key)); //insert into the reverse map
if(ok.second) //if above amap.insert works
std::pair<MyMap::iterator, bool> res = map.insert(std::make_pair(key,value));
}
for (mpIter=map.begin(); mpIter != map.end(); ++mpIter)
cout << " " << (*mpIter).second << endl;
}
I have the following multiset in C++:
template<class T>
class CompareWords {
public:
bool operator()(T s1, T s2)
{
if (s1.length() == s2.length())
{
return ( s1 < s2 );
}
else return ( s1.length() < s2.length() );
}
};
typedef multiset<string, CompareWords<string>> mySet;
typedef std::multiset<string,CompareWords<string>>::iterator mySetItr;
mySet mWords;
I want to print each unique element of type std::string in the set once and next to the element I want to print how many time it appears in the list (frequency), as you can see the functor "CompareWord" keeps the set sorted.
A solution is proposed here, but its not what I need, because I am looking for a solution without using (while,for,do while).
I know that I can use this:
//gives a pointer to the first and last range or repeated element "word"
auto p = mWords.equal_range(word);
// compute the distance between the iterators that bound the range AKA frequency
int count = static_cast<int>(std::distance(p.first, p.second));
but I can't quite come up with a solution without loops?
Unlike the other solutions, this iterates over the list exactly once. This is important, as iterating over a structure like std::multimap is reasonably high overhead (the nodes are distinct allocations).
There are no explicit loops, but the tail-end recursion will be optimized down to a loop, and I call an algorithm that will run a loop.
template<class Iterator, class Clumps, class Compare>
void produce_clumps( Iterator begin, Iterator end, Clumps&& clumps, Compare&& compare) {
if (begin==end) return; // do nothing for nothing
typedef decltype(*begin) value_type_ref;
// We know runs are at least 1 long, so don't bother comparing the first time.
// Generally, advancing will have a cost similar to comparing. If comparing is much
// more expensive than advancing, then this is sub optimal:
std::size_t count = 1;
Iterator run_end = std::find_if(
std::next(begin), end,
[&]( value_type_ref v ){
if (!compare(*begin, v)) {
++count;
return false;
}
return true;
}
);
// call our clumps callback:
clumps( begin, run_end, count );
// tail end recurse:
return produce_clumps( std::move(run_end), std::move(end), std::forward<Clumps>(clumps), std::forward<Compare>(compare) );
}
The above is a relatively generic algorithm. Here is its use:
int main() {
typedef std::multiset<std::string> mySet;
typedef std::multiset<std::string>::iterator mySetItr;
mySet mWords { "A", "A", "B" };
produce_clumps( mWords.begin(), mWords.end(),
[]( mySetItr run_start, mySetItr /* run_end -- unused */, std::size_t count )
{
std::cout << "Word [" << *run_start << "] occurs " << count << " times\n";
},
CompareWords<std::string>{}
);
}
live example
The iterators must refer to a sorted sequence (with regards to the Comparator), then the clumps will be passed to the 3rd argument together with their length.
Every element in the multiset will be visited exactly once with the above algorithm (as a right-hand side argument to your comparison function). Every start of a clump will be visited (length of clump) additional times as a left-hand side argument (including clumps of length 1). There will be exactly N iterator increments performed, and no more than N+C+1 iterator comparisons (N=number of elements, C=number of clumps).
#include <iostream>
#include <algorithm>
#include <set>
#include <iterator>
#include <string>
int main()
{
typedef std::multiset<std::string> mySet;
typedef std::multiset<std::string>::iterator mySetItr;
mySet mWords;
mWords.insert("A");
mWords.insert("A");
mWords.insert("B");
mySetItr it = std::begin(mWords), itend = std::end(mWords);
std::for_each<mySetItr&>(it, itend, [&mWords, &it] (const std::string& word)
{
auto p = mWords.equal_range(word);
int count = static_cast<int>(std::distance(p.first, p.second));
std::cout << word << " " << count << std::endl;
std::advance(it, count - 1);
});
}
Outputs:
A 2
B 1
Live demo link.
Following does the job without explicit loop using recursion:
void print_rec(const mySet& set, mySetItr it)
{
if (it == set.end()) {
return;
}
const auto& word = *it;
auto next = std::find_if(it, set.end(),
[&word](const std::string& s) {
return s != word;
});
std::cout << word << " appears " << std::distance(it, next) << std::endl;
print_rec(set, next);
}
void print(const mySet& set)
{
print_rec(set, set.begin());
}
Demo