I've defined a class called ClusterSet that just has one field, called clusters:
class ClusterSet {
std::map<std::string, std::map<std::string, float>* > clusters;
public:
typedef std::map<std::string, std::map<std::string, float> *>::iterator iterator;
typedef std::map<std::string, std::map<std::string, float> *>::const_iterator const_iterator;
iterator begin() { return clusters.begin(); }
const_iterator begin() const { return clusters.begin(); }
iterator end() { return clusters.end(); }
const_iterator end() const { return clusters.end(); }
void create_cluster(std::string representative);
void add_member(std::string representative, std::string member, float similarity);
int write_to_file(std::string outputfile);
int size();
~ClusterSet();
};
In my create_cluster method, I use new to allocate memory for the inner map, and store this pointer in clusters. I defined a destructor so that I can deallocate all this memory:
ClusterSet::~ClusterSet() {
ClusterSet::iterator clust_it;
for (clust_it = clusters.begin(); clust_it != clusters.end(); ++clust_it) {
std::cout << "Deleting members for " << clust_it->first << std::endl;
delete clust_it->second;
}
}
When my destructor is called, it seems to deallocate all the inner maps correctly (it prints out "Deleting members for..." for each one). However, once that's done I get a runtime error that says "failed to "munmap" 1068 bytes: Invalid argument". What's causing this?
I have briefly looked at the "rule of three" but I don't understand why I would need a copy constructor or an assignment operator, or how that might solve my problem. I would never need to use either directly.
There's no good reason (and plenty of disadvantages) for dynamically allocating the inner maps. Change the outer map type to
std::map<std::string, std::map<std::string, float> >
and then you won't need to implement your own destructor and copy/move semantics at all (unless you want to change those you get from the map, perhaps to prevent copying your class).
If, in other circumstances, you really do need to store pointers to objects and tie their lifetime to their presence in the map, store smart pointers:
std::map<std::string, std::unique_ptr<something> >
If you really want to manage their lifetimes by hand for some reason, then you will need to follow the Rule of Three and give your class valid copy semantics (either preventing copying by deleting the copy constructor/assignment operator, or implementing whatever semantics you want). Even if you don't think you're copying objects, it's very easy to write code that does.
Related
If I have a class with a map as a private member such as
class MyClass
{
public:
MyClass();
std::map<std::string, std::string> getPlatforms() const;
private:
std::map<std::string, std::string> platforms_;
};
MyClass::MyClass()
:
{
platforms_["key1"] = "value1";
// ...
platforms_["keyN"] = "valueN";
}
std::map<std::string, std::string> getPlatforms() const
{
return platforms_;
}
And in my main function would there be a difference between these two pieces of code?
Code1:
MyClass myclass();
std::map<std::string, std::string>::iterator definition;
for (definition = myclass.getPlatforms().begin();
definition != myclass.getPlatforms().end();
++definition){
std::cout << (*definition).first << std::endl;
}
Code2:
MyClass myclass();
std::map<std::string, std::string> platforms = myclass.getPlatforms();
std::map<std::string, std::string>::iterator definition;
for (definition = platforms.begin();
definition != platforms.end();
++definition){
std::cout << (*definition).first << std::endl;
}
In Code2 I just created a new map variable to hold the map returned from the getPlatforms() function.
Anyway, in my real code (which I cannot post the real code from but it is directly corresponding to this concept) the first way (Code1) results in a runtime error with being unable to access memory at a location.
The second way works!
Can you enlighten me as to the theoretical underpinnings of what is going on between those two different pieces of code?
getPlatforms() returns the map by value, rather than reference, which is generally a bad idea.
You have shown one example of why it is a bad idea:
getPlatforms().begin() is an iterator on a map that is gone before the iterator is used and getPlatforms().end() is an iterator on a different copy from the same original map.
Can you enlighten me as to the theoretical underpinnings of what is going on between those two different pieces of code?
When you return by value, you return a deep copy of the data.
When you call myclass.getPlatforms().begin(); and myclass.getPlatforms().end(); you are effectively constructing two copies of your data, then getting the begin iterator from one copy and the end iterator from the other. Then, you compare the two iterators for equality; This is undefined behavior.
results in a runtime error with being unable to access memory at a location.
This is because definition is initialized, then the temporary object used to create it is deleted, invalidating the data the iterator pointed to. Then, you attempt to use the data, through the iterator.
A problem that you have is that you should be using const_iterator not iterator. This is because the function getPlatforms is const qualified, whereas the function in the map iterator begin() is not; you must use the const qualified const_iterator begin() const instead to explicitly tell the compiler you will not modify any members of the class.
Note: this is only the case for code 1, which should, by the way return const&
I am inheriting an interface, and implementing a virtual function that is supposed to do some work on a list of dynamically allocated objects. The first step is to remove duplicates from the list based on some custom equivalence criteria:
class Foo { /* ... */ };
struct FooLess
{
bool operator()(const Foo *lhs, const Foo *rhs);
}
struct FooEqual
{
bool operator()(const Foo *lhs, const Foo *rhs);
}
void doStuff(std::list<Foo*> &foos)
{
// use the sort + unique idiom to find and erase duplicates
FooLess less;
FooEqual equal;
foos.sort( foos.begin(), foos.end(), less );
foos.erase(
std::unique( foos.begin(), foos.end(), equal ),
foos.end() ); // memory leak!
}
The problem is that using sort + unique doesn't clean up the memory, and the elements to be erased have unspecified values after unique, so I cannot perform the cleanup myself before eraseing. I was considering something like this:
void doStuff(std::list<Foo*> &foos)
{
// make a temporary copy of the input as a list of auto_ptr's
std::list<auto_ptr<Foo>> auto_foos;
for (std::list<Foo>::iterator it = foos.begin(); it != foos.end(); ++it)
auto_foos.push_back(auto_ptr<Foo>(*it));
foos.clear();
FooLess less; // would need to change implementation to work on auto_ptr<Foo>
FooEqual equal; // likewise
auto_foos.sort( auto_foos.begin(), auto_foos.end(), less );
auto_foos.erase(
std::unique( auto_foos.begin(), auto_foos.end(), equal ),
auto_foos.end() ); // okay now, duplicates deallocated
// transfer ownership of the remaining objects back
for (std::list<auto_ptr<Foo>>::iterator it = auto_foos.begin();
it != auto_foos.end(); ++it)
{ foos.push_back(it->get()); it->release(); }
}
Will this be okay, or am I missing something?
I am not able to use C++11 (Boost might be possible) or change the function signature to accept a list of straightforward Foos.
To put an object into a standard container the object needs value semantics (the standard says "copy assignable" and "copy constructable"). Among other things, that means the copy constructor and assignment operator needs to create a copy of an object (leaving the original intact)
The auto_ptr copy constructor does not do that. Instead, the copy constructor and assignment operator transfer ownership of the pointer.
As a consequence, it is not possible for a standard container to contain an auto_ptr.
A lot of implementations (as in compiler and standard library) have the standard containers and/or auto_ptr coded so attempting to have a container of auto_ptr's will trigger a compiler error. Unfortunately, not all implementations do that.
There are generally the following methods you can use in C++98:
Define some pointer that will do what std::auto_ptr can't do. There was an old version of that thing, which contained an additional field of type bool that marked ownership. It was marked mutable, so it could be modified also in the object being read from when copying. The object was deleted at the end only if owned was true. Something like:
==
template <class T> class owning_ptr
{
T* ptr;
mutable bool owns;
public:
void operator =(T* src) { ptr = src; owns = true; }
owning_ptr(const owning_ptr& other)
{
// copy the pointer, but STEAL ownership!
ptr = other.ptr; owns = other.owns; other.owns = false;
}
T* release() { owns = false; return ptr; }
~owning_ptr() { if ( owns ) delete ptr; }
/* ... some lacking stuff ..*/
};
You may try out boost::shared_ptr
Instead of std::unique, you may try to do std::adjacent_find in a loop. Then you'll just find all elements that are "the same" as by your equal. If there's more than one element, you will erase them in place (you are allowed to do it because it's a list, so iterators remain valid).
I'm tyring to understand how COW works, I found following class on wikibooks, but I don't understand this code.
template <class T>
class CowPtr
{
public:
typedef boost::shared_ptr<T> RefPtr;
private:
RefPtr m_sp;
void detach()
{
T* tmp = m_sp.get();
if( !( tmp == 0 || m_sp.unique() ) ) {
m_sp = RefPtr( new T( *tmp ) );
}
}
public:
CowPtr(T* t)
: m_sp(t)
{}
CowPtr(const RefPtr& refptr)
: m_sp(refptr)
{}
CowPtr(const CowPtr& cowptr)
: m_sp(cowptr.m_sp)
{}
CowPtr& operator=(const CowPtr& rhs)
{
m_sp = rhs.m_sp; // no need to check for self-assignment with boost::shared_ptr
return *this;
}
const T& operator*() const
{
return *m_sp;
}
T& operator*()
{
detach();
return *m_sp;
}
const T* operator->() const
{
return m_sp.operator->();
}
T* operator->()
{
detach();
return m_sp.operator->();
}
};
And I would use it in my multithreaded application on map object, which is shared.
map<unsigned int, LPOBJECT> map;
So I've assigned it to template and now I have :
CowPtr<map<unsigned int, LPOBJECT>> map;
And now my questions :
How I should propertly take instance of the map for random thread which want only read map objects ?
How I should modify map object from random thread, for ex. insert new object or erase it ?
The code you post is poor to the point of being unusable; the
author doesn't seem to understand how const works in C++.
Practically speaking: CoW requires some knowledge of the
operations being done on the class. The CoW wrapper has to
trigger the copy when an operation on the wrapped object might
modify; in cases where the wrapped object can "leak" pointers
or iterators which allow modification, it also has to be able to
memorize this, to require deep copy once anything has been
leaked. The code you posted triggers the copy depending on
whether the pointer is const or not, which isn't at all the same
thing. Thus, with an std::map, calling std::map<>::find on
the map should not trigger copy on write, even if the pointer
is not const, and calling std::map<>::insert should, even if
the pointer is const.
With regards to threading: it is very difficult to make a CoW
class thread safe without grabbing a lock for every operation
which may mutate, because it's very difficult to know when
the actual objects are shared between threads. And it's even
more difficult if the object allows pointers or iterators to
leak, as do the standard library objects.
You don't explain why you want a thread-safe CoW map. What's
the point of the map if each time you add or remove an element,
you end up with a new copy, which isn't visible in other
instances? If it's just to start individual instances with
a copy of some existing map, std::map has a copy constructor
which does the job just fine, and you don't need any fancy
wrapper.
How does this work?
The class class CowPtr does hold a shared pointer to the underlying object. It does have a private method to copy construct a new object and assign the pointer to to the local shared pointer (if any other object does hold a reference to it): void detach().
The relevant part of this code is, that it has each method as
const return_type&
method_name() const
and once without const. The const after a method guarantees that the method does not modify the object, the method is called a const method. As the reference to the underlying object is const too, that method is being called every time you require a reference without modifying it.
If however you chose to modify the Object behind the reference, for example:
CowPtr<std::map<unsigned int, LPOBJECT>> map;
map->clear();
the non-const method T& operator->() is being called, which calls detach(). By doing so, a copy is made if any other CowPtr or shared_ptr is referencing the same underlying object (the instance of <unsigned int, LPOBJECT> in this case)
How to use it?
Just how you would use a std::shared_ptr or boost::shared_ptr. The cool thing about that implementation is that it does everything automatically.
Remarks
This is no COW though, as a copy is made even if you do not write, it is more a Copy if you do not guarantee that you do not write-Implementation.
I have the following structure:
struct CacheNode {
set<int> *value;
int timestamp;
CacheNode() : value(new set<int>()), timestamp(0) {}
};
And I pre-allocate a vector of them as follows:
vector<CacheNode> V(10);
When I do this, every CacheNode element in the vector points to the same set<int> in its value field. In particular,
V[0].value->insert(0);
cout << V[1].value->size() << endl;
prints out 1 instead of the 0 that I want.
What is the correct way to pre-allocate the vector (or to declare the structure) so that each CacheNode have its own set<int> instance?
(Note: I do need the value to be a pointer to a set, because it is possible in my application for some CacheNodes to share sets.)
You have violated the rule of 3. You have created an object with a non-trivial constructor, and failed to create a destructor or copy constructor or operator=.
std::vector<blah> foo(10) creates a single default constructed blah, and makes 10 copies of it in foo. Because you violated the rule of 3, these 10 copies are all identical.
The easiest method would be to do away with the new:
struct CacheNode {
std::set<int> value;
int timestamp;
CacheNode() : value(), timestamp(0) {}
};
another route would be to use a unique_ptr for lifetime management, and explicitly copy:
struct CacheNode {
std::unique_ptr<std::set<int>> value;
int timestamp;
CacheNode() : value(new std::set<int>()), timestamp(0) {}
CacheNode(CacheNode&&) = default; // C++11 feature
CacheNode(CacheNode const& other):value(new std::set<int>( *other.value ) ), timestampe(other.timestamp) {}
CacheNode& operator=(CacheNode const& other) {
value.reset(new std::set<int>(*other.value));
timestampe = other.timestamp;
return *this;
}
CacheNode& operator=(CacheNode&& other) = default;
// no need for ~CacheNode, unique_ptr handles it
};
when you want to take the std::set<int> out of your CacheNode, call CacheNode().value.release() and store the resulting std::set<int>*.
std::shared_ptr<std::set<int>> would allow shared ownership of the std::set.
There are other approaches, including making your vector store pointers to CacheNode, creating value_ptr<T> templates that do value semantics, etc.
In C++11, these are relatively easy and safe, because std::vector will move things around, and move semantics on a value_ptr<T> won't create a new T.
I am a bit leery of your plan to have shared std::set<int> between different CacheNode, because in general that is bad smell -- the ownership/lifetime of things should be clear, and in this case you have some CacheNode that own the std::set<int> and others that don't (because they share ownership). A shared_ptr can get around this, but often there are better solutions.
vector<CacheNode> V(10); creates an initial CacheNode object and then copies it 10 times. So you have 10 identical objects.
You can use generate_n:
std::vector<CacheNode> v;
std::generate_n(std::back_inserter(v), 10u, [](){ return CacheNode{}; });
Here's an example program.
You will want to use vector.assign(10, CacheNode()) as what your doing is a way of reserving space mostly.
Also you should do what the other ones say, provide virtual destructor and such.
Say I have a class Foo, which contains some kind of container, say a vector<Bar *> bars. I want to allow the user to iterate through this container, but I want to be flexible so that I might change to a different container in the future. I'm used to Java, where I could do this
public class Foo
{
List<Bar> bars; // may change to a different collection
// User would use this
public Iterator<Bar> getIter()
{
return bars.iterator(); // can change without user knowing
}
}
C++ iterators are designed to look like raw C++ pointers. How do I get the equivalent functionality? I could do the following, which returns the beginning and end of the collection as an iterator that the user can walk himself.
class Foo
{
vector<Bar *> bars;
public:
// user would use this
std::pair<vector<Bar *>::iterator , vector<Bar *>::iterator > getIter()
{
return std::make_pair(bars.begin(), bars.end());
}
}
It works, but I feel I must be doing something wrong.
Function declaration exposes the fact that I'm using a vector. If I change the implementation, I need to change the function declaration. Not a huge deal but kind of goes against encapsulation.
Instead of returning a Java-like iterator class that can do its own bounds check, I need to return both the .begin() and .end() of the collection. Seems a bit ugly.
Should I write my own iterator class?
You could adapt the vector behaviour and provide the same interface:
class Foo
{
std::vector<Bar *> bars;
public:
typedef std::vector<Bar*>::iterator iterator;
iterator begin() {
return bars.begin();
}
iterator end() {
return bars.end();
}
};
Use Foo::iterator as the iterator type outside of the container.
However, bear in mind that hiding behind the typedef offers less than it seems. You can swap the internal implementation as long as it provides the same guarantees. For example, if you treat Foo::iterator as a random access iterator, then you cannot swap a vector for a list internally at a later date without a comprehensive refactoring because list iterators are not random access.
You could refer to Scott Meyers Effective STL, Item 2: beware the illusion of container independent code for a comprehensive argument as to why it might be a bad idea to assume that you can change the underlying container at any point in future. One of the more serious points is iterator invalidation. Say you treat your iterators as bi-directional, so that you could swap a vector for a list at some point. An insertion in the middle of a vector will invalidate all of its iterators, while the same does not hold for list. In the end, the implementation details will leak, and trying to hide them might be Sisyphus work...
You are looking for type erasure. Basically you want an iterator with vector erased from it. This is roughly what it looks like:
#include <vector>
#include <memory>
#include <iostream>
template<class T>
class Iterator{ //the class that erases the iterator type
//private stuff that the user should not care about
struct Iterator_base{
virtual void increment() = 0;
virtual T &dereference() = 0;
virtual ~Iterator_base() = default;
};
std::unique_ptr<Iterator_base> iter;
template<class Iter>
class Iterator_helper : public Iterator_base{
void increment() override{
++iter;
}
T &dereference() override{
return *iter;
}
Iter iter;
public:
Iterator_helper(const Iter &iter) : iter(iter){}
};
public:
template<class Iter>
Iterator(const Iter &iter) : iter(new Iterator_helper<Iter>(iter)){}
//iterator functions for the user
Iterator &operator ++(){
iter->increment();
return *this;
}
T &operator *(){
return iter->dereference();
}
};
struct Bar{
Bar(int i) : i(i){};
int i;
};
class Foo
{
std::vector<Bar> bars;
public:
Foo(){ //just so we have some elements to point to
bars.emplace_back(1);
bars.emplace_back(2);
}
// user would use this
Iterator<Bar> begin()
{
return bars.begin();
}
};
int main(){
Foo f;
auto it = f.begin();
std::cout << (*it).i << '\n'; //1
++it; //increment
std::cout << (*it).i << '\n'; //2
(*it).i++; //dereferencing
std::cout << (*it).i << '\n'; //3
}
You can now pass any iterator (actually anything) to Iterator that support pre-increment, dereferencing and copy constuction, completely hiding the vector inside. You can even assign Iterators that have a vector::iterator inside to an Iterator that has a list::iterator inside, though that may not be a good thing.
This is a very bare-bone implementation, you would want to also implement operators ++ for post-increment, --, ->, ==, =, <, >, <=, >=, != and possibly []. Once you are done with that you need to duplicate the code into a Const_Iterator. If you don't want to do that yourself consider using boost::type_erasure.
Also note that you are paying for this encapsulation with unnecessary dynamic memory allocations, cache misses, virtual function calls that probably cannot be inlined and triply redundant code (same functions in Iterator, Iteratr_base and Iterator_helper).
vector is still present in the private part of Foo, you can get rid of that with a pimpl, adding another level of indirection.
I feel like this bit of encapsulation is not worth the cost, but your mileage may vary.