Is c++11 "range-based for" thread-safe? - c++

I have an environment modelled by lines and points packed in two std::vector.
I want to calculate a field generated by this environnement. I multithreaded the process. As the environment is totally defined at the begining, threads should only read on it so I don't use any syncrhonisation as discribed here and there.
The problem comes now : when I iterate through the lines in the environement, I have two different behaviours depending if I use the c++11 range-based for statment or if I use a more common for statement with iterators.
It seems that the range-based for isn't thread-safe and I'm wondering why?
If I'm not right assuming that, it might significate I have a more deep problem that may reappear latter.
Here is a piece of code, the first worker seems to work, the second provoke a segfault.
Worker::worker(Environement const* e, int id):threadId(id),env(e)
{
}
// worker that seems to do his job.
void Worker::run() const
{
cout<<"in thread n "<<_threadId<<endl;
vector<Line> const* lines = &env->_lines;
for(std::vector<Line>::const_iterator it = lines->begin() ;it != lines->end(); ++it ){
it->hello();
}
}
// create a segfault
void Worker::run2() const
{
cout<<"in thread n "<<_threadId<<endl;
vector<Line> const& lines = env->_lines;
for(auto it : lines){
it.hello();
}
}
The simplified structure of data if needed:
struct Line
{
void hello() const {std::cout<<"hello"<<std::endl;}
}
struct Environment
{
std::vector<Line> _lines;
std::vector<Point> _points;
}

Related

Prevent or detect "this" from being deleted during use

One error that I often see is a container being cleared whilst iterating through it. I have attempted to put together a small example program demonstrating this happening. One thing to note is that this can often happen many function calls deep so is quite hard to detect.
Note: This example deliberately shows some poorly designed code. I am trying to find a solution to detect the errors caused by writing code such as this without having to meticulously examine an entire codebase (~500 C++ units)
#include <iostream>
#include <string>
#include <vector>
class Bomb;
std::vector<Bomb> bombs;
class Bomb
{
std::string name;
public:
Bomb(std::string name)
{
this->name = name;
}
void touch()
{
if(rand() % 100 > 30)
{
/* Simulate everything being exploded! */
bombs.clear();
/* An error: "this" is no longer valid */
std::cout << "Crickey! The bomb was set off by " << name << std::endl;
}
}
};
int main()
{
bombs.push_back(Bomb("Freddy"));
bombs.push_back(Bomb("Charlie"));
bombs.push_back(Bomb("Teddy"));
bombs.push_back(Bomb("Trudy"));
for(size_t i = 0; i < bombs.size(); i++)
{
bombs.at(i).touch();
}
return 0;
}
Can anyone suggest a way of guaranteeing this cannot happen?
The only way I can currently detect this kind of thing is replacing the global new and delete with mmap / mprotect and detecting use after free memory accesses. This and Valgrind however sometimes fail to pick it up if the vector does not need to reallocate (i.e only some elements removed or the new size is not yet the reserve size). Ideally I don't want to have to clone much of the STL to make a version of std::vector that always reallocates every insertion/deletion during debug / testing.
One way that almost works is if the std::vector instead contains std::weak_ptr, then the usage of .lock() to create a temporary reference prevents its deletion whilst execution is within the classes method. However this cannot work with std::shared_ptr because you do not need lock() and same with plain objects. Creating a container of weak pointers just for this would be wasteful.
Can anyone else think of a way to protect ourselves from this.
Easiest way is to run your unit tests with Clang MemorySanitizer linked in.
Let some continuous-integration Linux box to do it automatically on each push
into repo.
MemorySanitizer has "Use-after-destruction detection" (flag -fsanitize-memory-use-after-dtor + environment variable MSAN_OPTIONS=poison_in_dtor=1) and so it will blow the test up that executes the code and that turns your continuous-integration red.
If you have neither unit tests nor continuous integration in place then you can also just manually debug your code with MemorySanitizer but that is hard way compared with the easiest. So better start to use continuous integration and write unit tests.
Note that there may be legitimate reasons of memory reads and writes after destructor has been ran but memory hasn't yet been freed. For example std::variant<std::string,double>. It lets us to assign it std::string then double and so its implementation might destroy the string and reuse same storage for double. Filtering such cases out is unfortunately manual work at the moment, but tools evolve.
In your particular example the misery boils down to no less than two design flaws:
Your vector is a global variable. Limit the scope of all of your objects as much as possible and issues like this are less likely to occur.
Having the single responsibility principle in mind, I can hardly imagine how one could come up with a class that needs to have some method that either directly or indirectly (maybe through 100 layers of call stack) deletes objects that could happen to be this.
I am aware that your example is artificial and intentionally bad, so please don't get me wrong here: I'm sure that in your actual case it is not so obvious how sticking to some basic design rules can prevent you from doing this. But as I said, I strongly believe that good design will reduce the likelyhood of such bugs coming up. And in fact, I cannot remember that I was ever facing such an issue, but maybe I am just not experienced enough :)
However, if this really keeps being an issue despite sticking with some design rules, then I have this idea how to detect it:
Create a member int recursionDepth in your class and initialize it with 0
At the beginning of each non-private method increment it.
Use RAII to make sure that at the end of each method it is decremented again
In the destructor check it to be 0, otherwise it means that the destructor is directly or indirectly called by some method of this.
You may want to #ifdef all of this and enable it only in debug build. This would essentially make it a debug assertion, some people like them :)
Note, that this does not work in a multi threaded environment.
In the end I went with a custom iterator that if the owner std::vector resizes whilst the iterator is still in scope, it will log an error or abort (giving me a stacktrace of the program). This example is a bit convoluted but I have tried to simplify it as much as possible and removed unused functionality from the iterator.
This system has flagged up about 50 errors of this nature. Some may be repeats. However Valgrind and ElecricFence at this point came up clean which is disappointing (In total they flagged up around 10 which I have already fixed since the start of the code cleanup).
In this example I use clear() which Valgrind does flag as an error. However in the actual codebase it is random access erases (i.e vec.erase(vec.begin() + 9)) which I need to check and Valgrind unfortunately misses quite a few.
main.cpp
#include "sstd_vector.h"
#include <iostream>
#include <string>
#include <memory>
class Bomb;
sstd::vector<std::shared_ptr<Bomb> > bombs;
class Bomb
{
std::string name;
public:
Bomb(std::string name)
{
this->name = name;
}
void touch()
{
if(rand() % 100 > 30)
{
/* Simulate everything being exploded! */
bombs.clear(); // Causes an ABORT
std::cout << "Crickey! The bomb was set off by " << name << std::endl;
}
}
};
int main()
{
bombs.push_back(std::make_shared<Bomb>("Freddy"));
bombs.push_back(std::make_shared<Bomb>("Charlie"));
bombs.push_back(std::make_shared<Bomb>("Teddy"));
bombs.push_back(std::make_shared<Bomb>("Trudy"));
/* The key part is the lifetime of the iterator. If the vector
* changes during the lifetime of the iterator, even if it did
* not reallocate, an error will be logged */
for(sstd::vector<std::shared_ptr<Bomb> >::iterator it = bombs.begin(); it != bombs.end(); it++)
{
it->get()->touch();
}
return 0;
}
sstd_vector.h
#include <vector>
#include <stdlib.h>
namespace sstd
{
template <typename T>
class vector
{
std::vector<T> data;
size_t refs;
void check_valid()
{
if(refs > 0)
{
/* Report an error or abort */
abort();
}
}
public:
vector() : refs(0) { }
~vector()
{
check_valid();
}
vector& operator=(vector const& other)
{
check_valid();
data = other.data;
return *this;
}
void push_back(T val)
{
check_valid();
data.push_back(val);
}
void clear()
{
check_valid();
data.clear();
}
class iterator
{
friend class vector;
typename std::vector<T>::iterator it;
vector<T>* parent;
iterator() { }
iterator& operator=(iterator const&) { abort(); }
public:
iterator(iterator const& other)
{
it = other.it;
parent = other.parent;
parent->refs++;
}
~iterator()
{
parent->refs--;
}
bool operator !=(iterator const& other)
{
if(it != other.it) return true;
if(parent != other.parent) return true;
return false;
}
iterator operator ++(int val)
{
iterator rtn = *this;
it ++;
return rtn;
}
T* operator ->()
{
return &(*it);
}
T& operator *()
{
return *it;
}
};
iterator begin()
{
iterator rtn;
rtn.it = data.begin();
rtn.parent = this;
refs++;
return rtn;
}
iterator end()
{
iterator rtn;
rtn.it = data.end();
rtn.parent = this;
refs++;
return rtn;
}
};
}
The disadvantages of this system is that I must use an iterator rather than .at(idx) or [idx]. I personally don't mind this one so much. I can still use .begin() + idx if random access is needed.
It is a little bit slower (nothing compared to Valgrind though). When I am done, I can do a search / replace of sstd::vector with std::vector and there should be no performance drop.

Possible bug with GCC, foreach loops operate on shadows, rather than the actual objects

I believe I've stumbled upon a bug in GCC 4.82
consider the following MCVE:
class foreachtestobject
{
std::string somevalue;
public:
foreachtestobject(int i)
{
somevalue = "default value "+to_string(i);
}
void reSetSomeValue(string newvalue)
{
somevalue = newvalue;
}
string getValue()
{
return somevalue;
}
};
int main()
{
vector<foreachtestobject> vec;
vec.push_back(foreachtestobject(1));
vec.push_back(foreachtestobject(2));
//reading via foreach is unproblematic
for(auto obj : vec )
{
cout<<"Object is: "<<obj.getValue()<<endl;
}
//changing values inside a foreach
for (auto obj : vec)
{
obj.reSetSomeValue("new name");
}
//printing a second time
for(auto obj : vec )
{
cout<<"Object is: "<<obj.getValue()<<endl;
}//Notice that nothing has changed.
//now changing via conventional loop
for (int i = 0; i<vec.size();i++)
{
vec[i].reSetSomeValue("this worked");
}
//Printing a third time
for(auto obj : vec )
{
cout<<"Object is: "<<obj.getValue()<<endl;
}//Notice how the values have been changed correctly.
}
Running the code through QTs debugger, it appears that the foreach loops create temporary copies of the objects, the memory addresses do not match either of the two actual objects. So when reSetSomeValue is called, the function on the shadow objects are called instead.
I might add that I am not entirely sure QT actually compiles with gcc 4.8.2. I happen to know that I updated GCC some time ago, I don't know if QT automatically takes to use the updated version. the command GCC --version reports 4.8.2.
This strikes me as down right odd, not to mention inefficient, if each object iterated over is copied, it represent a considerable overhead. According to any sources I find, foreach loops should work in the same manner as conventional for loops, yet here it does not.
When that's said, is this a bug? If not, why?
for(auto obj : vec )
does and should create copies of the elements of the range, that's in the language rules. If you want a reference, say so:
for(auto &obj : vec )
it appears that the foreach loops create temporary copies of the objects
Did it not occur to you that this is because that's the code you wrote? The only "bug" here is with you: you're operating on copies of your array elements.
If you wish to operate on the original versions, use references. For example:
for (auto& el : container)
// ^
When you have a problem, accusing the compiler of being buggy is the last resort, not the first (unless you're using Visual Studio). At least look up what the ranged-for loop construct means and does.

An attempt to create atomic reference counting is failing with deadlock. Is this the right approach?

So I'm attempting to create copy-on-write map that uses an attempt at atomic reference counting on the read-side to not have locking.
Something isn't quite right. I see some references getting over-incremented and some are going down negative, so something isn't really atomic. In my tests I have 10 reader threads looping 100 times each doing a get() and 1 writer thread doing 100 writes.
It gets stuck in the writer because some of the references never go down to zero, even though they should.
I'm attempting to use the 128-bit DCAS technique laid explained by this blog.
Is there something blatantly wrong with this or is there an easier way to debugging this rather than playing with it in the debugger?
typedef std::unordered_map<std::string, std::string> StringMap;
static const int zero = 0; //provides an l-value for asm code
class NonBlockingReadMapCAS {
public:
class OctaWordMapWrapper {
public:
StringMap* fStringMap;
//std::atomic<int> fCounter;
int64_t fCounter;
OctaWordMapWrapper(OctaWordMapWrapper* copy) : fStringMap(new StringMap(*copy->fStringMap)), fCounter(0) { }
OctaWordMapWrapper() : fStringMap(new StringMap), fCounter(0) { }
~OctaWordMapWrapper() {
delete fStringMap;
}
/**
* Does a compare and swap on an octa-word - in this case, our two adjacent class members fStringMap
* pointer and fCounter.
*/
static bool inline doubleCAS(OctaWordMapWrapper* target, StringMap* compareMap, int64_t compareCounter, StringMap* swapMap, int64_t swapCounter ) {
bool cas_result;
__asm__ __volatile__
(
"lock cmpxchg16b %0;" // cmpxchg16b sets ZF on success
"setz %3;" // if ZF set, set cas_result to 1
: "+m" (*target),
"+a" (compareMap), //compare target's stringmap pointer to compareMap
"+d" (compareCounter), //compare target's counter to compareCounter
"=q" (cas_result) //results
: "b" (swapMap), //swap target's stringmap pointer with swapMap
"c" (swapCounter) //swap target's counter with swapCounter
: "cc", "memory"
);
return cas_result;
}
OctaWordMapWrapper* atomicIncrementAndGetPointer()
{
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter +1))
return this;
else
return NULL;
}
OctaWordMapWrapper* atomicDecrement()
{
while(true) {
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter -1))
break;
}
return this;
}
bool atomicSwapWhenNotReferenced(StringMap* newMap)
{
return doubleCAS(this, this->fStringMap, zero, newMap, 0);
}
}
__attribute__((aligned(16)));
std::atomic<OctaWordMapWrapper*> fReadMapReference;
pthread_mutex_t fMutex;
NonBlockingReadMapCAS() {
fReadMapReference = new OctaWordMapWrapper();
}
~NonBlockingReadMapCAS() {
delete fReadMapReference;
}
bool contains(const char* key) {
std::string keyStr(key);
return contains(keyStr);
}
bool contains(std::string &key) {
OctaWordMapWrapper *map;
do {
map = fReadMapReference.load()->atomicIncrementAndGetPointer();
} while (!map);
bool result = map->fStringMap->count(key) != 0;
map->atomicDecrement();
return result;
}
std::string get(const char* key) {
std::string keyStr(key);
return get(keyStr);
}
std::string get(std::string &key) {
OctaWordMapWrapper *map;
do {
map = fReadMapReference.load()->atomicIncrementAndGetPointer();
} while (!map);
//std::cout << "inc " << map->fStringMap << " cnt " << map->fCounter << "\n";
std::string value = map->fStringMap->at(key);
map->atomicDecrement();
return value;
}
void put(const char* key, const char* value) {
std::string keyStr(key);
std::string valueStr(value);
put(keyStr, valueStr);
}
void put(std::string &key, std::string &value) {
pthread_mutex_lock(&fMutex);
OctaWordMapWrapper *oldWrapper = fReadMapReference;
OctaWordMapWrapper *newWrapper = new OctaWordMapWrapper(oldWrapper);
std::pair<std::string, std::string> kvPair(key, value);
newWrapper->fStringMap->insert(kvPair);
fReadMapReference.store(newWrapper);
std::cout << oldWrapper->fCounter << "\n";
while (oldWrapper->fCounter > 0);
delete oldWrapper;
pthread_mutex_unlock(&fMutex);
}
void clear() {
pthread_mutex_lock(&fMutex);
OctaWordMapWrapper *oldWrapper = fReadMapReference;
OctaWordMapWrapper *newWrapper = new OctaWordMapWrapper(oldWrapper);
fReadMapReference.store(newWrapper);
while (oldWrapper->fCounter > 0);
delete oldWrapper;
pthread_mutex_unlock(&fMutex);
}
};
Maybe not the answer but this looks suspicious to me:
while (oldWrapper->fCounter > 0);
delete oldWrapper;
You could have a reader thread just entering atomicIncrementAndGetPointer() when the counter is 0 thus pulling the rug underneath the reader thread by deleting the wrapper.
Edit to sum up the comments below for potential solution:
The best implementation I'm aware of is to move fCounter from OctaWordMapWrapper to fReadMapReference (You don't need the OctaWordMapWrapper class at all actually). When the counter is zero swap the pointer in your writer. Because you can have high contention of reader threads which essentially blocks the writer indefinitely you can have highest bit of fCounter allocated for reader lock, i.e. while this bit is set the readers spin until the bit is cleared. The writer sets this bit (__sync_fetch_and_or()) when it's about to change the pointer, waits for the counter to fall down to zero (i.e. existing readers finish their work) and then swap the pointer and clears the bit.
This approach should be waterproof, though it's obviously blocking readers upon writes. I don't know if this is acceptable in your situation and ideally you would like this to be non-blocking.
The code would look something like this (not tested!):
class NonBlockingReadMapCAS
{
public:
NonBlockingReadMapCAS() :m_ptr(0), m_counter(0) {}
private:
StringMap *acquire_read()
{
while(1)
{
uint32_t counter=atom_inc(m_counter);
if(!(counter&0x80000000))
return m_ptr;
atom_dec(m_counter);
while(m_counter&0x80000000);
}
return 0;
}
void release_read()
{
atom_dec(m_counter);
}
void acquire_write()
{
uint32_t counter=atom_or(m_counter, 0x80000000);
assert(!(counter&0x80000000));
while(m_counter&0x7fffffff);
}
void release_write()
{
atom_and(m_counter, uint32_t(0x7fffffff));
}
StringMap *volatile m_ptr;
volatile uint32_t m_counter;
};
Just call acquire/release_read/write() before & after accessing the pointer for read/write. Replace atom_inc/dec/or/and() with __sync_fetch_and_add(), __sync_fetch_and_sub(), __sync_fetch_and_or() and __sync_fetch_and_and() respectively. You don't need doubleCAS() for this actually.
As noted correctly by #Quuxplusone in a comment below this is single producer & multiple consumer implementation. I modified the code to assert properly to enforce this.
Well, there are probably lots of problems, but here are the obvious two.
The most trivial bug is in atomicIncrementAndGetPointer. You wrote:
if (doubleCAS(this, this->fStringMap, this->fCounter, this->fStringMap, this->fCounter +1))
That is, you're attempting to increment this->fCounter in a lock-free way. But it doesn't work, because you're fetching the old value twice with no guarantee that the same value is read each time. Consider the following sequence of events:
Thread A fetches this->fCounter (with value 0) and computes argument 5 as this->fCounter +1 = 1.
Thread B successfully increments the counter.
Thread A fetches this->fCounter (with value 1) and computes argument 3 as this->fCounter = 1.
Thread A executes doubleCAS(this, this->fStringMap, 1, this->fStringMap, 1). It succeeds, of course, but we've lost the "increment" we were trying to do.
What you wanted is more like
StringMap* oldMap = this->fStringMap;
int64_t oldCounter = this->fCounter;
if (doubleCAS(this, oldMap, oldValue, oldMap, oldValue+1))
...
The other obvious problem is that there's a data race between get and put. Consider the following sequence of events:
Thread A begins to execute get: it fetches fReadMapReference.load() and prepares to execute atomicIncrementAndGetPointer on that memory address.
Thread B finishes executing put: it deletes that memory address. (It is within its rights to do so, because the wrapper's reference count is still at zero.)
Thread A starts executing atomicIncrementAndGetPointer on the deleted memory address. If you're lucky, you segfault, but of course in practice you probably won't.
As explained in the blog post:
The garbage collection interface is omitted, but in real applications you would need to scan the hazard pointers before deleting a node.
Another user has suggested a similar approach, but if you are compiling with gcc (and perhaps with clang), you could use the intrinsic __sync_add_and_fetch_4 which does something similar to what your assembly code does, and is likely much more portable.
I have used it when I implemented refcounting in an Ada library (but the algorithm remains the same).
int __sync_add_and_fetch_4 (int* ptr, int value);
// increments the value pointed to by ptr by value, and returns the new value
Although I'm not sure how your reader threads work, I suspect your problem is that you are not catching and handling possible out_of_range exceptions in your get() method that might arise from this line: std::string value = map->fStringMap->at(key);. Note that if key is not found in the map, this will throw, and exit the function without decrementing the counter, which would lead to the condition you describe (of getting stuck in the while-loop within the writer thread while waiting for the counters to decrement).
In any event, whether this is the cause of the issues you're seeing or not, you definitely need to either handle this exception (and any others) or modify your code such that there's no risk of a throw. For the at() method, I would probably just use find() instead, and then check the iterator it returns. However, more generally, I would suggest using the RAII pattern to ensure that you don't let any unexpected exceptions escape without unlocking/decrementing. For example, you might check out boost::scoped_lock to wrap your fMutex and then write something simple like this for the OctaWordMapWrapper increment/decrement:
class ScopedAtomicMapReader
{
public:
explicit ScopedAtomicMapReader(std::atomic<OctaWordMapWrapper*>& map) : fMap(NULL) {
do {
fMap = map.load()->atomicIncrementAndGetPointer();
} while (NULL == fMap);
}
~ScopedAtomicMapReader() {
if (NULL != fMap)
fMap->atomicDecrement();
}
OctaWordMapWrapper* map(void) {
return fMap;
}
private:
OctaWordMapWrapper* fMap;
}; // class ScopedAtomicMapReader
With something like that, then for example, your contains() and get() methods would simplify to (and be immune to exceptions):
bool contains(std::string &key) {
ScopedAtomicMapReader mapWrapper(fReadMapReference);
return (mapWrapper.map()->fStringMap->count(key) != 0);
}
std::string get(std::string &key) {
ScopedAtomicMapReader mapWrapper(fReadMapReference);
return mapWrapper.map()->fStringMap->at(key); // Now it's fine if this throws...
}
Finally, although I don't think you should have to do this, you might also try declaring fCounter as volatile as well (given your access to it in the while-loop in the put() method will be on a different thread than the writes to it on the reader threads.
Hope this helps!
By the way, one other minor thing: fReadMapReference is leaking. I think you should delete this in your destructor.

tbb::concurrent_hash_map throws SIGSEGV

I'm running a small program built using TBB on Windows with mingw32. It does a parallel_for. Inside the parallel_for my object makes changes to a concurrent_hash_map object. It starts running but later throws a SIGSEGV when I try to use an accessor. I don't know where the problem is.
My object:
class Foobar
{
public:
Foobar(FoobarParent* rw) : _rw(rw)
{
_fooMap = &_rw->randomWalkers();
}
void operator() (const tbb::blocked_range<size_t>&r ) const
{
for(size_t i = r.begin(); i != r.end(); ++i)
{
apply(i);
}
}
private:
void apply(int i) const
{
pointMap_t::accessor a;
_fooMap->find(a, i);
Point3D current = a->second;
Point3D next = _rw->getNext(current);
if (!_rw->hasConstraint(next))
{
return;
}
a->second = next;
}
FoobarParent* _rw;
pointMap_t* _fooMap;
};
pointMap_t is defined as:
typedef tbb::concurrent_hash_map<int, Point3D> pointMap_t;
Can someone shed a light on this issue? I'm new to TBB. The signal is thrown when the apply method calls a->second.
There are two potential problems in this code.
First, if find() does not find the specified key, it will fail to dereference a->second. You should rewrite it either with insert() which will ensure existence of the element or add condition checking like:
if( a ) // process it
Second, you call getNext and hasConstraint under the lock of the accessor. It is dangerous to call anything under the lock since it can have another lock inside or a call to TBB and thus can lead to deadlock or other problems.

Bin packing implementation in C++ with STL

This is my first time using this site so sorry for any bad formatting or weird formulations, I'll try my best to conform to the rules on this site but I might do some misstakes in the beginning.
I'm right now working on an implementation of some different bin packing algorithms in C++ using the STL containers. In the current code I still have some logical faults that needs to be fixed but this question is more about the structure of the program. I would wan't some second opinion on how you should structure the program to minimize the number of logical faults and make it as easy to read as possible. In it's current state I just feel that this isn't the best way to do it but I don't really see any other way to write my code right now.
The problem is a dynamic online bin packing problem. It is dynamic in the sense that items have an arbitrary time before they will leave the bin they've been assigned to.
In short my questions are:
How would the structure of a Bin packing algorithm look in C++?
Is STL containers a good tool to make the implementation be able to handle inputs of arbitrary lenght?
How should I handle the containers in a good, easy to read and implement way?
Some thoughts about my own code:
Using classes to make a good distinction between handling the list of the different bins and the list of items in those bins.
Getting the implementation as effective as possible.
Being easy to run with a lot of different data lengths and files for benchmarking.
#include <iostream>
#include <fstream>
#include <list>
#include <queue>
#include <string>
#include <vector>
using namespace std;
struct type_item {
int size;
int life;
bool operator < (const type_item& input)
{
return size < input.size;
}
};
class Class_bin {
double load;
list<type_item> contents;
list<type_item>::iterator i;
public:
Class_bin ();
bool operator < (Class_bin);
bool full (type_item);
void push_bin (type_item);
double check_load ();
void check_dead ();
void print_bin ();
};
Class_bin::Class_bin () {
load=0.0;
}
bool Class_bin::operator < (Class_bin input){
return load < input.load;
}
bool Class_bin::full (type_item input) {
if (load+(1.0/(double) input.size)>1) {
return false;
}
else {
return true;
}
}
void Class_bin::push_bin (type_item input) {
int sum=0;
contents.push_back(input);
for (i=contents.begin(); i!=contents.end(); ++i) {
sum+=i->size;
}
load+=1.0/(double) sum;
}
double Class_bin::check_load () {
return load;
}
void Class_bin::check_dead () {
for (i=contents.begin(); i!=contents.end(); ++i) {
i->life--;
if (i->life==0) {
contents.erase(i);
}
}
}
void Class_bin::print_bin () {
for (i=contents.begin (); i!=contents.end (); ++i) {
cout << i->size << " ";
}
}
class Class_list_of_bins {
list<Class_bin> list_of_bins;
list<Class_bin>::iterator i;
public:
void push_list (type_item);
void sort_list ();
void check_dead ();
void print_list ();
private:
Class_bin new_bin (type_item);
bool comparator (type_item, type_item);
};
Class_bin Class_list_of_bins::new_bin (type_item input) {
Class_bin temp;
temp.push_bin (input);
return temp;
}
void Class_list_of_bins::push_list (type_item input) {
if (list_of_bins.empty ()) {
list_of_bins.push_front (new_bin(input));
return;
}
for (i=list_of_bins.begin (); i!=list_of_bins.end (); ++i) {
if (!i->full (input)) {
i->push_bin (input);
return;
}
}
list_of_bins.push_front (new_bin(input));
}
void Class_list_of_bins::sort_list () {
list_of_bins.sort();
}
void Class_list_of_bins::check_dead () {
for (i=list_of_bins.begin (); i !=list_of_bins.end (); ++i) {
i->check_dead ();
}
}
void Class_list_of_bins::print_list () {
for (i=list_of_bins.begin (); i!=list_of_bins.end (); ++i) {
i->print_bin ();
cout << "\n";
}
}
int main () {
int i, number_of_items;
type_item buffer;
Class_list_of_bins bins;
queue<type_item> input;
string filename;
fstream file;
cout << "Input file name: ";
cin >> filename;
cout << endl;
file.open (filename.c_str(), ios::in);
file >> number_of_items;
for (i=0; i<number_of_items; ++i) {
file >> buffer.size;
file >> buffer.life;
input.push (buffer);
}
file.close ();
while (!input.empty ()) {
buffer=input.front ();
input.pop ();
bins.push_list (buffer);
}
bins.print_list ();
return 0;
}
Note that this is just a snapshot of my code and is not yet running properly
Don't wan't to clutter this with unrelated chatter just want to thank the people who contributed, I will review my code and hopefully be able to structure my programming a bit better
How would the structure of a Bin packing algorithm look in C++?
Well, ideally you would have several bin-packing algorithms, separated into different functions, which differ only by the logic of the algorithm. That algorithm should be largely independent from the representation of your data, so you can change your algorithm with only a single function call.
You can look at what the STL Algorithms have in common. Mainly, they operate on iterators instead of containers, but as I detail below, I wouldn't suggest this for you initially. You should get a feel for what algorithms are available and leverage them in your implementation.
Is STL containers a good tool to make the implementation be able to handle inputs of arbitrary length?
It usually works like this: create a container, fill the container, apply an algorithm to the container.
Judging from the description of your requirements, that is how you'll use this, so I think it'll be fine. There's one important difference between your bin packing algorithm and most STL algorithms.
The STL algorithms are either non-modifying or are inserting elements to a destination. bin-packing, on the other hand, is "here's a list of bins, use them or add a new bin". It's not impossible to do this with iterators, but probably not worth the effort. I'd start by operating on the container, get a working program, back it up, then see if you can make it work for only iterators.
How should I handle the containers in a good, easy to read and implement way?
I'd take this approach, characterize your inputs and outputs:
Input: Collection of items, arbitrary length, arbitrary order.
Output: Collection of bins determined by algorithm. Each bin contains a collection of items.
Then I'd worry about "what does my algorithm need to do?"
Constantly check bins for "does this item fit?"
Your Class_bin is a good encapsulation of what is needed.
Avoid cluttering your code with unrelated stuff like "print()" - use non-member help functions.
type_item
struct type_item {
int size;
int life;
bool operator < (const type_item& input)
{
return size < input.size;
}
};
It's unclear what life (or death) is used for. I can't imagine that concept being relevant to implementing a bin-packing algorithm. Maybe it should be left out?
This is personal preference, but I don't like giving operator< to my objects. Objects are usually non-trivial and have many meanings of less-than. For example, one algorithm might want all the alive items sorted before the dead items. I typically wrap that in another struct for clarity:
struct type_item {
int size;
int life;
struct SizeIsLess {
// Note this becomes a function object, which makes it easy to use with
// STL algorithms.
bool operator() (const type_item& lhs, const type_item& rhs)
{
return lhs.size < rhs.size;
}
}
};
vector<type_item> items;
std::sort(items.begin, items.end(), type_item::SizeIsLess);
Class_bin
class Class_bin {
double load;
list<type_item> contents;
list<type_item>::iterator i;
public:
Class_bin ();
bool operator < (Class_bin);
bool full (type_item);
void push_bin (type_item);
double check_load ();
void check_dead ();
void print_bin ();
};
I would skip the Class_ prefix on all your types - it's just a bit excessive, and it should be clear from the code. (This is a variant of hungarian notation. Programmers tend to be hostile towards it.)
You should not have a class member i (the iterator). It's not part of class state. If you need it in all the members, that's ok, just redeclare it there. If it's too long to type, use a typedef.
It's difficult to quantify "bin1 is less than bin2", so I'd suggest removing the operator<.
bool full(type_item) is a little misleading. I'd probably use bool can_hold(type_item). To me, bool full() would return true if there is zero space remaining.
check_load() would seem more clearly named load().
Again, it's unclear what check_dead() is supposed to accomplish.
I think you can remove print_bin and write that as a non-member function, to keep your objects cleaner.
Some people on StackOverflow would shoot me, but I'd consider just making this a struct, and leaving load and item list public. It doesn't seem like you care much about encapsulation here (you're only need to create this object so you don't need do recalculate load each time).
Class_list_of_bins
class Class_list_of_bins {
list<Class_bin> list_of_bins;
list<Class_bin>::iterator i;
public:
void push_list (type_item);
void sort_list ();
void check_dead ();
void print_list ();
private:
Class_bin new_bin (type_item);
bool comparator (type_item, type_item);
};
I think you can do without this class entirely.
Conceptually, it represents a container, so just use an STL container. You can implement the methods as non-member functions. Note that sort_list can be replaced with std::sort.
comparator is too generic a name, it gives no indication of what it compares or why, so consider being more clear.
Overall Comments
Overall, I think the classes you've picked adequately model the space you're trying to represent, so you'll be fine.
I might structure my project like this:
struct bin {
double load; // sum of item sizes.
std::list<type_item> items;
bin() : load(0) { }
};
// Returns true if the bin can fit the item passed to the constructor.
struct bin_can_fit {
bin_can_fit(type_item &item) : item_(item) { }
bool operator()(const bin &b) {
return item_.size < b.free_space;
}
private:
type_item item_;
};
// ItemIter is an iterator over the items.
// BinOutputIter is an output iterator we can use to put bins.
template <ItemIter, BinOutputIter>
void bin_pack_first_fit(ItemIter curr, ItemIter end, BinOutputIter output_bins) {
std::vector<bin> bins; // Create a local bin container, to simplify life.
for (; curr != end; ++curr) {
// Use a helper predicate to check whether the bin can fit this item.
// This is untested, but just for an idea.
std::vector<bin>::iterator bin_it =
std::find_if(bins.begin(), bins.end(), bin_can_fit(*curr));
if (bin_it == bins.end()) {
// Did not find a bin with enough space, add a new bin.
bins.push_back(bin);
// push_back invalidates iterators, so reassign bin_it to the last item.
bin_it = std::advance(bins.begin(), bins.size() - 1);
}
// bin_it now points to the bin to put the item in.
bin_it->items.push_back(*curr);
bin_it->load += curr.size();
}
std::copy(bins.begin(), bins.end(), output_bins); // Apply our bins to the destination.
}
void main(int argc, char** argv) {
std::vector<type_item> items;
// ... fill items
std::vector<bin> bins;
bin_pack_first_fit(items.begin(), items.end(), std::back_inserter(bins));
}
Some thoughts:
Your names are kinda messed up in places.
You have a lot of parameters named input, thats just meaningless
I'd expect full() to check whether it is full, not whether it can fit something else
I don't think push_bin pushes a bin
check_dead modifies the object (I'd expect something named check_*, to just tell me something about the object)
Don't put things like Class and type in the names of classes and types.
class_list_of_bins seems to describe what's inside rather then what the object is.
push_list doesn't push a list
Don't append stuff like _list to every method in a list class, if its a list object, we already know its a list method
I'm confused given the parameters of life and load as to what you are doing. The bin packing problem I'm familiar with just has sizes. I'm guessing that overtime some of the objects are taken out of bins and thus go away?
Some further thoughts on your classes
Class_list_of_bins is exposing too much of itself to the outside world. Why would the outside world want to check_dead or sort_list? That's nobodies business but the object itself. The public method you should have on that class really should be something like
* Add an item to the collection of bins
* Print solution
* Step one timestep into the future
list<Class_bin>::iterator i;
Bad, bad, bad! Don't put member variables on your unless they are actually member states. You should define that iterator where it is used. If you want to save some typing add this: typedef list::iterator bin_iterator and then you use bin_iterator as the type instead.
EXPANDED ANSWER
Here is my psuedocode:
class Item
{
Item(Istream & input)
{
read input description of item
}
double size_needed() { return actual size required (out of 1) for this item)
bool alive() { return true if object is still alive}
void do_timestep() { decrement life }
void print() { print something }
}
class Bin
{
vector of Items
double remaining_space
bool can_add(Item item) { return true if we have enough space}
void add(Item item) {add item to vector of items, update remaining space}
void do_timestep() {call do_timestep() and all Items, remove all items which indicate they are dead, updating remaining_space as you go}
void print { print all the contents }
}
class BinCollection
{
void do_timestep { call do_timestep on all of the bins }
void add(item item) { find first bin for which can_add return true, then add it, create a new bin if neccessary }
void print() { print all the bins }
}
Some quick notes:
In your code, you converted the int size to a float repeatedly, that's not a good idea. In my design that is localized to one place
You'll note that the logic relating to a single item is now contained inside the item itself. Other objects only can see whats important to them, size_required and whether the object is still alive
I've not included anything about sorting stuff because I'm not clear what that is for in a first-fit algorithm.
This interview gives some great insight into the rationale behind the STL. This may give you some inspiration on how to implement your algorithms the STL-way.