Prevent or detect "this" from being deleted during use - c++

One error that I often see is a container being cleared whilst iterating through it. I have attempted to put together a small example program demonstrating this happening. One thing to note is that this can often happen many function calls deep so is quite hard to detect.
Note: This example deliberately shows some poorly designed code. I am trying to find a solution to detect the errors caused by writing code such as this without having to meticulously examine an entire codebase (~500 C++ units)
#include <iostream>
#include <string>
#include <vector>
class Bomb;
std::vector<Bomb> bombs;
class Bomb
{
std::string name;
public:
Bomb(std::string name)
{
this->name = name;
}
void touch()
{
if(rand() % 100 > 30)
{
/* Simulate everything being exploded! */
bombs.clear();
/* An error: "this" is no longer valid */
std::cout << "Crickey! The bomb was set off by " << name << std::endl;
}
}
};
int main()
{
bombs.push_back(Bomb("Freddy"));
bombs.push_back(Bomb("Charlie"));
bombs.push_back(Bomb("Teddy"));
bombs.push_back(Bomb("Trudy"));
for(size_t i = 0; i < bombs.size(); i++)
{
bombs.at(i).touch();
}
return 0;
}
Can anyone suggest a way of guaranteeing this cannot happen?
The only way I can currently detect this kind of thing is replacing the global new and delete with mmap / mprotect and detecting use after free memory accesses. This and Valgrind however sometimes fail to pick it up if the vector does not need to reallocate (i.e only some elements removed or the new size is not yet the reserve size). Ideally I don't want to have to clone much of the STL to make a version of std::vector that always reallocates every insertion/deletion during debug / testing.
One way that almost works is if the std::vector instead contains std::weak_ptr, then the usage of .lock() to create a temporary reference prevents its deletion whilst execution is within the classes method. However this cannot work with std::shared_ptr because you do not need lock() and same with plain objects. Creating a container of weak pointers just for this would be wasteful.
Can anyone else think of a way to protect ourselves from this.

Easiest way is to run your unit tests with Clang MemorySanitizer linked in.
Let some continuous-integration Linux box to do it automatically on each push
into repo.
MemorySanitizer has "Use-after-destruction detection" (flag -fsanitize-memory-use-after-dtor + environment variable MSAN_OPTIONS=poison_in_dtor=1) and so it will blow the test up that executes the code and that turns your continuous-integration red.
If you have neither unit tests nor continuous integration in place then you can also just manually debug your code with MemorySanitizer but that is hard way compared with the easiest. So better start to use continuous integration and write unit tests.
Note that there may be legitimate reasons of memory reads and writes after destructor has been ran but memory hasn't yet been freed. For example std::variant<std::string,double>. It lets us to assign it std::string then double and so its implementation might destroy the string and reuse same storage for double. Filtering such cases out is unfortunately manual work at the moment, but tools evolve.

In your particular example the misery boils down to no less than two design flaws:
Your vector is a global variable. Limit the scope of all of your objects as much as possible and issues like this are less likely to occur.
Having the single responsibility principle in mind, I can hardly imagine how one could come up with a class that needs to have some method that either directly or indirectly (maybe through 100 layers of call stack) deletes objects that could happen to be this.
I am aware that your example is artificial and intentionally bad, so please don't get me wrong here: I'm sure that in your actual case it is not so obvious how sticking to some basic design rules can prevent you from doing this. But as I said, I strongly believe that good design will reduce the likelyhood of such bugs coming up. And in fact, I cannot remember that I was ever facing such an issue, but maybe I am just not experienced enough :)
However, if this really keeps being an issue despite sticking with some design rules, then I have this idea how to detect it:
Create a member int recursionDepth in your class and initialize it with 0
At the beginning of each non-private method increment it.
Use RAII to make sure that at the end of each method it is decremented again
In the destructor check it to be 0, otherwise it means that the destructor is directly or indirectly called by some method of this.
You may want to #ifdef all of this and enable it only in debug build. This would essentially make it a debug assertion, some people like them :)
Note, that this does not work in a multi threaded environment.

In the end I went with a custom iterator that if the owner std::vector resizes whilst the iterator is still in scope, it will log an error or abort (giving me a stacktrace of the program). This example is a bit convoluted but I have tried to simplify it as much as possible and removed unused functionality from the iterator.
This system has flagged up about 50 errors of this nature. Some may be repeats. However Valgrind and ElecricFence at this point came up clean which is disappointing (In total they flagged up around 10 which I have already fixed since the start of the code cleanup).
In this example I use clear() which Valgrind does flag as an error. However in the actual codebase it is random access erases (i.e vec.erase(vec.begin() + 9)) which I need to check and Valgrind unfortunately misses quite a few.
main.cpp
#include "sstd_vector.h"
#include <iostream>
#include <string>
#include <memory>
class Bomb;
sstd::vector<std::shared_ptr<Bomb> > bombs;
class Bomb
{
std::string name;
public:
Bomb(std::string name)
{
this->name = name;
}
void touch()
{
if(rand() % 100 > 30)
{
/* Simulate everything being exploded! */
bombs.clear(); // Causes an ABORT
std::cout << "Crickey! The bomb was set off by " << name << std::endl;
}
}
};
int main()
{
bombs.push_back(std::make_shared<Bomb>("Freddy"));
bombs.push_back(std::make_shared<Bomb>("Charlie"));
bombs.push_back(std::make_shared<Bomb>("Teddy"));
bombs.push_back(std::make_shared<Bomb>("Trudy"));
/* The key part is the lifetime of the iterator. If the vector
* changes during the lifetime of the iterator, even if it did
* not reallocate, an error will be logged */
for(sstd::vector<std::shared_ptr<Bomb> >::iterator it = bombs.begin(); it != bombs.end(); it++)
{
it->get()->touch();
}
return 0;
}
sstd_vector.h
#include <vector>
#include <stdlib.h>
namespace sstd
{
template <typename T>
class vector
{
std::vector<T> data;
size_t refs;
void check_valid()
{
if(refs > 0)
{
/* Report an error or abort */
abort();
}
}
public:
vector() : refs(0) { }
~vector()
{
check_valid();
}
vector& operator=(vector const& other)
{
check_valid();
data = other.data;
return *this;
}
void push_back(T val)
{
check_valid();
data.push_back(val);
}
void clear()
{
check_valid();
data.clear();
}
class iterator
{
friend class vector;
typename std::vector<T>::iterator it;
vector<T>* parent;
iterator() { }
iterator& operator=(iterator const&) { abort(); }
public:
iterator(iterator const& other)
{
it = other.it;
parent = other.parent;
parent->refs++;
}
~iterator()
{
parent->refs--;
}
bool operator !=(iterator const& other)
{
if(it != other.it) return true;
if(parent != other.parent) return true;
return false;
}
iterator operator ++(int val)
{
iterator rtn = *this;
it ++;
return rtn;
}
T* operator ->()
{
return &(*it);
}
T& operator *()
{
return *it;
}
};
iterator begin()
{
iterator rtn;
rtn.it = data.begin();
rtn.parent = this;
refs++;
return rtn;
}
iterator end()
{
iterator rtn;
rtn.it = data.end();
rtn.parent = this;
refs++;
return rtn;
}
};
}
The disadvantages of this system is that I must use an iterator rather than .at(idx) or [idx]. I personally don't mind this one so much. I can still use .begin() + idx if random access is needed.
It is a little bit slower (nothing compared to Valgrind though). When I am done, I can do a search / replace of sstd::vector with std::vector and there should be no performance drop.

Related

Unchecked read from a map in a const function

Suppose the following:
struct C {
... // lots of other stuff
int get(int key) const { return m.at(key); } // This will never throw
private:
std::unordered_map<int, int> m;
};
Due to how the application works, I know that get never throws. I want to make get as fast as possible. So, I would like to make the access unchecked, i.e. I would like to write something like return m[key]. Of course, I cannot write exactly that while keeping get const. However, I want to keep get const, since it is logically const.
Here is the only (ugly) solution I came up with:
struct C {
... // lots of other stuff
int get(int key) const { return const_cast<C *>(this)->m[key]; }
private:
std::unordered_map<int, int> m;
};
Is there a better way?
One approach would be to use std::unordered_map::find:
struct C {
... // lots of other stuff
int get(int key) const { return m.find(key)->second; }
private:
std::unordered_map<int, int> m;
};
I object to the very reasoning behind this question. The overhead (of map.at() vs map[]) associated with catching an error due to unknown key is presumably tiny compared to the cost of finding the key in the first place.
Yet, you willingly take the serious risk of a run-time error just for such a marginal efficiency advantage that you presumably have not even validated/measured. You may think that you know that key is always contained in the map, but perhaps future code changes (including bugs introduced by others) may change that?
If you really know, then you should use
map.find(key)->second;
which makes the bug explicit if the iterator returned is invalid (i.e. equal to map.end()). You may use assert in pre-production code, i.e.
auto it = map.find(key);
assert(it!=map.end());
return it->second;
which in production code (when assert is an empty macro) is removed.

How to iterate through all elements of set C++

[UPDATE: My problem is solved! Lots of thanks to Mike Seymour and Niall and all you guys!]
My code has errors in the for loop and I do not know how to fix it :(
MyClass::ITECH7603Class(set<Student>* students) {
/* Initialize dynamically the group field */
group = new map<string, Student>();
for (set<Student>::iterator it = students->begin(); it != students->end(); it++) {
addStudent(it);
}
}
void MyClass::addStudent(Student* studentPtr) {
string fullName = studentPtr->getName() + " " + studentPtr->getSurname();
group->insert(pair<string, Student>(fullName, *studentPtr));
}
So the main idea is to loop through all students in the set, and add each student into a map group. Any help? Thank you very much!
for (set<Student>::iterator it = students->begin; it != students->end; it++) {
addStudent(it);
}
should be:
for (set<Student>::iterator it = students->begin(); it != students->end(); it++) {
//^^ //^^
addStudent(it);
}
addStudent takes a pointer, while it is an iterator, so can't be passed directly.
You should change addStudent to take either a value or a pointer/reference to const:
// option 1
void addStudent(Student);
addStudent(*it);
// option 2
void addStudent(Student const &);
addStudent(*it);
// option 3
void addStudent(Student const *);
addStudent(&*it);
If, as you say in a comment, you must leave it taking a mutable pointer, then you'll need some grotesquery to deal with the fact that elements of the set are immutable:
// nasty option
addStudent(const_cast<Student*>(&*it));
// slightly less nasty option
Student copy = *it;
addStudent(&copy);
Beware that the first option will give undefined behaviour if the function uses the dodgy pointer to make any modification to the Student object stored in the set. The second makes a temporary copy, which can be modified without breaking the set. This is fine as long as addStudent only stores a copy of the object passed to it, not the pointer itself, which will become invalid when copy is destroyed.
In c++11 you can use range for sytax:
for (const auto &student : *students)
{
addStudent(it);
}
Then change addStudent function signature to accept reference:
void MyClass::addStudent(const Student &student) {
While you've gotten answers that "fix" your code to the extent of compiling and producing results that you apparently find acceptable, I don't find them very satisfying in terms of code style. I would do this job rather differently. In particular, my code to do this wouldn't have a single (explicit) loop. If I needed to do approximately what you're asking for, I'd probably use code something like this:
std::pair<std::string, Student> make_mappable(Student &stud) {
return std::make_pair(stud.getName() + " " + stud.getSurName(), stud);
}
std::map<std::string, Student> gen_map(std::set<Student> const &input) {
std::map<std::string, Student> ret;
std::transform(input.begin(), input.end(),
std::inserter(ret, ret.end()),
make_mappable);
return ret;
}
There definitely would not be any new in sight, nor would there be any passing a pointer to a Student.
OTOH, since the data you're using as the key for your map is data that's already in the items in the set, it may more convenient all around to continue to use a set, and just specify a comparison function based on the student's name:
struct by_given_name {
bool operator()(Student const &a, Student const &b) const {
if (a.getName() < b.getName())
return true;
if (b.getName() < a.getName())
return false;
return a.getSurName() < b.getSurName();
}
};
std::set<Student, by_given_name> xform(std::set<Student> const &in) {
return std::set<Student, by_given_name>{in.begin(), in.end()};
}
For what its worth, a Live Demo of the latter.
Whether the latter is practical will typically depend on one other factor though: your ability to create a Student from only a name/surname. If you can't do that, searching by name will be inconvenient (at best), so you'd want to use a map.
I realize this probably isn't much (if any) help in completely what's apparently home-work for a class--but even if your class prevents you from actually turning in decent code, it seems worthwhile to me to at least try to learn to write decent code in addition to what it requires. If you do pass the class and get a job writing code, you'd probably rather your coworkers didn't want to hurt you.

Is a default value of nullptr in a map of pointers defined behaviour?

The following code seems to always follow the true branch.
#include <map>
#include <iostream>
class TestClass {
// implementation
}
int main() {
std::map<int, TestClass*> TestMap;
if (TestMap[203] == nullptr) {
std::cout << "true";
} else {
std::cout << "false";
}
return 0;
}
Is it defined behaviour for an uninitialized pointer to point at nullptr, or an artifact of my compiler?
If not, how can I ensure portability of the following code? Currently, I'm using similar logic to return the correct singleton instance for a log file:
#include <string>
#include <map>
class Log {
public:
static Log* get_instance(std::string path);
protected:
Log(std::string path) : path(path), log(path) {};
std::string path;
std::ostream log;
private:
static std::map<std::string, Log*> instances;
};
std::map<std::string, Log*> Log::instances = std::map<std::string, Log*>();
Log* Log::get_instance(std::string path) {
if (instances[path] == nullptr) {
instances[path] = new Log(path);
}
return instances[path];
}
One solution would be to use something similar to this where you use a special function provide a default value when checking a map. However, my understanding is that this would cause the complexity of the lookup to be O(n) instead of O(1). This isn't too much of an issue in my scenario (there would only ever be a handful of logs), but a better solution would be somehow to force pointers of type Log* to reference nullptr by default thus making the lookup check O(1) and portable at the same time. Is this possible and if so, how would I do it?
The map always value-initializes its members (in situations where they are not copy-initialized, of course), and value-initialization for builtin types means zero-initialization, therefore it is indeed defined behaviour. This is especially true for the value part of new keys generated when accessing elements with operator[] which didn't exist before calling that.
Note however that an uninizialized pointer is not necessarily a null pointer; indeed, just reading its value already invokes undefined behaviour (and might case a segmentation fault on certain platforms under certain circumstances). The point is that pointers in maps are not uninitialized. So if you write for example
void foo()
{
TestClass* p;
// ...
}
p will not be initialized to nullptr.
Note however that you might want to check for presence instead, to avoid accumulating unnecessary entries. You'd check for presence using the find member function:
map<int, TestClass*>::iterator it = TestMap.find(203);
if (it == map.end())
{
// there's no such element in the map
}
else
{
TestClass* p = it->second;
// ...
}
Yes, that's defined behaviour. If an element isn't yet in a map when you access it via operator[], it gets default constructed.

return a vector vs use a parameter for the vector to return it

With the code below, the question is:
If you use the "returnIntVector()" function, is the vector copied from the local to the "outer" (global) scope? In other words is it a more time and memory consuming variation compared to the "getIntVector()"-function? (However providing the same functionality.)
#include <iostream>
#include <vector>
using namespace std;
vector<int> returnIntVector()
{
vector<int> vecInts(10);
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
vecInts[ui] = ui;
return vecInts;
}
void getIntVector(vector<int> &vecInts)
{
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
vecInts[ui] = ui;
}
int main()
{
vector<int> vecInts = returnIntVector();
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
cout << vecInts[ui] << endl;
cout << endl;
vector<int> vecInts2(10);
getIntVector(vecInts2);
for(unsigned int ui = 0; ui < vecInts2.size(); ui++)
cout << vecInts2[ui] << endl;
return 0;
}
In theory, yes it's copied. In reality, no, most modern compilers take advantage of return value optimization.
So you can write code that acts semantically correct. If you want a function that modifies or inspects a value, you take it in by reference. Your code does not do that, it creates a new value not dependent upon anything else, so return by value.
Use the first form: the one which returns vector. And a good compiler will most likely optimize it. The optimization is popularly known as Return value optimization, or RVO in short.
Others have already pointed out that with a decent (not great, merely decent) compiler, the two will normally end up producing identical code, so the two give equivalent performance.
I think it's worth mentioning one or two other points though. First, returning the object does officially copy the object; even if the compiler optimizes the code so that copy never takes place, it still won't (or at least shouldn't) work if the copy ctor for that class isn't accessible. std::vector certainly supports copying, but it's entirely possible to create a class that you'd be able to modify like in getIntVector, but not return like in returnIntVector.
Second, and substantially more importantly, I'd generally advise against using either of these. Instead of passing or returning a (reference to) a vector, you should normally work with an iterator (or two). In this case, you have a couple of perfectly reasonable choices -- you could use either a special iterator, or create a small algorithm. The iterator version would look something like this:
#ifndef GEN_SEQ_INCLUDED_
#define GEN_SEQ_INCLUDED_
#include <iterator>
template <class T>
class sequence : public std::iterator<std::forward_iterator_tag, T>
{
T val;
public:
sequence(T init) : val(init) {}
T operator *() { return val; }
sequence &operator++() { ++val; return *this; }
bool operator!=(sequence const &other) { return val != other.val; }
};
template <class T>
sequence<T> gen_seq(T const &val) {
return sequence<T>(val);
}
#endif
You'd use this something like this:
#include "gen_seq"
std::vector<int> vecInts(gen_seq(0), gen_seq(10));
Although it's open to argument that this (sort of) abuses the concept of iterators a bit, I still find it preferable on practical grounds -- it lets you create an initialized vector instead of creating an empty vector and then filling it later.
The algorithm alternative would look something like this:
template <class T, class OutIt>
class fill_seq_n(OutIt result, T num, T start = 0) {
for (T i = start; i != num-start; ++i) {
*result = i;
++result;
}
}
...and you'd use it something like this:
std::vector<int> vecInts;
fill_seq_n(std::back_inserter(vecInts), 10);
You can also use a function object with std::generate_n, but at least IMO, this generally ends up more trouble than it's worth.
As long as we're talking about things like that, I'd also replace this:
for(unsigned int ui = 0; ui < vecInts2.size(); ui++)
cout << vecInts2[ui] << endl;
...with something like this:
std::copy(vecInts2.begin(), vecInts2.end(),
std::ostream_iterator<int>(std::cout, "\n"));
In C++03 days, getIntVector() is recommended for most cases. In case of returnIntVector(), it might create some unncessary temporaries.
But by using return value optimization and swaptimization, most of them can be avoided. In era of C++11, the latter can be meaningful due to the move semantics.
In theory, the returnIntVector function returns the vector by value, so a copy will be made and it will be more time-consuming than the function which just populates an existing vector. More memory will also be used to store the copy, but only temporarily; since vecInts is locally scoped it will be stack-allocated and will be freed as soon as the returnIntVector returns. However, as others have pointed out, a modern compiler will optimize away these inefficiencies.
returnIntVector is more time consuming because it returns a copy of the vector, unless the vector implementation is realized with a single pointer in which case the performance is the same.
in general you should not rely on the implementation and use getIntVector instead.

Bin packing implementation in C++ with STL

This is my first time using this site so sorry for any bad formatting or weird formulations, I'll try my best to conform to the rules on this site but I might do some misstakes in the beginning.
I'm right now working on an implementation of some different bin packing algorithms in C++ using the STL containers. In the current code I still have some logical faults that needs to be fixed but this question is more about the structure of the program. I would wan't some second opinion on how you should structure the program to minimize the number of logical faults and make it as easy to read as possible. In it's current state I just feel that this isn't the best way to do it but I don't really see any other way to write my code right now.
The problem is a dynamic online bin packing problem. It is dynamic in the sense that items have an arbitrary time before they will leave the bin they've been assigned to.
In short my questions are:
How would the structure of a Bin packing algorithm look in C++?
Is STL containers a good tool to make the implementation be able to handle inputs of arbitrary lenght?
How should I handle the containers in a good, easy to read and implement way?
Some thoughts about my own code:
Using classes to make a good distinction between handling the list of the different bins and the list of items in those bins.
Getting the implementation as effective as possible.
Being easy to run with a lot of different data lengths and files for benchmarking.
#include <iostream>
#include <fstream>
#include <list>
#include <queue>
#include <string>
#include <vector>
using namespace std;
struct type_item {
int size;
int life;
bool operator < (const type_item& input)
{
return size < input.size;
}
};
class Class_bin {
double load;
list<type_item> contents;
list<type_item>::iterator i;
public:
Class_bin ();
bool operator < (Class_bin);
bool full (type_item);
void push_bin (type_item);
double check_load ();
void check_dead ();
void print_bin ();
};
Class_bin::Class_bin () {
load=0.0;
}
bool Class_bin::operator < (Class_bin input){
return load < input.load;
}
bool Class_bin::full (type_item input) {
if (load+(1.0/(double) input.size)>1) {
return false;
}
else {
return true;
}
}
void Class_bin::push_bin (type_item input) {
int sum=0;
contents.push_back(input);
for (i=contents.begin(); i!=contents.end(); ++i) {
sum+=i->size;
}
load+=1.0/(double) sum;
}
double Class_bin::check_load () {
return load;
}
void Class_bin::check_dead () {
for (i=contents.begin(); i!=contents.end(); ++i) {
i->life--;
if (i->life==0) {
contents.erase(i);
}
}
}
void Class_bin::print_bin () {
for (i=contents.begin (); i!=contents.end (); ++i) {
cout << i->size << " ";
}
}
class Class_list_of_bins {
list<Class_bin> list_of_bins;
list<Class_bin>::iterator i;
public:
void push_list (type_item);
void sort_list ();
void check_dead ();
void print_list ();
private:
Class_bin new_bin (type_item);
bool comparator (type_item, type_item);
};
Class_bin Class_list_of_bins::new_bin (type_item input) {
Class_bin temp;
temp.push_bin (input);
return temp;
}
void Class_list_of_bins::push_list (type_item input) {
if (list_of_bins.empty ()) {
list_of_bins.push_front (new_bin(input));
return;
}
for (i=list_of_bins.begin (); i!=list_of_bins.end (); ++i) {
if (!i->full (input)) {
i->push_bin (input);
return;
}
}
list_of_bins.push_front (new_bin(input));
}
void Class_list_of_bins::sort_list () {
list_of_bins.sort();
}
void Class_list_of_bins::check_dead () {
for (i=list_of_bins.begin (); i !=list_of_bins.end (); ++i) {
i->check_dead ();
}
}
void Class_list_of_bins::print_list () {
for (i=list_of_bins.begin (); i!=list_of_bins.end (); ++i) {
i->print_bin ();
cout << "\n";
}
}
int main () {
int i, number_of_items;
type_item buffer;
Class_list_of_bins bins;
queue<type_item> input;
string filename;
fstream file;
cout << "Input file name: ";
cin >> filename;
cout << endl;
file.open (filename.c_str(), ios::in);
file >> number_of_items;
for (i=0; i<number_of_items; ++i) {
file >> buffer.size;
file >> buffer.life;
input.push (buffer);
}
file.close ();
while (!input.empty ()) {
buffer=input.front ();
input.pop ();
bins.push_list (buffer);
}
bins.print_list ();
return 0;
}
Note that this is just a snapshot of my code and is not yet running properly
Don't wan't to clutter this with unrelated chatter just want to thank the people who contributed, I will review my code and hopefully be able to structure my programming a bit better
How would the structure of a Bin packing algorithm look in C++?
Well, ideally you would have several bin-packing algorithms, separated into different functions, which differ only by the logic of the algorithm. That algorithm should be largely independent from the representation of your data, so you can change your algorithm with only a single function call.
You can look at what the STL Algorithms have in common. Mainly, they operate on iterators instead of containers, but as I detail below, I wouldn't suggest this for you initially. You should get a feel for what algorithms are available and leverage them in your implementation.
Is STL containers a good tool to make the implementation be able to handle inputs of arbitrary length?
It usually works like this: create a container, fill the container, apply an algorithm to the container.
Judging from the description of your requirements, that is how you'll use this, so I think it'll be fine. There's one important difference between your bin packing algorithm and most STL algorithms.
The STL algorithms are either non-modifying or are inserting elements to a destination. bin-packing, on the other hand, is "here's a list of bins, use them or add a new bin". It's not impossible to do this with iterators, but probably not worth the effort. I'd start by operating on the container, get a working program, back it up, then see if you can make it work for only iterators.
How should I handle the containers in a good, easy to read and implement way?
I'd take this approach, characterize your inputs and outputs:
Input: Collection of items, arbitrary length, arbitrary order.
Output: Collection of bins determined by algorithm. Each bin contains a collection of items.
Then I'd worry about "what does my algorithm need to do?"
Constantly check bins for "does this item fit?"
Your Class_bin is a good encapsulation of what is needed.
Avoid cluttering your code with unrelated stuff like "print()" - use non-member help functions.
type_item
struct type_item {
int size;
int life;
bool operator < (const type_item& input)
{
return size < input.size;
}
};
It's unclear what life (or death) is used for. I can't imagine that concept being relevant to implementing a bin-packing algorithm. Maybe it should be left out?
This is personal preference, but I don't like giving operator< to my objects. Objects are usually non-trivial and have many meanings of less-than. For example, one algorithm might want all the alive items sorted before the dead items. I typically wrap that in another struct for clarity:
struct type_item {
int size;
int life;
struct SizeIsLess {
// Note this becomes a function object, which makes it easy to use with
// STL algorithms.
bool operator() (const type_item& lhs, const type_item& rhs)
{
return lhs.size < rhs.size;
}
}
};
vector<type_item> items;
std::sort(items.begin, items.end(), type_item::SizeIsLess);
Class_bin
class Class_bin {
double load;
list<type_item> contents;
list<type_item>::iterator i;
public:
Class_bin ();
bool operator < (Class_bin);
bool full (type_item);
void push_bin (type_item);
double check_load ();
void check_dead ();
void print_bin ();
};
I would skip the Class_ prefix on all your types - it's just a bit excessive, and it should be clear from the code. (This is a variant of hungarian notation. Programmers tend to be hostile towards it.)
You should not have a class member i (the iterator). It's not part of class state. If you need it in all the members, that's ok, just redeclare it there. If it's too long to type, use a typedef.
It's difficult to quantify "bin1 is less than bin2", so I'd suggest removing the operator<.
bool full(type_item) is a little misleading. I'd probably use bool can_hold(type_item). To me, bool full() would return true if there is zero space remaining.
check_load() would seem more clearly named load().
Again, it's unclear what check_dead() is supposed to accomplish.
I think you can remove print_bin and write that as a non-member function, to keep your objects cleaner.
Some people on StackOverflow would shoot me, but I'd consider just making this a struct, and leaving load and item list public. It doesn't seem like you care much about encapsulation here (you're only need to create this object so you don't need do recalculate load each time).
Class_list_of_bins
class Class_list_of_bins {
list<Class_bin> list_of_bins;
list<Class_bin>::iterator i;
public:
void push_list (type_item);
void sort_list ();
void check_dead ();
void print_list ();
private:
Class_bin new_bin (type_item);
bool comparator (type_item, type_item);
};
I think you can do without this class entirely.
Conceptually, it represents a container, so just use an STL container. You can implement the methods as non-member functions. Note that sort_list can be replaced with std::sort.
comparator is too generic a name, it gives no indication of what it compares or why, so consider being more clear.
Overall Comments
Overall, I think the classes you've picked adequately model the space you're trying to represent, so you'll be fine.
I might structure my project like this:
struct bin {
double load; // sum of item sizes.
std::list<type_item> items;
bin() : load(0) { }
};
// Returns true if the bin can fit the item passed to the constructor.
struct bin_can_fit {
bin_can_fit(type_item &item) : item_(item) { }
bool operator()(const bin &b) {
return item_.size < b.free_space;
}
private:
type_item item_;
};
// ItemIter is an iterator over the items.
// BinOutputIter is an output iterator we can use to put bins.
template <ItemIter, BinOutputIter>
void bin_pack_first_fit(ItemIter curr, ItemIter end, BinOutputIter output_bins) {
std::vector<bin> bins; // Create a local bin container, to simplify life.
for (; curr != end; ++curr) {
// Use a helper predicate to check whether the bin can fit this item.
// This is untested, but just for an idea.
std::vector<bin>::iterator bin_it =
std::find_if(bins.begin(), bins.end(), bin_can_fit(*curr));
if (bin_it == bins.end()) {
// Did not find a bin with enough space, add a new bin.
bins.push_back(bin);
// push_back invalidates iterators, so reassign bin_it to the last item.
bin_it = std::advance(bins.begin(), bins.size() - 1);
}
// bin_it now points to the bin to put the item in.
bin_it->items.push_back(*curr);
bin_it->load += curr.size();
}
std::copy(bins.begin(), bins.end(), output_bins); // Apply our bins to the destination.
}
void main(int argc, char** argv) {
std::vector<type_item> items;
// ... fill items
std::vector<bin> bins;
bin_pack_first_fit(items.begin(), items.end(), std::back_inserter(bins));
}
Some thoughts:
Your names are kinda messed up in places.
You have a lot of parameters named input, thats just meaningless
I'd expect full() to check whether it is full, not whether it can fit something else
I don't think push_bin pushes a bin
check_dead modifies the object (I'd expect something named check_*, to just tell me something about the object)
Don't put things like Class and type in the names of classes and types.
class_list_of_bins seems to describe what's inside rather then what the object is.
push_list doesn't push a list
Don't append stuff like _list to every method in a list class, if its a list object, we already know its a list method
I'm confused given the parameters of life and load as to what you are doing. The bin packing problem I'm familiar with just has sizes. I'm guessing that overtime some of the objects are taken out of bins and thus go away?
Some further thoughts on your classes
Class_list_of_bins is exposing too much of itself to the outside world. Why would the outside world want to check_dead or sort_list? That's nobodies business but the object itself. The public method you should have on that class really should be something like
* Add an item to the collection of bins
* Print solution
* Step one timestep into the future
list<Class_bin>::iterator i;
Bad, bad, bad! Don't put member variables on your unless they are actually member states. You should define that iterator where it is used. If you want to save some typing add this: typedef list::iterator bin_iterator and then you use bin_iterator as the type instead.
EXPANDED ANSWER
Here is my psuedocode:
class Item
{
Item(Istream & input)
{
read input description of item
}
double size_needed() { return actual size required (out of 1) for this item)
bool alive() { return true if object is still alive}
void do_timestep() { decrement life }
void print() { print something }
}
class Bin
{
vector of Items
double remaining_space
bool can_add(Item item) { return true if we have enough space}
void add(Item item) {add item to vector of items, update remaining space}
void do_timestep() {call do_timestep() and all Items, remove all items which indicate they are dead, updating remaining_space as you go}
void print { print all the contents }
}
class BinCollection
{
void do_timestep { call do_timestep on all of the bins }
void add(item item) { find first bin for which can_add return true, then add it, create a new bin if neccessary }
void print() { print all the bins }
}
Some quick notes:
In your code, you converted the int size to a float repeatedly, that's not a good idea. In my design that is localized to one place
You'll note that the logic relating to a single item is now contained inside the item itself. Other objects only can see whats important to them, size_required and whether the object is still alive
I've not included anything about sorting stuff because I'm not clear what that is for in a first-fit algorithm.
This interview gives some great insight into the rationale behind the STL. This may give you some inspiration on how to implement your algorithms the STL-way.