I've recently (4 days) started to learn C++ coming from C / Java background. In order to learn a new language I ussualy start by re-implementing different classical algorithms, as language specific as I can.
I've come to this code, its a DFS - Depth First Search in an unoriented graph. Still from what I read it's best to pass parameters by references in C++. Unfortunately I can't quite grasp the concept of reference. Every time I need a reference, I get confused and I think in terms of pointers. In my current code, i use pass by value .
Here is the code (probably isn't Cppthonic as it should):
#include <algorithm>
#include <iostream>
#include <fstream>
#include <string>
#include <stack>
#include <vector>
using namespace std;
template <class T>
void utilShow(T elem);
template <class T>
void utilShow(T elem){
cout << elem << " ";
}
vector< vector<short> > getMatrixFromFile(string fName);
void showMatrix(vector< vector<short> > mat);
vector<unsigned int> DFS(vector< vector<short> > mat);
/* Reads matrix from file (fName) */
vector< vector<short> > getMatrixFromFile(string fName)
{
unsigned int mDim;
ifstream in(fName.c_str());
in >> mDim;
vector< vector<short> > mat(mDim, vector<short>(mDim));
for(int i = 0; i < mDim; ++i) {
for(int j = 0; j < mDim; ++j) {
in >> mat[i][j];
}
}
return mat;
}
/* Output matrix to stdout */
void showMatrix(vector< vector<short> > mat){
vector< vector<short> >::iterator row;
for(row = mat.begin(); row < mat.end(); ++row){
for_each((*row).begin(), (*row).end(), utilShow<short>);
cout << endl;
}
}
/* DFS */
vector<unsigned int> DFS(vector< vector<short> > mat){
// Gives the order for DFS when visiting
stack<unsigned int> nodeStack;
// Tracks the visited nodes
vector<bool> visited(mat.size(), false);
vector<unsigned int> result;
nodeStack.push(0);
visited[0] = true;
while(!nodeStack.empty()) {
unsigned int cIdx = nodeStack.top();
nodeStack.pop();
result.push_back(cIdx);
for(int i = 0; i < mat.size(); ++i) {
if(1 == mat[cIdx][i] && !visited[i]) {
nodeStack.push(i);
visited[i] = true;
}
}
}
return result;
}
int main()
{
vector< vector<short> > mat;
mat = getMatrixFromFile("Ex04.in");
vector<unsigned int> dfsResult = DFS(mat);
cout << "Adjancency Matrix: " << endl;
showMatrix(mat);
cout << endl << "DFS: " << endl;
for_each(dfsResult.begin(), dfsResult.end(), utilShow<unsigned int>);
return (0);
}
Can you please can give me some hints on how to use references, by referencing to this code ?
Is my current programming style, compatible with the constructs of C++ ?
Is there a standard alternative for vector and type** for bi dimensional arrays in C++ ?
LATER EDIT:
OK, I've analyzed your answers (thanks all), and I've rewritten the code in a more OOP manner. Also I've understand what a reference and were to use it. It's somewhat similar to a const pointer, except the fact that a pointer of that type can hold a NULL.
This is my latest code:
#include <algorithm>
#include <fstream>
#include <iostream>
#include <ostream>
#include <stack>
#include <string>
#include <vector>
using namespace std;
template <class T> void showUtil(T elem);
/**
* Wrapper around a graph
**/
template <class T>
class SGraph
{
private:
size_t nodes;
vector<T> pmatrix;
public:
SGraph(): nodes(0), pmatrix(0) { }
SGraph(size_t nodes): nodes(nodes), pmatrix(nodes * nodes) { }
// Initialize graph from file name
SGraph(string &file_name);
void resize(size_t new_size);
void print();
void DFS(vector<size_t> &results, size_t start_node);
// Used to retrieve indexes.
T & operator()(size_t row, size_t col) {
return pmatrix[row * nodes + col];
}
};
template <class T>
SGraph<T>::SGraph(string &file_name)
{
ifstream in(file_name.c_str());
in >> nodes;
pmatrix = vector<T>(nodes * nodes);
for(int i = 0; i < nodes; ++i) {
for(int j = 0; j < nodes; ++j) {
in >> pmatrix[i*nodes+j];
}
}
}
template <class T>
void SGraph<T>::resize(size_t new_size)
{
this->pmatrix.resize(new_size * new_size);
}
template <class T>
void SGraph<T>::print()
{
for(int i = 0; i < nodes; ++i){
cout << pmatrix[i];
if(i % nodes == 0){
cout << endl;
}
}
}
template <class T>
void SGraph<T>::DFS(vector<size_t> &results, size_t start_node)
{
stack<size_t> nodeStack;
vector<bool> visited(nodes * nodes, 0);
nodeStack.push(start_node);
visited[start_node] = true;
while(!nodeStack.empty()){
size_t cIdx = nodeStack.top();
nodeStack.pop();
results.push_back(cIdx);
for(int i = 0; i < nodes; ++i){
if(pmatrix[nodes*cIdx + i] && !visited[i]){
nodeStack.push(i);
visited[i] = 1;
}
}
}
}
template <class T>
void showUtil(T elem){
cout << elem << " ";
}
int main(int argc, char *argv[])
{
string file_name = "Ex04.in";
vector<size_t> dfs_results;
SGraph<short> g(file_name);
g.DFS(dfs_results, 0);
for_each(dfs_results.begin(), dfs_results.end(), showUtil<size_t>);
return (0);
}
For 4 days into C++, you're doing a great job. You're already using standard containers, algorithms, and writing your own function templates. The most sorely lacking thing I see is exactly in reference to your question: the need to pass by reference/const reference.
Any time you pass/return a C++ object by value, you are invoking a deep copy of its contents. This isn't cheap at all, especially for something like your matrix class.
First let's look at showMatrix. The purpose of this function is to output the contents of a matrix. Does it need a copy? No. Does it need to change anything in the matrix? No, it's purpose is just to display it. Thus we want to pass the Matrix by const reference.
typedef vector<short> Row;
typedef vector<Row> SquareMatrix;
void showMatrix(const SquareMatrix& mat);
[Note: I used some typedefs to make this easier to read and write. I recommend it when you have a lot of template parametrization].
Now let's look at getMatrixFromFile:
SquareMatrix getMatrixFromFile(string fName);
Returning SquareMatrix by value here could be expensive (depending on whether your compiler applies return value optimization to this case), and so is passing in a string by value. With C++0x, we have rvalue references to make it so we don't have to return a copy (I also modified the string to be passed in by const reference for same reasons as showMatrix, we don't need a copy of the file name):
SquareMatrix&& getMatrixFromFile(const string& fName);
However, if you don't have a compiler with these features, then a common compromise is to pass in a matrix by reference and let the function fill it in:
void getMatrixFromFile(const string& fName, SquareMatrix& out_matrix);
This doesn't give provide as convenient a syntax for the client (now they have to write two lines of code instead of one), but it avoids the deep copying overhead consistently. There is also MOJO to address this, but that will become obsolete with C++0x.
A simple rule of thumb: if you have any user-defined type (not a plain old data type) and you want to pass it to a function:
pass by const reference if the function only needs to read from it.
pass by reference if the function needs to modify the original.
pass by value only if the function needs a copy to modify.
There are exceptions where you might have a cheap UDT (user-defined type) that is cheaper to copy than it is to pass by const reference, e.g., but stick to this rule for now and you'll be on your way to writing safe, efficient C++ code that doesn't waste precious clock cycles on unnecessary copies (a common bane of poorly written C++ programs).
To pass by reference, you'd typically change this:
vector<unsigned int> DFS(vector< vector<short> > mat){
to:
vector<unsigned int> DFS(vector<vector<short>> const &mat) {
Technically, this is passing a const reference, but that's what you normally want to use when/if you're not planning to modify the original object.
On another note, I'd probably change this:
for_each((*row).begin(), (*row).end(), utilShow<short>);
to something like:
std::copy(row->begin(), row->end(), std::ostream_iterator<short>(std::cout, " "));
Likewise:
for_each(dfsResult.begin(), dfsResult.end(), utilShow<unsigned int>);
would become:
std::copy(dfsResult.begin(), dfsResult.end(),
std::ostream_iterator<unsigned int>(std::cout, " "));
(...which looks like it would obviate utilShow entirely).
As far as 2D matrices go, unless you need a ragged matrix (where different rows can be different lengths), you typically use a simple front-end to handle indexing in a single vector:
template <class T>
class matrix {
std::vector<T> data_;
size_t columns_;
public:
matrix(size_t rows, size_t columns) : columns_(columns), data_(rows * columns) {}
T &operator()(size_t row, size_t column) { return data[row * columns_ + column]; }
};
Note that this uses operator() for indexing, so instead of m[x][y], you'd use m(x,y), about like in BASIC or Fortran. You can overload operator[] in a way that allows you to use that notation if you prefer, but it's a fair amount of extra work with (IMO) little real benefit.
References and pointers are closely related. Both are ways of passing parameters without copying the parameter value onto the subroutine's stack frame.
The main difference between them:
A pointer p points to an object o.
A reference i is an object o. In other words, in an alias.
To make things more confusing, as far as I know, the compiler implementation between the two is pretty much the same.
Imagine the function Ptr(const T* t) and Ref(const T& t).
int main() {
int a;
Ptr(&a);
Ref(a);
}
In Ptr, t is going to point to the location of a. You can dereference it and get the value of a. If you do &t (take the address of t), you will get the address of the parameter.
In Ref, t is a. You can use a for the value of a. You can get the address of a with &a. It's a little syntactic sugar that c++ gives you.
Both provide a mechanism for passing parameters without copying. In your function (by the way, you don't need the declaration):
template <class T> void utilShow(T elem) { ... }
Every time it gets called, T will be copied. If T is a large vector, it is copying all the data in the vector. That's pretty inefficient. You don't want to pass the entire vector to the new stack frame, you want to say "hey - new stack frame, use this data". So you can pass by reference. What does that look like?
template <class T> void utilShow(const T &elem) { ... }
elem is const, because it's not changed by the function. It's also going to use the memory for elem that's stored in the caller, rather than copying it down the stack.
Again, for the same reason (to avoid a copy of the parameters), use:
vector< vector<short> > getMatrixFromFile(const string &fName) { ... }
void showMatrix(const vector< vector<short> > &mat) { ... }
The one tricky part is that you might think: "Hey, a reference means no copies! I'm gonna use it all the time! I'm gonna return references from functions!" And that's where your program crashes.
Imagine this:
// Don't do this!
Foo& BrokenReturnRef() {
Foo f;
return f;
}
int main() {
Foo &f = BrokenReturnRef();
cout << f.bar();
}
Unfortunately, this is broken! When BrokenReturnRef runs, f is in scope and everything is cool. Then you return to main and keep referencing f. The stack frame that created f has gone away, and that location is no longer valid, and you're referencing junk memory. In this case, you'll have to return by value (or allocate a new pointer on the heap).
The one exception to the rule of "don't return references" is when you know that memory will outlast the stack. This is how STL implements operator[] for its containers.
Hope that helps! :)
void utilShow(T& elem);
vector< vector<short> > getMatrixFromFile(const string& fName);
void showMatrix(vector< vector<short> >& mat);
vector<unsigned int> DFS(vector< vector<short> >& mat);
Some which I could figure out. And if possible if you aren't changing or intend to change the state of the object inside your method body make the variables passed as const.
I wouldn't ask you include all the C++ constructs in your first try itself, but gradually so that you don't overwhelm yourself to depression. Vector is the most used STL container. And usage of containers depend on your needs rather than feeling fanciful to use one over another.
One brief description of containers.
http://msdn.microsoft.com/en-us/library/1fe2x6kt%28VS.80%29.aspx
#Jerry Thanks for editing.
Vector isn't overused, but is used more because of its simplicity for simple objects, rather than large monolithic class objects. It resembles a C style array, but isn't, with a lot of extra algorithms. Two more which are used quite frequently are maps and lists. It maybe so because of the places where I work they need the use of these containers more than at other places.
Related
How do I get the position of an element inside a vector, where the elements are classes. Is there a way of doing this?
Example code:
class Object
{
public:
void Destroy()
{
// run some code to get remove self from vector
}
}
In main.cpp:
std::vector<Object> objects;
objects.push_back( <some instances of Object> );
// Some more code pushing back some more stuff
int n = 20;
objects.at(n).Destroy(); // Assuming I pushed back 20 items or more
So I guess I want to be able to write a method or something which is a member of the class which will return the location of itself inside the vector... Is this possible?
EDIT:
Due to confusion, I should explain better.
void Destroy(std::vector<Object>& container){
container.erase( ?...? );
}
The problem is, how can I find the number to do the erasing...? Apparently this isn't possible... I thought it might not be...
You can use std::find to find elements in vector (providing you implement a comparison operator (==) for Object. However, 2 big concerns:
If you need to find elements in a container then you will ger much better performance with using an ordered container such as std::map or std::set (find operations in O(log(N)) vs O(N)
Object should not be the one responsible of removing itself from the container. Object shouldn't know or be concerned with where it is, as that breaks encapsulation. Instead, the owner of the container should concern itself ith such tasks.
The object can erase itself thusly:
void Destroy(std::vector<Object>& container);
{
container.erase(container.begin() + (this - &container[0]));
}
This will work as you expect, but it strikes me as exceptionally bad design. Members should not have knowledge of their containers. They should exist (from their own perspective) in an unidentifiable limbo. Creation and destruction should be left to their creator.
Objects in a vector don't automatically know where they are in the vector.
You could supply each object with that information, but much easier: remove the object from the vector. Its destructor is then run automatically.
Then the objects can be used also in other containers.
Example:
#include <algorithm>
#include <iostream>
#include <vector>
class object_t
{
private:
int id_;
public:
int id() const { return id_; }
~object_t() {}
explicit object_t( int const id ): id_( id ) {}
};
int main()
{
using namespace std;
vector<object_t> objects;
for( int i = 0; i <= 33; ++i )
{
objects.emplace_back( i );
}
int const n = 20;
objects.erase( objects.begin() + n );
for( auto const& o : objects )
{
cout << o.id() << ' ';
}
cout << endl;
}
If you need to destroy the n'th item in a vector then the easiest way is to get an iterator from the beginning using std::begin() and call std::advance() to advance how ever many places you want, so something like:
std::vector<Object> objects;
const size_t n = 20;
auto erase_iter = std::advance(std::begin(objects), n);
objects.erase(erase_iter);
If you want to find the index of an item in a vector then use std::find to get the iterator and call std::distance from the beginning.
So something like:
Object object_to_find;
std::vector<Object> objects;
auto object_iter = std::find(std::begin(objects), std::end(objects), object_to_find);
const size_t n = std::distance(std::begin(objects), object_iter);
This does mean that you need to implement an equality operator for your object. Or you could try something like:
auto object_iter = std::find(std::begin(objects), std::end(objects),
[&object_to_find](const Object& object) -> bool { return &object_to_find == &object; });
Although for this to work the object_to_find needs to be the one from the actual list as it is just comparing addresses.
With the code below, the question is:
If you use the "returnIntVector()" function, is the vector copied from the local to the "outer" (global) scope? In other words is it a more time and memory consuming variation compared to the "getIntVector()"-function? (However providing the same functionality.)
#include <iostream>
#include <vector>
using namespace std;
vector<int> returnIntVector()
{
vector<int> vecInts(10);
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
vecInts[ui] = ui;
return vecInts;
}
void getIntVector(vector<int> &vecInts)
{
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
vecInts[ui] = ui;
}
int main()
{
vector<int> vecInts = returnIntVector();
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
cout << vecInts[ui] << endl;
cout << endl;
vector<int> vecInts2(10);
getIntVector(vecInts2);
for(unsigned int ui = 0; ui < vecInts2.size(); ui++)
cout << vecInts2[ui] << endl;
return 0;
}
In theory, yes it's copied. In reality, no, most modern compilers take advantage of return value optimization.
So you can write code that acts semantically correct. If you want a function that modifies or inspects a value, you take it in by reference. Your code does not do that, it creates a new value not dependent upon anything else, so return by value.
Use the first form: the one which returns vector. And a good compiler will most likely optimize it. The optimization is popularly known as Return value optimization, or RVO in short.
Others have already pointed out that with a decent (not great, merely decent) compiler, the two will normally end up producing identical code, so the two give equivalent performance.
I think it's worth mentioning one or two other points though. First, returning the object does officially copy the object; even if the compiler optimizes the code so that copy never takes place, it still won't (or at least shouldn't) work if the copy ctor for that class isn't accessible. std::vector certainly supports copying, but it's entirely possible to create a class that you'd be able to modify like in getIntVector, but not return like in returnIntVector.
Second, and substantially more importantly, I'd generally advise against using either of these. Instead of passing or returning a (reference to) a vector, you should normally work with an iterator (or two). In this case, you have a couple of perfectly reasonable choices -- you could use either a special iterator, or create a small algorithm. The iterator version would look something like this:
#ifndef GEN_SEQ_INCLUDED_
#define GEN_SEQ_INCLUDED_
#include <iterator>
template <class T>
class sequence : public std::iterator<std::forward_iterator_tag, T>
{
T val;
public:
sequence(T init) : val(init) {}
T operator *() { return val; }
sequence &operator++() { ++val; return *this; }
bool operator!=(sequence const &other) { return val != other.val; }
};
template <class T>
sequence<T> gen_seq(T const &val) {
return sequence<T>(val);
}
#endif
You'd use this something like this:
#include "gen_seq"
std::vector<int> vecInts(gen_seq(0), gen_seq(10));
Although it's open to argument that this (sort of) abuses the concept of iterators a bit, I still find it preferable on practical grounds -- it lets you create an initialized vector instead of creating an empty vector and then filling it later.
The algorithm alternative would look something like this:
template <class T, class OutIt>
class fill_seq_n(OutIt result, T num, T start = 0) {
for (T i = start; i != num-start; ++i) {
*result = i;
++result;
}
}
...and you'd use it something like this:
std::vector<int> vecInts;
fill_seq_n(std::back_inserter(vecInts), 10);
You can also use a function object with std::generate_n, but at least IMO, this generally ends up more trouble than it's worth.
As long as we're talking about things like that, I'd also replace this:
for(unsigned int ui = 0; ui < vecInts2.size(); ui++)
cout << vecInts2[ui] << endl;
...with something like this:
std::copy(vecInts2.begin(), vecInts2.end(),
std::ostream_iterator<int>(std::cout, "\n"));
In C++03 days, getIntVector() is recommended for most cases. In case of returnIntVector(), it might create some unncessary temporaries.
But by using return value optimization and swaptimization, most of them can be avoided. In era of C++11, the latter can be meaningful due to the move semantics.
In theory, the returnIntVector function returns the vector by value, so a copy will be made and it will be more time-consuming than the function which just populates an existing vector. More memory will also be used to store the copy, but only temporarily; since vecInts is locally scoped it will be stack-allocated and will be freed as soon as the returnIntVector returns. However, as others have pointed out, a modern compiler will optimize away these inefficiencies.
returnIntVector is more time consuming because it returns a copy of the vector, unless the vector implementation is realized with a single pointer in which case the performance is the same.
in general you should not rely on the implementation and use getIntVector instead.
When using arrays you can do something like
class SomeClass
{
public:
int* LockMember( size_t& numInts );
private:
int* member;
size_t numInts;
};
int* SomeClass::LockMember( size_t& out_numInts )
{
out_numInts = numInts - 1;
return member + 1;
}
To return an array offset by some amount so as to prevent someone from modifying some part of contingeous memory, or, atleast, show some intent that this part of contingeous memory of the object should remain untouched.
Since I use vectors everywhere, I am wondering if there was some way to accomplish the same sort of thing:
class SomeClass
{
public:
std::vector<int> LockMember( void );
private:
std::vector<int> member;
};
std::vector<int> SomeClass::LockMember( void )
{
// somehow make a vector with its beginning iterator pointing to member.begin() + 1
// have a size smaller by one, still the same end iterator. The vector must be
// pointing to the same data as in this class as it needs to be modifiable.
return magicOffsetVector;
}
With the commented part replaced by real code. Any ideas?
If I understand you correctly: You want some memory with two parts: At the beginning you want something that can't be touched, and after that you want something that is open for use by client code.
You could do something along the following code. This will give the client code a copy to play with. This does mean you would have to do a lot of copying, though.
class SomeClass
{
public:
std::vector<int> getMember( void ) const;
void setMember(std::vector<int> newContent);
private:
std::vector<int> member;
size_t magicOffset;
};
// Read restricted part
std::vector<int> SomeClass::getMember( void ) const
{
return vector<int>(member.begin() + magicOffset, member.end());
}
// Assign to restricted part
void SomeClass::setMember(const std::vector<int>& v)
{
std::copy(v.begin(), v.end(), member.begin() + magicOffset);
}
In order to avoid the copying, it is possible that you could allocate memory for two vectors, one for the protected part and one for the unprotected part, and use placement new to put both vectors into that memory, thus ensuring that they are in contiguous memory. And then give the client code more or less free access to the public part of the vector. However, there's still the thing with bookkeeping variables in vector, and basically this would be an awful hack that's just waiting to blow up.
However, if you only need access to the unrestricted part on a per-element basis, you could just do range-checking on the arguments, i.e.:
int getElement(size_t idx)
{
idx += magicOffset;
if (idx > member.size() || idx < 0) throw std::out_of_range("Illegal index");
return member[idx];
}
And then either provide a setElement, or return int&.
This is my first time using this site so sorry for any bad formatting or weird formulations, I'll try my best to conform to the rules on this site but I might do some misstakes in the beginning.
I'm right now working on an implementation of some different bin packing algorithms in C++ using the STL containers. In the current code I still have some logical faults that needs to be fixed but this question is more about the structure of the program. I would wan't some second opinion on how you should structure the program to minimize the number of logical faults and make it as easy to read as possible. In it's current state I just feel that this isn't the best way to do it but I don't really see any other way to write my code right now.
The problem is a dynamic online bin packing problem. It is dynamic in the sense that items have an arbitrary time before they will leave the bin they've been assigned to.
In short my questions are:
How would the structure of a Bin packing algorithm look in C++?
Is STL containers a good tool to make the implementation be able to handle inputs of arbitrary lenght?
How should I handle the containers in a good, easy to read and implement way?
Some thoughts about my own code:
Using classes to make a good distinction between handling the list of the different bins and the list of items in those bins.
Getting the implementation as effective as possible.
Being easy to run with a lot of different data lengths and files for benchmarking.
#include <iostream>
#include <fstream>
#include <list>
#include <queue>
#include <string>
#include <vector>
using namespace std;
struct type_item {
int size;
int life;
bool operator < (const type_item& input)
{
return size < input.size;
}
};
class Class_bin {
double load;
list<type_item> contents;
list<type_item>::iterator i;
public:
Class_bin ();
bool operator < (Class_bin);
bool full (type_item);
void push_bin (type_item);
double check_load ();
void check_dead ();
void print_bin ();
};
Class_bin::Class_bin () {
load=0.0;
}
bool Class_bin::operator < (Class_bin input){
return load < input.load;
}
bool Class_bin::full (type_item input) {
if (load+(1.0/(double) input.size)>1) {
return false;
}
else {
return true;
}
}
void Class_bin::push_bin (type_item input) {
int sum=0;
contents.push_back(input);
for (i=contents.begin(); i!=contents.end(); ++i) {
sum+=i->size;
}
load+=1.0/(double) sum;
}
double Class_bin::check_load () {
return load;
}
void Class_bin::check_dead () {
for (i=contents.begin(); i!=contents.end(); ++i) {
i->life--;
if (i->life==0) {
contents.erase(i);
}
}
}
void Class_bin::print_bin () {
for (i=contents.begin (); i!=contents.end (); ++i) {
cout << i->size << " ";
}
}
class Class_list_of_bins {
list<Class_bin> list_of_bins;
list<Class_bin>::iterator i;
public:
void push_list (type_item);
void sort_list ();
void check_dead ();
void print_list ();
private:
Class_bin new_bin (type_item);
bool comparator (type_item, type_item);
};
Class_bin Class_list_of_bins::new_bin (type_item input) {
Class_bin temp;
temp.push_bin (input);
return temp;
}
void Class_list_of_bins::push_list (type_item input) {
if (list_of_bins.empty ()) {
list_of_bins.push_front (new_bin(input));
return;
}
for (i=list_of_bins.begin (); i!=list_of_bins.end (); ++i) {
if (!i->full (input)) {
i->push_bin (input);
return;
}
}
list_of_bins.push_front (new_bin(input));
}
void Class_list_of_bins::sort_list () {
list_of_bins.sort();
}
void Class_list_of_bins::check_dead () {
for (i=list_of_bins.begin (); i !=list_of_bins.end (); ++i) {
i->check_dead ();
}
}
void Class_list_of_bins::print_list () {
for (i=list_of_bins.begin (); i!=list_of_bins.end (); ++i) {
i->print_bin ();
cout << "\n";
}
}
int main () {
int i, number_of_items;
type_item buffer;
Class_list_of_bins bins;
queue<type_item> input;
string filename;
fstream file;
cout << "Input file name: ";
cin >> filename;
cout << endl;
file.open (filename.c_str(), ios::in);
file >> number_of_items;
for (i=0; i<number_of_items; ++i) {
file >> buffer.size;
file >> buffer.life;
input.push (buffer);
}
file.close ();
while (!input.empty ()) {
buffer=input.front ();
input.pop ();
bins.push_list (buffer);
}
bins.print_list ();
return 0;
}
Note that this is just a snapshot of my code and is not yet running properly
Don't wan't to clutter this with unrelated chatter just want to thank the people who contributed, I will review my code and hopefully be able to structure my programming a bit better
How would the structure of a Bin packing algorithm look in C++?
Well, ideally you would have several bin-packing algorithms, separated into different functions, which differ only by the logic of the algorithm. That algorithm should be largely independent from the representation of your data, so you can change your algorithm with only a single function call.
You can look at what the STL Algorithms have in common. Mainly, they operate on iterators instead of containers, but as I detail below, I wouldn't suggest this for you initially. You should get a feel for what algorithms are available and leverage them in your implementation.
Is STL containers a good tool to make the implementation be able to handle inputs of arbitrary length?
It usually works like this: create a container, fill the container, apply an algorithm to the container.
Judging from the description of your requirements, that is how you'll use this, so I think it'll be fine. There's one important difference between your bin packing algorithm and most STL algorithms.
The STL algorithms are either non-modifying or are inserting elements to a destination. bin-packing, on the other hand, is "here's a list of bins, use them or add a new bin". It's not impossible to do this with iterators, but probably not worth the effort. I'd start by operating on the container, get a working program, back it up, then see if you can make it work for only iterators.
How should I handle the containers in a good, easy to read and implement way?
I'd take this approach, characterize your inputs and outputs:
Input: Collection of items, arbitrary length, arbitrary order.
Output: Collection of bins determined by algorithm. Each bin contains a collection of items.
Then I'd worry about "what does my algorithm need to do?"
Constantly check bins for "does this item fit?"
Your Class_bin is a good encapsulation of what is needed.
Avoid cluttering your code with unrelated stuff like "print()" - use non-member help functions.
type_item
struct type_item {
int size;
int life;
bool operator < (const type_item& input)
{
return size < input.size;
}
};
It's unclear what life (or death) is used for. I can't imagine that concept being relevant to implementing a bin-packing algorithm. Maybe it should be left out?
This is personal preference, but I don't like giving operator< to my objects. Objects are usually non-trivial and have many meanings of less-than. For example, one algorithm might want all the alive items sorted before the dead items. I typically wrap that in another struct for clarity:
struct type_item {
int size;
int life;
struct SizeIsLess {
// Note this becomes a function object, which makes it easy to use with
// STL algorithms.
bool operator() (const type_item& lhs, const type_item& rhs)
{
return lhs.size < rhs.size;
}
}
};
vector<type_item> items;
std::sort(items.begin, items.end(), type_item::SizeIsLess);
Class_bin
class Class_bin {
double load;
list<type_item> contents;
list<type_item>::iterator i;
public:
Class_bin ();
bool operator < (Class_bin);
bool full (type_item);
void push_bin (type_item);
double check_load ();
void check_dead ();
void print_bin ();
};
I would skip the Class_ prefix on all your types - it's just a bit excessive, and it should be clear from the code. (This is a variant of hungarian notation. Programmers tend to be hostile towards it.)
You should not have a class member i (the iterator). It's not part of class state. If you need it in all the members, that's ok, just redeclare it there. If it's too long to type, use a typedef.
It's difficult to quantify "bin1 is less than bin2", so I'd suggest removing the operator<.
bool full(type_item) is a little misleading. I'd probably use bool can_hold(type_item). To me, bool full() would return true if there is zero space remaining.
check_load() would seem more clearly named load().
Again, it's unclear what check_dead() is supposed to accomplish.
I think you can remove print_bin and write that as a non-member function, to keep your objects cleaner.
Some people on StackOverflow would shoot me, but I'd consider just making this a struct, and leaving load and item list public. It doesn't seem like you care much about encapsulation here (you're only need to create this object so you don't need do recalculate load each time).
Class_list_of_bins
class Class_list_of_bins {
list<Class_bin> list_of_bins;
list<Class_bin>::iterator i;
public:
void push_list (type_item);
void sort_list ();
void check_dead ();
void print_list ();
private:
Class_bin new_bin (type_item);
bool comparator (type_item, type_item);
};
I think you can do without this class entirely.
Conceptually, it represents a container, so just use an STL container. You can implement the methods as non-member functions. Note that sort_list can be replaced with std::sort.
comparator is too generic a name, it gives no indication of what it compares or why, so consider being more clear.
Overall Comments
Overall, I think the classes you've picked adequately model the space you're trying to represent, so you'll be fine.
I might structure my project like this:
struct bin {
double load; // sum of item sizes.
std::list<type_item> items;
bin() : load(0) { }
};
// Returns true if the bin can fit the item passed to the constructor.
struct bin_can_fit {
bin_can_fit(type_item &item) : item_(item) { }
bool operator()(const bin &b) {
return item_.size < b.free_space;
}
private:
type_item item_;
};
// ItemIter is an iterator over the items.
// BinOutputIter is an output iterator we can use to put bins.
template <ItemIter, BinOutputIter>
void bin_pack_first_fit(ItemIter curr, ItemIter end, BinOutputIter output_bins) {
std::vector<bin> bins; // Create a local bin container, to simplify life.
for (; curr != end; ++curr) {
// Use a helper predicate to check whether the bin can fit this item.
// This is untested, but just for an idea.
std::vector<bin>::iterator bin_it =
std::find_if(bins.begin(), bins.end(), bin_can_fit(*curr));
if (bin_it == bins.end()) {
// Did not find a bin with enough space, add a new bin.
bins.push_back(bin);
// push_back invalidates iterators, so reassign bin_it to the last item.
bin_it = std::advance(bins.begin(), bins.size() - 1);
}
// bin_it now points to the bin to put the item in.
bin_it->items.push_back(*curr);
bin_it->load += curr.size();
}
std::copy(bins.begin(), bins.end(), output_bins); // Apply our bins to the destination.
}
void main(int argc, char** argv) {
std::vector<type_item> items;
// ... fill items
std::vector<bin> bins;
bin_pack_first_fit(items.begin(), items.end(), std::back_inserter(bins));
}
Some thoughts:
Your names are kinda messed up in places.
You have a lot of parameters named input, thats just meaningless
I'd expect full() to check whether it is full, not whether it can fit something else
I don't think push_bin pushes a bin
check_dead modifies the object (I'd expect something named check_*, to just tell me something about the object)
Don't put things like Class and type in the names of classes and types.
class_list_of_bins seems to describe what's inside rather then what the object is.
push_list doesn't push a list
Don't append stuff like _list to every method in a list class, if its a list object, we already know its a list method
I'm confused given the parameters of life and load as to what you are doing. The bin packing problem I'm familiar with just has sizes. I'm guessing that overtime some of the objects are taken out of bins and thus go away?
Some further thoughts on your classes
Class_list_of_bins is exposing too much of itself to the outside world. Why would the outside world want to check_dead or sort_list? That's nobodies business but the object itself. The public method you should have on that class really should be something like
* Add an item to the collection of bins
* Print solution
* Step one timestep into the future
list<Class_bin>::iterator i;
Bad, bad, bad! Don't put member variables on your unless they are actually member states. You should define that iterator where it is used. If you want to save some typing add this: typedef list::iterator bin_iterator and then you use bin_iterator as the type instead.
EXPANDED ANSWER
Here is my psuedocode:
class Item
{
Item(Istream & input)
{
read input description of item
}
double size_needed() { return actual size required (out of 1) for this item)
bool alive() { return true if object is still alive}
void do_timestep() { decrement life }
void print() { print something }
}
class Bin
{
vector of Items
double remaining_space
bool can_add(Item item) { return true if we have enough space}
void add(Item item) {add item to vector of items, update remaining space}
void do_timestep() {call do_timestep() and all Items, remove all items which indicate they are dead, updating remaining_space as you go}
void print { print all the contents }
}
class BinCollection
{
void do_timestep { call do_timestep on all of the bins }
void add(item item) { find first bin for which can_add return true, then add it, create a new bin if neccessary }
void print() { print all the bins }
}
Some quick notes:
In your code, you converted the int size to a float repeatedly, that's not a good idea. In my design that is localized to one place
You'll note that the logic relating to a single item is now contained inside the item itself. Other objects only can see whats important to them, size_required and whether the object is still alive
I've not included anything about sorting stuff because I'm not clear what that is for in a first-fit algorithm.
This interview gives some great insight into the rationale behind the STL. This may give you some inspiration on how to implement your algorithms the STL-way.
Suppose you have a function, and you call it a lot of times, every time the function return a big object. I've optimized the problem using a functor that return void, and store the returning value in a public member:
#include <vector>
const int N = 100;
std::vector<double> fun(const std::vector<double> & v, const int n)
{
std::vector<double> output = v;
output[n] *= output[n];
return output;
}
class F
{
public:
F() : output(N) {};
std::vector<double> output;
void operator()(const std::vector<double> & v, const int n)
{
output = v;
output[n] *= n;
}
};
int main()
{
std::vector<double> start(N,10.);
std::vector<double> end(N);
double a;
// first solution
for (unsigned long int i = 0; i != 10000000; ++i)
a = fun(start, 2)[3];
// second solution
F f;
for (unsigned long int i = 0; i != 10000000; ++i)
{
f(start, 2);
a = f.output[3];
}
}
Yes, I can use inline or optimize in an other way this problem, but here I want to stress on this problem: with the functor I declare and construct the output variable output only one time, using the function I do that every time it is called. The second solution is two time faster than the first with g++ -O1 or g++ -O2. What do you think about it, is it an ugly optimization?
Edit:
to clarify my aim. I have to evaluate the function >10M times, but I need the output only few random times. It's important that the input is not changed, in fact I declared it as a const reference. In this example the input is always the same, but in real world the input change and it is function of the previous output of the function.
More common scenario is to create object with reserved large enough size outside the function and pass large object to the function by pointer or by reference. You could reuse this object on several calls to your function. Thus you could reduce continual memory allocation.
In both cases you are allocating new vector many many times.
What you should do is to pass both input and output objects to your class/function:
void fun(const std::vector<double> & in, const int n, std::vector<double> & out)
{
out[n] *= in[n];
}
this way you separate your logic from the algorithm. You'll have to create a new std::vector once and pass it to the function as many time as you want. Notice that there's unnecessary no copy/allocation made.
p.s. it's been awhile since I did c++. It may not compile right away.
It's not an ugly optimization. It's actually a fairly decent one.
I would, however, hide output and make an operator[] member to access its members. Why? Because you just might be able to perform a lazy evaluation optimization by moving all the math to that function, thus only doing that math when the client requests that value. Until the user asks for it, why do it if you don't need to?
Edit:
Just checked the standard. Behavior of the assignment operator is based on insert(). Notes for that function state that an allocation occurs if new size exceeds current capacity. Of course this does not seem to explicitly disallow an implementation from reallocating even if otherwise...I'm pretty sure you'll find none that do and I'm sure the standard says something about it somewhere else. Thus you've improved speed by removing allocation calls.
You should still hide the internal vector. You'll have more chance to change implementation if you use encapsulation. You could also return a reference (maybe const) to the vector from the function and retain the original syntax.
I played with this a bit, and came up with the code below. I keep thinking there's a better way to do this, but it's escaping me for now.
The key differences:
I'm allergic to public member variables, so I made output private, and put getters around it.
Having the operator return void isn't necessary for the optimization, so I have it return the value as a const reference so we can preserve return value semantics.
I took a stab at generalizing the approach into a templated base class, so you can then define derived classes for a particular return type, and not re-define the plumbing. This assumes the object you want to create takes a one-arg constructor, and the function you want to call takes in one additional argument. I think you'd have to define other templates if this varies.
Enjoy...
#include <vector>
template<typename T, typename ConstructArg, typename FuncArg>
class ReturnT
{
public:
ReturnT(ConstructArg arg): output(arg){}
virtual ~ReturnT() {}
const T& operator()(const T& in, FuncArg arg)
{
output = in;
this->doOp(arg);
return this->getOutput();
}
const T& getOutput() const {return output;}
protected:
T& getOutput() {return output;}
private:
virtual void doOp(FuncArg arg) = 0;
T output;
};
class F : public ReturnT<std::vector<double>, std::size_t, const int>
{
public:
F(std::size_t size) : ReturnT<std::vector<double>, std::size_t, const int>(size) {}
private:
virtual void doOp(const int n)
{
this->getOutput()[n] *= n;
}
};
int main()
{
const int N = 100;
std::vector<double> start(N,10.);
double a;
// second solution
F f(N);
for (unsigned long int i = 0; i != 10000000; ++i)
{
a = f(start, 2)[3];
}
}
It seems quite strange(I mean the need for optimization at all) - I think that a decent compiler should perform return value optimization in such cases. Maybe all you need is to enable it.