How to implement a lightweight fast associative array?

How to implement a lightweight fast associative array? - c++

I'm trying to understand how I should implement an associative array which gives constant time for search operations, right now my implementation looks like this:
#include <iostream>
#include <vector>
#include <string>
using namespace std;
template <class Key, class Value> class Dict {
private:
typedef struct Item {
Value value;
Key key;
} Item;
vector<Item> _data;
public:
void clear() {
_data.clear();
}
long size() {
return _data.size();
}
bool is_item(Key key) {
for (int i = 0; i < size(); i++) {
if (_data[i].key == key) return true;
}
return false;
}
bool add_item(Key key, Value value) {
if (is_item(key)) return false;
Item new_item;
new_item.key = key;
new_item.value = value;
_data.push_back(new_item);
return true;
}
Value &operator[](Key key) {
for (int i = 0; i < size(); i++) {
if (_data[i].key == key) return _data[i].value;
}
long idx = size();
Item new_item;
new_item.key = key;
_data.push_back(new_item);
return _data[idx].value;
}
Key get_key(long index) {
if (index < 0) index = 0;
for (int i = 0; i < size(); i++)
if (i == index) return _data[i].key;
return NULL;
}
Value &operator[](long index) {
if (index < 0) index = 0;
for (int i = 0; i < size(); i++) {
if (i == index) return _data[i].value;
}
return _data[0].value;
}
};
A simple test for this:
class Foo {
public:
Foo(int value) {
_value = value;
}
int get_value() {
return _value;
}
void set_value(int value) {
_value = value;
}
private:
int _value;
};
template <class Key, class Value> void print_dict(Dict<Key, Value> &dct) {
if (!dct.size()) {
printf("Empty Dict");
}
for (int i = 0; i < dct.size(); i++) {
printf("%d%s", dct[dct.get_key(i)], i == dct.size() - 1 ? "" : ", ");
}
printf("\n");
}
int main(int argc, char *argv[]) {
printf("\nDict tests\n------------\n");
Dict<string, int> dct;
string key1("key1");
string key2("key2");
string key3("key3");
dct["key1"] = 100;
dct["key2"] = 200;
dct["key3"] = 300;
printf("%d %d %d\n", dct["key1"], dct["key2"], dct["key3"]);
printf("%d %d %d\n", dct[key1], dct[key2], dct[key3]);
print_dict(dct);
dct.clear();
print_dict(dct);
Dict<Foo *, int> dct2;
Foo *f1 = new Foo(100);
Foo *f2 = new Foo(200);
dct2[f1] = 101;
dct2[f2] = 202;
print_dict(dct2);
}
Here's the thing, right now the search operation is linear time and I'd like it to become constant time and I'm wondering about a simple/lightweight way to achieve this.
I've seen hashtables are a possible option but I'd prefer not having to implement a hash function per object. Maybe something similar to an unordered_map... dunno.
Could anyone give some ideas or maybe providing a simple lightweight implementation of what I'm trying to achieve here?
In this fictional example I'm using std::vector to avoid making the question bigger and more complex than what it is but my the real use-case won't be using the STL at all (ie: i'll be coding my own custom implementation of std::vector)
CONSTRAINTS
The reason of not using the STL at all is not because that implementation is not good (fast,generic,full-featured) enough but more because is quite heavy for my size-constrained projects (final exe <=65536bytes). Even this small implementation of the STL is actually quite big to be used as it is
I don't need a full implementation of an associative array but just providing the interface i've already implemented above (main problem being the linear-time search)
I don't care about inserting/deleting methods being slow but definitely I'd like the search/lookup being near to constant time
I guess I'd need to convert the above implementation in an associative array using a hash table but I'm unsure about the relevant implementation details (which hash functions per object, which table size, ...)

Let me address some issues you've raised in your question.
Here's the thing, right now the search operation is linear time and I'd like it to become constant time and I'm wondering about a simple/lightweight way to achieve this.
A simple lightweight way to achieve this, i.e., to have an associative array (a.k.a. key-value-store), is to use one provided by the standard library.
You are coding in a recent version of C++, you are in the lucky position that the standard library actually provides one that satisfies your constant-time requirements:
http://en.cppreference.com/w/cpp/container/unordered_map
The implementation of the data structures shipped as part of a standard library of any decent compiler these days, are probably better than anything you could come up with. (Or why did you ask for give me the code?).
I've seen hashtables are a possible option but I'd prefer not having to implement a hash function per object. Maybe something similar to an unordered_map... dunno.
A std::unordered_map actually is a hash table, and as you can see in the docs, it takes a hash function. As you can see written in the docs there are lots of specializations for lots of types already available, that can help you derive a custom hash function for your custom object types:
http://en.cppreference.com/w/cpp/utility/hash
Could anyone give some ideas or maybe providing a simple lightweight implementation of what I'm trying to achieve here?
Just have a look at the example code to std::unordered_map to see how it's used. If you worry about performance, don't forget to measure. If you really want to consume some input on implementation of hash tables, I liked these talks on the Python dictionary:
https://www.youtube.com/watch?v=C4Kc8xzcA68
https://www.youtube.com/watch?v=p33CVV29OG8
Also have a look at the wikipedia page (if you haven't already):
https://en.wikipedia.org/wiki/Associative_array
In this fictional example I'm using std::vector to avoid making the question bigger and more complex than what it is but my the real use-case won't be using the STL at all (ie: i'll be coding my own custom implementation of std::vector)
Unless you are doing this for educational/recreational purposes, don't do it. Don't be ashamed to base your endeavours on the shoulders of giants. That the standard library wasn't invented in your project is not a problem.

If you want to keep code size small, you should avoid templates as much as possible. At least templates that create non-trivial amounts of code.
For your hash map, that means: stick to one key type and only store void pointers to the values. If you don't want to deal with void* and the casts that go along with it all over in your code, implement one non-template hash map that stores void* as value, with all "no inline" functions. Then create an "all inline" (maybe even all "force inline") wrapper class that uses the void* map internally and just converts T* <-> void*.
If you really really really need different key types, see if you can stick to PODs without padding (memcpy copyable and memcmp comparable). That way you can still use the same hash map class for everything: you just have to tell the map (at runtime) what the key-size is. Then you can copy the keys into the map using memcpy, compare using memcmp and hash them using any hash algorithm that can hash byte sequences (=almost every hash algorithm).
Of course you'll also want to do a lot of other stuff, e.g. avoid inlining any non-trivial functions, avoid C-Runtime library functions, disable exception handling and RTTI etc., but that's a different topic.
Or, maybe just stick to plain C :)

Related

Designing hash set in leetcode, code gives run time error

Am trying to solve a question on designing HashSet.
Design a HashSet without using any built-in hash table libraries.
To be specific, your design should include these two functions:
add(value): Insert a value into the HashSet.
contains(value) : Return whether the value exists in the HashSet or not.
remove(value): Remove a value in the HashSet. If the value does not exist in the HashSet, do
nothing.
Example:
MyHashSet hashSet = new MyHashSet(); hashSet.add(1);
hashSet.add(2); hashSet.contains(1); // returns true
hashSet.contains(3); // returns false (not found) hashSet.add(2);
hashSet.contains(2); // returns true hashSet.remove(2);
hashSet.contains(2); // returns false (already removed)
Note:
All values will be in the range of [1, 1000000]. The number of
operations will be in the range of [1, 10000]. Please do not use the
built-in HashSet library.
The following code runs fine locally, but fails on submission giving the error,
Runtime Error Message:
reference binding to misaligned address 0x736c61662c657572 for type 'int', which requires 4 byte alignment
Last executed input:
["MyHashSet","add","remove","add","contains","add","remove","add","add","add","add"]
[[],[6],[4],[17],[14],[14],[17],[14],[14],[18],[14]]
class MyHashSet { public:
vector<vector<int>> setHash;
MyHashSet() {
setHash.reserve(10000);
}
void add(int key) {
int bucket = key % 10000;
vector<int>::iterator it;
it = find(setHash[bucket].begin(),setHash[bucket].end(),key);
if(it == setHash[bucket].end()){
setHash[bucket].push_back(key);
}
}
void remove(int key) {
int bucket = key % 10000;
vector<int>::iterator it1;
it1 = find(setHash[bucket].begin(),setHash[bucket].end(),key);
if(it1 != setHash[bucket].end()){
int index = distance(it1,setHash[bucket].begin());
setHash[bucket].erase(setHash[bucket].begin()+index);
}
}
/** Returns true if this set did not already contain the specified element */
bool contains(int key) {
int bucket = key % 10000;
vector<int>::iterator it2;
it2 = find(setHash[bucket].begin(),setHash[bucket].end(),key);
if(it2 != setHash[bucket].end()){
return true;
}
return false;
}
};
I suspect its due to a memory issue. But couldn't able to figure out as i am still learning the fundamentals of c++.

If your question is how to fix your implementation, I'd heed the advice in the comments.
If you're looking to learn about C++ and solve the problem in an optimal way, I'd use std::bitset. The fact they give you the defined input range [1,1000000] leads me to believe they're looking for something like this.
This might fall into the category of built-in hash table libraries so here is a potential implementation.
class MyHashSet {
public:
void add(int key) {
flags.set(key);
}
void remove(int key) {
flags.reset(key);
}
bool contains(int key) const {
return flags[key];
}
private:
bitset<1000000+1> flags;
};
On my platform, this takes up ~16kB (as opposed to 30kB+ for 10000 vectors). It also requires no dynamic memory allocation.
If you consider this off-topic or cheating, please provide the title/number of the LeetCode problem so I can work on your code draft using their test cases. I'm also studying hash tables right now so it'd be a win-win.

Is there a C++ container for unique values that supports strict size checking?

I'm looking for a C++ container to store pointers to objects which also meets the following requirements.
A container that keeps the order of elements (sequence container, so std::set is not suitable)
A container that has a member function which return the actual size (As std::array::size() always returns the fixed size, std::array is not suitable)
A container that supports random accesses such as operator [].
This is my code snippet and I'd like to remove the assertions used for checking size and uniqueness of elements.
#include <vector>
#include <set>
#include "assert.h"
class Foo {
public:
void DoSomething() {
}
};
int main() {
// a variable used to check whether a container is properly assigned
const uint8_t size_ = 2;
Foo foo1;
Foo foo2;
// Needs a kind of sequential containers to keep the order
// used std::vector instead of std::array to use member function size()
const std::vector<Foo*> vec = {
&foo1,
&foo2
};
std::set<Foo*> set_(vec.begin(), vec.end());
assert(vec.size() == size_); // size checking against pre-defined value
assert(vec.size() == set_.size()); // check for elements uniqueness
// Needs to access elements using [] operator
for (auto i = 0; i < size_; i++) {
vec[i]->DoSomething();
}
return 0;
}
Is there a C++ container which doesn't need two assertions used in my code snippet? Or should I need to make my own class which encapsulates one of STL containers?

So a class that acts like a vector except if you insert, it rejects duplicates like a set or a map.
One option might be the Boost.Bimap with indices of T* and sequence_index.
Your vector-like indexing would be via the sequence_index. You might even be willing to live with holes in the sequence after an element is erased.
Sticking with STLyou could implement a bidirectional map using 2 maps, or the following uses a map and a vector:
Note that by inheriting from vector I get all the vector methods for free, but I also risk the user downcasting to the vector.
One way round that without remodelling with a wrapper (a la queue vs list) is to make it protected inheritance and then explicitly using all the methods back to public. This is actually safer as it ensures you haven't inadvertently left some vector modification method live that would take the two containers out of step.
Note also that you would need to roll your own initializer_list constructor if you wanted one to filter out any duplicates. And you would have to do a bit of work to get this thread-safe.
template <class T>
class uniqvec : public std::vector<T*>
{
private:
typedef typename std::vector<T*> Base;
enum {push_back, pop_back, emplace_back, emplace}; //add anything else you don't like from vector
std::map <T*, size_t> uniquifier;
public:
std::pair<typename Base::iterator, bool> insert(T* t)
{
auto rv1 = uniquifier.insert(std::make_pair(t, Base::size()));
if (rv1.second)
{
Base::push_back(t);
}
return std::make_pair(Base::begin()+rv1.first.second, rv1.second);
}
void erase(T* t)
{
auto found = uniquifier.find(t);
if (found != uniquifier.end())
{
auto index = found->second;
uniquifier.erase(found);
Base::erase(Base::begin()+index);
for (auto& u : uniquifier)
if (u.second > index)
u.second--;
}
}
// Note that c++11 returns the next safe iterator,
// but I don't know if that should be in vector order or set order.
void erase(typename Base::iterator i)
{
return erase(*i);
}
};

As others have mentioned, your particular questions seems like the XY problem (you are down in the weeds about a particular solution instead of focusing on the original problem). There was an extremely useful flowchart provided here a number of years ago (credit to #MikaelPersson) that will help you choose a particular STL container to best fit your needs. You can find the original question here In which scenario do I use a particular STL container?.

Not sure what data structure to use

I'm currently trying to work with vectors / deques of structures. Simple example of the structure...
struct job {
int id;
int time;
}
I want to be able to search through the structure to find the job that matches the time, remove it from the structure and continue to check for other ids in that structure. Sample code...
<vector> jobs;
<deque> started;
for (unsigned int i = 0; i < jobs.size(); i++)
{
if (jobs.at(i).time == time)
{
started.push_back(jobs.at(i));
jobs.erase(jobs.begin() + i);
i--;
}
}
time++;
This works how I want it to but it also seems very hacky since I'm adjusting the index whenever I delete and I think it's simply because I'm not as knowledgeable as should be with data structures. Anyone able to give me some advice?
NOTE - I don't think this is a duplicate to what this post has been tagged to as I'm not looking to do something efficiently with what I already have. To me, it seems efficient enough considering I'm reducing the size of the deque each time I get what I need from it. What I was hoping for, is some advice on figuring out what is the best data structure for what I'm attempting to do with deques, which are likely not meant to be handled as I'm handling them.
I could also be wrong and my usage is fine but just seems off to me to.

Well, I always knew that this talk would come in handy! The message here is "know your STL algorithms". With that, let me introduce you to std::stable_partition.
One thing you can do is use just one single vector, as follows:
using namespace std;
vector<job> jobs;
// fill the vector with jobs
auto startedJobsIter = stable_partition(begin(jobs), end(jobs),
[=time](job const &_job) { return _job.time == time; });
Now, everything between begin(jobs) and startedJobsIter satisfy the condition, while everything from startedJobsIter and end(jobs) does not.
Edit
If you don't care about the relative ordering of the items, then you could just use std::partition, which could be even more performant, because it would not preserve the relative ordering of the elements in the original vector, but will still divide it into the two parts.
Edit 2
Here's an adaptation for older C++ standards:
struct job_time_predicate {
public:
job_time_predicate(int time) : time_(time) { }
bool operator()(job const &the_job) { return the_job.time == time_; }
private:
int time_;
};
int main()
{
using namespace std;
int time = 10;
vector<job> jobs;
// fill that vector
vector<job>::iterator startedJobsIter =
stable_partition(jobs.begin(), jobs.end(), job_time_predicate(time));
}

return a vector vs use a parameter for the vector to return it

With the code below, the question is:
If you use the "returnIntVector()" function, is the vector copied from the local to the "outer" (global) scope? In other words is it a more time and memory consuming variation compared to the "getIntVector()"-function? (However providing the same functionality.)
#include <iostream>
#include <vector>
using namespace std;
vector<int> returnIntVector()
{
vector<int> vecInts(10);
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
vecInts[ui] = ui;
return vecInts;
}
void getIntVector(vector<int> &vecInts)
{
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
vecInts[ui] = ui;
}
int main()
{
vector<int> vecInts = returnIntVector();
for(unsigned int ui = 0; ui < vecInts.size(); ui++)
cout << vecInts[ui] << endl;
cout << endl;
vector<int> vecInts2(10);
getIntVector(vecInts2);
for(unsigned int ui = 0; ui < vecInts2.size(); ui++)
cout << vecInts2[ui] << endl;
return 0;
}

In theory, yes it's copied. In reality, no, most modern compilers take advantage of return value optimization.
So you can write code that acts semantically correct. If you want a function that modifies or inspects a value, you take it in by reference. Your code does not do that, it creates a new value not dependent upon anything else, so return by value.

Use the first form: the one which returns vector. And a good compiler will most likely optimize it. The optimization is popularly known as Return value optimization, or RVO in short.

Others have already pointed out that with a decent (not great, merely decent) compiler, the two will normally end up producing identical code, so the two give equivalent performance.
I think it's worth mentioning one or two other points though. First, returning the object does officially copy the object; even if the compiler optimizes the code so that copy never takes place, it still won't (or at least shouldn't) work if the copy ctor for that class isn't accessible. std::vector certainly supports copying, but it's entirely possible to create a class that you'd be able to modify like in getIntVector, but not return like in returnIntVector.
Second, and substantially more importantly, I'd generally advise against using either of these. Instead of passing or returning a (reference to) a vector, you should normally work with an iterator (or two). In this case, you have a couple of perfectly reasonable choices -- you could use either a special iterator, or create a small algorithm. The iterator version would look something like this:
#ifndef GEN_SEQ_INCLUDED_
#define GEN_SEQ_INCLUDED_
#include <iterator>
template <class T>
class sequence : public std::iterator<std::forward_iterator_tag, T>
{
T val;
public:
sequence(T init) : val(init) {}
T operator *() { return val; }
sequence &operator++() { ++val; return *this; }
bool operator!=(sequence const &other) { return val != other.val; }
};
template <class T>
sequence<T> gen_seq(T const &val) {
return sequence<T>(val);
}
#endif
You'd use this something like this:
#include "gen_seq"
std::vector<int> vecInts(gen_seq(0), gen_seq(10));
Although it's open to argument that this (sort of) abuses the concept of iterators a bit, I still find it preferable on practical grounds -- it lets you create an initialized vector instead of creating an empty vector and then filling it later.
The algorithm alternative would look something like this:
template <class T, class OutIt>
class fill_seq_n(OutIt result, T num, T start = 0) {
for (T i = start; i != num-start; ++i) {
*result = i;
++result;
}
}
...and you'd use it something like this:
std::vector<int> vecInts;
fill_seq_n(std::back_inserter(vecInts), 10);
You can also use a function object with std::generate_n, but at least IMO, this generally ends up more trouble than it's worth.
As long as we're talking about things like that, I'd also replace this:
for(unsigned int ui = 0; ui < vecInts2.size(); ui++)
cout << vecInts2[ui] << endl;
...with something like this:
std::copy(vecInts2.begin(), vecInts2.end(),
std::ostream_iterator<int>(std::cout, "\n"));

In C++03 days, getIntVector() is recommended for most cases. In case of returnIntVector(), it might create some unncessary temporaries.
But by using return value optimization and swaptimization, most of them can be avoided. In era of C++11, the latter can be meaningful due to the move semantics.

In theory, the returnIntVector function returns the vector by value, so a copy will be made and it will be more time-consuming than the function which just populates an existing vector. More memory will also be used to store the copy, but only temporarily; since vecInts is locally scoped it will be stack-allocated and will be freed as soon as the returnIntVector returns. However, as others have pointed out, a modern compiler will optimize away these inefficiencies.

returnIntVector is more time consuming because it returns a copy of the vector, unless the vector implementation is realized with a single pointer in which case the performance is the same.
in general you should not rely on the implementation and use getIntVector instead.

Bin packing implementation in C++ with STL

This is my first time using this site so sorry for any bad formatting or weird formulations, I'll try my best to conform to the rules on this site but I might do some misstakes in the beginning.
I'm right now working on an implementation of some different bin packing algorithms in C++ using the STL containers. In the current code I still have some logical faults that needs to be fixed but this question is more about the structure of the program. I would wan't some second opinion on how you should structure the program to minimize the number of logical faults and make it as easy to read as possible. In it's current state I just feel that this isn't the best way to do it but I don't really see any other way to write my code right now.
The problem is a dynamic online bin packing problem. It is dynamic in the sense that items have an arbitrary time before they will leave the bin they've been assigned to.
In short my questions are:
How would the structure of a Bin packing algorithm look in C++?
Is STL containers a good tool to make the implementation be able to handle inputs of arbitrary lenght?
How should I handle the containers in a good, easy to read and implement way?
Some thoughts about my own code:
Using classes to make a good distinction between handling the list of the different bins and the list of items in those bins.
Getting the implementation as effective as possible.
Being easy to run with a lot of different data lengths and files for benchmarking.
#include <iostream>
#include <fstream>
#include <list>
#include <queue>
#include <string>
#include <vector>
using namespace std;
struct type_item {
int size;
int life;
bool operator < (const type_item& input)
{
return size < input.size;
}
};
class Class_bin {
double load;
list<type_item> contents;
list<type_item>::iterator i;
public:
Class_bin ();
bool operator < (Class_bin);
bool full (type_item);
void push_bin (type_item);
double check_load ();
void check_dead ();
void print_bin ();
};
Class_bin::Class_bin () {
load=0.0;
}
bool Class_bin::operator < (Class_bin input){
return load < input.load;
}
bool Class_bin::full (type_item input) {
if (load+(1.0/(double) input.size)>1) {
return false;
}
else {
return true;
}
}
void Class_bin::push_bin (type_item input) {
int sum=0;
contents.push_back(input);
for (i=contents.begin(); i!=contents.end(); ++i) {
sum+=i->size;
}
load+=1.0/(double) sum;
}
double Class_bin::check_load () {
return load;
}
void Class_bin::check_dead () {
for (i=contents.begin(); i!=contents.end(); ++i) {
i->life--;
if (i->life==0) {
contents.erase(i);
}
}
}
void Class_bin::print_bin () {
for (i=contents.begin (); i!=contents.end (); ++i) {
cout << i->size << " ";
}
}
class Class_list_of_bins {
list<Class_bin> list_of_bins;
list<Class_bin>::iterator i;
public:
void push_list (type_item);
void sort_list ();
void check_dead ();
void print_list ();
private:
Class_bin new_bin (type_item);
bool comparator (type_item, type_item);
};
Class_bin Class_list_of_bins::new_bin (type_item input) {
Class_bin temp;
temp.push_bin (input);
return temp;
}
void Class_list_of_bins::push_list (type_item input) {
if (list_of_bins.empty ()) {
list_of_bins.push_front (new_bin(input));
return;
}
for (i=list_of_bins.begin (); i!=list_of_bins.end (); ++i) {
if (!i->full (input)) {
i->push_bin (input);
return;
}
}
list_of_bins.push_front (new_bin(input));
}
void Class_list_of_bins::sort_list () {
list_of_bins.sort();
}
void Class_list_of_bins::check_dead () {
for (i=list_of_bins.begin (); i !=list_of_bins.end (); ++i) {
i->check_dead ();
}
}
void Class_list_of_bins::print_list () {
for (i=list_of_bins.begin (); i!=list_of_bins.end (); ++i) {
i->print_bin ();
cout << "\n";
}
}
int main () {
int i, number_of_items;
type_item buffer;
Class_list_of_bins bins;
queue<type_item> input;
string filename;
fstream file;
cout << "Input file name: ";
cin >> filename;
cout << endl;
file.open (filename.c_str(), ios::in);
file >> number_of_items;
for (i=0; i<number_of_items; ++i) {
file >> buffer.size;
file >> buffer.life;
input.push (buffer);
}
file.close ();
while (!input.empty ()) {
buffer=input.front ();
input.pop ();
bins.push_list (buffer);
}
bins.print_list ();
return 0;
}
Note that this is just a snapshot of my code and is not yet running properly
Don't wan't to clutter this with unrelated chatter just want to thank the people who contributed, I will review my code and hopefully be able to structure my programming a bit better

How would the structure of a Bin packing algorithm look in C++?
Well, ideally you would have several bin-packing algorithms, separated into different functions, which differ only by the logic of the algorithm. That algorithm should be largely independent from the representation of your data, so you can change your algorithm with only a single function call.
You can look at what the STL Algorithms have in common. Mainly, they operate on iterators instead of containers, but as I detail below, I wouldn't suggest this for you initially. You should get a feel for what algorithms are available and leverage them in your implementation.
Is STL containers a good tool to make the implementation be able to handle inputs of arbitrary length?
It usually works like this: create a container, fill the container, apply an algorithm to the container.
Judging from the description of your requirements, that is how you'll use this, so I think it'll be fine. There's one important difference between your bin packing algorithm and most STL algorithms.
The STL algorithms are either non-modifying or are inserting elements to a destination. bin-packing, on the other hand, is "here's a list of bins, use them or add a new bin". It's not impossible to do this with iterators, but probably not worth the effort. I'd start by operating on the container, get a working program, back it up, then see if you can make it work for only iterators.
How should I handle the containers in a good, easy to read and implement way?
I'd take this approach, characterize your inputs and outputs:
Input: Collection of items, arbitrary length, arbitrary order.
Output: Collection of bins determined by algorithm. Each bin contains a collection of items.
Then I'd worry about "what does my algorithm need to do?"
Constantly check bins for "does this item fit?"
Your Class_bin is a good encapsulation of what is needed.
Avoid cluttering your code with unrelated stuff like "print()" - use non-member help functions.
type_item
struct type_item {
int size;
int life;
bool operator < (const type_item& input)
{
return size < input.size;
}
};
It's unclear what life (or death) is used for. I can't imagine that concept being relevant to implementing a bin-packing algorithm. Maybe it should be left out?
This is personal preference, but I don't like giving operator< to my objects. Objects are usually non-trivial and have many meanings of less-than. For example, one algorithm might want all the alive items sorted before the dead items. I typically wrap that in another struct for clarity:
struct type_item {
int size;
int life;
struct SizeIsLess {
// Note this becomes a function object, which makes it easy to use with
// STL algorithms.
bool operator() (const type_item& lhs, const type_item& rhs)
{
return lhs.size < rhs.size;
}
}
};
vector<type_item> items;
std::sort(items.begin, items.end(), type_item::SizeIsLess);
Class_bin
class Class_bin {
double load;
list<type_item> contents;
list<type_item>::iterator i;
public:
Class_bin ();
bool operator < (Class_bin);
bool full (type_item);
void push_bin (type_item);
double check_load ();
void check_dead ();
void print_bin ();
};
I would skip the Class_ prefix on all your types - it's just a bit excessive, and it should be clear from the code. (This is a variant of hungarian notation. Programmers tend to be hostile towards it.)
You should not have a class member i (the iterator). It's not part of class state. If you need it in all the members, that's ok, just redeclare it there. If it's too long to type, use a typedef.
It's difficult to quantify "bin1 is less than bin2", so I'd suggest removing the operator<.
bool full(type_item) is a little misleading. I'd probably use bool can_hold(type_item). To me, bool full() would return true if there is zero space remaining.
check_load() would seem more clearly named load().
Again, it's unclear what check_dead() is supposed to accomplish.
I think you can remove print_bin and write that as a non-member function, to keep your objects cleaner.
Some people on StackOverflow would shoot me, but I'd consider just making this a struct, and leaving load and item list public. It doesn't seem like you care much about encapsulation here (you're only need to create this object so you don't need do recalculate load each time).
Class_list_of_bins
class Class_list_of_bins {
list<Class_bin> list_of_bins;
list<Class_bin>::iterator i;
public:
void push_list (type_item);
void sort_list ();
void check_dead ();
void print_list ();
private:
Class_bin new_bin (type_item);
bool comparator (type_item, type_item);
};
I think you can do without this class entirely.
Conceptually, it represents a container, so just use an STL container. You can implement the methods as non-member functions. Note that sort_list can be replaced with std::sort.
comparator is too generic a name, it gives no indication of what it compares or why, so consider being more clear.
Overall Comments
Overall, I think the classes you've picked adequately model the space you're trying to represent, so you'll be fine.
I might structure my project like this:
struct bin {
double load; // sum of item sizes.
std::list<type_item> items;
bin() : load(0) { }
};
// Returns true if the bin can fit the item passed to the constructor.
struct bin_can_fit {
bin_can_fit(type_item &item) : item_(item) { }
bool operator()(const bin &b) {
return item_.size < b.free_space;
}
private:
type_item item_;
};
// ItemIter is an iterator over the items.
// BinOutputIter is an output iterator we can use to put bins.
template <ItemIter, BinOutputIter>
void bin_pack_first_fit(ItemIter curr, ItemIter end, BinOutputIter output_bins) {
std::vector<bin> bins; // Create a local bin container, to simplify life.
for (; curr != end; ++curr) {
// Use a helper predicate to check whether the bin can fit this item.
// This is untested, but just for an idea.
std::vector<bin>::iterator bin_it =
std::find_if(bins.begin(), bins.end(), bin_can_fit(*curr));
if (bin_it == bins.end()) {
// Did not find a bin with enough space, add a new bin.
bins.push_back(bin);
// push_back invalidates iterators, so reassign bin_it to the last item.
bin_it = std::advance(bins.begin(), bins.size() - 1);
}
// bin_it now points to the bin to put the item in.
bin_it->items.push_back(*curr);
bin_it->load += curr.size();
}
std::copy(bins.begin(), bins.end(), output_bins); // Apply our bins to the destination.
}
void main(int argc, char** argv) {
std::vector<type_item> items;
// ... fill items
std::vector<bin> bins;
bin_pack_first_fit(items.begin(), items.end(), std::back_inserter(bins));
}

Some thoughts:
Your names are kinda messed up in places.
You have a lot of parameters named input, thats just meaningless
I'd expect full() to check whether it is full, not whether it can fit something else
I don't think push_bin pushes a bin
check_dead modifies the object (I'd expect something named check_*, to just tell me something about the object)
Don't put things like Class and type in the names of classes and types.
class_list_of_bins seems to describe what's inside rather then what the object is.
push_list doesn't push a list
Don't append stuff like _list to every method in a list class, if its a list object, we already know its a list method
I'm confused given the parameters of life and load as to what you are doing. The bin packing problem I'm familiar with just has sizes. I'm guessing that overtime some of the objects are taken out of bins and thus go away?
Some further thoughts on your classes
Class_list_of_bins is exposing too much of itself to the outside world. Why would the outside world want to check_dead or sort_list? That's nobodies business but the object itself. The public method you should have on that class really should be something like
* Add an item to the collection of bins
* Print solution
* Step one timestep into the future
list<Class_bin>::iterator i;
Bad, bad, bad! Don't put member variables on your unless they are actually member states. You should define that iterator where it is used. If you want to save some typing add this: typedef list::iterator bin_iterator and then you use bin_iterator as the type instead.
EXPANDED ANSWER
Here is my psuedocode:
class Item
{
Item(Istream & input)
{
read input description of item
}
double size_needed() { return actual size required (out of 1) for this item)
bool alive() { return true if object is still alive}
void do_timestep() { decrement life }
void print() { print something }
}
class Bin
{
vector of Items
double remaining_space
bool can_add(Item item) { return true if we have enough space}
void add(Item item) {add item to vector of items, update remaining space}
void do_timestep() {call do_timestep() and all Items, remove all items which indicate they are dead, updating remaining_space as you go}
void print { print all the contents }
}
class BinCollection
{
void do_timestep { call do_timestep on all of the bins }
void add(item item) { find first bin for which can_add return true, then add it, create a new bin if neccessary }
void print() { print all the bins }
}
Some quick notes:
In your code, you converted the int size to a float repeatedly, that's not a good idea. In my design that is localized to one place
You'll note that the logic relating to a single item is now contained inside the item itself. Other objects only can see whats important to them, size_required and whether the object is still alive
I've not included anything about sorting stuff because I'm not clear what that is for in a first-fit algorithm.

This interview gives some great insight into the rationale behind the STL. This may give you some inspiration on how to implement your algorithms the STL-way.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js