sorting a list - the best way possible - c++

I have a list
std::list<Selector> _selectorList;
I do parse on something and I get a smart pointer and a priority associated with the smart pointer. I implemented a structure to hold these two parameters like shown below.
struct Selector
{
int priority;
SmartPointer *selector;
}
There will be n number of parsing which will be done, as a result n number of struct instances will be pushed back into the list. At the end I am supposed to sort the list according to the decreasing order of priority variable in the structure. Currently, I plan to do this.
_selectorList.sort();
Is there any better approach than this, provided that I must use a list (only and nothing else) to store the smart pointers returned by parsing?

Like larsman told you using a pointer to a SmartPointer is highly probably wrong. Since smart pointers are used to avoid memory leaks, the reference counter is updated on object copy or assignment, so a SmartPointer * is probably useless.
For what concerne a better approach, you could reuse std::list::sort instead of reimplementing your own sort operation. The only things to do is let your Selector implement comparison operator in order to be able to sort your list.
Have a look here.

Related

Using raw pointers without allocating memory

I would like to ask about my approach to using pointers raw pointers without allocating any memory using pointers. I am working on an application, that is simulating classical cashdesk. So I have a class CashDesk, which is containing vectors of Items and vector of Orders, which are classes to represent items and orders. Furthermore, I want the Order class to contain a vector, which would be a vector of pointers to Item – I don't want to store the object multiple times in different orders, because it makes no sense to me. Through the pointers in Order, I only want to be able to access properties of the class Item, there is no allocating of memory using the pointers.
Simplified code:
class CashDesk {
vector<Item> items;
vector<Order> orders;
}
class Order {
vector<Item*> ItemsInOrder;
}
Class Item containing only structured data – information about the Item.
I create all objects at the level of the CashDesk class – create instance of Item when needed and push it to items vector.
I have been told that I should avoid using raw pointers unless there is no another option. The important thing is that I don't use any memory allocation using pointers – really using the pointer in terms of pointing at the object and accessing it's properties. Should I rather use something like unique_ptr, or completely different approach?
Thanks for any response.
I have been told that I should avoid using raw pointers unless there is no another option.
You have been told something subtly wrong. You should avoid owning raw pointers, but non-owning raw pointers are perfectly fine.
You will have to ensure that the elements of Order::itemsInOrder aren't invalidated by operations on CashDesk::items, but that co-ordination should be within the private parts of CashDesk.
You could be more explicit about the lack of ownership semantic, by using std::vector<Item>::iterator in place of Item *, but that doesn't change any behaviour (a conforming implementation may implement std::vector<Item>::iterator as an alias of Item *)

Referencing and inserting into a vector of list<nodes>

I am implementing a hashtable and am having trouble with the implementation. After literal hours of googling on this one thing, i've given up and was hoping to see of i could get any help here. The biggest issue is to do with the use of vectors in the HashTable(doesnt make sense to me, rather just use list<> but using it is required)
My main issue is to do with how to implement the insert function to add to the HashTable.
void HashTable::insert(ulint key,ulint value){ //insert data associated with key
HashNode nodeToAdd;
nodeToAdd.assign(key, value);
int index = hash_function(key);
this->table[index].push_back(nodeToAdd);
}
Now the issue im having is adding the HashNode to my HashTable.
for reference in HashTable, the field for the table is
typedef vector <list<HashNode> > Table;
Table *table;
So by my understanding
this->table[index].push_back(nodeToAdd);
is going to the vector HashTable[index], which at the index should be a list. and when it gets to that list, it should push_back the new node into the list.
However when compiled, i'm hit by an error(no matching function to call) and i don't understand why.
Your list stores objects of type HashNode, not type HashNode*.
So you need to decide which of those you want to use, and change the code accordingly.
If you want to keep storing HashNode, then your insert is wrong -- it should instead create the node on the stack and store it by value in the list.
If you want to store a pointer, then your Table type is wrong, and should instead be vector<list<HashNode*>> -- note it should be managed carefully since the pointers will not be automatically deleted.
Personally, I'd suggest you go with #1 and save yourself a whole lot of headaches. But if you insist on #2, then I suggest you stop using malloc and use new -- or better yet use std::unique_ptr or std::shared_ptr for automatic lifetime management.
Also noteworthy is your definition Table *table. This is baffling, since Table is a vector. Your insert function is dereferencing this pointer, expecting it to perhaps point to an array of Table values, when it's quite clear you actually think it's a vector. I'm pretty sure you don't want that to be a pointer.
Since I only just noticed that detail, I imagine that's the first source of your error, since table[index] is actually type Table, not type list<HashNode> and you were trying to call the non-existent function vector<list<HashNode>>::push_back(HashNode*).

Remove item from a list using its pointer

I have a pointer p (not an iterator) to an item in a list. Can I then use p to delete (erase) the item from the list? Something like:
mylist.erase(p);
So far I have only been able to do this by iterating through the list until I reach an item at the location p, and then using the erase method, which seems very inefficient.
Nope, you'll have to use an iterator. I don't get why getting the pointer is easier than getting an iterator though...
A std::list is not associative so there's no way you can use a pointer as a key to simply delete a specific element directly.
The fact that you find yourself in this situation points rather to questionable design since you're correct that the only way to remove the item from the collection as it stands is by iterating over it completely (i.e. linear complexity)
The following may be worth considering:
If possible, you could change the list to a std::multiset (assuming there are duplicate items) which will make direct access more efficient.
If the design allows, change the item that you're pointing to to incorporate a 'deleted' flag (or use a template to provide this) allowing you to avoid deleting the object from the collection but quickly mark it as deleted. Drawback is that all your software will have to change to accommodate this convention.
If this is the only bit of linear searching and the collection is not big (<20 items say.) For the sake of expediency, just do the linear search as you've suggested but leave a big comment in the code indicating how you "completely get" how inefficient this is. You may find that this does not become a tangible issue in any case for a while, if ever.
I'm guessing that 3 is probably your best option. :)
This is not what I advice to do, but just to answer the question:
Read only if you are ready to go into forbidden world of undefined behavior and non-portability:
There is non-portable way to make an iterator from T* pointer to an element in a list<T>. You need to look into your std library list header file. For Gnu g++ it includes stl_list.h where std::list definition is. Most typically std::list<T> consists of nodes similar to this:
template <class T>
struct Node {
T item;
Node* prev;
Node* next;
};
Having pointer to Node<T>::item you can by using offsetof calculate this node pointer. Be aware that this Node template could be the private part of std::list so you must hack this - let say by defining identical struct template with different name. std::list<>::iterator is just wrapper over this node.
It cannot be done.
I have a similar problem in that I'm using epoll_wait and processing a list of events. The events structure only contains a union, of which the most obvious type to use is void * to indicate which data is relevant (including the file descriptor) that was found.
It seems really silly that std::list will not allow you to remove an element via a pointer since there is obviously a next and previous pointer.
I'm considering going back to using the Linux kernel LIST macros instead to get around this. The problem with too much abstraction is that you have to give up on interoperability and communication with lower level apis.

What is the difference and benefits of these two lines of code?

I have two lines of code I want explained a bit please. As much as you can tell me. Mainly the benefits of each and what is happening behind the scenes with memory and such.
Here are two structs as an example:
struct Employee
{
std::string firstname, lastname;
char middleInitial;
Date hiringDate; // another struct, not important for example
short department;
};
struct Manager
{
Employee emp; // manager employee record
list<Employee*>group; // people managed
};
Which is better to use out of these two in the above struct and why?
list<Employee*>group;
list<Employee>group;
First of all, std::list is a doubly-linked list. So both those statements are creating a linked list of employees.
list<Employee*> group;
This creates a list of pointers to Employee objects. In this case there needs to be some other code to allocate each employee before you can add it to the list. Similarly, each employee must be deleted separately, std::list will not do this for you. If the list of employees is to be shared with some other entity this would make sense. It'd probably be better to place the employee in a smart pointer class to prevent memory leaks. Something like
typedef std::list<std::shared_ptr<Employee>> EmployeeList;
EmployeeList group;
This line
list<Employee>group;
creates a list of Employee objects by value. Here you can construct Employee objects on the stack, add them to the list and not have to worry about memory allocation. This makes sense if the employee list is not shared with anything else.
One is a list of pointers and the other is a list of objects. If you've already allocated the objects, the first makes sense.
You probably want to use the second one, if you store the "people managed" to be persisted also in another location. To elaborate: if you also have a global list of companyEmployees you probably want to have pointers, as you want to share the object representing an employee between the locations (so that, for example, if you update the name the change is "seen" from both locations).
If instead you only want to know "why a list of structs instead of a list of pointers" the answer is: better memory locality, no need to de-allocate the single Employee objects, but careful that every assignement to/from a list node (for example, through an iterator and its * operator) copies the whole struct and not just a pointer.
The first one stores the objects by pointer. In this case you need to carefully document who owns the allocated memory and who's responsible for cleaning it up when done. The second one stores the objects by value and has full control of their lifespan.
Which one to use depends on context you haven't given in your question although I favor the second slightly as a default because it doesn't leave open the possibility of mismanaging your memory.
But after all that, carefully consider if list is actually the right container choice for you. Typically it's a low-priority container that satisfies very specific needs. I almost always favor vector and deque first for random access containers, or set and map for ordered containers.
If you do need to store pointers in the container, boost provides ptr-container classes that manage the memory for you, or I suggest storing some sort of smart pointer so that the memory is cleaned up automatically when the object isn't needed anymore.
A lot depends on what you are doing. For starters, do you really want
Manager to contain an Employee, rather than to be one: the classical
example of a manager (one of the classic OO examples) would be:
struct Manager : public Employee
{
list<Employee*> group;
};
Otherwise, you have the problem that you cannot put managers into the
group of another manager; you're limited to one level in the management
hierarchy.
The second point is that in order to make an intelligent decision, you
have to understand the role of Employee in the program. If Employee
is just a value: some hard data, typically immutable (except by
assignment of a complete Employee), then list<Employee> group is
definitely to be preferred: don't use pointers unless you have to. If
Employee is a "entity", which models some external entity (say an
employee of the firm), you would generally make it uncopyable and
unassignable, and use list<Employee*> (with some sort of mechanism to
inform the Manager when the employee is fired, and the pointed to
object is deleted). If managers are employees, and you don't want to
loose this fact when they are added to a group, then you have to use the
pointer version: polymorphism requires pointers or references to work
(and you can't have a container of references).
The two lists are good, but they will require a completely different handling.
list<Employee*>group;
is a list of pointers to objects of type Employee and you will store there pointers to objects allocated dynamically, and you will need to be particularly clear as to who will delete those objects.
list<Employee>group;
is a list of objects of type Employee; you get the benefit (and associated cost in terms of performance) of dealing with concrete instances that you do not need to memory manage yourself.
Specifically, one of the advantages of using std::list compared to a plain array, is that you can have a list of objects and avoid the cost and risks of dealing with dynamic memory allocation and pointers.
With a list of objects, you can do, e. g.
Employee a; // object allocated in the stack
list.push_back(a); // the list does a copy for you
Employee* b = new Employee....
list.push_back(*b); // the object pointed is copied
delete b;
With a list of pointers you are forced at using always dynamic allocation, in practice, or refer to object whose lifetime is longer than the list's (if you can guarantee it).
By using a std::list of pointers, you are more or less in the same situation as when using a plain array of pointers as far as memory management is concerned. The only advantage you get is that the list can grow dynamically without effort on your part.
I personally don't see much sense in using a list of pointers; basically, because I think that pointers should be used (always, when possible) through smart pointers. So, if you really need pointers, you will be better off, IMO, using a list of smart pointers provided by boost.
Use the first one if you're allocating or accessing the structures separately.
Use the second one if you'll only be allocating/accessing them through the list.
First one defines a list of pointers to objects, the second a list of objects.
The first version (with pointers) is preferred by most of the programmers.
The main reason is that STL is copying elements by value making sorting and internal reallocation more efficient.
You probably want to use unique_ptr<> or auto_ptr<> or shared_ptr<> rather then plain old * pointers. This goes some if not the whole way of having both the expected use without much of the memory issues with using non-heap objects...

Avoid making copies with vectors of vectors

I want to be able to have a vector of vectors of some type such as:
vector<vector<MyStruct> > vecOfVec;
I then create a vector of MyStruct, and populate it.
vector<MyStruct> someStructs;
// Populate it with data
Then finally add someStructs to vecOfVec;
vecOfVec.push_back(someStructs);
What I want to do is avoid having the copy constructor calls when pushing the vector. I know this can be accomplished by using a vector of pointers, but I'd like to avoid that if possible.
One strategy I've thought of seems to work, but I don't know if I'm over-engineering this problem.
// Push back an empty vector
vecOfVec.push_back(vector<MyStruct>());
// Swap the empty with the filled vector (constant time)
vecOfVec.back().swap(someStructs);
This seems like it would add my vector without having to do any copies, but this seems like something a compiler would already be doing during optimization.
Do you think this is a good strategy?
Edit: Simplified my swap statement due to some suggestions.
The swap trick is as good as it gets with C++03. In C++0x, you'll be able to use the vector's move constructor via std::move to achieve the same thing in a more obvious way.
Another option is to not create a separate vector<MyStruct>, but instead have the code that creates it accept it a a vector<MyStruct>& argument, and operate on it. Then, you add a new empty element to your outer vector<vector<MyStruct>>, and pass a reference to the code that will fill it.
I know this can be accomplished by
using a vector of pointers, but I'd
like to avoid that if possible.
Why?
That would be the most intuitive/readable/maintainable solution and would be much better than any weird hacks anyone comes up with (such as the swap you show).
Tim,
There's a common pattern to solve this. This is called smart pointers, and the best one to use is boost::shared_ptr.
Then, never pass vector by value or store it. Instead, store boost::shared_ptr >. You don't need to care about allocations/deallocations (when the containing vector is destroyed, so will be the others, just as in your code), and you can access the inner members almost the same way. The copy is, however, avoided by means of the smart pointer object's reference counting mechanism.
Let me show you how.
using boost::shared_ptr;
vector<shared_ptr<vector<MyStruct> > vecOfVecs;
shared_ptr<vector<MyStruct> > someStructs(new vector<MyStruct>);
// fill in the vector MyStructs
MyStructs->push_back(some struct.... as you usually do).
//...
vecOfVecs.push_back(someStructs); // Look! No copy!
If you do not already use boost::shared_ptr, I recommend downloading it from boost.org rather than implementing your own. It is really irreplaceable tool, soon to be in the C++ standard library.
You can either do something like vect.push_back(vector<MyStruct>()); and do vect.back().push_back(MyStruct()); or use smart pointers and have a vector of smart pointers to vector<MyStruct>
I think the swap idea is already fine, but can be written much easier:
vecOfVec.push_back(vector<MyStruct>());
vecOfVec.back().swap(someStructs);