Related
As a beginner in C++, I am practicing C++ with an algorithm assignment. Along the way, I have some questions that I have difficulty getting through. Pardon me if the questions sound entry-level since I am still learning.
The goal is to find collinear points in a given vector of points with three classes. The following briefly describes the three classes' purposes:
Point: Representing a point with x and y values.
LineSegment: Representing a line segment with two points at ends.
Collinear: Containing the segments found in a vector of Points. The main part of the algorithm.
So, I would expect the client code to look like this:
std::vector<Point> points; // may become huge
// populate points
// ...
Collinear collinear_points(points);
std::vector<LineSegment> segments_in_points = collinear_points.GetSegments();
Since the class Collinear depends on a certain vector of points to get the segments correspondingly, I think it would need it as a data member. The question that keeps haunting me is, should it hold a copy of the vector or hold a raw pointer/reference to the vector outside the object. I think a smart pointer would be an overkill here. According to the old answer here, maybe it is better to go with reference which also avoids potential expensive copying? What is the common practice for this kind of dependency between classes if any exists?
If points gets modified after the construction of collinear_points, then the data collinear_points is referencing will be inconsistent with the segments it contains. Is it common to leave the responsibility to users for making sure the validness of an object depending on other ones? Is there a way to let collinear_points know the content has been modified and put it in an invalid state?
To answer your actual question from the title: A non-owning raw pointer would still be the usual choice, mainly because that’s what we’ve been doing since the old C days. Of course, a pointer has the problem that it can be nullptr. So using a reference communicates more clearly that null is not an allowed value. Because of that I tend to use the reference, although it still feels a tiny bit weird even to myself. But imo it’s the better design decision overall
That said, I believe the real question here is one of ownership. If Collinear does not own the vector, the user of your API has to make sure that the vector lives at least as long as the associated Collinear object. Otherwise you’ll access a dangling pointer/reference and things tend to go downhill from there. ;)
Is there a way to let collinear_points know the content has been modified and put it in an invalid state?
Yes, there is. Own everything. That includes the points vector and the segments vector. Following this approach Collinear could look something like this:
class Collinear {
public:
// usings because I’m a lazy typer
using PointsVec = std::vector<Point>;
using SegmentsVec = std::vector<LineSegment>;
// Take ownership of the points vector by either copying
// or moving it into a member.
explicit Collinear(const PointsVec& p): m_points(p) {}
explicit Collinear(PointsVec&& p): m_points(std::move(p)) {}
// Retain ownership of the segments vector by returning a const ref.
const SegmentsVec& GetSegments(); // Check if you can make it const.
// access functions for the two vectors ...
private:
PointsVec m_points;
SegmentsVec m_segments;
}
Now Collinear controls access to the points vector. You’ll have to write the functions for the permitted operations as members of Collinear. The important thing is never to return a non-const pointer or non-const ref to m_points, because then you might miss write accesses.
The segments vector is similar. Your provide the write access member functions and Collinear retains ownership, which means it can re-calculate it when necessary and the user doesn’t need to be concerned with that. Depending on how expensive the calculation is you can now go wild with lazy evaluation and every optimization you can think of.
There is a completely different design approach, though. Own nothing. Does Collinear have to be a class at all? Could it be a bunch of free functions in a namespace?
namespace Collinear {
std::vector<LineSegment> GetSegments(const std::vector<Point>& points);
}
// ...
auto segments_in_points = Collinear::GetSegments(points);
That’s the opposite of the own-everything approach. Before, you had full control. Now your user has full control. On the other hand, they now have to take care of any laziness/optimizations/update detection.
Which approach is appropriate is a question of a) API design philosophy and b) your conrete situation. What are your users? What do they expect? Which approach makes their lives easier? Since this is an assignment, you probably won’t have any real users. So imagine a group of people that might want to use your code and decide based on that. Or just use the approach you’ll have more fun implementing. The important thing imo: Pick one of the two approaches. Don’t mix them, because such an API is inconsistent. That increases confusion, decreases ease of use, and makes errors more likely.
Btw:
I think a smart pointer would be an overkill here.
Using smart pointers is not a question of overkill. It’s a question of ownership. If you have an owning pointer never use a raw pointer for it. … Unless a legacy API forces you to. Even then it’s a great idea to mark it as owning with a transparent wrapper like gsl::owner<T>.
The question that keeps haunting me is, should it hold a copy of the vector or hold a raw pointer/reference to the vector outside the object
I think you should keep a copy of vector here to keep things generic.You don't want your collinear class depend on a particular vector<Points>, instead it should be like while creating instance of collinear class you just tell it on what vector<Points>, it has to work on.Then if you change this vector and you want collinear also to work on this data set(which you might not want), its your responsibility to tell collinear to work on new data set.If you want collinear to be updated automatically when you update vector<points>, you can do so, but you have to answer questions like what happens to the state of collinear(which would be depending on the vector<points>) when the data set changes.
I am trying to write a simple game using C++ and SDL. My question is, what is the best practice to store class member variables.
MyObject obj;
MyObject* obj;
I read a lot about eliminating pointers as much as possible in similar questions, but I remember that few years back in some books I read they used it a lot (for all non trivial objects) . Another thing is that SDL returns pointers in many of its functions and therefor I would have to use "*" a lot when working with SDL objects.
Also am I right when I think the only way to initialize the first one using other than default constructor is through initializer list?
Generally, using value members is preferred over pointer members. However, there are some exceptions, e.g. (this list is probably incomplete and only contains reason I could come up with immediately):
When the members are huge (use sizeof(MyObject) to find out), the difference often doesn't matter for the access and stack size may be a concern.
When the objects come from another source, e.g., when there are factory function creating pointers, there is often no alternative to store the objects.
If the dynamic type of the object isn't known, using a pointer is generally the only alternative. However, this shouldn't be as common as it often is.
When there are more complicated relations than direct owner, e.g., if an object is shared between different objects, using a pointer is the most reasonable approach.
In all of these case you wouldn't use a pointer directly but rather a suitable smart pointer. For example, for 1. you might want to use a std::unique_ptr<MyObject> and for 4. a std::shared_ptr<MyObject> is the best alternative. For 2. you might need to use one of these smart pointer templates combined with a suitable deleter function to deal with the appropriate clean-up (e.g. for a FILE* obtained from fopen() you'd use fclose() as a deleter function; of course, this is a made up example as in C++ you would use I/O streams anyway).
In general, I normally initialize my objects entirely in the member initializer list, independent on how the members are represented exactly. However, yes, if you member objects require constructor arguments, these need to be passed from a member initializer list.
First I would like to say that I completely agree with Dietmar Kühl and Mats Petersson answer. However, you have also to take on account that SDL is a pure C library where the majority of the API functions expect C pointers of structs that can own big chunks of data. So you should not allocate them on stack (you shoud use new operator to allocate them on the heap). Furthermore, because C language does not contain smart pointers, you need to use std::unique_ptr::get() to recover the C pointer that std::unique_ptr owns before sending it to SDL API functions. This can be quite dangerous because you have to make sure that the std::unique_ptr does not get out of scope while SDL is using the C pointer (similar problem with std::share_ptr). Otherwise you will get seg fault because std::unique_ptr will delete the C pointer while SDL is using it.
Whenever you need to call pure C libraries inside a C++ program, I recommend the use of RAII. The main idea is that you create a small wrapper class that owns the C pointer and also calls the SDL API functions for you. Then you use the class destructor to delete all your C pointers.
Example:
class SDLAudioWrap {
public:
SDLAudioWrap() { // constructor
// allocate SDL_AudioSpec
}
~SDLAudioWrap() { // destructor
// free SDL_AudioSpec
}
// here you wrap all SDL API functions that involve
// SDL_AudioSpec and that you will use in your program
// It is quite simple
void SDL_do_some_stuff() {
SDL_do_some_stuff(ptr); // original C function
// SDL_do_some_stuff(SDL_AudioSpec* ptr)
}
private:
SDL_AudioSpec* ptr;
}
Now your program is exception safe and you don't have the possible issue of having smart pointers deleting your C pointer while SDL is using it.
UPDATE 1: I forget to mention that because SDL is a C library, you will need a custom deleter class in order to proper manage their C structs using smart pointers.
Concrete example: GSL GNU scientific library. Integration routine requires the allocation of a struct called "gsl_integration_workspace". In this case, you can use the following code to ensure that your code is exception safe
auto deleter= [](gsl_integration_workspace* ptr) {
gsl_integration_workspace_free(ptr);
};
std::unique_ptr<gsl_integration_workspace, decltype(deleter)> ptr4 (
gsl_integration_workspace_alloc (2000), deleter);
Another reason why I prefer wrapper classes
In case of initialization, it depends on what the options are, but yes, a common way is to use an initializer list.
The "don't use pointers unless you have to" is good advice in general. Of course, there are times when you have to - for example when an object is being returned by an API!
Also, using new will waste quite a bit of memory and CPU-time if MyObject is small. Each object created with new has an overhead of around 16-48 bytes in a typical modern OS, so if your object is only a couple of simple types, then you may well have more overhead than actual storage. In a largeer application, this can easily add up to a huge amount. And of course, a call to new or delete will most likely take some hundreds or thousands of cycles (above and beyond the time used in the constructor). So, you end up with code that runs slower and takes more memory - and of course, there's always some risk that you mess up and have memory leaks, causing your program to potentially crash due to out of memory, when it's not REALLY out of memory.
And as that famous "Murphy's law states", these things just have to happen at the worst possible and most annoying times - when you have just done some really good work, or when you've just succeeded at a level in a game, or something. So avoiding those risks whenever possible is definitely a good idea.
Well, creating the object is a lot better than using pointers because it's less error prone. Your code doesn't describe it well.
MyObj* foo;
foo = new MyObj;
foo->CanDoStuff(stuff);
//Later when foo is not needed
delete foo;
The other way is
MyObj foo;
foo.CanDoStuff(stuff);
less memory management but really it's up to you.
As the previous answers claimed the "don't use pointers unless you have to" is a good advise for general programming but then there are many issues that could finally make you select the pointers choice. Furthermore, in you initial question you are not considering the option of using references. So you can face three types of variable members in a class:
MyObject obj;
MyObject* obj;
MyObject& obj;
I use to always consider the reference option rather than the pointer one because you don't need to take care about if the pointer is NULL or not.
Also, as Dietmar Kühl pointed, a good reason for selecting pointers is:
If the dynamic type of the object isn't known, using a pointer is
generally the only alternative. However, this shouldn't be as common
as it often is.
I think this point is of particular importance when you are working on a big project. If you have many own classes, arranged in many source files and you use them in many parts of your code you will come up with long compilation times. If you use normal class instances (instead of pointers or references) a simple change in one of the header file of your classes will infer in the recompilation of all the classes that include this modified class. One possible solution for this issue is to use the concept of Forward declaration, which make use of pointers or references (you can find more info here).
So after working on my last question, I boiled it down to this:
I need to add an unknown number user-defined classes (object_c) to a boost::intrusive::list. The classes have const members in them. All I need to do to push them to the list is to construct them and then have them persist, they automatically add themselves.
The code in question is basically
for (unsigned i = 0; i < json_objects.count(); ++i) {
ctor_data = read(json_objects[i]);
// construct object here
}
What I've tried:
mallocing an array of objects, then filling them in: Doesn't work, because I have const members.
static object_c *json_input = (object_c*) malloc(json_objects.size() * sizeof(object_c));
...
json_input[i](ctor_data); //error: no match for call to (object_c) (ctor_data&)
Making a pointer: This doesn't work, functions don't work properly with it, and it doesn't get destructed
new object_c(ctor_data);
Pushing the object back to an std::vector: This doesn't work, boost rants for dozens of lines when I try (output here)
vector_of_objects.push_back(object_c(ctor_data));
Just declaring the darn thing: Obviously doesn't work, goes out of scope immediately (dur)
object_c(ctor_data);
I'm sure there is an easy way to do this. Anyone have any ideas? I've been at this problem for most of the weekend.
#3 should be the method you need to use. You need to elabourate on what your errors are.
If it is just operator= as you show in your previous question, and you dont want to define one, you can try emplace_back as long as you are in C++11. Of Course I am talking std::vector, I need to check what is the equivalent if any in boost::intrusive. Edit: I might be wrong, but it doesnt seem to support move semantics yet..
Alternatively use #2 with smart pointers.
If you are going with #1, you would need to use placement new as #rasmus indicates.
At the end of the documentation's usage section it tells you that
“The lifetime of a stored object is not bound to or managed by the container”
So you need to somehow manage the objects’ lifetime.
One way is to have them in a std::vector, as in the documentation’s final example.
Sorry for the late reply, exam studying and all that.
It was simpler than I was making it out to be, basically. Also, for this answer, I'm referring to my class as entity_c, so that an object of entity_c actually makes sense.
What I was doing in my OP was when I push_back'd an entity_c, it was automatically adding itself to a global intrusive::list, and somehow that made it not work. After I stopped being lazy, I wrote up a minimal compilable progam and played around with that. I found out that making an std::vector to store the constructed entity_cs in worked (even though it deconstructs them when they're added? I dunno what that's about). Then all I had to do was populate a local intrusive::list with those objects, then clone the local list into the global list.
Thanks for all the help, I'll tweak that program to try and fit in different stuff, like placement new suggested by #rasmus (thanks for that, hadn't seen that before). Also thanks to #karathik for showing my emplace_new, I think I might have to go and find out about all these new C++11 features that have been added in, there are so many cool ones. I even learnt how to make my own copy constructor.
Truly and edifying educational experience.
My application problem is the following -
I have a large structure foo. Because these are large and for memory management reasons, we do not wish to delete them when processing on the data is complete.
We are storing them in std::vector<boost::shared_ptr<foo>>.
My question is related to knowing when all processing is complete. First decision is that we do not want any of the other application code to mark a complete flag in the structure because there are multiple execution paths in the program and we cannot predict which one is the last.
So in our implementation, once processing is complete, we delete all copies of boost::shared_ptr<foo>> except for the one in the vector. This will drop the reference counter in the shared_ptr to 1. Is it practical to use shared_ptr.use_count() to see if it is equal to 1 to know when all other parts of my app are done with the data.
One additional reason I'm asking the question is that the boost documentation on the shared pointer shared_ptr recommends not using "use_count" for production code.
Edit -
What I did not say is that when we need a new foo, we will scan the vector of foo pointers looking for a foo that is not currently in use and use that foo for the next round of processing. This is why I was thinking that having the reference counter of 1 would be a safe way to ensure that this particular foo object is no longer in use.
My immediate reaction (and I'll admit, it's no more than that) is that it sounds like you're trying to get the effect of a pool allocator of some sort. You might be better off overloading operator new and operator delete to get the effect you want a bit more directly. With something like that, you can probably just use a shared_ptr like normal, and the other work you want delayed, will be handled in operator delete for that class.
That leaves a more basic question: what are you really trying to accomplish with this? From a memory management viewpoint, one common wish is to allocate memory for a large number of objects at once, and after the entire block is empty, release the whole block at once. If you're trying to do something on that order, it's almost certainly easier to accomplish by overloading new and delete than by playing games with shared_ptr's use_count.
Edit: based on your comment, overloading new and delete for class sounds like the right thing to do. If anything, integration into your existing code will probably be easier; in fact, you can often do it completely transparently.
The general idea for the allocator is pretty much the same as you've outlined in your edited question: have a structure (bitmaps and linked lists are both common) to keep track of your free objects. When new needs to allocate an object, it can scan the bit vector or look at the head of the linked list of free objects, and return its address.
This is one case that linked lists can work out quite well -- you (usually) don't have to worry about memory usage, because you store your links right in the free object, and you (virtually) never have to walk the list, because when you need to allocate an object, you just grab the first item on the list.
This sort of thing is particularly common with small objects, so you might want to look at the Modern C++ Design chapter on its small object allocator (and an article or two since then by Andrei Alexandrescu about his newer ideas of how to do that sort of thing). There's also the Boost::pool allocator, which is generally at least somewhat similar.
If you want to know whether or not the use count is 1, use the unique() member function.
I would say your application should have some method that eliminates all references to the Foo from other parts of the app, and that method should be used instead of checking use_count(). Besides, if use_count() is greater than 1, what would your program do? You shouldn't be relying on shared_ptr's features to eliminate all references, your application architecture should be able to eliminate references. As a final check before removing it from the vector, you could assert(unique()) to verify it really is being released.
I think you can use shared_ptr's custom deleter functionality to call a particular function when the last copy has been released. That way, you're not using use_count at all.
You would need to hold something other than a copy of the shared_ptr in your vector so that the shared_ptr is only tracking the outstanding processing.
Boost has several examples of custom deleters in the shared_ptr docs.
I would suggest that instead of trying to use the shared_ptr's use_count to keep track, it might be better to implement your own usage counter. this way you will have full control over this rather than using the shared_ptr's one which, as you rightly suggest, is not recommended. You can also pre-set your own counter to allow for the number of threads you know will need to act on the data, rather than relying on them all being initialised at the beginning to get their copies of the structure.
I have an object which implements reference counting mechanism. If the number of references to it becomes zero, the object is deleted.
I found that my object is never deleted, even when I am done with it. This is leading to memory overuse. All I have is the number of references to the object and I want to know the places which reference it so that I can write appropriate cleanup code.
Is there some way to accomplish this without having to grep in the source files? (That would be very cumbersome.)
A huge part of getting reference counting (refcounting) done correctly in C++ is to use Resource Allocation Is Initialization so it's much harder to accidentally leak references. However, this doesn't solve everything with refcounts.
That said, you can implement a debug feature in your refcounting which tracks what is holding references. You can then analyze this information when necessary, and remove it from release builds. (Use a configuration macro similar in purpose to how DEBUG macros are used.)
Exactly how you should implement it is going to depend on all your requirements, but there are two main ways to do this (with a brief overview of differences):
store the information on the referenced object itself
accessible from your debugger
easier to implement
output to a special trace file every time a reference is acquired or released
still available after the program exits (even abnormally)
possible to use while the program is running, without running in your debugger
can be used even in special release builds and sent back to you for analysis
The basic problem, of knowing what is referencing a given object, is hard to solve in general, and will require some work. Compare: can you tell me every person and business that knows your postal address or phone number?
One known weakness of reference counting is that it does not work when there are cyclic references, i.e. (in the simplest case) when one object has a reference to another object which in turn has a reference to the former object. This sounds like a non-issue, but in data structures such as binary trees with back-references to parent nodes, there you are.
If you don't explicitly provide for a list of "reverse" references in the referenced (un-freed) object, I don't see a way to figure out who is referencing it.
In the following suggestions, I assume that you don't want to modify your source, or if so, just a little.
You could of course walk the whole heap / freestore and search for the memory address of your un-freed object, but if its address turns up, it's not guaranteed to actually be a memory address reference; it could just as well be any random floating point number, of anything else. However, if the found value lies inside a block a memory that your application allocated for an object, chances improve a little that it's indeed a pointer to another object.
One possible improvement over this approach would be to modify the memory allocator you use -- e.g. your global operator new -- so that it keeps a list of all allocated memory blocks and their sizes. (In a complete implementation of this, operator delete would have remove the list entry for the freed block of memory.) Now, at the end of your program, you have a clue where to search for the un-freed object's memory address, since you have a list of memory blocks that your program actually used.
The above suggestions don't sound very reliable to me, to be honest; but maybe defining a custom global operator new and operator delete that does some logging / tracing goes in the right direction to solve your problem.
I am assuming you have some class with say addRef() and release() member functions, and you call these when you need to increase and decrease the reference count on each instance, and that the instances that cause problems are on the heap and referred to with raw pointers. The simplest fix may be to replace all pointers to the controlled object with boost::shared_ptr. This is surprisingly easy to do and should enable you to dispense with your own reference counting - you can just make those functions I mentioned do nothing. The main change required in your code is in the signatures of functions that pass or return your pointers. Other places to change are in initializer lists (if you initialize pointers to null) and if()-statements (if you compare pointers with null). The compiler will find all such places after you change the declarations of the pointers.
If you do not want to use the shared_ptr - maybe you want to keep the reference count intrinsic to the class - you can craft your own simple smart pointer just to deal with your class. Then use it to control the lifetime of your class objects. So for example, instead of pointer assignment being done with raw pointers and you "manually" calling addRef(), you just do an assignment of your smart pointer class which includes the addRef() automatically.
I don't think it's possible to do something without code change. With code change you can for example remember the pointers of the objects which increase reference count, and then see what pointer is left and examine it in the debugger. If possible - store more verbose information, such as object name.
I have created one for my needs. You can compare your code with this one and see what's missing. It's not perfect but it should work in most of the cases.
http://sites.google.com/site/grayasm/autopointer
when I use it I do:
util::autopointer<A> aptr=new A();
I never do it like this:
A* ptr = new A();
util::autopointer<A> aptr = ptr;
and later to start fulling around with ptr; That's not allowed.
Further I am using only aptr to refer to this object.
If I am wrong I have now the chance to get corrections. :) See ya!