what exactly reference counting in c++ means?, - c++

What exactly is reference counting? In particular, what is it for C++? What are the problems we can face if we don't handle them? Do all languages require reference counting?

What exactly is reference counting? In particular, what is it for C++?
In simple words, Reference counting means counting the references to an object.
Typically, C++ employs the technique of RAII. Wherein, the ability to manage the deallocation of an type object is tied up within the type object itself. It means that the user does not have to explicitly manage the lifetime of the object and its deallocation explicitly, The functionality to do this management is built in the object itself.
This functionality means that the object should exist and remain valid untill there are stakeholders who refer to the object, and this is achieved by reference counting. Everytime the object is shared(copied) the reference count(typically a member inside the class type) is incremented and each time the destructor is called the count is decremented, when the count reaches 0, the object is not being reffered by anyone and it marks the end of its lifetime and hence it is destructed.
What are the problems we can face if we don't handle them?
It would mean no more RAII, and endless and often faulty manual resource management.
In short programming nightmares.
Do all languages require reference counting?
Languages don't require reference counting but employing the technique provides very easy usage and less efforts for users of the language, So most languages prefer to use it to provide these advantages to their users.

Reference counting is a simple but not complete approach for garbage detection.
When the counter reaches zero, you could release that object.
BUT if there are no more used objects which referencing each other cyclic, they will never be released
Consider a references b, b references a, but nothing else reference a or b.
The reference count on a and b will be still 1 (= in use)

Reference-count garbage collection is a powerful technique for managing memory that helps prevent objects from being deleted accidentally or more than once. The technique is not limited to C++ code and, despite its name, is unrelated to the C++ concept of reference variables. Rather, the term means that we maintain a count of all ``owning references'' to an object and delete the object when this count becomes zero.

Reference counting - lets use a metaphor.
You have an ear. You want it back at some point.
You get a group of people pointing at your ear. You count them as soon as they point.
When the number goes to zero - it is just yours and you can do with it as you wish.
I.e. take it out of the equation (free it back to memory).
BTW. Circular stuff is tricky to spot.

Related

Swig: simple idiomatic wrapper usage when weak_ptr are used?

note: this question is related to weak_ptr usage, but is not about wrapping weak_ptrs.
I am currently evaluating Swig and I have found an "inconvenience" in the usage of the wrappers by the client languages, that I have not found described online and for which I have no satisfactory solution.
In C++ if you have a complex graph of objects that are managed using shared_ptrs, you must take special care when the graph can have a cycle (i.e. if it is not a DAG) or else you will get memory leaks. By that I mean that if you must (cannot avoid) to have a cycle, it must contain at least one weak_ptr. This means that you will have to handle the cases where you cannot lock the weak_ptr, because the related shared_ptr has died. This management is something that one can expect C++ programmers to be used to deal with. Now let's look at what can happen for a user of the wrappers:
So let's take the following example:
object a, held by a shared_ptr, has a shared_ptr to b
object b, held by a shared_ptr, has a weak_ptr to a
The following could happen to a user of the wrapper:
A a
B b = a->GetB()
// here let's suppose that a gets out of scope, so it can be garbage collected
b->GetA() // fails if a has been garbage collected
The failure could be cleanly managed by propagating a C++ exception to the client code (throw if cannot lock the weak_ptr to create a shared_ptr). However this is not idiomatic to Python/C#/Java users: they do not except to have to manually keep some objects alive to access others.
I have a draft of a an alternative solution which involves creating a C++ "Co-Owner" of objects a and b that would be locked by the SWIG wrappers when any of a or b are accessed via the wrappers, thus keeping both a and b alive when any of them is accessed via the wrappers. The downsides are that this starts to look like I am implementing a proto-garbage-collection in C++, also this modifies the C++ implementation & API of objects a & b and finally the objects a & b will have to be notified by the wrappers that they are used via the wrappers (not something I think I can do without patching SWIG to add function calls in constructor and destructor of shadow objects ??).
Have I missed anything ? Is there another solution to this problem ?
Publicly available weak pointers
If weak ownership is actually part of the interface, it is possible to bind extra types manually that exhibit the behavior of weak pointers. In this example they create a type FooWeakPtr providing the relevant interface. But as you can see, this is not taking advantage of the language-specific classes in every language.
Internal weak pointers
If the weak pointers are not part of the interface, then SWIG-generated bindings should not care about them, and treat the bound objects as shared pointers (at least in Python and Java). Therefore, for as long as your objects are available from the other language, you must make sure they all stay alive.
That means it is up to you to design your hierarchy of objects in a way that clients can never face a case where an internal weak pointer is invalid, and that dropping every reference actually leads to the destruction of the hierarchy.
The solution actually resides in the details of your object hierarchy, and SWIG cannot do much about it.

C++ implicit data sharing [duplicate]

I am building a game engine library in C++. A little while back I was using Qt to build an application and was rather fascinated with its use of Implicit Sharing. I am wondering if anybody could explain this technique in greater detail or could offer a simple example of this in action.
The key idea behind implicit sharing seems to go around using the more common term copy-on-write. The idea behind copy-on-write is to have each object serve as a wrapper around a pointer to the actual implementation. Each implementation object keeps track of the number of pointers into it. Whenever an operation is performed on the wrapper object, it's just forwarded to the implementation object, which does the actual work.
The advantage of this approach is that copying and destruction of these objects are cheap. To make a copy of the object, we just make a new instance of a wrapper, set its pointer to point at the implementation object, and then increment the count of the number of pointers to the object (this is sometimes called the reference count, by the way). Destruction is similar - we drop the reference count by one, then see if anyone else is pointing at the implementation. If not, we free its resources. Otherwise, we do nothing and just assume someone else will do the cleanup later.
The challenge in this approach is that it means that multiple different objects will all be pointing at the same implementation. This means that if someone ends up making a change to the implementation, every object referencing that implementation will see the changes - a very serious problem. To fix this, every time an operation is performed that might potentially change the implementation, the operation checks to see if any other objects also reference the implementation by seeing if the reference count is identically 1. If no other objects reference the object, then the operation can just go ahead - there's no possibility of the changes propagating. If there is at least one other object referencing the data, then the wrapper first makes a deep-copy of the implementation for itself and changes its pointer to point to the new object. Now we know there can't be any sharing, and the changes can be made without a hassle.
If you'd like to see some examples of this in action, take a look at lecture examples 15.0 and 16.0 from Stanford's introductory C++ programming course. It shows how to design an object to hold a list of words using this technique.
Hope this helps!

C++: Safe to use locals of caller in function?

I think it's best if I describe the situation using a code example:
int MyFuncA()
{
MyClass someInstance;
//<Work with and fill someInstance...>
MyFuncB( &someInstance )
}
int MyFuncB( MyClass* instance )
{
//Do anything you could imagine with instance, *except*:
//* Allowing references to it or any of it's data members to escape this function
//* Freeing anything the class will free in it's destructor, including itself
instance->DoThis();
instance->ModifyThat();
}
And here come my straightforward questions:
Is the above concept guranteed, by C and C++ standards, to work as expected? Why (not)?
Is this considered doing this, sparingly and with care, bad practice?
Is the above concept guranteed, by C and C++ standards, to work as expected? Why (not)?
Yes, it will work as expected. someInstance is available through the scope of MyFuncA. The call to MyFuncB is within that scope.
Is this considered doing this, sparingly and with care, bad practice?
Don't see why.
I don't see any problem in actually using the pointer you were passed to call functions on the object. As long as you call public methods of MyClass, everything remains valid C/C++.
The actual instance you create at the beginning of MyFuncA() will get destroyed at the end of MyFuncA(), and you are guaranteed that the instance will remain valid for the whole execution of MyFuncB() because someInstance is still valid in the scope of MyFuncA().
Yes it will work. It does not matter if the pointer you pass into MyFuncB is on the stack or on the heap (in this specific case).
In regards for the bad practice part you can probably argue both ways. In general it's bad I think because if for any reason any object which is living outside of MyFuncA gets hold of the object reference then it will die a horrible death later on and cause sometime very hard to track bugs. It rewally depends how extensive the usage of the object becomes in MyFuncB. Especially when it starts involving another 3rd class it can get messy.
Others have answered the basic question, with "yeah, that's legal". And in the absence of greater architecture it is hard to call it good or bad practice. But I'll try and wax philosophical on the broader question you seem to be picking up about pointers, object lifetimes, and expectations across function calls...
In the C++ language, there's no built-in way to pass a pointer to a function and "enforce" that it won't stow that away after the call is complete. And since C++ pointers are "weak references" by default, the objects pointed to may disappear out from under someone you pass it to.
But explicitly weak pointer abstractions do exist, for instance in Qt:
http://doc.qt.nokia.com/latest/qweakpointer.html
These are designed to specifically encode the "paranoia" to the recipient that the object it is holding onto can disappear out from under it. Anyone dereferencing one sort of realizes something is up, and they have to take the proper cautions under the design contract.
Additionally, abstractions like shared pointer exist which signal a different understanding to the recipient. Passing them one of those gives them the right to keep the object alive as long as they want, giving you something like garbage collection:
http://doc.qt.nokia.com/4.7-snapshot/qsharedpointer.html
These are only some options. But in the most general sense, if you come up with any interesting invariant for the lifetimes of your object...consider not passing raw pointers. Instead pass some pointer-wrapping class that embodies and documents the rules of the "game" in your architecture.
(One of major the reasons to use C++ instead of other languages is the wealth of tools you have to do cool things like that, without too much runtime cost!)
i don't think there should be any problem with that barring, as you say, something that frees the object, or otherwise trashes its state. i think whatever unexpected things happen would not have anything to do with using the class this way. (nothing in life is guaranteed of course, but classes are intended to be passed around and operated on, whether it's a local variable or otherwise i do not believe is relevant.)
the one thing you would not be able to do is keep a reference to the class after it goes out of scope when MyFuncA() returns, but that's just the nature of the scoping rules.

Handling one object in some containers

I want to store pointers to one instance of an object in some (two or more) containers. I've met one problem in this idea: how I can handle removing of this object. Objects have rather stormy life (I am talking about game, but I think this situation is not so specific) and can be removed rather often. To my mind this problem is divided into two problems
1.
How should I signal to containers about deletion? In C# I used to create boolean property IsDead in stored objects, so each iteration of the main loop at first finds 'dead' objects and removes them. No circular reference and everything is rather clear :-) Is this technique correct?
2.
Even if I implement this technique in C++ I meet difficulty with calling destructors if this object is in some containers. Even if I create some kind of a field 'IsDead' and remove dead object from all lists, I had to free memory.
After reading some articles I have an idea that I should have one 'main' container with shared_ptr to all my objects, and other containers should store weak_ptr to them, so only main container checks object's status and others look only at shared_ptr. Are my intentions correct or is there another solution?
It sounds like you're looking for shared_ptr<T>.
http://msdn.microsoft.com/en-us/library/bb982026.aspx
This is a reference counted ptr in C++ that enables easy sharing of objects. The shared_ptr<T> can be freely handed out to several objects. As the shared_ptr instances are copied around and destucted the internal reference counter will be updated appropriately. When all references are removed the underlying data will be deleted.

Know what references an object

I have an object which implements reference counting mechanism. If the number of references to it becomes zero, the object is deleted.
I found that my object is never deleted, even when I am done with it. This is leading to memory overuse. All I have is the number of references to the object and I want to know the places which reference it so that I can write appropriate cleanup code.
Is there some way to accomplish this without having to grep in the source files? (That would be very cumbersome.)
A huge part of getting reference counting (refcounting) done correctly in C++ is to use Resource Allocation Is Initialization so it's much harder to accidentally leak references. However, this doesn't solve everything with refcounts.
That said, you can implement a debug feature in your refcounting which tracks what is holding references. You can then analyze this information when necessary, and remove it from release builds. (Use a configuration macro similar in purpose to how DEBUG macros are used.)
Exactly how you should implement it is going to depend on all your requirements, but there are two main ways to do this (with a brief overview of differences):
store the information on the referenced object itself
accessible from your debugger
easier to implement
output to a special trace file every time a reference is acquired or released
still available after the program exits (even abnormally)
possible to use while the program is running, without running in your debugger
can be used even in special release builds and sent back to you for analysis
The basic problem, of knowing what is referencing a given object, is hard to solve in general, and will require some work. Compare: can you tell me every person and business that knows your postal address or phone number?
One known weakness of reference counting is that it does not work when there are cyclic references, i.e. (in the simplest case) when one object has a reference to another object which in turn has a reference to the former object. This sounds like a non-issue, but in data structures such as binary trees with back-references to parent nodes, there you are.
If you don't explicitly provide for a list of "reverse" references in the referenced (un-freed) object, I don't see a way to figure out who is referencing it.
In the following suggestions, I assume that you don't want to modify your source, or if so, just a little.
You could of course walk the whole heap / freestore and search for the memory address of your un-freed object, but if its address turns up, it's not guaranteed to actually be a memory address reference; it could just as well be any random floating point number, of anything else. However, if the found value lies inside a block a memory that your application allocated for an object, chances improve a little that it's indeed a pointer to another object.
One possible improvement over this approach would be to modify the memory allocator you use -- e.g. your global operator new -- so that it keeps a list of all allocated memory blocks and their sizes. (In a complete implementation of this, operator delete would have remove the list entry for the freed block of memory.) Now, at the end of your program, you have a clue where to search for the un-freed object's memory address, since you have a list of memory blocks that your program actually used.
The above suggestions don't sound very reliable to me, to be honest; but maybe defining a custom global operator new and operator delete that does some logging / tracing goes in the right direction to solve your problem.
I am assuming you have some class with say addRef() and release() member functions, and you call these when you need to increase and decrease the reference count on each instance, and that the instances that cause problems are on the heap and referred to with raw pointers. The simplest fix may be to replace all pointers to the controlled object with boost::shared_ptr. This is surprisingly easy to do and should enable you to dispense with your own reference counting - you can just make those functions I mentioned do nothing. The main change required in your code is in the signatures of functions that pass or return your pointers. Other places to change are in initializer lists (if you initialize pointers to null) and if()-statements (if you compare pointers with null). The compiler will find all such places after you change the declarations of the pointers.
If you do not want to use the shared_ptr - maybe you want to keep the reference count intrinsic to the class - you can craft your own simple smart pointer just to deal with your class. Then use it to control the lifetime of your class objects. So for example, instead of pointer assignment being done with raw pointers and you "manually" calling addRef(), you just do an assignment of your smart pointer class which includes the addRef() automatically.
I don't think it's possible to do something without code change. With code change you can for example remember the pointers of the objects which increase reference count, and then see what pointer is left and examine it in the debugger. If possible - store more verbose information, such as object name.
I have created one for my needs. You can compare your code with this one and see what's missing. It's not perfect but it should work in most of the cases.
http://sites.google.com/site/grayasm/autopointer
when I use it I do:
util::autopointer<A> aptr=new A();
I never do it like this:
A* ptr = new A();
util::autopointer<A> aptr = ptr;
and later to start fulling around with ptr; That's not allowed.
Further I am using only aptr to refer to this object.
If I am wrong I have now the chance to get corrections. :) See ya!