The ** idiom in C++ for object construction - c++

In a lot of C++ API'S (COM-based ones spring to mind) that make something for you, the pointer to the object that is constructed is usually required as a ** pointer (and the function will construct and init it for you)
You usually see signatures like:
HRESULT createAnObject( int howbig, Object **objectYouWantMeToInitialize ) ;
-- but you seldom see the new object being passed as a return value.
Besides people wanting to see error codes, what is the reason for this? Is it better to use the ** pattern rather than a returned pointer for simpler operations such as:
wchar_t* getUnicode( const char* src ) ;
Or would this better be written as:
void getUnicode( const char* src, wchar_t** dst ) ;
The most important thing I can think of is to remember to free it, and the ** way, for some reason, tends to remind me that I have to deallocate it as well.

"Besides wanting error codes"?
What makes you think there is a besides. Error codes are pretty much the one and only reason. The function needs some way to indicate failure. C doesn't have exceptions, so it has to do that through either a pointer parameter, or the return value, and the return value is idiomatic, and easier to check when calling the function.
(By the way, there's no universal rule that ** means you have to free the object. That's not always the case, and it's probably a bad idea to use something that arbitrary to remind you of which objects to clean up.)

Two reasons come to my mind.
First are error codes actually. Other than C++, C doesn't have exceptions, and COM is a C-API. Also many C++ based projects prefer not to use exceptions for various reasons.
There may be cases, where a return value can't signal errors, E.g. if your function returns an integer, there may be no integer value, that can represent an error code. While signalling errors with pointers is easy (NULL == Error), some API designers prefer to signal errors in a consistent way over all functions.
Second, functions can have only one return value, but calling them may create multiple objects. Some Win32 API functions take multiple pointers to pointers that can be filled optionally, if you call these functions with non-NULL pointers. You cannot return two pointers, or rather this would be awkward to use, if the return value is some struct by value containing more than one pointer. Here too a consistent API is a sensible goal to achieve.

New objects in function arguments passed by ** is better. This take me a comfort to future use of change void to bool for example to return success of a function or other information providing function works.
Answer in one line: This is much better for resulting error codes.

Besides people wanting to see error codes, what is the reason for this?
There are some reasons for this. One of them is writing an interface that is usable in C (you see this in the WinAPI and Windows COM).
Backwards compatibility is another reason (i.e. the interface was written like that and breaking it now would break existing code).
I'd go with C compatibility for a design principle when using code like this. If you were to write in C++ you'd write
retval Myfunction(Result *& output);
instead of
retval Myfunction(Result ** output);
or (even better):
Result *Myfunction();
and have the function throw an exception on error.

I'm not sure I agree that's the best way to do it... this might be better:
Object * createAnObject(int howbig, HRESULT * optPlaceResultCodeHereIfNotNull = NULL);
That way there is no messing about with double-indirection (which can be a little bit tricky for people who aren't used to it), and the people who don't care about result codes don't have to worry about the second argument at all... they can just check to see if the return value is NULL or not.
Actually, since it's C++, you could make things easier still, using function overloading:
Object * createAnObject(int howbig);
Object * createAnObject(int howbig, HRESULT & returnResultCode);

Any method call in a COM call has to be HRESULT. The return codes get leveraged all over the framework and passing a double pointer is a well-known way to get the created object.

Not answering your question but a comment as your question brought out some thoughts I have about COM/DCOM programming using C++.
All these "pointer" and "pointer to pointer", memory management and reference counting are the reasons why I shy away from doing COM programming with C++. Even with ATL in place, I dislike it for the simple reason that it does not look natural enough. Having said that, I did do a few projects using ATL.
Back then the alternative is use VB. VB code looks more natural for COM or DCOM programming.
Today, I would use C#.

Related

Metaprogramming C/C++ using the preprocessor

So I have this huge tree that is basically a big switch/case with string keys and different function calls on one common object depending on the key and one piece of metadata.
Every entry basically looks like this
} else if ( strcmp(key, "key_string") == 0) {
((class_name*)object)->do_something();
} else if ( ...
where do_something can have different invocations, so I can't just use function pointers. Also, some keys require object to be cast to a subclass.
Now, if I were to code this in a higher level language, I would use a dictionary of lambdas to simplify this.
It occurred to me that I could use macros to simplify this to something like
case_call("key_string", class_name, do_something());
case_call( /* ... */ )
where case_call would be a macro that would expand this code to the first code snippet.
However, I am very much on the fence whether that would be considered good style. I mean, it would reduce typing work and improve the DRYness of the code, but then it really seems to abuse the macro system somewhat.
Would you go down that road, or rather type out the whole thing? And what would be your reasoning for doing so?
Edit
Some clarification:
This code is used as a glue layer between a simplified scripting API which accesses several different aspects of a C++ API as simple key-value properties. The properties are implemented in different ways in C++ though: Some have getter/setter methods, some are set in a special struct. Scripting actions reference C++ objects casted to a common base class. However, some actions are only available on certain subclasses and have to be cast down.
Further down the road, I may change the actual C++ API, but for the moment, it has to be regarded as unchangeable. Also, this has to work on an embedded compiler, so boost or C++11 are (sadly) not available.
I would suggest you slightly reverse the roles. You are saying that the object is already some class that knows how to handle a certain situation, so add a virtual void handle(const char * key) in your base class and let the object check in the implementation if it applies to it and do whatever is necessary.
This would not only eliminate the long if-else-if chain, but would also be more type safe and would give you more flexibility in handling those events.
That seems to me an appropriate use of macros. They are, after all, made for eliding syntactic repetition. However, when you have syntactic repetition, it’s not always the fault of the language—there are probably better design choices out there that would let you avoid this decision altogether.
The general wisdom is to use a table mapping keys to actions:
std::map<std::string, void(Class::*)()> table;
Then look up and invoke the action in one go:
object->*table[key]();
Or use find to check for failure:
const auto i = table.find(key);
if (i != table.end())
object->*(i->second)();
else
throw std::runtime_error(...);
But if as you say there is no common signature for the functions (i.e., you can’t use member function pointers) then what you actually should do depends on the particulars of your project, which I don’t know. It might be that a macro is the only way to elide the repetition you’re seeing, or it might be that there’s a better way of going about it.
Ask yourself: why do my functions take different arguments? Why am I using casts? If you’re dispatching on the type of an object, chances are you need to introduce a common interface.

C++ and FULLY dynamic functions

I have a problem with detours. Detours, as you all know, can only move among 5 bytes of space (i.e a 'jmp' call and a 4 byte address). Because of this it is impossible to have the 'hook' function in a class (a method), you cannot supply the 'this' pointer because there is simply not enough space (here's the problem more thoroughly explained). So I've been brainstorming all day for a solution, and now I want your thoughts on the subject so I don't begin a 3-5 day project without knowing if it would be possible or not.
I had 3 goals initially, I wanted the 'hook' functions to be class methods, I wanted the whole approach to be object-oriented (no static functions or global objects) and, the worst/hardest part, to be completely dynamic. This is my (in theory) solution; with assembly one can modify functions at runtime (a perfect example is any detouring method). So since I can modify functions dynamically, shouldn't I also be able to create them dynamically? For example; I allocate memory for, let's say ~30 bytes (through malloc/new). Wouldn't it be possible to just replace all bytes with binary numbers corresponding to different assembly operators (like 0xE9 is 'jmp') and then call the address directly (since it would contain a function)?
NOTE: I know on beforehand the return value, and all the arguments to all functions that I want to detour, and since I'm using GCC, the thiscall convention is practically identical to the _cdecl one.
So this is my thought/soon-to-be implementation; I create a 'Function' class. This constructor takes a variadic amount of arguments (except the first argument, which describes the return value of the target function).
Each argument is a description of the arguments the hook will receive (the size, and whether it is a pointer or not). So let's say I want to create a Function class for a int * RandomClass::IntCheckNum(short arg1);. Then I would just have to do like this:Function func(Type(4, true), Type(4, true), Type(2, false));. Where 'Type' is defined as Type(uint size, bool pointer). Then through assembly I could dynamically create the function (note: this would all be using _cdecl calling convention) since I can calculate the number of arguments and total size.
EDIT: With the example, Type(4, true) is the return value (int*), the scondType(4, true) is the RandomClass 'this' pointer and Type(2, false) describes the first argument (short arg1).
With this implementation I could easily have class methods as callbacks, but it would require an extensive amount of assembly code (which I'm not even especially experienced at).
In the end, the only non-dynamic thing would be the methods in my callback class (which also would require pre and post callbacks).
So I wanted to know; is this possible? How much work would it require, and am I way over my head here?
EDIT: I'm sorry if I presented everything a bit fuzzy, but if there is something you want more thoroughly explained, do ask!
EDIT2: I'd also like to know, if I can find the hex values for all assembly operators somewhere? A list would help a ton! And/or if it is possible to somehow 'save' the asm(""); code at a memory address (which I highly doubt).
What you describe is usually called "thunking", and is quite commonly implemented. Historically, the most common purpose has been mapping between 16-bit and 32-bit code (by autogenerating a new 32-bit function that calls an existing 16-bit one or vice versa). I believe some C++ compilers generate similar functions to adjust base class pointers to subclass pointers in multiple inheritance, also.
It certainly seems like a viable solution to your problem, and I don't foresee any huge issues. Just make sure you allocate the memory with any flags needed in your operating system to make sure the memory is executable (most modern OSs give out non-executable memory by default).
You may find this link helpful, particularly if working in Win32: http://www.codeproject.com/Articles/16785/Thunking-in-Win32-Simplifying-Callbacks-to-Non-sta
Regarding finding the hex values of assembly operations, the best reference I know of is the Appendix to the manual of the NASM assembler (and I don't just say that because I helped write it). There's a copy available here: http://www.posix.nl/linuxassembly/nasmdochtml/nasmdoca.html

Does it make sense to use const in an interface or not?

I have a module that performs some calculations and during the calculations, communicates with other modules. Since the calculation module does not want to rely on the other modules, it exposes an interface like this (this is a very simplified version of course):
class ICalculationManager
{
public:
double getValue (size_t index) = 0;
void setValue (size_t index, double value) = 0;
void notify (const char *message) = 0;
};
Applications that want to use the calculation module need to write their own implementation of the interface, and feed it to the calculation tool, like this:
MyCalculationManager calcMgr;
CalculationTool calcTool (calcMgr);
calcTool.calculate();
I am wondering now whether it makes sense to add "const" to the methods of the ICalculationManager interface.
It would seem logical that the getValue method only gets something and doesn't change anything, so I could make this const. And setValue probably changes data so that won't be const.
But for a more general method like notify I can't be sure.
In fact, for none of the methods I can now for sure that the method is really implemented as a const method, and if I would make the interface methods const, I am forcing all implementations to be const as well, which is possibly not wanted.
It seems to me that const methods only make sense if you know beforehand what your implementation will be and whether it will be const or not. Is this true?
Doesn't it make sense to make methods of this kind of interface const? And if it makes sense, what are good rules to determine whether the method should be const or not, even if I don't know what the implementation will be?
EDIT: changed the parameter from notify from "char *" to "const char *" since this lead to irrelevant answers.
You make a function const when you are advertising to clients that calling the function will never change the externally visible state of the object. Your object only has one piece of state that can be retrieved, getValue.
So, if getValue can cause the next getValue to return a different value then sure, leave it non-const. If you want to tell clients that calling getValue() will never change the value returned by the next getValue() then make it const.
Same for notify:
double d1 = mgr->getValue(i);
mgr->notify("SNTH"); // I'm cheating.
double d2 = mgr->getValue(i);
assert(d1==d2);
If that should hold true for all cases and all i's then notify() should be const. Otherwise it should not be.
Yes. One should use const whenever and wherever it is sensible to do so. It doesn't make sense that the method for performing a calculation (which is what your interface suggests) should change it's observable behavior because it had "notify" called on it. (And for that matter, how is notification related to calculation at all?)
My making one of the interface members const, you don't force clients to be const -- you merely allow them use of a const ICalculationManager.
I would probably make Notify const. If clients need to do something non-const as a result of a notification, then Notify is not a good method name -- that name suggest non-state-modifying transformations such as logging, not modification.
For instance, most of the time you pass your interface around, you're going to want to use pass-by-reference-to-const to pass the interface implementor, but if the methods aren't const, you cannot do that.
The interface should be guiding the implementation, not the other way around. If you haven't decided if a method or parameter can be const or not, you're not done designing.
Using const is a way of making assertions about what the code is or is not allowed to do. This is extremely valuable in reasoning about a piece of code. If your parameter to notify isn't const for example, what changes would it make to the message? How would it make the message larger if it needed to?
Edit: You appear to know the value of declaring a const parameter, so lets build on that. Suppose you want a function to record the value of a calculation:
void RecordCalculation(const ICalculationManager *calculation);
The only methods you'll be able to call on that pointer are the const methods. You can be sure that after the function returns, the object will be unchanged. This is what I meant by reasoning about the code - you can be absolutely certain the object won't be changed, because the compiler will generate an error if you try.
Edit 2: If your object contains some internal state that will be modified in response to operations that are logically const, such as a cache or buffer, go ahead and use the mutable keyword on those members. That's what it was invented for.
For me it only depends on the contract of your interface.
For a getter method I do not see why it should change any data and if this happens, maybe mutable is an option.
For the setter method I agree, not const there because this will certainly change data somehow.
For the notify is hard to say without knowing what it means for your system. Also, do you expect the message parameter to be modified by the implementation? If now, it should be const too.
Without reading your entire post: Yes of course, it makes sense if you want to use an object (which inherits ICalculationManager) in a const context. Generally, you should always use const qualifier if you don't manipulate private data.
EDIT:
Like Mark Ransom said: You need to know exactly how your interface functions should behave, otherwise your not finished designing.
I know I'm going to get a lot of downvotes for this, but in my opinion the usefulness of const-correctness in C++ is vastly exaggerated. The const idea is primitive (it only captures one bit of concept... change/don't change) and comes with an high cost that even includes necessity of code duplication. Also it doesn't scale well (consider const_iterators).
What's more important I cannot remember even a single case (not even ONE) in which the const-correctness machinery helped me by spotting a true logical error, that is I was trying to do something that I shouldn't do. Instead every single time the compiler stopped me there was a problem in the const declaration part (i.e. what I was trying to do was logically legit, but a method or a parameter had a problem in the declaration about const-ness).
In all cases I can remember where I got a compiler error related to const-correctness the fix was just adding some missing const keywords or removing some that were in excess... without using the const-correctness idea those errors wouldn't have been there at all.
I like C++, but of course I don't love to death every bit of it (digression: when I interview someone a question I often ask is "what is the part you don't like about <language> ?" ... if the answer is "none" then simply means that who I'm talking to is still in the fanboy stage and clearly doesn't have a big real experience).
There are many parts of C++ that are very good, parts that are IMO horrible (stream formatting, for example) and parts that are not horrible but neither logically beautiful nor practically useful. Const-correctness idea is IMO in this gray area (and this is not a newbie impression... I came to this conclusion after many many lines and years of coding in C++).
May be it's me, but apparently const correctness solves a problem that my brain doesn't have ... I've many others problems, but not the one of confusing when I should change an instance state and when I shouldn't.
Unfortunately (differently from stream formatting) you cannot just ignore the const-correctness machinery in C++ because it's part of the core language, so even if I don't like it I'm forced to comply with it anyway.
You may now say... ok, but what's the answer to the question ? It's simply that I wouldn't get too crazy about that part of the semantic description... it's just a single bit and comes with an high price; if you're unsure and you can get away without declaring constness then don't do it. Constness of references or methods is never an help for the compiler (remember that it can be legally casted away) and it has been added to C++ just as an help for programmers. My experience tells me however that (given the high cost and the low return) it's not a real help at all.

How to get rid of void-pointers

I inherited a big application that was originally written in C (but in the mean time a lot of C++ was also added to it). Because of historical reasons, the application contains a lot of void-pointers. Before you start to choke, let me explain why this was done.
The application contains many different data structures, but they are stored in 'generic' containers. Nowadays I would use templated STL containers for it, or I would give all data structures a common base class, so that the container can store pointers to the base class, but in the [good?] old C days, the only solution was to cast the struct-pointer to a void-pointer.
Additionally, there is a lot of code that works on these void-pointers, and uses very strange C constructions to emulate polymorphism in C.
I am now reworking the application, and trying to get rid of the void-pointers. Adding a common base-class to all the data structures isn't that hard (few days of work), but the problem is that the code is full of constructions like shown below.
This is an example of how data is stored:
void storeData (int datatype, void *data); // function prototype
...
Customer *myCustomer = ...;
storeData (TYPE_CUSTOMER, myCustomer);
This is an example of how data is fetched again:
Customer *myCustomer = (Customer *) fetchData (TYPE_CUSTOMER, key);
I actually want to replace all the void-pointers with some smart-pointer (reference-counted), but I can't find a trick to automate (or at least) help me to get rid of all the casts to and from void-pointers.
Any tips on how to find, replace, or interact in any possible way with these conversions?
I actually want to replace all the
void-pointers with some smart-pointer
(reference-counted), but I can't find
a trick to automate (or at least) help
me to get rid of all the casts to and
from void-pointers.
Such automated refactoring bears many risks.
Otherwise, sometimes I like to play tricks by making out of such void* functions the template functions. That:
void storeData (int datatype, void *data);
becomes:
template <class T>
void storeData (int datatype, T *data);
At first implement template by simply wrapping the original (renamed) function and converting the types. That might allow you to see potential problems - already by simply compiling the code.
You probably don't need to get rid of the casts to use shared pointers.
storeData(TYPE_CUSTOMER, myCustomer1->get());
shared_ptr<Customer> myCustomer2(reinterpret_cast<Customer*>fetchData(TYPE_CUSTOMER, "???");
Of course, this assumes that you don't expect to share the same pointer across calls to store/fetch. In other words, myCustomer1 and myCustomer2 don't share the same pointer.
Apparently, there is no automated way/trick to convert or find all uses of void-pointers. I'll have to use manual labor to find all void-pointers, in combination with PC-Lint that will give errors whenever there is an incorrect conversion.
Case closed.

What are some 'good use' examples of dynamic casting?

We often hear/read that one should avoid dynamic casting. I was wondering what would be 'good use' examples of it, according to you?
Edit:
Yes, I'm aware of that other thread: it is indeed when reading one of the first answers there that I asked my question!
This recent thread gives an example of where it comes in handy. There is a base Shape class and classes Circle and Rectangle derived from it. In testing for equality, it is obvious that a Circle cannot be equal to a Rectangle and it would be a disaster to try to compare them. While iterating through a collection of pointers to Shapes, dynamic_cast does double duty, telling you if the shapes are comparable and giving you the proper objects to do the comparison on.
Vector iterator not dereferencable
Here's something I do often, it's not pretty, but it's simple and useful.
I often work with template containers that implement an interface,
imagine something like
template<class T>
class MyVector : public ContainerInterface
...
Where ContainerInterface has basic useful stuff, but that's all. If I want a specific algorithm on vectors of integers without exposing my template implementation, it is useful to accept the interface objects and dynamic_cast it down to MyVector in the implementation. Example:
// function prototype (public API, in the header file)
void ProcessVector( ContainerInterface& vecIfce );
// function implementation (private, in the .cpp file)
void ProcessVector( ContainerInterface& vecIfce)
{
MyVector<int>& vecInt = dynamic_cast<MyVector<int> >(vecIfce);
// the cast throws bad_cast in case of error but you could use a
// more complex method to choose which low-level implementation
// to use, basically rolling by hand your own polymorphism.
// Process a vector of integers
...
}
I could add a Process() method to the ContainerInterface that would be polymorphically resolved, it would be a nicer OOP method, but I sometimes prefer to do it this way. When you have simple containers, a lot of algorithms and you want to keep your implementation hidden, dynamic_cast offers an easy and ugly solution.
You could also look at double-dispatch techniques.
HTH
My current toy project uses dynamic_cast twice; once to work around the lack of multiple dispatch in C++ (it's a visitor-style system that could use multiple dispatch instead of the dynamic_casts), and once to special-case a specific subtype.
Both of these are acceptable, in my view, though the former at least stems from a language deficit. I think this may be a common situation, in fact; most dynamic_casts (and a great many "design patterns" in general) are workarounds for specific language flaws rather than something that aim for.
It can be used for a bit of run-time type-safety when exposing handles to objects though a C interface. Have all the exposed classes inherit from a common base class. When accepting a handle to a function, first cast to the base class, then dynamic cast to the class you're expecting. If they passed in a non-sensical handle, you'll get an exception when the run-time can't find the rtti. If they passed in a valid handle of the wrong type, you get a NULL pointer and can throw your own exception. If they passed in the correct pointer, you're good to go.
This isn't fool-proof, but it is certainly better at catching mistaken calls to the libraries than a straight reinterpret cast from a handle, and waiting until some data gets mysteriously corrupted when you pass the wrong handle in.
Well it would really be nice with extension methods in C#.
For example let's say I have a list of objects and I want to get a list of all ids from them. I can step through them all and pull them out but I would like to segment out that code for reuse.
so something like
List<myObject> myObjectList = getMyObjects();
List<string> ids = myObjectList.PropertyList("id");
would be cool except on the extension method you won't know the type that is coming in.
So
public static List<string> PropertyList(this object objList, string propName) {
var genList = (objList.GetType())objList;
}
would be awesome.
It is very useful, however, most of the times it is too useful: if for getting the job done the easiest way is to do a dynamic_cast, it's more often than not a symptom of bad OO design, what in turn might lead to trouble in the future in unforeseen ways.