provide member pointer to the member itself - c++

I am implementing my C#-like property class in C++.
So I have to provide acceess to the internal field(mother::_i) to the property field(mother::i).
I found few solutions but there were no perfects.
Firstly I made a method to provide owner(mother in this case)'s pointer on runtime by calling method like RProperty<...>::SetOwner(mother&). But it requires additional code to use my property class and costs in runtime.
Secondly I came up with idea that this pointer of RProperty and member pointer of itself can find the owner's pointer. obviously, ownerpointer = this - &mother::i. But providing member pointer to member itself gives me compile time error. I tried a tricky method using 'empty' struct to provide member pointer to property. But it turns out sizeof(struct empty) is not zero. it costs unnecessary extra memory per instances. I stucked in this issue for few days.
anyone has a good idea? :)
Code works but not perfect:
#include "stdafx.h"
struct empty{};
template<typename TCLASS, typename TFIELD>
class RPropertyBase
{
protected:
RPropertyBase(){ }
TCLASS& getOwner() const { };
};
template<typename TCLASS, typename TFIELD, TFIELD TCLASS::*PFIELD, empty TCLASS::*PTHIS>
class RProperty : RPropertyBase<TCLASS, TFIELD>
{
protected:
TCLASS& getOwner() const { return *(TCLASS*)((unsigned int)this-(unsigned int)&(((TCLASS*)0)->*PTHIS)-sizeof(empty) ); }
public:
RProperty<TCLASS, TFIELD, PFIELD, PTHIS>& operator=(const TFIELD& A){ getOwner().*PFIELD = A; return *this; }
operator TFIELD&() const { return getOwner().*PFIELD; }
};
class mother
{
int _i;
template<typename C>
struct __Propertyi : public RProperty<C, int, &C::_i, &C::_empty>
{
using RProperty<C, int, &C::_i, &C::_empty>::operator=;
};
public:
empty _empty;
__Propertyi<mother> i;
};
int _tmain(int argc, _TCHAR* argv[])
{
mother a;
a.i = 1;
int bb = (a.i);
return 0;
}

First...
So I have to provide acceess to the internal field(mother::_i) to the property field(mother::i).
Identifiers beginning with an underscore are reserved in C++ - only the compiler and it's libraries are supposed to use them. Identifiers containing double-underscores are also reserved. However, identifiers with a single trailing underscore such as i_ are OK.
Getting to the point...
ownerpointer = this - &mother::i
It looks like you're trying to subtract a member pointer from a pointer, which you can't do. Member pointers are a bit like offsets into the layout of a type, but this breaks down in two ways...
It's not the abstraction they're designed to provide.
It's not accurate anyway - once you allow for multiple inheritance and virtual inheritance, the offset at which a particular member appears within a type doesn't just depend on it's position within the base type in which it's defined, but also on which subtype you're looking at.
If you really want to do pointer arithmetic that's aware of the layout of a type, it's certainly possible, but it's a C programming technique that uses C-level features. There's also some significant limitations on context.
The key idea is that instead of trying to use member pointers as offsets, you use actual offsets. This costs you type-safety but, so long as you wrap the type-unsafe code and make absolutely certain it's correct, you should be OK.
The basic tool is offsetof, which is a macro that C++ inherits from C...
offsetof(structname, membername)
You can look up the implementation of that macro, but don't copy it - the standard requires that a compiler provide some way to implement the macro that works, but the implementation that works for one compiler may not work for another. However, two common approaches are...
Look at the address of the member in an imaginary instance of the struct at address zero. Problems with this (e.g. that imaginary instance obviously doesn't have a valid virtual pointer) are part of the reason for some restrictions.
Use a special "intrinsic" function provided by the compiler, which is one of the reasons why those identifiers with underscores are reserved.
Using that offset, in principle, you can cast your pointer to char* via void*, do your arithmetic, then cast back again to the needed type.
The first problem is obvious - some members (ie the static ones) aren't at a fixed offset in each instance, they're at a fixed address irrespective of the instance. Obvious but perhaps best to say it.
The next problem is from that offsetof documentation I linked...
type shall be a POD class (including unions).
You're looking at the layout of a type. You need that layout to apply to subtypes as well. Because you've discarded the C++ polymorphism abstraction and you're dealing directly with offsets, the compiler can't handle any run-time layout resolution for you. Various inheritance-related issues would invalidate the offset calculations - multiple inheritance, virtual inheritance, a subtype that has a virtual pointer when the base doesn't.
So you need to do your layout with a POD struct. You can get away with single inheritance, but you can't have virtual methods. But there's another annoyance - POD is a bit of an overloaded term that obviously doesn't just relate to whether offsetof is valid or not. A type that has non-POD data members isn't POD.
I hit this problem with a multiway tree data structure. I used offsetof to implement the data structure (because different times). I wrapped this in a template, which used a struct and offsetof to determine the node layouts. In a whole series of compilers and compiler versions this was fine until I switched to a version of GCC, which started warning all over the place.
My question and answer about this on SO are here.
This issue with offsetof may have been addressed in C++11 - I'm not sure. In any case, even though a member within a struct is non-POD, that struct will still have a fixed layout determined at compile time. The offset is OK even if the compiler throws warnings at you, which luckily in GCC can be turned off.
The next problem from that offsetof documentation I linked...
type shall be a standard-layout class (including unions).
This is a new one from C++11 and, to be honest, I haven't really thought about it much myself.
The final problem - actually, the view of a pointer as an address is invalid. Sure, the compiler implements pointers as addresses, but there's lots of technicalities, and compiler writers have been exploiting these in their optimisers.
One area you have to be very careful with once you start doing pointer arithmetic is the compilers "alias analysis" - how it decides whether two pointers might point to the same thing (in order to decide when it can safely keep values in registers and not refer back to memory to see if a write through an alias pointer changed it). I once asked this question about that, but it turns out the answer I accepted is a problem (I should probably go back and do something about it) because although it describes the problem correctly, the solution it suggests (using union-based puns) is only correct for GCC and not guaranteed by the C++ standard.
In the end, my solution was to hide the pointer arithmetic (and char* pointers) in a set of functions...
inline void* Ptr_Add (void* p1, std::ptrdiff_t p2)
{
return (((char*) p1) + p2);
}
inline void* Ptr_Sub (void* p1, std::ptrdiff_t p2)
{
return (((char*) p1) - p2);
}
inline std::ptrdiff_t Ptr_Diff (void* p1, void* p2)
{
return (((char*) p1) - ((char*) p2));
}
inline bool Ptr_EQ (void* p1, void* p2) { return (((char*) p1) == ((char*) p2)); }
inline bool Ptr_NE (void* p1, void* p2) { return (((char*) p1) != ((char*) p2)); }
inline bool Ptr_GT (void* p1, void* p2) { return (((char*) p1) > ((char*) p2)); }
inline bool Ptr_GE (void* p1, void* p2) { return (((char*) p1) >= ((char*) p2)); }
inline bool Ptr_LT (void* p1, void* p2) { return (((char*) p1) < ((char*) p2)); }
inline bool Ptr_LE (void* p1, void* p2) { return (((char*) p1) <= ((char*) p2)); }
That std::ptrdiff_t type is significant too - the bit-width of a pointer isn't guaranteed to match the bit-width of a long.
Outside of these functions, all pointers are either their correct type or void*. C++ treats void* specially (the compiler knows it can alias other pointer types) so it seems to work, though there may be details I'm not remembering. Sorry - these things are hard, especially these days with optimisers that are sometimes clever in the "obnoxious pedant" sense, and I only touch this evil code if I absolutely have to.
One last issue - I already mentioned that pointers aren't addresses. One oddity is that on some platforms, two different pointers may map to the same address in different address spaces - see for example the Harvard Architecture which has different address spaces for instructions. So even the offset between two pointers is invalid except within certain limits, no doubt described in complicated detail in the standard. A single struct is a single struct - obviously it lives on one address space, with the possible exception of static members - but don't just assume pointer arithmetic is always valid.
Long story short - yes, it's possible to subtract the offset of a member from the address of a member to find the address of the struct, but you have to use actual offsets (not member pointers) and there are limitations and technicalities that may mean you can't even solve your problem this way (e.g. I'm not sure you'll be able to use offsets as template parameters), and certainly mean it's harder than it seems.
Ultimately, the take-away advice is that if you read this, treat it as a warning. Don't do the things I've done. I wish I hadn't, and probably so will you.

Related

Placement new based on template sizeof()

Is this legal in c++11? Compiles with the latest intel compiler and appears to work, but I just get that feeling that it is a fluke.
class cbase
{
virtual void call();
};
template<typename T> class functor : public cbase
{
public:
functor(T* obj, void (T::*pfunc)())
: _obj(obj), _pfunc(pfunc) {}
virtual void call()
{
(_obj)(*_pfunc)();
}
private:
T& _obj;
void (T::*_pfunc)();
//edited: this is no good:
//const static int size = sizeof(_obj) + sizeof(_pfunc);
};
class signal
{
public:
template<typename T> void connect(T& obj, void (T::*pfunc)())
{
_ptr = new (space) functor<T>(obj, pfunc);
}
private:
cbase* _ptr;
class _generic_object {};
typename aligned_storage<sizeof(functor<_generic_object>),
alignment_of<functor<_generic_object>>::value>::type space;
//edited: this is no good:
//void* space[(c1<_generic_object>::size / sizeof(void*))];
};
Specifically I'm wondering if void* space[(c1<_generic_object>::size / sizeof(void*))]; is really going to give the correct size for c1's member objects (_obj and _pfunc). (It isn't).
EDIT:
So after some more research it would seem that the following would be (more?) correct:
typename aligned_storage<sizeof(c1<_generic_object>),
alignment_of<c1<_generic_object>>::value>::type space;
However upon inspecting the generated assembly, using placement new with this space seems to inhibit the compiler from optimizing away the call to 'new' (which seemed to happen while using just regular '_ptr = new c1;'
EDIT2: Changed the code to make intentions a little clearer.
const static int size = sizeof(_obj) + sizeof(_pfunc); will give the sum of the sizes of the members, but that may not be the same as the size of the class containing those members. The compiler is free to insert padding between members or after the last member. As such, adding together the sizes of the members approximates the smallest that object could possibly be, but doesn't necessarily give the size of an object with those members.
In fact, the size of an object can vary depending not only on the types of its members, but also on their order. For example:
struct A {
int a;
char b;
};
vs:
struct B {
char b;
int a;
};
In many cases, A will be smaller than B. In A, there will typically be no padding between a and b, but in B, there will often be some padding (e.g., with a 4-byte int, there will often be 3 bytes of padding between b and a).
As such, your space may not contain enough...space to hold the object you're trying to create there in init.
I think you just got lucky; Jerry's answer points out that there may be padding issues. What I think you have is a non-virtual class (i.e., no vtable), with essentially two pointers (under the hood).
That aside, the arithmetic: (c1<_generic_object>::size / sizeof(void*)) is flawed because it will truncate if size is not a multiple of sizeof(void *). You would need something like:
((c1<_generic_object>::size + sizeof(void *) - 1) / sizeof(void *))
This code does not even get to padding issues, because it has a few of more immediate ones.
Template class c1 is defined to contain a member T &_obj of reference type. Applying sizeof to _obj in scope of c1 will evaluate to the size of T, not to the size of reference member itself. It is not possible to obtain the physical size of a reference in C++ (at least directly). Meanwhile, any actual object of type c1<T> will physically contain a reference to T, which is typically implemented in such cases as a pointer "under the hood".
For this reason it is completely unclear to me why the value of c1<_generic_object>::size is used as a measure of memory required for in-pace construction of an actual object of type c1<T> (for any T). It just doesn't make any sense. These sizes are not related at all.
By pure luck the size of an empty class _generic_object might evaluate to the same (or greater) value as the size of a physical implementation of a reference member. In that case the code will allocate a sufficient amount of memory. One might even claim that the sizeof(_generic_object) == sizeof(void *) equality will "usually" hold in practice. But that would be just a completely arbitrary coincidence with no meaningful basis whatsoever.
This even looks like red herring deliberately inserted into the code for the purpose of pure obfuscation.
P.S. In GCC sizeof of an empty class actually evaluates to 1, not to any "aligned" size. Which means that the above technique is guaranteed to initialize c1<_generic_object>::size with a value that is too small. More specifically, in 32 bit GCC the value of c1<_generic_object>::size will be 9, while the actual size of any c1<some_type_t> will be 12 bytes.

C++: treating structure members as an array

I have a structure containing lots (like, hundreds) of pointers. Each pointer is a different type, but they all inherit from a common base class --- let's call it Base. I'm using multiple inheritance. (This is all machine-generated, which is why it's weird.)
e.g.:
class Data {
Class1* p1;
Class2* p2;
Class3* p3;
...etc...
};
I want to call a method defined in Base on all of these. I can generate code like this:
void callFooOnData(Data* p)
{
if (p1) p1->Foo();
if (p2) p2->Foo();
if (p3) p3->Foo();
...etc...
}
The problem is, I've got lots and lots of these, of a billion different kinds of Data, and the above code ends up being very large and is affecting my footprint. So I'm trying to replace this with something smarter.
In C, I could simply take the address of the first structure member and the count, and do something like:
void callFooOnData(Data* p)
{
callFooOnObjects(&p1, 3);
}
But, of course, I can't do this in C++, because the structure members aren't a uniform type, and casting them to Base* may involve changing them, and this may involve changing the pointer, and because they're not a uniform type the pointers will have to be changed differently for each member, and I can't do that.
Is there a way to do something like this is C++? This is all machine generated code, so it doesn't have to be pretty, thankfully --- that shark has already been jumped --- but I do need it to be as portable as possible...
(I do have access to RTTI, but would like to avoid it if possible for performance reasons.)
Update:
So, as far as I can tell... you just can't do this in C++. Simply can't be done. I was misled by the fact that it's totally straightforward in C. In C++ you can only safely cast a pointer between two types in a class hierarchy if you have a well typed pointer to begin with, and of course, I don't.
So I'm going to have to change the underlying problem to avoid this: the solution I've come up with is to store every pointer in the structure twice, once in an array of Base* for iterating over, and once as a ClassWhatever* for calling methods on. Which sucks, of course, because it's going to double the size of the Data structure.
So if anyone would like to confirm this (I would love to be proven wrong), I will happily mark their answer as correct...
Each pointer is a different type, but they all inherit from a common base class --- let's call it Base
Instead of having hundreds of members, just keep a container of Base class pointers:
class Data {
std::vector<Base*> objects;
};
In a good design, you don't really need to know the type of each object, and it's actually better if it's abstracted away. Remember, it's always good to program against interfaces, not concrete classes.
and casting them to Base* may involve changing them
Not really, for public inheritance the cast should be implicit.
If you only call Foo() on all of them, Foo() can be a virtual method in the base class, that way you take full advantage of polymorphism.
But, of course, I can't do this in C++, because the structure members
aren't a uniform type, and casting them to Base* may involve changing
them, and this may involve changing the pointer, and because they're
not a uniform type the pointers will have to be changed differently
for each member, and I can't do that.
Although its a bad idea, this is not true:
you can iterate over the pointers using pointer arithmetic like you do in C. The pointers are all the same size, and they have a common base class (not considering structure alignment problems etc). Its dirty but technically it works. That would be different if you hold the instances themselves as members in the class, not pointers to them.
The best way in C++11 would be to use
std::vector< std::unique_ptr<Base> > objects;
see http://www.drdobbs.com/cpp/c11-uniqueptr/240002708
You must "fight" with auto-generated code by auto-generating your code too. You can use tools like ctags to "parse" your auto-generated classes and auto-generate your code from ctags output. See http://ctags.sourceforge.net/ctags.html.
You can also try to cast your Data to tuple as advised in this, possible duplicate, question: Iterate through struct variables.
I am not sure which is faster...
If you can modify tool which auto-generates this source code - maybe best would be to extend this tool...
Another option (not compileable):
class DataManipulator
{
public:
DataManipulator(const Data& aData_in)
{
objects.add(aData_in->Class1);
.
.
}
void operate()
{
for(objects_iterator...)
{
if(*objects_iterator != NULL)
objects_iterator->Foo();
}
}
private:
std::vector<Base*> objects;
};
I did not want to change my previous answer. However I come to another solutions which should work for you: full example here: http://ideone.com/u22FO
The main part below:
struct C {
A1* p1;
A2* p2;
A3* p3;
A4* p4;
// ...
};
template <class D, size_t startNumber, size_t numMembers, bool notFinished>
struct FooCaller;
template <class D, size_t startNumber>
struct FooSingleCall;
template <class D, size_t startNumber, size_t numMembers>
struct FooCaller<D, startNumber, numMembers, false> {
void operator() (D& d) {}
};
template <class D, size_t startNumber, size_t numMembers>
struct FooCaller<D, startNumber, numMembers, true> {
void operator() (D& d) {
FooSingleCall<D,startNumber>()(d);
FooCaller<D, startNumber + 1, numMembers, startNumber < numMembers>()(d);
}
};
#define FooSingleCallD(n) \
template <class D> \
struct FooSingleCall<D,n>{ \
void operator() (D& d) { \
d.p##n->foo(); \
} \
}
FooSingleCallD(1);
FooSingleCallD(2);
FooSingleCallD(3);
FooSingleCallD(4);
// ... unfortunately repeat as many times as needed
template <class D, size_t numMembers>
void callFoo(D& d)
{
FooCaller<D, 1, numMembers, 1 <= numMembers>()(d);
}
Aware: be smart enough not to define hundreds of FooSingleCall... See one of the answers to this famous question: https://stackoverflow.com/a/4581720/1463922 and you will see how to make 1000 instances in just 5 lines...
Also, please replace FooSingleCall with something retrieving N-pointer from your class - something like GetNPointer...
All the pointers are the same size (all class object pointers in C++ are the same size), and in practice the compiler needs to be rather perverse in order to insert padding anywhere in this list of pointers. Anyway, the problem of possible padding is the same as in C. So you can do just the same as in C, no problem at all.
void foo( Base const* );
void bar()
{
// ...
foo( *(&thingy.p1 + 3) );
}
It's that easy.
That said, even with machine generated code the design sounds horrible, wrong, really really bad.
One does got something of that sort when generating vtables (where each pointer is to a function of generally different signature), but it's very rare. So, this sounds like an XY problem. Like, you’re trying to solve problem X, have come up with an ungood solution Y, and are now asking about imagined solution Y instead of the real problem X…

Class identity without RTTI

I've found a simple solution somewhere on the internet to an identity class without built-in C++ RTTI.
template <typename T>
class Identity {
public:
static int64_t id()
{
static int64_t dummy;
return reinterpret_cast<int64_t>(&dummy);
}
};
When we need some class ID, we just use:
Identity<OurClass>::id();
I'm wondering, are there any collisions? Can it return the same ID for the different classes, or the different ID for the same classes? I have tried this code with g++ with different optimization values, everything seems ok.
First off: there is such an integral type that is made specifically to contain pointers:
intptr_t
and in C++11 uintptr_t
Second, even though in practice on gcc they are equal, the size of a pointer to an object and the size of a function pointer (or pointer to member) might well be different. Therefore it would be better using a specific object rather than the method itself (for Standard conformance).
Third, it only gives you identity, while RTTI is much richer, as it knows about all the subclasses a given object can be cast to, and even allows cross-casts or casts across virtual inheritance.
Still, the corrected version can be useful I guess:
struct Foo {
static intptr_t Id() {
static boost::none_t const Dummy = {};
return reinterpret_cast<intptr_t>(&Dummy);
}
};
And in hierarchies, having a virtual function returning that ID.
For completeness, I'll mention that Clang and LLVM have their own way of dealing with object identification without RTTI. You may want to read about their way of implementing isa, cast and dyn_cast here.
This solution casts a function pointer to an int. There is no guarantee that this pointer fits into an int, although in practice sizeof(void *) == sizeof(void (*)()) <= sizeof(int)
Edit: My bad. On x86_64 sizeof(int) = 4, sizeof(void (*)()) = 8, so collisions are possible and are unpredictable.
You can cast to an integral of appropriate size, but still it is undefined behavior theoretically.
This version avoids undefined behavior (and compiler warnings):
template <typename T>
class Identity {
public:
static const int* id() { static const int id = 0; return &id; }
};

How do boost::variant and boost::any work?

How do variant and any from the boost library work internally? In a project I am working on, I currently use a tagged union. I want to use something else, because unions in C++ don't let you use objects with constructors, destructors or overloaded assignment operators.
I queried the size of any and variant, and did some experiments with them. In my platform, variant takes the size of its longest possible type plus 8 bytes: I think it my just be 8 bytes o type information and the rest being the stored value. On the other hand, any just takes 8 bytes. Since i'm on a 64-bit platform, I guess any just holds a pointer.
How does Any know what type it holds? How does Variant achieve what it does through templates? I would like to know more about these classes before using them.
If you read the boost::any documentation they provide the source for the idea: http://www.two-sdg.demon.co.uk/curbralan/papers/ValuedConversions.pdf
It's basic information hiding, an essential C++ skill to have. Learn it!
Since the highest voted answer here is totally incorrect, and I have my doubts that people will actually go look at the source to verify that fact, here's a basic implementation of an any like interface that will wrap any type with an f() function and allow it to be called:
struct f_any
{
f_any() : ptr() {}
~f_any() { delete ptr; }
bool valid() const { return ptr != 0; }
void f() { assert(ptr); ptr->f(); }
struct placeholder
{
virtual ~placeholder() {}
virtual void f() const = 0;
};
template < typename T >
struct impl : placeholder
{
impl(T const& t) : val(t) {}
void f() const { val.f(); }
T val;
};
// ptr can now point to the entire family of
// struct types generated from impl<T>
placeholder * ptr;
template < typename T >
f_any(T const& t) : ptr(new impl<T>(t)) {}
// assignment, etc...
};
boost::any does the same basic thing except that f() actually returns typeinfo const& and provides other information access to the any_cast function to work.
The key difference between boost::any and boost::variant is that any can store any type, while variant can store only one of a set of enumerated types. The any type stores a void* pointer to the object, as well as a typeinfo object to remember the underlying type and enforce some degree of type safety. In boost::variant, it computes the maximum sized object, and uses "placement new" to allocate the object within this buffer. It also stores the type or the type index.
Note that if you have Boost installed, you should be able to see the source files in "any.hpp" and "variant.hpp". Just search for "include/boost/variant.hpp" and "include/boost/any.hpp" in "/usr", "/usr/local", and "/opt/local" until you find the installed headers, and you can take a look.
Edit
As has been pointed out in the comments below, there was a slight inaccuracy in my description of boost::any. While it can be implemented using void* (and a templated destroy callback to properly delete the pointer), the actualy implementation uses any<T>::placeholder*, with any<T>::holder<T> as subclasses of any<T>::placeholder for unifying the type.
boost::any just snapshots the typeinfo while the templated constructor runs: it has a pointer to a non-templated base class that provides access to the typeinfo, and the constructor derived a type-specific class satisfying that interface. The same technique can actually be used to capture other common capabilities of a set of types (e.g. streaming, common operators, specific functions), though boost doesn't offer control of this.
boost::variant is conceptually similar to what you've done before, but by not literally using a union and instead taking a manual approach to placement construction/destruction of objects in its buffer (while handling alignment issues explicitly) it works around the restrictions that C++ has re complex types in actual unions.

A recurring const-connundrum

I often find myself having to define two versions of a function in order to have one that is const and one which is non-const (often a getter, but not always). The two vary only by the fact that the input and output of one is const, while the input and output of the other is non-const. The guts of the function - the real work, is IDENTICAL.
Yet, for const-correctness, I need them both. As a simple practical example, take the following:
inline const ITEMIDLIST * GetNextItem(const ITEMIDLIST * pidl)
{
return pidl ? reinterpret_cast<const ITEMIDLIST *>(reinterpret_cast<const BYTE *>(pidl) + pidl->mkid.cb) : NULL;
}
inline ITEMIDLIST * GetNextItem(ITEMIDLIST * pidl)
{
return pidl ? reinterpret_cast<ITEMIDLIST *>(reinterpret_cast<BYTE *>(pidl) + pidl->mkid.cb) : NULL;
}
As you can see, they do the same thing. I can choose to define one in terms of the other using yet more casts, which is more appropriate if the guts - the actual work, is less trivial:
inline const ITEMIDLIST * GetNextItem(const ITEMIDLIST * pidl)
{
return pidl ? reinterpret_cast<const ITEMIDLIST *>(reinterpret_cast<const BYTE *>(pidl) + pidl->mkid.cb) : NULL;
}
inline ITEMIDLIST * GetNextItem(ITEMIDLIST * pidl)
{
return const_cast<ITEMIDLIST *>(GetNextItem(const_cast<const ITEMIDLIST *>(pidl));
}
So, I find this terribly tedious and redundant. But if I wish to write const-correct code, then I either have to supply both of the above, or I have to litter my "consumer-code" with const-casts to get around the problems of having only defined one or the other.
Is there a better pattern for this? What is the "best" approach to this issue in your opinion:
providing two copies of a given function - the const and non-const versions
or just one version, and then requiring consumers of that code to do their casts as they will?
Or is there a better approach to the issue entirely?
Is there work being done on the language itself to mitigate or obviate this issue entirely?
And for bonus points:
do you find this to be an unfortunate by-product of the C++ const-system
or do you find this to be tantamount to touching the very heights of mount Olympus?
EDIT:
If I supply only the first - takes const returns const, then any consumer that needs to modify the returned item, or hand the returned item to another function that will modify it, must cast off the constness.
Similarly, if I supply only the second definition - takes non-const and returns non-const, then a consumer that has a const pidl must cast off the constness in order to use the above function, which honestly, doesn't modify the constness of the item itself.
Maybe more abstraction is desirable:
THING & Foo(THING & it);
const THING & Foo(const THING & it);
I would love to have a construct:
const_neutral THING & Foo(const_neutral THING & it);
I certainly could do something like:
THING & Foo(const THING & it);
But that's always rubbed me the wrong way. I am saying "I don't modify the contents of your THING, but I'm going to get rid of the constness that you entrusted me with silently for you in your code."
Now, a client, which has:
const THING & it = GetAConstThing();
...
ModifyAThing(Foo(it));
That's just wrong. GetAConstThing's contract with the caller is to give it a const reference. The caller is expected NOT TO MODIFY the thing - only use const-operations on it. Yes, the caller can be evil and wrong and cast away that constness of it, but that's just Evil(tm).
The crux of the matter, to me, is that Foo is const-neutral. It doesn't actually modify the thing its given, but its output needs to propagate the constness of its argument.
NOTE: edited a 2nd time for formatting.
IMO this is an unfortunate by-product of the const system, but it doesn't come up that often: only when functions or methods give out pointers/references to something (whether or not they modify something, a function can't hand out rights that it doesn't have or const-correctness would seriously break, so these overloads are unavoidable).
Normally, if these functions are just one short line, I'd just reduplicate them. If the implementation is more complicated, I've used templates to avoid code reduplication:
namespace
{
//here T is intended to be either [int] or [const int]
//basically you can also assert at compile-time
//whether the type is what it is supposed to be
template <class T>
T* do_foo(T* p)
{
return p; //suppose this is something more complicated than that
}
}
int* foo(int* p)
{
return do_foo(p);
}
const int* foo(const int* p)
{
return do_foo(p);
}
int main()
{
int* p = 0;
const int* q = foo(p); //non-const version
foo(q); //const version
}
The real problem here appears to be that you're providing the outside world with (relatively) direct access to the internals of your class. In a few cases (e.g., container classes) that can make sense, but in most it means you're providing low-level access to the internals as dumb data, where you should be looking at the higher-level operations that client code does with that data, and then provide those higher-level operations directly from your class.
Edit: While it's true that in this case, there's apparently no class involved, the basic idea remains the same. I don't think it's shirking the issue either -- I'm simply pointing out that while I agree that it is an issue, it's only that arises only rather infrequently.
I'm not sure low-level code justifies such things either. Most of my code is much lower level than most people ever have much reason to work with, and I still only encounter it rather infrequently.
Edit2: I should also mention that C++ 0x has a new definition of the auto keyword, along with a new keyword (decltype) that make a fair number of things like this considerably easier to handle. I haven't tried to implement this exact function with them, but this general kind of situation is the sort of thing for which they're intended (e.g., automatically figuring out a return type based on passed arguments). That said, they normally do just a bit more than you want, so they might be a bit clumsy (if useful at all) for this exact situation.
I don't believe it's the deficiency of const-correctness per se, but rather the lack of convenient ability to generalize a method over cv-qualifiers (in the same way we can generalize over types via templates). Hypothetically, imagine if you could write something like:
template<cvqual CV>
inline CV ITEMIDLIST* GetNextItem(CV ITEMIDLIST * pidl)
{
return pidl ? reinterpret_cast<CV ITEMIDLIST *>(reinterpret_cast<CV BYTE *>(pidl) + pidl->mkid.cb) : NULL;
}
ITEMIDLIST o;
const ITEMIDLIST co;
ITEMIDLIST* po = GetNextItem(&o); // CV is deduced to be nothing
ITEMIDLIST* pco = GetNextItem(&co); // CV is deduced to be "const"
Now you can actually do this kind of thing with template metaprogramming, but this gets
messy real quick:
template<class T, class TProto>
struct make_same_cv_as {
typedef T result;
};
template<class T, class TProto>
struct make_same_cv_as<T, const TProto> {
typedef const T result;
};
template<class T, class TProto>
struct make_same_cv_as<T, volatile TProto> {
typedef volatile T result;
};
template<class T, class TProto>
struct make_same_cv_as<T, const volatile TProto> {
typedef const volatile T result;
};
template<class CV_ITEMIDLIST>
inline CV_ITEMIDLIST* GetNextItem(CV_ITEMIDLIST* pidl)
{
return pidl ? reinterpret_cast<CV_ITEMIDLIST*>(reinterpret_cast<typename make_same_cv_as<BYTE, CV_ITEMIDLIST>::result*>(pidl) + pidl->mkid.cb) : NULL;
}
The problem with the above is the usual problem with all templates - it'll let you pass object of any random type so long as it has the members with proper names, not just ITEMIDLIST. You can use various "static assert" implementations, of course, but that's also a hack in and of itself.
Alternatively, you can use the templated version to reuse the code inside your .cpp file, and then wrap it into a const/non-const pair and expose that in the header. That way, you pretty much only duplicate function signature.
Your functions are taking a pointer to a pidl which is either const or non-const. Either your function will be modifying the parameter or it won't - choose one and be done with it. If the function also modifies your object, make the function non-const. I don't see why you should need duplicate functions in your case.
You've got a few workarounds now...
Regarding best practices: Provide a const and a non-const versions. This is easiest to maintain and use (IMO). Provide them at the lowest levels so that it may propagate most easily. Don't make the clients cast, you're throwing implementation details, problems, and shortcomings on them. They should be able to use your classes without hacks.
I really don't know of an ideal solution... I think a keyword would ultimately be the easiest (I refuse to use a macro for it). If I need const and non-const versions (which is quite frequent), I just define it twice (as you do), and remember to keep them next to each other at all times.
I think it's hard to get around, if you look at something like vector in the STL, you have the same thing:
iterator begin() {
return (iterator(_Myfirst, this));
}
const_iterator begin() const {
return (iterator(_Myfirst, this));
}
/A.B.
During my work I developed a solution similar to what Pavel Minaev proposed. However I use it a bit differently and I think it makes the thing much simpler.
First of all you will need two meta-functions: an identity and const adding. Both can be taken from Boost if you use it (boost::mpl::identity from Boost.MPL and boost::add_const from Boost.TypeTraits). They are however (especially in this limited case) so trivial that they can be defined without referring to Boost.
EDIT: C++0x provides add_const (in type_traits header) meta-function so this solution just became a bit simpler. Visual C++ 2010 provides identity (in utility header) as well.
The definitions are following
template<typename T>
struct identity
{
typedef T type;
};
and
template<typename T>
struct add_const
{
typedef const T type;
};
Now having that generally you will provide a single implementation of a member function as a private (or protected if required somehow) static function which takes this as one of the parameters (in case of non-member function this is omitted).
That static function also has a template parameter being the meta-function for dealing with constness. Actual functions will the call this function specifying as the template argument either identity (non-const version) or add_const (const version).
Generally this will look like:
class MyClass
{
public:
Type1* fun(
Type2& arg)
{
return fun_impl<identity>(this, arg);
}
const Type1* fun(
const Type2& arg) const
{
return fun_impl<add_const>(this, arg);
}
private:
template<template<typename Type> class Constness>
static typename Constness<Type1>::type* fun_impl(
typename Constness<MyClass>::type* p_this,
typename Constness<Type2>::type& arg)
{
// Do the implementation using Constness each time constness
// of the type differs.
}
};
Note that this trick does not force you to have implementation in header file. Since fun_impl is private it should not be used outside of MyClass anyway. So you can move its definition to source file (leaving the declaration in the class to have access to class internals) and move fun definitions to source file as well.
This is only a bit more verbose however in case of longer non-trivial functions it pays off.
I think it is natural. After all you just said that you have to repeat the same algorithm (function implementation) for two different types (const one and non-const one). And that is what templates are for. For writing algorithms which work with any type satisfying some basic concepts.
I would posit that if you need to cast off the const of a variable to use it then your "consumer" code is not const correct. Can you provide a test case or two where you are running into this issue?
You don't need two versions in your case. A non-const thing will implicitly convert to a const thing, but not vice versa. From the name of you function, it looks like GetNextItem will have no reason to modify pidl, so you can rewrite it like this:
inline ITEMIDLIST * GetNextItem(const ITEMIDLIST * pidl);
Then clients can call it with a const or non-const ITEMIDLIST and it will just work:
ITEMIDLIST* item1;
const ITEMIDLIST* item2;
item1 = GetNextItem(item1);
item2 = GetNextItem(item2);
From your example, this sounds like a special case of having a pass-through function, where you want the return type to exactly match the parameter's type. One possibility would be to use a template. eg:
template<typename T> // T should be a (possibly const) ITEMIDLIST *
inline T GetNextItem(T pidl)
{
return pidl
? reinterpret_cast<T>(reinterpret_cast<const BYTE *>(pidl) + pidl->mkid.cb)
: NULL;
}
You could use templates.
template<typename T, typename U>
inline T* GetNextItem(T* pidl)
{
return pidl ? reinterpret_cast<T*>(reinterpret_cast<U*>(pidl) + pidl->mkid.cb) : NULL;
}
and use them like
ITEMDLIST* foo = GetNextItem<ITEMDLIST, BYTE>(bar);
const ITEMDLIST* constfoo = GetNextItem<const ITEMDLIST, const BYTE>(constbar);
or use some typedefs if you get fed up with typing.
If your function doesn't use a second type with the same changing constness, the compiler will deduce automatically which function to use and you can omit the template parameters.
But I think there may be a deeper problem hidden in the structure for ITEMDLIST. Is it possible to derive from ITEMDLIST? Almost forgot my win32 times... bad memories...
Edit: And you can, of course, always abuse the preprocessor. Thats what it's made for. Since you are already on win32, you can completly turn to the dark side, doesn't matter anymore ;-)