Why doesn't void take a void value in C++? - c++

I'm curious why C++ does not define void via :
typedef struct { } void;
I.e. what is the value in a type that cannot be instantiated, even if that installation must produce no code?
If we use gcc -O3 -S, then both the following produce identical assembler :
int main() { return 0; }
and
template <class T> T f(T a) { }
typedef struct { } moo;
int main() { moo a; f(a); return 0; }
This makes perfect sense. A struct { } simply takes an empty value, easy enough to optimize away. In fact, the strange part is that they produce different code without the -O3.
You cannot however pull this same trick with simply typedef void moo because void cannot assume any value, not even an empty one. Does this distinction have any utility?
There are various other strongly typed languages like Haskell, and presumably the MLs, that have a value for their void type, but offer no valueless types overtly, although some posses native pointer types, which act like void *.

I see the rationale for void being unable to be instantiated coming from the C roots of C++. In the old gory days, type safety wasn't that big a deal and void*s were constantly passed around. However, you could always be looking at code that does not literally say void* (due to typedefs, macros, and in C++, templates) but still means it.
It is a tiny bit of safety if you cannot dereference a void* but have to cast it to a pointer to a proper type, first. Whether you accomplish that by using an incomplete type as Ben Voigt suggests, or if you use the built-in void type doesn't matter. You're protected from incorrectly guessing that you are dealing with a type, which you are not.
Yes, void introduces annoying special cases, especially when designing templates. But it's a good thing (i.e. intentional) that the compiler doesn't silently accept attempts to instantiate void.

Because that wouldn't make void an incomplete type, now would it?
Try your example code again with:
struct x; // incomplete
typedef x moo;

Why should void be an incomplete type?
There are many reasons.
The simplest is this: moo FuncName(...) must still return something. Whether it is a neutral value, whether it is the "not a value value", or whatever; it still must say return value;, where value is some actual value. It must say return moo();.
Why write a function that technically returns something, when it isn't actually returning something? The compiler can't optimize the return out, because it's returning a value of a complete type. The caller might actually use it.
C++ isn't all templates, you know. The compiler doesn't necessarily have the knowledge to throw everything away. It often has to perform in-function optimizations that have no knowledge of the external use of that code.
void in this case means "I don't return anything." There is a fundamental difference between that and "I return a meaningless value." It is legal to do this:
moo FuncName(...) { return moo(); }
moo x = FuncName(...);
This is at best misleading. A cursory scan suggests that a value is being returned and stored. The identifier x now has a value and can be used for something.
Whereas this:
void FuncName(...) {}
void x = FuncName(...);
is a compiler error. So is this:
void FuncName(...) {}
moo x = FuncName(...);
It's clear what's going on: FuncName has no return value.
Even if you were designing C++ from scratch, with no hold-overs from C, you would still need some keyword to indicate the difference between a function that returns a value and one that does not.
Furthermore, void* is special in part because void is not a complete type. Because the standard mandates that it isn't a complete type, the concept of a void* can mean "pointer to untyped memory." This has specialized casting semantics. Pointers to typed memory can be implicitly cast to untyped memory. Pointers to untyped memory can be explicitly cast back to the typed pointer that it used to be.
If you used moo* instead, then the semantics get weird. You have a pointer to typed memory. Currently, C++ defines casting between unrelated typed pointers (except for certain special cases) to be undefined behavior. Now the standard has to add another exception for the moo type. It has to say what happens when you do this:
moo *m = new moo;
*moo;
With void, this is a compiler error. What is it with moo? It's still useless and meaningless, but the compiler has to allow it.
To be honest, I would say that the better question would be "Why should void be a complete type?"

Related

const and non-const versions of *static* member functions

I have two versions of the same static member function: one takes a pointer-to-const parameter and that takes a pointer-to-non-const parameter. I want to avoid code duplication.
After reading some stack overflow questions (these were all about non-static member functions though) I came up with this:
class C {
private:
static const type* func(const type* x) {
//long code
}
static type* func(type* x) {
return const_cast<type*>(func(static_cast<const type*>(x)));
}
public:
//some code that uses these functions
};
(I know juggling with pointers is generally a bad idea, but I'm implementing a data structure.)
I found some code in libstdc++ that looks like this:
NOTE: these are not member functions
static type* local_func(type* x)
{
//long code
}
type* func(type* x)
{
return local_func(x);
}
const type* func(const type* x)
{
return local_func(const_cast<type*>(x));
}
In the first approach the code is in a function that takes a pointer-to-const parameter.
In the second approach the code is in a function that takes a pointer-to-non-const parameter.
Which approach should generally be used? Are both correct?
The most important rule is that an interface function (public method, a free function other than one in a detail namespace, etc), should not cast away the constness of its input. Scott Meyer was one of the first to talk about preventing duplication using const_cast, here's a typical example (How do I remove code duplication between similar const and non-const member functions?):
struct C {
const char & get() const {
return c;
}
char & get() {
return const_cast<char &>(static_cast<const C &>(*this).get());
}
char c;
};
This refers to instance methods rather than static/free functions, but the principle is the same. You notice that the non-const version adds const to call the other method (for an instance method, the this pointer is the input). It then casts away constness at the end; this is safe because it knows the original input was not const.
Implementing this the other way around would be extremely dangerous. If you cast away constness of a function parameter you receive, you are taking a big risk in UB if the object passed to you is actually const. Namely, if you call any methods that actually mutate the object (which is very easy to do by accident now that you've cast away constness), you can easily get UB:
C++ standard, section § 5.2.11/7 [const cast]
[ Note: Depending on the type of the object, a write operation through the pointer, lvalue or pointer to data member resulting from a
const_cast that casts away a const-qualifier may produce undefined
behavior. —end note ]
It's not as bad in private methods/implementation functions because perhaps you carefully control how/when its called, but why do it this way? It's more dangerous to no benefit.
Conceptually, it's often the case that when you have a const and non-const version of the same function, you are just passing along internal references of the object (vector::operator[] is a canonical example), and not actually mutating anything, which means that it will be safe either way you write it. But it's still more dangerous to cast away the constness of the input; although you might be unlikely to mess it up yourself, imagine a team setting where you write it the wrong way around and it works fine, and then someone changes the implementation to mutate something, giving you UB.
In summary, in many cases it may not make a practical difference, but there is a correct way to do it that's strictly better than the alternative: add constness to the input, and remove constness from the output.
I have actually only ever seen your first version before, so from my experience it is the more common idiom.
The first version seems correct to me while the second version can result in undefined behavior if (A) you pass an actual const object to the function and (B) the long code writes to that object. Given that in the first case the compiler will tell you if you're trying to write to the object I would never recommend option 2 as it is. You could consider a standalone function that takes/returns const however.

Why is return type before the function name?

The new C++11 standard adds a new function declaration syntax with a trailing return type:
// Usual declaration
int foo();
// New declaration
auto foo() -> int;
This syntax has the advantage of letting the return type be deduced, as in:
template<class T, class U>
auto bar(T t, U u) -> decltype(t + u);
But then why the return type was put before the function name in the first place? I imagine that one answer will be that there was no need for such type deduction in that time. If so, is there a reason for a hypothetical new programming language to not use trailing return type by default?
As always, K&R are the "bad guys" here. They devised that function syntax for C, and C++ basically inherited it as-is.
Wild guessing here:
In C, the declaration should hint at the usage, i.e., how to get the value out of something. This is reflected in:
simple values: int i;, int is accessed by writing i
pointers: int *p;, int is accessed by writing *p
arrays: int a[n];, int is accessed by writing a[n]
functions: int f();, int is accessed by writing f()
So, the whole choice depended on the "simple values" case. And as #JerryCoffin already noted, the reason we got type name instead of name : type is probably buried in the ancient history of programming languages. I guess K&R took type name as it's easier to put the emphasis on usage and still have pointers etc. be types.
If they had chosen name : type, they would've either disjoined usage from declarations: p : int* or would've made pointers not be types anymore and instead be something like a decoration to the name: *p : int.
On a personal note: Imagine if they had chosen the latter and C++ inherited that - it simply wouldn't have worked, since C++ puts the emphasis on types instead of usage. This is also the reason why int* p is said to be the "C++ way" and int *p to be the "C way".
If so, is there a reason for a hypothetical new programming language to not use trailing return type by default?
Is there a reason to not use deduction by default? ;) See, e.g. Python or Haskell (or any functional language for that matter, IIRC) - no return types explicitly specified. There's also a movement to add this feature to C++, so sometime in the future you might see just auto f(auto x){ return x + 42; } or even []f(x){ return x + 42; }.
C++ is based on C, which was based on B, which was based on BCPL which was based on CPL.
I suspect that if you were to trace the whole history, you'd probably end up at Fortran, which used declarations like integer x (as opposed to, for example, Pascal, which used declarations like: var x : integer;).
Likewise, for a function, Pascal used something like function f(<args>) : integer; Under the circumstances, it's probably safe to guess that (at least in this specific respect) Pascal's syntax probably would have fit a bit better with type deduction.

If void() does not return a value, why do we use it?

void f() means that f returns nothing. If void returns nothing, then why we use it? What is the main purpose of void?
When C was invented the convention was that, if you didn't specify the return type, the compiler automatically inferred that you wanted to return an int (and the same holds for parameters).
But often you write functions that do stuff and don't need to return anything (think e.g. about a function that just prints something on the screen); for this reason, it was decided that, to specify that you don't want to return anything at all, you have to use the void keyword as "return type".
Keep in mind that void serves also other purposes; in particular:
if you specify it as the list of parameters to a functions, it means that the function takes no parameters; this was needed in C, because a function declaration without parameters meant to the compiler that the parameter list was simply left unspecified. In C++ this is no longer needed, since an empty parameters list means that no parameter is allowed for the function;
void also has an important role in pointers; void * (and its variations) means "pointer to something left unspecified". This is useful if you have to write functions that must store/pass pointers around without actually using them (only at the end, to actually use the pointer, a cast to the appropriate type is needed).
also, a cast to (void) is often used to mark a value as deliberately unused, suppressing compiler warnings.
int somefunction(int a, int b, int c)
{
(void)c; // c is reserved for future usage, kill the "unused parameter" warning
return a+b;
}
This question has to do with the history of the language: C++ borrowed from C, and C used to implicitly type everything untyped as int (as it turned out, it was a horrible idea). This included functions that were intended as procedures (recall that the difference between functions and procedures is that function invocations are expressions, while procedure invocations are statements). If I recall it correctly from reading the early C books, programmers used to patch this shortcoming with a #define:
#define void int
This convention has later been adopted in the C standard, and the void keyword has been introduced to denote functions that are intended as procedures. This was very helpful, because the compiler could now check if your code is using a return value from a function that wasn't intended to return anything, and to warn you about functions that should return but let the control run off the end instead.
In imperative programming languages such as C, C++, Java, etc., functions and methods of type void are used for their side effects. They do not produce a meaningful value to return, but they influence the program state in one of many possible ways. E.g., the exit function in C returns no value, but it has the side effect of aborting the application. Another example, a C++ class may have a void method that changes the value of its instance variables.
void() means return nothing.
void doesn't mean nothing. void is a type to represent nothing. That is a subtle difference : the representation is still required, even though it represents nothing.
This type is used as function's return type which returns nothing. This is also used to represent generic data, when it is used as void*. So it sounds amusing that while void represents nothing, void* represents everything!
Because sometimes you dont need a return value. That's why we use it.
If you didn't have void, how would you tell the compiler that a function doesn't return a value?
Cause consider some situations where you may have to do some calculation on global variables and put results in global variable or you want to print something depending on arguments , etc.. In these situations you can use the method which dont return value.. i.e.. void
Here's an example function:
struct SVeryBigStruct
{
// a lot of data here
};
SVeryBigStruct foo()
{
SVeryBigStruct bar;
// calculate something here
return bar;
}
And now here's another function:
void foo2(SVeryBigStruct& bar) // or SVeryBigStruct* pBar
{
bar.member1 = ...
bar.member2 = ...
}
The second function is faster, it doesn't have to copy whole struct.
probably to tell the compiler " you dont need to push and pop all cpu-registers!"
Sometimes it can be used to print something, rather than to return it. See http://en.wikipedia.org/wiki/Mutator_method#C_example for examples
Functions are not required to return a value. To tell the compiler that a function does not return a value, a return type of void is used.

Recovering a function pointer from a boost::any

I want to use boost::any to store heterogeneous function pointers. I get an exception when I try to use boost::any_cast to recast to the function pointer.
Is what I want to do even allowed?
.h:
typedef void(*voidFunction)(void);
struct functionInfo{
CString functionName;
boost::any functionPointer;
};
void foo();
int foo2(int a);
.cpp
void foo()
{
;//do something
}
int foo2(int a)
{
;//do something
}
void main()
{
vector<functionInfo> functionList;
functionInfo myInfo;
myInfo.functionName = _T("foo");
myInfo.functionPointer = &foo;
functionList.push_back(myInfo);
myInfo.functionName = _T("foo2");
myInfo.functionPointer = &foo2;
functionList.push_back(myInfo);
voidFunction myVoidFunction = boost::any_cast<voidFunction>(functionList[0].functionPointer);
}
----EDIT----
Ok, you are right, the reason why it acted like this is because foo is a class member function.
Meaning:
void MyClass::foo();
myInfo.functionPointer = &MyClass::foo;
so I needed to typedef:
typedef void(MyClass::*voidClassFunction)(void);
voidClassFunction myVoidFunction = boost::any_cast<voidClassFunction>(functionList[0].functionPointer);
Is what I want to do even allowed?
Absolutely. As long as you cast it back to exactly the type you gave it.
And that's your problem. You don't. foo2 is not a voidFunction. Therefore, you cannot cast it to one.
The purpose of boost::any is to have a void* that is guaranteed to either work correctly according to the C++ standard or throw an exception. The C++ standard allows the conversion of any (non-member) pointer type to a void*. It also allows conversion of a void* back to a type, provided that the type being provided is exactly the same type as the original. If it's not, welcome to undefined behavior.
boost::any exist to prevent undefined behavior by storing type information with the void*. It will rightly throw an exception when you attempt to cast something to the wrong type. As you do here. boost::any is not a way to pretend that types don't exist and to pretend that you can turn anything into something else. It's just a type-safe typeless container. You still need to know what you actually put there.
There is no way to store a list of functions with arbitrary argument lists and call them with the same argument list. The user must provide a function or functor with the right argument list that you expect. boost::bind is a way to adapt a function/functor for a particular argument list, but the user must explicitly use it.
The best you can do is have a list of specific function parameter sets that you accept, stored in a boost::variant object. You can use a visitor to figure out which particular function to call.

Is it legal to cast function pointers? [duplicate]

Let's say I have a function that accepts a void (*)(void*) function pointer for use as a callback:
void do_stuff(void (*callback_fp)(void*), void* callback_arg);
Now, if I have a function like this:
void my_callback_function(struct my_struct* arg);
Can I do this safely?
do_stuff((void (*)(void*)) &my_callback_function, NULL);
I've looked at this question and I've looked at some C standards which say you can cast to 'compatible function pointers', but I cannot find a definition of what 'compatible function pointer' means.
As far as the C standard is concerned, if you cast a function pointer to a function pointer of a different type and then call that, it is undefined behavior. See Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to
type (6.3.2.3).
Section 6.3.2.3, paragraph 8 reads:
A pointer to a function of one type may be converted to a pointer to a function of another
type and back again; the result shall compare equal to the original pointer. If a converted
pointer is used to call a function whose type is not compatible with the pointed-to type,
the behavior is undefined.
So in other words, you can cast a function pointer to a different function pointer type, cast it back again, and call it, and things will work.
The definition of compatible is somewhat complicated. It can be found in section 6.7.5.3, paragraph 15:
For two function types to be compatible, both shall specify compatible return types127.
Moreover, the parameter type lists, if both are present, shall agree in the number of
parameters and in use of the ellipsis terminator; corresponding parameters shall have
compatible types. If one type has a parameter type list and the other type is specified by a
function declarator that is not part of a function definition and that contains an empty
identifier list, the parameter list shall not have an ellipsis terminator and the type of each
parameter shall be compatible with the type that results from the application of the
default argument promotions. If one type has a parameter type list and the other type is
specified by a function definition that contains a (possibly empty) identifier list, both shall
agree in the number of parameters, and the type of each prototype parameter shall be
compatible with the type that results from the application of the default argument
promotions to the type of the corresponding identifier. (In the determination of type
compatibility and of a composite type, each parameter declared with function or array
type is taken as having the adjusted type and each parameter declared with qualified type
is taken as having the unqualified version of its declared type.)
127) If both function types are ‘‘old style’’, parameter types are not compared.
The rules for determining whether two types are compatible are described in section 6.2.7, and I won't quote them here since they're rather lengthy, but you can read them on the draft of the C99 standard (PDF).
The relevant rule here is in section 6.7.5.1, paragraph 2:
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Hence, since a void* is not compatible with a struct my_struct*, a function pointer of type void (*)(void*) is not compatible with a function pointer of type void (*)(struct my_struct*), so this casting of function pointers is technically undefined behavior.
In practice, though, you can safely get away with casting function pointers in some cases. In the x86 calling convention, arguments are pushed on the stack, and all pointers are the same size (4 bytes in x86 or 8 bytes in x86_64). Calling a function pointer boils down to pushing the arguments on the stack and doing an indirect jump to the function pointer target, and there's obviously no notion of types at the machine code level.
Things you definitely can't do:
Cast between function pointers of different calling conventions. You will mess up the stack and at best, crash, at worst, succeed silently with a huge gaping security hole. In Windows programming, you often pass function pointers around. Win32 expects all callback functions to use the stdcall calling convention (which the macros CALLBACK, PASCAL, and WINAPI all expand to). If you pass a function pointer that uses the standard C calling convention (cdecl), badness will result.
In C++, cast between class member function pointers and regular function pointers. This often trips up C++ newbies. Class member functions have a hidden this parameter, and if you cast a member function to a regular function, there's no this object to use, and again, much badness will result.
Another bad idea that might sometimes work but is also undefined behavior:
Casting between function pointers and regular pointers (e.g. casting a void (*)(void) to a void*). Function pointers aren't necessarily the same size as regular pointers, since on some architectures they might contain extra contextual information. This will probably work ok on x86, but remember that it's undefined behavior.
I asked about this exact same issue regarding some code in GLib recently. (GLib is a core library for the GNOME project and written in C.) I was told the entire slots'n'signals framework depends upon it.
Throughout the code, there are numerous instances of casting from type (1) to (2):
typedef int (*CompareFunc) (const void *a,
const void *b)
typedef int (*CompareDataFunc) (const void *b,
const void *b,
void *user_data)
It is common to chain-thru with calls like this:
int stuff_equal (GStuff *a,
GStuff *b,
CompareFunc compare_func)
{
return stuff_equal_with_data(a, b, (CompareDataFunc) compare_func, NULL);
}
int stuff_equal_with_data (GStuff *a,
GStuff *b,
CompareDataFunc compare_func,
void *user_data)
{
int result;
/* do some work here */
result = compare_func (data1, data2, user_data);
return result;
}
See for yourself here in g_array_sort(): http://git.gnome.org/browse/glib/tree/glib/garray.c
The answers above are detailed and likely correct -- if you sit on the standards committee. Adam and Johannes deserve credit for their well-researched responses. However, out in the wild, you will find this code works just fine. Controversial? Yes. Consider this: GLib compiles/works/tests on a large number of platforms (Linux/Solaris/Windows/OS X) with a wide variety of compilers/linkers/kernel loaders (GCC/CLang/MSVC). Standards be damned, I guess.
I spent some time thinking about these answers. Here is my conclusion:
If you are writing a callback library, this might be OK. Caveat emptor -- use at your own risk.
Else, don't do it.
Thinking deeper after writing this response, I would not be surprised if the code for C compilers uses this same trick. And since (most/all?) modern C compilers are bootstrapped, this would imply the trick is safe.
A more important question to research: Can someone find a platform/compiler/linker/loader where this trick does not work? Major brownie points for that one. I bet there are some embedded processors/systems that don't like it. However, for desktop computing (and probably mobile/tablet), this trick probably still works.
The point really isn't whether you can. The trivial solution is
void my_callback_function(struct my_struct* arg);
void my_callback_helper(void* pv)
{
my_callback_function((struct my_struct*)pv);
}
do_stuff(&my_callback_helper);
A good compiler will only generate code for my_callback_helper if it's really needed, in which case you'd be glad it did.
You have a compatible function type if the return type and parameter types are compatible - basically (it's more complicated in reality :)). Compatibility is the same as "same type" just more lax to allow to have different types but still have some form of saying "these types are almost the same". In C89, for example, two structs were compatible if they were otherwise identical but just their name was different. C99 seem to have changed that. Quoting from the c rationale document (highly recommended reading, btw!):
Structure, union, or enumeration type declarations in two different translation units do not formally declare the same type, even if the text of these declarations come from the same include file, since the translation units are themselves disjoint. The Standard thus specifies additional compatibility rules for such types, so that if two such declarations are sufficiently similar they are compatible.
That said - yeah strictly this is undefined behavior, because your do_stuff function or someone else will call your function with a function pointer having void* as parameter, but your function has an incompatible parameter. But nevertheless, i expect all compilers to compile and run it without moaning. But you can do cleaner by having another function taking a void* (and registering that as callback function) which will just call your actual function then.
As C code compiles to instruction which do not care at all about pointer types, it's quite fine to use the code you mention. You'd run into problems when you'd run do_stuff with your callback function and pointer to something else then my_struct structure as argument.
I hope I can make it clearer by showing what would not work:
int my_number = 14;
do_stuff((void (*)(void*)) &my_callback_function, &my_number);
// my_callback_function will try to access int as struct my_struct
// and go nuts
or...
void another_callback_function(struct my_struct* arg, int arg2) { something }
do_stuff((void (*)(void*)) &another_callback_function, NULL);
// another_callback_function will look for non-existing second argument
// on the stack and go nuts
Basically, you can cast pointers to whatever you like, as long as the data continue to make sense at run-time.
Well, unless I understood the question wrong, you can just cast a function pointer this way.
void print_data(void *data)
{
// ...
}
((void (*)(char *)) &print_data)("hello");
A cleaner way would be to create a function typedef.
typedef void(*t_print_str)(char *);
((t_print_str) &print_data)("hello");
If you think about the way function calls work in C/C++, they push certain items on the stack, jump to the new code location, execute, then pop the stack on return. If your function pointers describe functions with the same return type and the same number/size of arguments, you should be okay.
Thus, I think you should be able to do so safely.
Void pointers are compatible with other types of pointer. It's the backbone of how malloc and the mem functions (memcpy, memcmp) work. Typically, in C (Rather than C++) NULL is a macro defined as ((void *)0).
Look at 6.3.2.3 (Item 1) in C99:
A pointer to void may be converted to or from a pointer to any incomplete or object type