This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What does (char *) x or (void *) z mean?
I am working with a c++ file and have encountered the following line:
tmp.sort(Hash::pairval, printPair, (void *)(tmp.bitSize()));
I am most unsure of what (void *) means. bitsize() is a function, and I have heard the term passing a function pointer before. Is that what this is?
I know the :: is normally the scope resolution operator, which I have seen in .cpp/.h object type files. I believe it is serving the same purpose here, to state that pairval is found in Hash.
Thanks
The (void *) is simply casting the return value of tmp.bitSize() to a void pointer type. Casting is a very common operation in C++ and c as well.
Hash::pair
Is most probably a call to a static member of class Hash.
The (void*) part is a cast to void pointer of tmp.bitSize() which most probably returns some kind of value. So there is no function pointer.
I am most unsure of what (void *) means. bitsize() is a function, and I have heard the term passing a function pointer before. Is that what this is?
Nope. Note the parentheses, tmp.bitSize() is a function call expression that is called and returns a value. Hence - no function pointers involved here.
The return value is then cast to the pointer-to-void type (i.e. the catch-all "pointer to something" type) in order to be passed to a function which expects such pointer.
Why on Earth would someone convert a bit size (which looks like a number) into a pointer, I have no idea. This is somewhere between dubious and incorrect.
Read up on casting in C++. C-style casts are discouraged and casting to void* is seldomly useful and often dangerous because of the strict aliasing rule.
know the :: is normally the scope resolution operator, which I have seen in .cpp/.h object type files. I believe it is serving the same purpose here, to state that pairval is found in Hash.
That's correct.
Related
This question already has answers here:
Calling a function through its address in memory in c / c++
(6 answers)
Closed 5 years ago.
I have a function at a known memory address(for example: 0x11111111). The function returns an int, and takes a uint32_t pointer as its only argument.
How would I call this function using c++? I have seen a few examples of calling a function by its address, but I can't seem to find one that takes a pointer as its argument
EDIT:I seen that. Doesn't address how to call the function that takes a pointer as an argument
If you’re sure that there’s a function there, you could call it by casting the address to a function pointer of the appropriate type, then calling it. Here’s C code to do this:
typedef int (*FunctionType)(uint32_t*);
FunctionType function = (FunctionType)0x11111111;
function(arg);
This can easily be modified to support any number of function arguments and any return type you’d like. Just tweak the argument types list of the FunctionType typedef.
Or, in one line (gulp):
(((int (*)(uint32_t *)) 0x11111111)(arg);
Found this code example
void *handle;
double (*cosine)(double);
handle = dlopen("libm.so", RTLD_LAZY);
*(void **) (&cosine) = dlsym(handle, "cos");
I use rightmost-to-left reading rule to parse the variable's type:
double (*cosine)(double);
Here I write left-to-right but move LTR: "cosine" -> "*" -> "is a pointer"
then "(" we go outside the innermost () scope -> "(double)" -> "to function taking one double" -> and returning leftmost "double"
but what the hell is THIS? I even don't know where to start the parse from) is "&cosine" a address or reference? what does the (void **) mean? why it has leftmost "*" outside??? is it dereference or type?
*(void **) (&cosine)
Yup, that's a mouthful.
cosine is a function pointer. So &cosine is a pointer to that pointer. And then when we slap a * in front of it, we're changing the original pointer, to make it point somewhere else.
It's sort of like this:
int i = 5;
int *ip = &i;
*ip = 6; /* changes i to 6 */
Or it's more like this:
char a[10], b[10];
char *p = a;
*(&p) = b; /* changes p to point to b */
But in your case it's even trickier, because cosine is a pointer to a function, not a pointer to data. Most of the time, function pointers point to functions you have defined in your program. But here, we're arranging to make cosine point to a dynamically-loaded function, loaded by the dlsym() function.
dlsym is super special because it can return pointers to data, as well as pointers to functions. So it's got an impossible-to-define return type. It's declared as returning void *, of course, because that's the "generic" pointer type in C. (Think malloc.) But in pure C, void * is a generic data pointer type; it's not guaranteed to be able to be used with function pointers.
The straightforward thing to do would be to just say
cosine = dlsym(handle, "cos");
But a modern compiler will complain, because dlsym returns void *, and cosine has type double (*)(double) (that is, pointer to function taking double and returning double), and that's not a portable conversion.
So we go around the barn, and set cosine's value indirectly, not by saying
cosine = something
but rather by saying
*(&cosine) = something
But that's still no good in the dlsym case, because the types still don't match. We've got void * on the right, so we need void * on the left. And the solution is to take the address &cosine, which is otherwise a pointer-to-pointer-to-function, and cast it to a pointer-to-pointer-to-void, or void **, so that when we slap a * in front of it we've got a void * again, which is a proper destination to assign dlsym's return value to. So we end up with the line you were asking about:
* (void **) (&cosine) = dlsym(handle, "cos");
Now, it's important to note that we're on thin ice here. We've used the & and the cast to get around the fact that assigning a pointer-to-void to a `pointer-to-function isn't strictly legal. In the process we've successfully silenced the compiler's warning that what we're doing isn't strictly legal. (Indeed, silencing the warning was precisely the original programmer's intent in employing this dodge.)
The potential problem is, what if data pointers and function pointers have different sizes or representations? This code goes to some length to treat a function pointer, cosine, as if it were a data pointer, jamming the bits of a data pointer into it. If, say, a data pointer were somehow bigger than a function pointer, this would have terrible effects. (And, before you ask "But how could a data pointer ever be bigger than a function pointer?", that's exactly how they were, for example, in the "compact" memory model, back in the days of MS-DOS programming.)
Normally, playing games like this to break the rules and shut off compiler warnings is a bad idea. In the case of dlsym, though, it's fine, I would say perfectly acceptable. dlsym can't exist on a system where function pointers are different from data pointers, so if you're using dlsym at all, you must be on a machine where all pointers are the same, and this code will work.
It's also worth asking, if we have to play games with casts when calling dlsym, why take the extra trip around the barm with the pointer-to-pointer? Why not just say
cosine = (double (*)(double))dlsym(handle, "cos");
And the answer is, I don't know. I'm pretty sure this simpler cast will work just as well (again, as long as we're on a system where dlsym can exist at all). Perhaps there are compilers that warn about this case, that can only be tricked into silence by using the tricker, double-pointer trick.
See also Casting when using dlsym() .
This is nasty stuff. cosine is a pointer to a function that takes an argument of type double and returns double. &cosine is the address of that pointer. The cast says to pretend that that address is a pointer-to-pointer-to-void. The * in front of the cast is the usual dereference operator, so the result is to tell the compiler to pretend that the type of cosine is void*, so that the code can assign the return value from the call to dlsym to cosine. Phew; that hurts.
And, just for fun, a void* and a pointer-to-function are not at all related, which is why the code has to go through all that casting. The C++ language definition does not guarantee that this will work. I.e., the result is undefined behavior.
For C, a pointer to void can be converted to any pointer to an object without a cast. However, the C standard does not guarantee that void * can be converted to a pointer to a function - at all, since functions are not objects.
The dlsym is a POSIX function; and POSIX requires that as an extension, a pointer to a function must be convertable to void * and back again. However C++ wouldn't allow such a conversion without a cast.
In any case the *(void **) (&cosine) = dlsym(handle, "cos"); cast means that the pointer to the pointer to a function (double) returning double is cast as pointer to pointer to void, then dereferenced to get a lvalue of type void *, to which the return value of dlsym is assigned to. This is rather ugly, and should be better written as cosine = (double (*)(double))dlsym(handle, "cos") wherever a cast is required. Both are undefined behaviour when it comes to C, but the latter is not as much dark magic.
In this question:
asker raise a question about #define offsetof(st, m) \
((size_t) ( (char *)&((st *)(0))->m - (char *)0 )) deference null(0) pointer and there is no segment fault.
JaredPar's answer pointed out:
The -> operator is used above but it's not used to access the value.
Instead it's used to grab the address of the value. Here is a
non-macro code sample that should make it a bit clearer
SomeType *pSomeType = GetTheValue();
int* pMember = &(pSomeType->SomeIntMember);
The second line does not actually cause a dereference (implementation
dependent). It simply returns the address of SomeIntMember within the
pSomeType value.
My question is how to prove int* pMember = &(pSomeType->SomeIntMember); just assigns SomeIntMember's address to pMember without deferencing pSomeType.
Is there any iso c++ standard? or is there any method?
EDIT:
Although the question I posted is about c, I want c++ answer, so I tag this question c++.
If there is something in c++ standard, it is better.
Else, I hope to see something to prove JaredPar's conclusion, e.g., xaxxon posted the assembly, or how specific compiler implement.
If answers hold int* pMember = &(pSomeType->SomeIntMember); does make deference to pSomeType, then why offsetof's implemention(#define offsetof(st, m) ((size_t) ( (char *)&((st *)(0))->m - (char *)0 ))) is valid?
UPDATE:
Thanks for all the comments and answers, now I understand that #define offsetof(st, m) ((size_t) ( (char*)&((st*)(0))->m - (char)0)) is one of implementions in c, not in c++.
Also, I find msvc's implemention, #define offsetof(s,m) ((size_t)&reinterpret_cast<char const volatile&>((((s*)0)->m))), but it is a little complex to me, can someone give an expression? Thanks in advance.
The -> operator does cause a dereference. a -> b is defined as (*a).b , if a is a pointer.
If people claim it is not a dereference, then they are either mistaken, or using some non-standard meaning of the word "dereference".
In the C++ standard, the formal name for * is the indirection operator. The word "dereference" isn't used as a verb; instead, the standard says that applying the * operator to a pointer yields an lvalue designating the object that was pointed to.
&(p->x) causes undefined behaviour if p is not a valid pointer.
Regarding the "offsetof" edit , the code in implementation headers is not subject to the rules of the language. They can contain magic and non-standard non-portable code.
As M.M's answer points out, the -> operator is a dereference (if the operand is a pointer). The confusion about whether this is dereferencing likely arises from the related notion of memory access.
To dereference a pointer is to get the object at the pointed to address. Or more precisely, given a pointer p of type T*, the expression *p is a lvalue of type T that refers to the pointed to object.
Memory access is when a read or write occurs, which corresponds to lvalue-to-rvalue conversion and assignment respectively. When none happens, no access to memory is made.
pSomeType->SomeIntMember // is defined to be...
(*pSomeType).SomeIntMember
*pSomeType is a lvalue of SomeType, and therefore its member SomeIntMember is a lvalue of int.
Then its address is taken. No lvalue-to-rvalue conversion happens, and no assignment happens, therefore there is no memory access, as shown by #xaxxon's comment.
So in "the c++ programming language, 4th edition", there's a paragraph I don't understand about conversion of pointer-to-function types. Here is some of the code sample.
using P1 = int(*)(int*);
using P2 = void(*)(void);
void f(P1 pf) {
P2 pf2 = reinterpret_cast<P2>(pf);
pf2(); // likely serious problem
// other codes
}
When I run this it crashed.
I'm not sure if I am right, but I initially think the "likely serious problem" comment is when pf got casted to P2 in pf2, I think pf2 is not pointing to anything? Because when I created a function that matches P2's type and point pf2 to it, it didn't crash and runs normally.
After the code, I read this:
We need the nastiest of casts, reinterpret_cast, to do conversion of pointer-to-function types. The reason is that the result of using a pointer to function of the wrong type is so unpredictable and system-dependent. For example, in the example above, the called function may write to the object pointed to by its argument, but the call pf2() didn’t supply any argument!
Now I'm completely lost starting from "For example, in the example above" part:
"may write to the object pointed to by its argument" //what object is it exactly?
"but the call pf2() didn’t supply any argument!" //"using P2 = void(*)(void);" doesn't really need an arguement does it?
I think I'm missing something here. Can someone explain this?
For example, in the example above, the called function may write to the object pointed to by its argument (...)
pf is a pointer to a function like this:
int foo(int* intPtr)
{
// ...
}
So it could be implemented to write to its argument:
int foo(int* intPtr)
{
*intPtr = 42; // writing to the address given as argument
return 0;
}
(...) but the call pf2() didn’t supply any argument!
When you call foo through its cast to type P2, it will be called without arguments, so it is unclear what intPtr will be:
P2 pf2 = reinterpret_cast<P2>(pf);
pf2(); // no argument given here, although pf2 really is foo() and expects one!
Writing to it will most likely corrupt something.
Moreover, compilers usually implement calls to functions that return something by reserving space for the return value first, that will then be filled by the function call. When you call a P1 using the signature of P2, the call to P2 won't reserve space (as the return value is void) and the actual call will write an int somewhere it should not, which is another source for corruption.
Now I'm completely lost starting from "For example, in the example
above" part:
"may write to the object pointed to by its argument" //what object is
it exactly?
P1 is a function expecting a non-const pointer-to-int argument. That means it very well may write to the int referenced in its argument.
"but the call pf2() didn’t supply any argument!" //"using P2 =
void(*)(void);" doesn't really need an arguement does it?
When you call the function through another function pointer type passing no argument, the expectations of the called function aren't met. It may try to interpret whatever is on the stack as an int pointer and write to it, causing undefined behavior.
This does fail, but not necessarily in the way one might expect.
The implementation of a function pointer is left up to the compiler (undefined). Even the size of a function pointer can be bigger than a void*.
What is guaranteed about the size of a function pointer?
There is no guarentees about anything in the value of the function pointer. In fact, the only even guarentee that the comparison operators will work between function pointers of the same type.
Comparing function pointers
The standard does provide that function pointers can store the values of other function types.
Casting the function pointer to another type undefined behavior, meaning the compiler can do whatever it wants. Whether or not you supply the argument really doesn't matter, and how that would fail depends on the calling convention of the system. As far as your concerned, it could allow "demons to fly out of your nose".
Casting a function pointer to another type
So that brings us back to the statement by the author:
We need the nastiest of casts, reinterpret_cast, to do conversion of pointer-to-function types. The reason is that the result of using a pointer to function of the wrong type is so unpredictable and system-dependent. For example, in the example above, the called function may write to the object pointed to by its argument, but the call pf2() didn’t supply any argument!
That is trying to make the point that with no argument specified, if the function writes the output, it will write to some uninitialized state. Basically, if you look at the function as
int foo(int* arg) {*arg=10;}
if you didn't initialize arg, the author says you could be writing anywhere. But again, there is no guarentee that this even matters. The system could store functions with the footprint int (*)(int*) and void(*)(void) in completely different memory space, in which case instead of the above problem you'd have a jump into a random location in the program. Undefined behavior is just that: undefined.
Just don't do it man.
Let's say I have a function that accepts a void (*)(void*) function pointer for use as a callback:
void do_stuff(void (*callback_fp)(void*), void* callback_arg);
Now, if I have a function like this:
void my_callback_function(struct my_struct* arg);
Can I do this safely?
do_stuff((void (*)(void*)) &my_callback_function, NULL);
I've looked at this question and I've looked at some C standards which say you can cast to 'compatible function pointers', but I cannot find a definition of what 'compatible function pointer' means.
As far as the C standard is concerned, if you cast a function pointer to a function pointer of a different type and then call that, it is undefined behavior. See Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to
type (6.3.2.3).
Section 6.3.2.3, paragraph 8 reads:
A pointer to a function of one type may be converted to a pointer to a function of another
type and back again; the result shall compare equal to the original pointer. If a converted
pointer is used to call a function whose type is not compatible with the pointed-to type,
the behavior is undefined.
So in other words, you can cast a function pointer to a different function pointer type, cast it back again, and call it, and things will work.
The definition of compatible is somewhat complicated. It can be found in section 6.7.5.3, paragraph 15:
For two function types to be compatible, both shall specify compatible return types127.
Moreover, the parameter type lists, if both are present, shall agree in the number of
parameters and in use of the ellipsis terminator; corresponding parameters shall have
compatible types. If one type has a parameter type list and the other type is specified by a
function declarator that is not part of a function definition and that contains an empty
identifier list, the parameter list shall not have an ellipsis terminator and the type of each
parameter shall be compatible with the type that results from the application of the
default argument promotions. If one type has a parameter type list and the other type is
specified by a function definition that contains a (possibly empty) identifier list, both shall
agree in the number of parameters, and the type of each prototype parameter shall be
compatible with the type that results from the application of the default argument
promotions to the type of the corresponding identifier. (In the determination of type
compatibility and of a composite type, each parameter declared with function or array
type is taken as having the adjusted type and each parameter declared with qualified type
is taken as having the unqualified version of its declared type.)
127) If both function types are ‘‘old style’’, parameter types are not compared.
The rules for determining whether two types are compatible are described in section 6.2.7, and I won't quote them here since they're rather lengthy, but you can read them on the draft of the C99 standard (PDF).
The relevant rule here is in section 6.7.5.1, paragraph 2:
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
Hence, since a void* is not compatible with a struct my_struct*, a function pointer of type void (*)(void*) is not compatible with a function pointer of type void (*)(struct my_struct*), so this casting of function pointers is technically undefined behavior.
In practice, though, you can safely get away with casting function pointers in some cases. In the x86 calling convention, arguments are pushed on the stack, and all pointers are the same size (4 bytes in x86 or 8 bytes in x86_64). Calling a function pointer boils down to pushing the arguments on the stack and doing an indirect jump to the function pointer target, and there's obviously no notion of types at the machine code level.
Things you definitely can't do:
Cast between function pointers of different calling conventions. You will mess up the stack and at best, crash, at worst, succeed silently with a huge gaping security hole. In Windows programming, you often pass function pointers around. Win32 expects all callback functions to use the stdcall calling convention (which the macros CALLBACK, PASCAL, and WINAPI all expand to). If you pass a function pointer that uses the standard C calling convention (cdecl), badness will result.
In C++, cast between class member function pointers and regular function pointers. This often trips up C++ newbies. Class member functions have a hidden this parameter, and if you cast a member function to a regular function, there's no this object to use, and again, much badness will result.
Another bad idea that might sometimes work but is also undefined behavior:
Casting between function pointers and regular pointers (e.g. casting a void (*)(void) to a void*). Function pointers aren't necessarily the same size as regular pointers, since on some architectures they might contain extra contextual information. This will probably work ok on x86, but remember that it's undefined behavior.
I asked about this exact same issue regarding some code in GLib recently. (GLib is a core library for the GNOME project and written in C.) I was told the entire slots'n'signals framework depends upon it.
Throughout the code, there are numerous instances of casting from type (1) to (2):
typedef int (*CompareFunc) (const void *a,
const void *b)
typedef int (*CompareDataFunc) (const void *b,
const void *b,
void *user_data)
It is common to chain-thru with calls like this:
int stuff_equal (GStuff *a,
GStuff *b,
CompareFunc compare_func)
{
return stuff_equal_with_data(a, b, (CompareDataFunc) compare_func, NULL);
}
int stuff_equal_with_data (GStuff *a,
GStuff *b,
CompareDataFunc compare_func,
void *user_data)
{
int result;
/* do some work here */
result = compare_func (data1, data2, user_data);
return result;
}
See for yourself here in g_array_sort(): http://git.gnome.org/browse/glib/tree/glib/garray.c
The answers above are detailed and likely correct -- if you sit on the standards committee. Adam and Johannes deserve credit for their well-researched responses. However, out in the wild, you will find this code works just fine. Controversial? Yes. Consider this: GLib compiles/works/tests on a large number of platforms (Linux/Solaris/Windows/OS X) with a wide variety of compilers/linkers/kernel loaders (GCC/CLang/MSVC). Standards be damned, I guess.
I spent some time thinking about these answers. Here is my conclusion:
If you are writing a callback library, this might be OK. Caveat emptor -- use at your own risk.
Else, don't do it.
Thinking deeper after writing this response, I would not be surprised if the code for C compilers uses this same trick. And since (most/all?) modern C compilers are bootstrapped, this would imply the trick is safe.
A more important question to research: Can someone find a platform/compiler/linker/loader where this trick does not work? Major brownie points for that one. I bet there are some embedded processors/systems that don't like it. However, for desktop computing (and probably mobile/tablet), this trick probably still works.
The point really isn't whether you can. The trivial solution is
void my_callback_function(struct my_struct* arg);
void my_callback_helper(void* pv)
{
my_callback_function((struct my_struct*)pv);
}
do_stuff(&my_callback_helper);
A good compiler will only generate code for my_callback_helper if it's really needed, in which case you'd be glad it did.
You have a compatible function type if the return type and parameter types are compatible - basically (it's more complicated in reality :)). Compatibility is the same as "same type" just more lax to allow to have different types but still have some form of saying "these types are almost the same". In C89, for example, two structs were compatible if they were otherwise identical but just their name was different. C99 seem to have changed that. Quoting from the c rationale document (highly recommended reading, btw!):
Structure, union, or enumeration type declarations in two different translation units do not formally declare the same type, even if the text of these declarations come from the same include file, since the translation units are themselves disjoint. The Standard thus specifies additional compatibility rules for such types, so that if two such declarations are sufficiently similar they are compatible.
That said - yeah strictly this is undefined behavior, because your do_stuff function or someone else will call your function with a function pointer having void* as parameter, but your function has an incompatible parameter. But nevertheless, i expect all compilers to compile and run it without moaning. But you can do cleaner by having another function taking a void* (and registering that as callback function) which will just call your actual function then.
As C code compiles to instruction which do not care at all about pointer types, it's quite fine to use the code you mention. You'd run into problems when you'd run do_stuff with your callback function and pointer to something else then my_struct structure as argument.
I hope I can make it clearer by showing what would not work:
int my_number = 14;
do_stuff((void (*)(void*)) &my_callback_function, &my_number);
// my_callback_function will try to access int as struct my_struct
// and go nuts
or...
void another_callback_function(struct my_struct* arg, int arg2) { something }
do_stuff((void (*)(void*)) &another_callback_function, NULL);
// another_callback_function will look for non-existing second argument
// on the stack and go nuts
Basically, you can cast pointers to whatever you like, as long as the data continue to make sense at run-time.
Well, unless I understood the question wrong, you can just cast a function pointer this way.
void print_data(void *data)
{
// ...
}
((void (*)(char *)) &print_data)("hello");
A cleaner way would be to create a function typedef.
typedef void(*t_print_str)(char *);
((t_print_str) &print_data)("hello");
If you think about the way function calls work in C/C++, they push certain items on the stack, jump to the new code location, execute, then pop the stack on return. If your function pointers describe functions with the same return type and the same number/size of arguments, you should be okay.
Thus, I think you should be able to do so safely.
Void pointers are compatible with other types of pointer. It's the backbone of how malloc and the mem functions (memcpy, memcmp) work. Typically, in C (Rather than C++) NULL is a macro defined as ((void *)0).
Look at 6.3.2.3 (Item 1) in C99:
A pointer to void may be converted to or from a pointer to any incomplete or object type