How are function pointers type unsafe - c++

First of all type-safe means that anything that a compiler can catch straight away if done incorrectly.
Now, I heard function pointers are not type safe however whenever I tried to use them incorrectly the compiler did report errors for me. So, how is it type unsafe ?
E.g This is a function prototype that takes in a function pointer
void SortElements(void* MyArray, unsigned int iNumofElems,size_t size, int(*compare_funct)(void* First,void* SecondElem))
I have defined few functions to pass to it as:
int MySortAsc(void* First, void* Second);
void MyFunct2();
void MyFunct3(void* First);
The code only compiles for:
SortElements(MyArray, 10, sizeof(DataType), &MySortAsc); //Compiles
SortElements(MyArray, 10, sizeof(DataType), &MyFunct2); //Fails
Any idea how can I mis-use function pointers here ?
Is it because of this:
void (*functionPointer)();
...
int integer = 0xFFFFFFFF;
functionPointer = (void(*)())integer;
functionPointer();
Answer:
What I got to see is that function pointers in C++ are type safe. Ofcourse, they can be used in an unsafe manner by casting it incorectly but that does not make them a reason to be called as type unsafe. .NET delegates are strongly typed as well and to me it looks like both are type safe.

So, how is it type unsafe ?
void SortElements(void* MyArray, // what type is pointed here?
unsigned int N, // Are there really N elements?
size_t size, // Is the size correct?
int(*cmp)(void*,void*)); // Is this the correct function?
The code that you present is type-unsafe, not because of the function pointer but rather because of the use of void* in both the SortElements signature and the signature of the function pointer.
The reason why this is unsafe is because the caller has the whole responsibility of passing the right arguments, and the compiler cannot ensure that the pointer MyArray points to a contiguous memory region that holds iNumofElems each of which has the size offered in the interface. If the programmer makes a mistake, the compiler will not be able to help there, if a maintainer modifies the type stored in the array (size changes) or the number of elements, the compiler will not be able to detect it and tell you that you need to update the call to SortElements. Finally, because the function pointer that is passed also uses void*, the signature of a comparator that compares apples and pears is exactly the same, and the compiler cannot help if you pass the incorrect function pointer.
struct Apple {
int weight;
};
struct Pear {
double weight;
};
int compare_pears( void * pear1, void * pear2 ) {
return static_cast<Pear*>(pear1)->weight - static_cast<Pear*>(pear2)->weight;
}
int main() {
Apple apples[10];
SortElements( apples, 20, sizeof(Pear), compare_pears );
}
While the compiler is able to verify that the signature of the function pointer matches the signature that the function needs, the function pointer itself is unsafe, and allows you to pass a comparator for basically anything.
Compare that with this other alternative:
template <typename T, std::size_t N>
void SortElements( T (&array)[N], int (*cmp)( T const &, T const & ) );
Here the compiler will infer the type of the elements T and the size of the array N from the call. There is no need to pass the size of T, as the compiler knows it. The comparator function passed to this version of SortElements is strongly typed: it takes two constant references to the type of the element stored in the array and returns an int. If we tried this in the previous program:
int compare_pears( Pear const & lhs, Pear const & rhs );
int compare_apples( Apple const & l, Apple const & r );
Apple array[10];
//SortElements( array, compare_pears ); // Error!!!!
SortElements( array, compare_apples ); // Good!
You cannot mistake the size of the array or the size of the elements, if someone changes the type Apple, the compiler will pick it up, if the size of the array changes, the compiler will pick it up. You cannot mistake the comparator that is passed to the function as the compiler will also pick it up. Now the program is type safe, even if it uses function pointers (that might have an impact in performance as they inhibit inlining, which is why std::sort is usually faster than qsort)

Function pointers are type safe. However, many environments force upon the programmer the need to recast them. An incorrect casting could cause significant problems.

Function pointers are in fact type checked and are type safe.

Function pointers are strongly discouraged in nesC (a dialect of C used in TinyOs), for the reason that they hinder optimisation. Here static code analysis (or rather the lack of its applicability) is a bigger concern than type-safety, but I'm not sure whether these issues could be confused.
Another issue might be the use of function pointers as event handlers. When using a general event scheduler, you may want to abstract from the proper type, which would mean that you could have the idea to store function pointers as void* just for the sake of modularity. This would be a prominent example of type-unsafe usage of function pointers instead of type-safe dynamic binding usage.

Related

benefits of passing const reference vs values in function in c++ for primitive types

I want to know what might be the possible advantages of passing by value over passing by const reference for primitive types like int, char, float, double, etc. to function? Is there any performance benefit for passing by value?
Example:
int sum(const int x,const int y);
or
int sum(const int& x,const int& y);
For the second case, I have hardly seen people using this. I know there is benefit of passing by reference for big objects.
In every ABI I know of, references are passed via something equivalent to pointers. So when the compiler cannot inline the function or otherwise must follow the ABI, it will pass pointers there.
Pointers are often larger than values; but more importantly, pointers do not point at registers, and while the top of the stack is almost always going to be in cache, what it points at may not. In addition, many ABIs have primitives passed via register, which can be faster than via memory.
The next problem is within the function. Whenever the code flow could possible modify an int, data from a const int& parameter must be reloaded! While the reference is to const, the data it refers to can be changed via other paths.
The most common ways this can happen is when you leave the code the complier can see while understanding the function body or modify memory through a global variable, or follow a pointer to touch an int elsewhere.
In comparison, an int argument whose address is not taken cannot be legally modified through other means than directly. This permits the compiler to understand it isn't being mutated.
This isn't just a problem for the complier trying to optimize and getting confused. Take something like:
struct ui{
enum{ defFontSize=9;};
std:optional<int> fontSize;
void reloadFontSize(){
fontSize=getFontSizePref();
fontSizeChanged(*fontSize),
}
void fontSizeChanged(int const& sz){
if(sz==defFontSize)
fontSize=std:nullopt;
else
fontSize=sz;
drawText(sz);
}
void drawText(int sz){
std::cout << "At size " << sz <<"\n";
}
};
and the optional, to whom we are passing a reference, gets destroyed and used after destruction.
A bug like this can be far less obvious than this. If we defaulted to passing by value, it could not happen.
Typically, primitive types are not passed by reference, but sometimes there is a point in that. E.g, on x64 machine long double is 16 bytes long and pointer is 8 bytes long. So it will be a little bit better to use a reference in this case.
In your example, there is no point in that: usual int is 4 bytes long, so you can pass two integers instead of one pointer.
You can use sizeof() to measure the size of the type.

Why do C and C++ compilers allow array lengths in function signatures when they're never enforced?

This is what I found during my learning period:
#include<iostream>
using namespace std;
int dis(char a[1])
{
int length = strlen(a);
char c = a[2];
return length;
}
int main()
{
char b[4] = "abc";
int c = dis(b);
cout << c;
return 0;
}
So in the variable int dis(char a[1]) , the [1] seems to do nothing and doesn't work at
all, because I can use a[2]. Just like int a[] or char *a. I know the array name is a pointer and how to convey an array, so my puzzle is not about this part.
What I want to know is why compilers allow this behavior (int a[1]). Or does it have other meanings that I don't know about?
It is a quirk of the syntax for passing arrays to functions.
Actually it is not possible to pass an array in C. If you write syntax that looks like it should pass the array, what actually happens is that a pointer to the first element of the array is passed instead.
Since the pointer does not include any length information, the contents of your [] in the function formal parameter list are actually ignored.
The decision to allow this syntax was made in the 1970s and has caused much confusion ever since...
The length of the first dimension is ignored, but the length of additional dimensions are necessary to allow the compiler to compute offsets correctly. In the following example, the foo function is passed a pointer to a two-dimensional array.
#include <stdio.h>
void foo(int args[10][20])
{
printf("%zd\n", sizeof(args[0]));
}
int main(int argc, char **argv)
{
int a[2][20];
foo(a);
return 0;
}
The size of the first dimension [10] is ignored; the compiler will not prevent you from indexing off the end (notice that the formal wants 10 elements, but the actual provides only 2). However, the size of the second dimension [20] is used to determine the stride of each row, and here, the formal must match the actual. Again, the compiler will not prevent you from indexing off the end of the second dimension either.
The byte offset from the base of the array to an element args[row][col] is determined by:
sizeof(int)*(col + 20*row)
Note that if col >= 20, then you will actually index into a subsequent row (or off the end of the entire array).
sizeof(args[0]), returns 80 on my machine where sizeof(int) == 4. However, if I attempt to take sizeof(args), I get the following compiler warning:
foo.c:5:27: warning: sizeof on array function parameter will return size of 'int (*)[20]' instead of 'int [10][20]' [-Wsizeof-array-argument]
printf("%zd\n", sizeof(args));
^
foo.c:3:14: note: declared here
void foo(int args[10][20])
^
1 warning generated.
Here, the compiler is warning that it is only going to give the size of the pointer into which the array has decayed instead of the size of the array itself.
The problem and how to overcome it in C++
The problem has been explained extensively by pat and Matt. The compiler is basically ignoring the first dimension of the array's size effectively ignoring the size of the passed argument.
In C++, on the other hand, you can easily overcome this limitation in two ways:
using references
using std::array (since C++11)
References
If your function is only trying to read or modify an existing array (not copying it) you can easily use references.
For example, let's assume you want to have a function that resets an array of ten ints setting every element to 0. You can easily do that by using the following function signature:
void reset(int (&array)[10]) { ... }
Not only this will work just fine, but it will also enforce the dimension of the array.
You can also make use of templates to make the above code generic:
template<class Type, std::size_t N>
void reset(Type (&array)[N]) { ... }
And finally you can take advantage of const correctness. Let's consider a function that prints an array of 10 elements:
void show(const int (&array)[10]) { ... }
By applying the const qualifier we are preventing possible modifications.
The standard library class for arrays
If you consider the above syntax both ugly and unnecessary, as I do, we can throw it in the can and use std::array instead (since C++11).
Here's the refactored code:
void reset(std::array<int, 10>& array) { ... }
void show(std::array<int, 10> const& array) { ... }
Isn't it wonderful? Not to mention that the generic code trick I've taught you earlier, still works:
template<class Type, std::size_t N>
void reset(std::array<Type, N>& array) { ... }
template<class Type, std::size_t N>
void show(const std::array<Type, N>& array) { ... }
Not only that, but you get copy and move semantic for free. :)
void copy(std::array<Type, N> array) {
// a copy of the original passed array
// is made and can be dealt with indipendently
// from the original
}
So, what are you waiting for? Go use std::array.
It's a fun feature of C that allows you to effectively shoot yourself in the foot if you're so inclined. I think the reason is that C is just a step above assembly language. Size checking and similar safety features have been removed to allow for peak performance, which isn't a bad thing if the programmer is being very diligent. Also, assigning a size to the function argument has the advantage that when the function is used by another programmer, there's a chance they'll notice a size restriction. Just using a pointer doesn't convey that information to the next programmer.
First, C never checks array bounds. Doesn't matter if they are local, global, static, parameters, whatever. Checking array bounds means more processing, and C is supposed to be very efficient, so array bounds checking is done by the programmer when needed.
Second, there is a trick that makes it possible to pass-by-value an array to a function. It is also possible to return-by-value an array from a function. You just need to create a new data type using struct. For example:
typedef struct {
int a[10];
} myarray_t;
myarray_t my_function(myarray_t foo) {
myarray_t bar;
...
return bar;
}
You have to access the elements like this: foo.a[1]. The extra ".a" might look weird, but this trick adds great functionality to the C language.
To tell the compiler that myArray points to an array of at least 10 ints:
void bar(int myArray[static 10])
A good compiler should give you a warning if you access myArray [10]. Without the "static" keyword, the 10 would mean nothing at all.
This is a well-known "feature" of C, passed over to C++ because C++ is supposed to correctly compile C code.
Problem arises from several aspects:
An array name is supposed to be completely equivalent to a pointer.
C is supposed to be fast, originally developerd to be a kind of "high-level Assembler" (especially designed to write the first "portable Operating System": Unix), so it is not supposed to insert "hidden" code; runtime range checking is thus "forbidden".
Machine code generrated to access a static array or a dynamic one (either in the stack or allocated) is actually different.
Since the called function cannot know the "kind" of array passed as argument everything is supposed to be a pointer and treated as such.
You could say arrays are not really supported in C (this is not really true, as I was saying before, but it is a good approximation); an array is really treated as a pointer to a block of data and accessed using pointer arithmetic.
Since C does NOT have any form of RTTI You have to declare the size of the array element in the function prototype (to support pointer arithmetic). This is even "more true" for multidimensional arrays.
Anyway all above is not really true anymore :p
Most modern C/C++ compilers do support bounds checking, but standards require it to be off by default (for backward compatibility). Reasonably recent versions of gcc, for example, do compile-time range checking with "-O3 -Wall -Wextra" and full run-time bounds checking with "-fbounds-checking".
C will not only transform a parameter of type int[5] into *int; given the declaration typedef int intArray5[5];, it will transform a parameter of type intArray5 to *int as well. There are some situations where this behavior, although odd, is useful (especially with things like the va_list defined in stdargs.h, which some implementations define as an array). It would be illogical to allow as a parameter a type defined as int[5] (ignoring the dimension) but not allow int[5] to be specified directly.
I find C's handling of parameters of array type to be absurd, but it's a consequence of efforts to take an ad-hoc language, large parts of which weren't particularly well-defined or thought-out, and try to come up with behavioral specifications that are consistent with what existing implementations did for existing programs. Many of the quirks of C make sense when viewed in that light, particularly if one considers that when many of them were invented, large parts of the language we know today didn't exist yet. From what I understand, in the predecessor to C, called BCPL, compilers didn't really keep track of variable types very well. A declaration int arr[5]; was equivalent to int anonymousAllocation[5],*arr = anonymousAllocation;; once the allocation was set aside. the compiler neither knew nor cared whether arr was a pointer or an array. When accessed as either arr[x] or *arr, it would be regarded as a pointer regardless of how it was declared.
One thing that hasn't been answered yet is the actual question.
The answers already given explain that arrays cannot be passed by value to a function in either C or C++. They also explain that a parameter declared as int[] is treated as if it had type int *, and that a variable of type int[] can be passed to such a function.
But they don't explain why it has never been made an error to explicitly provide an array length.
void f(int *); // makes perfect sense
void f(int []); // sort of makes sense
void f(int [10]); // makes no sense
Why isn't the last of these an error?
A reason for that is that it causes problems with typedefs.
typedef int myarray[10];
void f(myarray array);
If it were an error to specify the array length in function parameters, you would not be able to use the myarray name in the function parameter. And since some implementations use array types for standard library types such as va_list, and all implementations are required to make jmp_buf an array type, it would be very problematic if there were no standard way of declaring function parameters using those names: without that ability, there could not be a portable implementation of functions such as vprintf.
It's allowed for compilers to be able to check whether the size of array passed is the same as what expected. Compilers may warn an issue if it's not the case.

How to store various types of function pointers together?

Normal pointers can be stored using a generic void*. e.g.
void* arr[10];
arr[0] = pChar;
arr[1] = pINt;
arr[2] = pA;
Sometime back, I came across a discussion that, void* may not be capable enough to store a function pointer without data-loss in all platforms (say 64-bit and more). I am not sure about this fact though.
If that's true, then what is the most portable way to store a collection of function pointers ?
[Note: This question doesn't satisfactorily answer this.]
Edit: I will be storing this function pointers with an index. There is a typecasting associated with every index whenever this collection is accessed. As of now, I am interested only to make an array or vector of it.]
You can convert a function pointer to another function pointer of any function type and back without loss.
So as long as when you make the call through the function pointer you typecast it back to the correct type first, you can store all of your function pointers in something like:
typedef void (*fcn_ptr)(void); // 'generic' function pointer
fcn_ptr arr[10];
A pointer to a function can be converted to a pointer to a function of a different type with a reinterpret_cast. If you convert it back to the original type you are guaranteed to get the original value back so you can then use it to call the function. (ISO/IEC 14882:2003 5.2.10 [expr.reinterpret.cast] / 6)
You now only need to select an arbitrary function pointer type for your array. void(*)() is a common choice.
E.g.
int main()
{
int a( int, double );
void b( const char * );
void (*array[])() = { reinterpret_cast<void(*)()>(a)
, reinterpret_cast<void(*)()>(b) };
int test = reinterpret_cast< int(*)( int, double) >( array[0] )( 5, 0.5 );
reinterpret_cast< void(*)( const char* ) >( array[1] )( "Hello, world!" );
}
Naturally, you've lost a lot of type safety and you will have undefined behavior if you call a function through a pointer to a function of a different type.
Use a union of your function pointer types, this works in both C and C++ and assures sufficient storage for the pointers (which are likely the same size, still...)
There are a few things that make up a function pointer's type.
the memory address of the code
the argument signature
the linkage/name mangling
the calling convention
for member functions, some other stuff too
If these features aren't uniform across your function pointers then you can't sensibly store them in the same container.
You can, however bind different aspects into a std::function which is a callable object that only requires the argument signature and return type to be uniform.
It may be a good time to re-think the problem in terms of virtual functions. Does this mishmash of function pointers have a coherent interface that you can express? If not, then you're doomed anyway :-)
RE: Hasturkun, you can store heterogeneous function pointers in unions, yes, they're just POD, but you will also need to store information about what type of pointer it is so that you can choose the correct member to call. Two problems with this:
there is a per-item overhead,
you have to manually check that you're using the right one consistently all the time, this is a burden with nonlocal effects -- it's a spreading poison.
Far better to have one container per type, it will clarify the code and make it safer. Or, use a proxy such as std::function to make them all have the same type.

How do you declare a pointer to a function that returns a pointer to an array of int values in C / C++?

Is this correct?
int (*(*ptr)())[];
I know this is trivial, but I was looking at an old test about these kind of constructs, and this particular combination wasn't on the test and it's really driving me crazy; I just have to make sure. Is there a clear and solid understandable rule to these kind of declarations?
(ie: pointer to... array of.. pointers to... functions that.... etc etc)
Thanks!
R
The right-left rule makes it easy.
int (*(*ptr)())[];can be interpreted as
Start from the variable name ------------------------------- ptr
Nothing to right but ) so go left to find * -------------- is a pointer
Jump out of parentheses and encounter () ----------- to a function that takes no arguments(in case of C unspecified number of arguments)
Go left, find * ------------------------------------------------ and returns a pointer
Jump put of parentheses, go right and hit [] ---------- to an array of
Go left again, find int ------------------------------------- ints.
In almost all situations where you want to return a pointer to an array the simplest thing to do is to return a pointer to the first element of the array. This pointer can be used in the same contexts as an array name an provides no more or less indirection than returning a pointer of type "pointer to array", indeed it will hold the same pointer value.
If you follow this you want a pointer to a function returning a pointer to an int. You can build this up (construction of declarations is easier than parsing).
Pointer to int:
int *A;
Function returning pointer to int:
int *fn();
pointer to function returning a pointer to int:
int *(*pfn)();
If you really want to return a pointer to a function returning a pointer to an array of int you can follow the same process.
Array of int:
int A[];
Pointer to array of int:
int (*p)[];
Function returning pointer ... :
int (*fn())[];
Pointer to fn ... :
int (*(*pfn)())[];
which is what you have.
You don't. Just split it up into two typedefs: one for pointer to int array, and one for pointer to functions. Something like:
typedef int (*IntArrayPtr_t)[];
typedef IntArrayPtr_t (*GetIntArrayFuncPtr_t)(void);
This is not only more readable, it also makes it easier to declare/define the functions that you are going to assign the variable:
IntArrayPtr_t GetColumnSums(void)
{ .... }
Of course this assumes this was a real-world situation, and not an interview question or homework. I would still argue this is a better solution for those cases, but that's only me. :)
If you feel like cheating:
typedef int(*PtrToArray)[5];
PtrToArray function();
int i = function;
Compiling that on gcc yields: invalid conversion from 'int (*(*)())[5]' to 'int'. The first bit is the type you're looking for.
Of course, once you have your PtrToArray typedef, the whole exercise becomes rather more trivial, but sometimes this comes in handy if you already have the function name and you just need to stick it somewhere. And, for whatever reason, you can't rely on template trickery to hide the gory details from you.
If your compiler supports it, you can also do this:
typedef int(*PtrToArray)[5];
PtrToArray function();
template<typename T> void print(T) {
cout << __PRETTY_FUNCTION__ << endl;
}
print(function);
Which, on my computer box, produces void function(T) [with T = int (* (*)())[5]]
Being able to read the types is pretty useful, since understanding compiler errors is often dependent on your ability to figure out what all those parenthesis mean. But making them yourself is less useful, IMO.
Here's my solution...
int** (*func)();
Functor returning an array of int*'s. It isn't as complicated as your solution.
Using cdecl you get the following
cdecl> declare a as pointer to function returning pointer to array of int;
Warning: Unsupported in C -- 'Pointer to array of unspecified dimension'
(maybe you mean "pointer to object")
int (*(*a)())[]
This question from C-faq is similar but provides 3 approaches to solve the problem.

What is useful about a reference-to-array parameter?

I recently found some code like this:
typedef int TenInts[10];
void foo(TenInts &arr);
What can you do in the body of foo() that is useful, that you could not do if the declaration was:
void foo(int *arr); // or,
void foo(int arr[]); // or,
void foo(int arr[10]); // ?
I found a question that asks how to pass a reference to an array. I guess I am asking why.
Also, only one answer to "When is pointer to array useful?" discussed function parameters, so I don't think this is a duplicate question.
The reference-to-array parameter does not allow array type to decay to pointer type. i.e. the exact array type remains preserved inside the function. (For example, you can use the sizeof arr / sizeof *arr trick on the parameter and get the element count). The compiler will also perform type checking in order to make sure the array argument type is exactly the same as the array parameter type, i.e. if the parameter is declared as a array of 10 ints, the argument is required to be an array of exactly 10 ints and nothing else.
In fact, in situations when the array size is fixed at compile-time, using a reference-to-array (or pointer-to-array) parameter declarations can be preceived as the primary, preferred way to pass an array. The other variant (when the array type is allowed to decay to pointer type) are reserved for situations when it is necessary to pass arrays of run-time size.
For example, the correct way to pass an array of compile-time size to a function is
void foo(int (&arr)[10]); // reference to an array
or
void foo(int (*arr)[10]); // pointer to an array
An arguably incorrect way would be to use a "decayed" approach
void foo(int arr[]); // pointer to an element
// Bad practice!!!
The "decayed" approach should be normally reserved for arrays of run-time size and is normally accompanied by the actual size of the array in a separate parameter
void foo(int arr[], unsigned n); // pointer to an element
// Passing a run-time sized array
In other words, there's really no "why" question when it comes to reference-to-array (or pointer-to-array) passing. You are supposed to use this method naturally, by default, whenever you can, if the array size is fixed at compile-time. The "why" question should really arise when you use the "decayed" method of array passing. The "decayed" method is only supposed to be used as a specialized trick for passing arrays of run-time size.
The above is basically a direct consequence of a more generic principle. When you have a "heavy" object of type T, you normally pass it either by pointer T * or by reference T &. Arrays are no exception from this general principle. They have no reason to be.
Keep in mind though that in practice it is often makes sense to write functions that work with arrays of run-time size, especially when it comes to generic, library-level functions. Such functions are more versatile. That means that often there's a good reason to use the "decayed" approach in real life code, Nevertheless, this does not excuse the author of the code from recognizing the situations when the array size is known at compile time and using the reference-to-array method accordingly.
One difference is that it's (supposed to be) impossible to pass a null reference. So in theory the function does not need to check if the parameter is null, whereas an int *arr parameter could be passed null.
You can write a function template to find out the size of an array at compile time.
template<class E, size_t size>
size_t array_size(E(&)[size])
{
return size;
}
int main()
{
int test[] = {2, 3, 5, 7, 11, 13, 17, 19};
std::cout << array_size(test) << std::endl; // prints 8
}
No more sizeof(test) / sizeof(test[0]) for me ;-)
Shouldn't we also address the words in bold from the question:
What can you do in the body of foo() that is useful, that you could not do if the declaration was void foo(int arr[]);?
The answer is: nothing. Passing an argument by reference allows a function to change its value and pass back this change to the caller. However, it is not possible to change the value of the array as a whole, which would have been a reason to pass it by reference.
void foo(int (&arr)[3]) { // reference to an array
arr = {1, 2 ,3}; // ILLEGAL: array type int[3] is not assignable
arr = new(int[3]); // same issue
arr = arr2; // same issue, with arr2 global variable of type int[3]
}
You can ensure that the function is only called on int arrays of size 10. That may be useful from a type-checking standpoint.
You get more semantic meaning regarding what the function is expecting.