c++ Testing for a missing callback function - c++

EDIT: added backquotes to callback template. The interface was reading the asterisks as markdown indicators, not just as asterisks!
In a Windows DLL/Linux SO I am writing, I give the user app a way to register a callback function. Works great. The callback prototype looks like (void)(*callback)(void*);
I was having a fit of paranoia while writing the docs and realized, I have no really good way to know if the registered address is valid. The only feedback is either a crash or call the callback inside a try/catch.
I have no idea what exception would be thrown if the callback did not exist and who-knows-what executed. NOt even really sure that the call to "nowhere" could recover itself enough to generate the exception instead of a crash.
Yes, I know it's the user's problem. Just trying to be thoughtful and maybe help the user understand his bug.
So, what exception would this throw? Windows and Linux answers please if they differ.
Or, is there a better way to approach this without having to use an exception catch to detect the missing function?

There's no way to recover. Similarly, you cannot recover if the callback contains a line like *(int*)(0x1234) = 5;. Just live with it.
As a C++ library developer, you're not in the business of making sure that nothing ever crashes. You merely provide code that does what it promises when used the way you document.

A bit off-topic, but callbacks in the form of void(*)(), i.e. taking 0 arguments, are less than useful. A useful C-style callback accepts a user specified argument, so that the user can find the state corresponding to the callback. E.g.:
typedef void callback_fn(void* user_arg);
callback_id register_callback(callback_fn* callback, void* user_arg);
void unregister_callback(callback_id);
Without user_arg the user of your callback will be forced to use a global variable to store state corresponding to the callback function.

The situation you describe is rather unlikely. I have never seen such handling anywhere. The program would just crash and ruin its user.
But your concern is valid, because the failure root cause (assigning wrong address) and its manifestation (call to invalid address) can be so far away from each other that it could be very hard to identify it.
All I could advise here is to "fail fast and loud". For instance, you could do test call of the callback whenever it is assigned. This will still lead to crash, but now in the stack trace user will see where it all started from.
Again, this is not something an ordinary library user would expect...

As I answered here:
How to test if an instance is corrupted?
(completely different type of question, but same applies)
If the pointer is "not recognisable NULL or similar", then there is no way, in code, to tell if it's valid or not.
You also can't use try/catch to capture the failure, as "failure to execute the code" does not result in a throw.
Since this a "programmer error", I don't believe it's a big issue. Programmers can do what they like with their own code anyway, so whatever mechanism you add, it's going to be possible to circumvent in some way or another.

As others have said, there's no way to check, but... Is it
really necessary? I'm all in favor of defensive programming,
but the only way you can possibly get a pointer to a function
(other than a null pointer) is by taking the address of
a function. Some compilers do allow explicitly converting
a pointer to an object to a pointer to function, despite the
fact that the standard requires a diagnostic in such cases. But
even then, the client code needs an explicit cast to screw up.
And unlike objects, functions life through out the lifetime of
the program, so you cannot have a problem with a dangling
pointer—a pointer which was once valid, but isn't any
more. There is, in fact, practically no way to get an invalid
pointer to a function except intentionally (the only way that
occurs to me is if the unload a DLL with the function), and if
someone intentionally wants to screw up, there's no way you'll
be able to prevent it.

Related

is it possible to use function pointers this way?

This is something that recently crossed my mind, quoting from wikipedia: "To initialize a function pointer, you must give it the address of a function in your program."
So, I can't make it point to an arbitrary memory address but what if i overwrite the memory at the address of the function with a piece of data the same size as before and than invoke it via pointer ? If such data corresponds to an actual function and the two functions have matching signatures the latter should be invoked instead of the first.
Is it theoretically possible ?
I apologize if this is impossible due to some very obvious reason that i should be aware of.
If you're writing something like a JIT, which generates native code on the fly, then yes you could do all of those things.
However, in order to generate native code you obviously need to know some implementation details of the system you're on, including how its function pointers work and what special measures need to be taken for executable code. For one example, on some systems after modifying memory containing code you need to flush the instruction cache before you can safely execute the new code. You can't do any of this portably using standard C or C++.
You might find when you come to overwrite the function, that you can only do it for functions that your program generated at runtime. Functions that are part of the running executable are liable to be marked write-protected by the OS.
The issue you may run into is the Data Execution Prevention. It tries to keep you from executing data as code or allowing code to be written to like data. You can turn it off on Windows. Some compilers/oses may also place code into const-like sections of memory that the OS/hardware protect. The standard says nothing about what should or should not work when you write an array of bytes to a memory location and then call a function that includes jmping to that location. It's all dependent on your hardware and your OS.
While the standard does not provide any guarantees as of what would happen if you make a function pointer that does not refer to a function, in real life and in your particular implementation and knowing the platform you may be able to do that with raw data.
I have seen example programs that created a char array with the appropriate binary code and have it execute by doing careful casting of pointers. So in practice, and in a non-portable way you can achieve that behavior.
It is possible, with caveats given in other answers. You definitely do not want to overwrite memory at some existing function's address with custom code, though. Not only is typically executable memory not writeable, but you have no guarantees as to how the compiler might have used that code. For all you know, the code may be shared by many functions that you think you're not modifying.
So, what you need to do is:
Allocate one or more memory pages from the system.
Write your custom machine code into them.
Mark the pages as non-writable and executable.
Run the code, and there's two ways of doing it:
Cast the address of the pages you got in #1 to a function pointer, and call the pointer.
Execute the code in another thread. You're passing the pointer to code directly to a system API or framework function that starts the thread.
Your question is confusingly worded.
You can reassign function pointers and you can assign them to null. Same with member pointers. Unless you declare them const, you can reassign them and yes the new function will be called instead. You can also assign them to null. The signatures must match exactly. Use std::function instead.
You cannot "overwrite the memory at the address of a function". You probably can indeed do it some way, but just do not. You're writing into your program code and are likely to screw it up badly.

Leaving a c++ application without running destructors of global static objects

I have "inherited" a design where we are using a few global objects for doing stuff when the application exits (updating application status log files, etc ... not important to the question).
Basically the application creates dummy helper objects of specific classes and lets their destructor do these extra works when the application exits either normally or when an error was encountered (and the application knows what to do in all the cases, again not relevant to the question).
But now I have encountered a situation where I do not want to call these destructors, just leave the application without executing these "termination jobs". How can I do this in a decent, platform independent way? I do not want a solution such as divide with zero :)
Edit: I know the design is broken :) We are working on fixing it.
Edit2: I would like to avoid any "trace" of abnormal exit... Sorry for late specification.
Edit3: Obtaining access to the source code for the destructors in order to modify them is very difficult. This happens when politicians take over the keyboard and try to write programs. We just know, that "their" destructor will run on exit...
How can I do this in a decent, platform independent way?
You can't. At least not in a decent way.
You can accomplish this by throwing an exception and not catching it. The end result will be that your application will terminate quite ungracefully. Destructors will not be called. This is quite ugly, very hackish. If your design relies on this behavior to function correctly, well not only is your design completely demented, but it will be near-impossible to maintain.
I would prefer setting a boolean flag in the objects you don't want to run the destructors for. If this flag is set, then the destruction code would not be run. The destructors will still fire, but the actual code you want to avoid running can be skipped.
If you control the construction of the global, you might be able to leverage operator placement-new. Construct a global char buffer big enough for your global, then placement-new your global there. Since objects constructed this way must be destroyed by explicitly calling the destructor, simply don't call the destructor on shutdown for the global.
abort();
Aborts the current process, producing an abnormal program termination.
The function raises the SIGABRT signal (as if raise(SIGABRT) was
called). This, if uncaught, causes the program to terminate returning
a platform-dependent unsuccessful termination error code to the host
environment.
The program is terminated without destroying any object and without
calling any of the functions passed to atexit or at_quick_exit.
Taking your comment of "we know it's broken"...
Put a global bool somewhere
bool isExiting;
Set this to true when exiting.
In your destructors do
if( !global::isExiting )
{
// destruction code here
}
Hide somewhere, and be thoroughly ashamed.
The easiest is to replace the global objects by heap-allocated pointers to the objects. That way, in order to run their destructors, you have to manually delete them. Be aware that this is of course very, very nasty. A better way than using a raw pointer is to use a std::unique_ptr and, in case the destructor shouldn’t be called, release it.
If you don’t want to change the client code which is interacting with the global objects (and expect a non-pointer type), just wrap the actual pointer into a global proxy object.
(PS: There are few but very legitimate reasons for such a design. Sometimes it’s entirely safe not to call destructors, and calling them may be detrimental, e.g. because it’s known that they only release memory, but will take very long due to many nested destructor calls. This still needs to be done very carefully but is not at all “bad design”.)
It is ugly but it works:
Create simple maker class.
Maker's constructorr should allocate the object of your liking and assign it to a static pointer.
Make sure that nothing is done in maker's destructor.
Create single static instance of maker object.
To make things look better, make the static pointer private andwrite inline accessor function that will convert the static pointer you have to a public reference.

In either C or C++, should I check pointer parameters against NULL/nullptr?

This question was inspired by this answer.
I've always been of the philosophy that the callee is never responsible when the caller does something stupid, like passing of invalid parameters. I have arrived at this conclusion for several reasons, but perhaps the most important one comes from this article:
Everything not defined is undefined.
If a function doesn't say in it's docs that it's valid to pass nullptr, then you damn well better not be passing nullptr to that function. I don't think it's the responsibility of the callee to deal with such things.
However, I know there are going to be some who disagree with me. I'm curious whether or not I should be checking for these things, and why.
If you're going to check for NULL pointer arguments where you have not entered into a contract to accept and interpret them, do it with an assert, not a conditional error return. This way the bugs in the caller will be immediately detected and can be fixed, and it makes it easy to disable the overhead in production builds. I question the value of the assert except as documentation however; a segfault from dereferencing the NULL pointer is just as effective for debugging.
If you return an error code to a caller which has already proven itself buggy, the most likely result is that the caller will ignore the error, and bad things will happen much later down the line when the original cause of the error has become difficult or impossible to track down. Why is it reasonable to assume the caller will ignore the error you return? Because the caller already ignored the error return of malloc or fopen or some other library-specific allocation function which returned NULL to indicate an error!
In C++, if you don't want to accept NULL pointers, then don't take the chance: accept a reference instead.
While in general I don't see the value in detecting NULL (why NULL and not some other invalid address?) for a public API I'd probably still do it simply because many C and C++ programmers expect such behavior.
Defense in Depth principle says yes. If this is an external API then totally essential. Otherwise, at least an assert to assist in debugging misuse of your API.
You can document the contract until you are blue in the face, but you cannot in callee code prevent ill-advised or malicious misuse of your function. The decision you have to make is what's the likely cost of misuse.
In my view, it's not a question of responsibility. It's a question of robustness.
Unless I have full control on the caller and I must optimize for even the minute speed improvement, I always check for NULL.
I lean heavily on the side of 'don't trust your user's input to not blow up your system' and in defensive programming in general. Since I have made APIs in a past life, I have seen users of the libraries pass in null pointers and then application crashes result.
If it is truly an internal library and I'm the only person (or only a select few) have the ability to use it, then I might ease up on null pointer checks as long as everyone agrees to abide by general contracts. I can't trust the user base at large to adhere to that.
The answer is going to be different for C and C++.
C++ has references. The only difference between passing a pointer and passing a reference is that the pointer can be null. So, if the writer of the called function expects a pointer argument and forgets to do something sane when it's null, he's silly, crazy or writing C-with-classes.
Either way, this is not a matter of who wears the responsibility hat. In order to write good software, the two programmers must co-operate, and it is the responsibility of all programmers to 1° avoid special cases that would require this kind of decision and 2° when that fails, write code that blows up in a non-ambiguous and documented way in order to help with debugging.
So, sure, you can point and laugh at the caller because he messed up and "everything not defined is undefined" and had to spend one hour debugging a simple null pointer bug, but your team wasted some precious time on that.
My philosophy is: Your users should be allowed to make mistakes, your programming team should not.
What this means is that the only place you should check for invalid parameters including NULL, is in the top-level user interface. Everywhere the user can provide input to your code, you should check for errors, and handle them as gracefully as possible.
Everywhere else, you should use ASSERTS to ensure the programmers are using the functions correctly.
If you are writing an API, then only the top-level functions should catch and handle bad input. It is pointless to keep checking for a NULL pointer three or four levels deep into your call stack.
I am pro defensive programming.
Unless you can profile that these nullptr checkings happen in a bottleneck of your application... (in such cases it is conceivable one should not do those pointers value tests at those points)
but all in all comparing an int with 0 is really cheap an operation.
I think it is a shame to let potential crash bugs instead of consuming so little CPU.
so: Test your pointers against NULL!
I think that you should strive to write code that is robust for every conceivable situation. Passing a NULL pointer to a function is very common; therefore, your code should check for it and deal with it, usually by returning an error value. Library functions should NOT crash an application.
For C++, if your function doesn't accept nullpointer, then use a reference argument. In general.
There are some exceptions. For example, many people, including myself, think it's better with pointer argument when the actual argument will most naturally be a pointer, especially when the function stores away of a copy of the pointer. Even when the function doesn't support nullpointer argument.
How much to defend against invalid argument depends, including that it depends on subjective opinion and gut-feeling.
Cheers & hth.,
One thing you have to consider is what happens if some caller DOES misuse your API. In the case of passing NULL pointers, the result is an obvious crash, so it's OK not to check. Any misuse will be readily apparent to the calling code's developer.
The infamous glibc debacle is another thing entirely. The misuse resulted in actually useful behavior for the caller, and the API stayed that way for decades. Then they changed it.
In this case, the API developers' should have checked values with an assert or some similar mechanism. But you can't go back in time to correct an error. The wailing and gnashing of teeth were inevitable. Read all about it here.
If you don't want a NULL then don't make the parameter a pointer.
By using a reference you guarantee that the object will not be NULL.
He who performas invalid operations on invalid or nonexisting data, only deserves his system-state to become invalid.
I consider it complete nonsense that functions which expect input should check for NULL. Or whatever other value for that matter. The sole job of a function is to do a task based on its input or scope-state, nothing else. If you have no valid input, or no input at all, then don't even call the function. Besides, a NULL-check doesn't detect the other millions and millions of possible invalid values. You know on forehand you would be passing NULL, so why would you still pass it, waste valuable cycles on yet another function call with parameter passing, an in-function comparison of some pointer, and then check the function output again for success or not. Sure, I might have done so when I was 6 years old back in 1982, but those days have long since gone.
There is ofcourse the argument to be made for public API's. Like some DLL offering idiot-proof checking. You know, those arguments: "If the user supplies NULL you don't want your application to crash." What a non-argument. It is the user which passes bogus data in the first place; it's an explicit choice and nothing else than that. If one feels that is quality, well... I prefer solid logic and performance over such things. Besides, a programmer is supposed to know what he's doing. If he's operating on invalid data for the particular scope, then he has no business calling himself a programmer. I see no reason to downgrade the performance, increase power consumption, while increasing binary size which in turn affects instruction caching and branch-prediction, of my products in order to support such users.
I don't think it's the responsibility of the callee to deal with such things
If it doesn't take this responsibility it might create bad results, like dereferencing NULL pointers. Problem is that it always implicitly takes this responsibility. That's why i prefer graceful handling.
In my opinion, it's the callee's responsibility to enforce its contract.
If the callee shouldn't accept NULL, then it should assert that.
Otherwise, the callee should be well behaved when it's handed a NULL. That is, either it should functionally be a no-op, return an error code, or allocate its own memory, depending on the contract that you specified for it. It should do whatever seems to be the most sensible from the caller's perspective.
As the user of the API, I want to be able to continue using it without having the program crash; I want to be able to recover at the least or shut down gracefully at worst.
One side effect of that approach is that when your library crashes in response to being passed an invalid argument, you will tend to get the blame.
There is no better example of this than the Windows operating system. Initially, Microsoft's approach was to eliminate many tests for bogus arguments. The result was an operating system that was more efficient.
However, the reality is that invalid arguments are passed all time. From programmers that aren't up to snuff, or just using values returned by other functions there weren't expected to be NULL. Now, Windows performs more validation and is less efficient as a result.
If you want to allow your routines to crash, then don't test for invalid parameters.
Yes, you should check for null pointers. You don't want to crash an application because the developer messed something up.
Overhead of development time + runtime performance has a trade-off with the robustness of the API you are designing.
If the API you are publishing has to run inside the process of the calling routine, you SHOULD NOT check for NULL or invalid arguments. In this scenario, if you crash, the client program crashes and the developer using your API should mend his ways.
However, if you are providing a runtime/ framework which will run the client program inside it (e.g., you are writing a virtual machine or a middleware which can host the code or an operating system), you should definitely check of the correctness of the arguments passed. You don't want your program to be blamed for the mistakes of a plugin.
There is a distinction between what I would call legal and moral responsibility in this case. As an analogy, suppose you see a man with poor eyesight walking towards a cliff edge, blithely unaware of its existence. As far as your legal responsibility goes, it would in general not be possible to successfully prosecute you if you fail to warn him and he carries on walking, falls off the cliff and dies. On the other hand, you had an opportunity to warn him -- you were in a position to save his life, and you deliberately chose not to do so. The average person tends to regard such behaviour with contempt, judging that you had a moral responsibility to do the right thing.
How does this apply to the question at hand? Simple -- the callee is not "legally" responsible for the actions of the caller, stupid or otherwise, such as passing in invalid input. On the other hand, when things go belly up and it is observed that a simple check within your function could have saved the caller from his own stupidity, you will end up sharing some of the moral responsibility for what has happened.
There is of course a trade-off going on here, dependent on how much the check actually costs you. Returning to the analogy, suppose that you found out that the same stranger was inching slowly towards a cliff on the other side of the world, and that by spending your life savings to fly there and warn him, you could save him. Very few people would judge you entirely harshly if, in this particular situation, you neglected to do so (let's assume that the telephone has not been invented, for the purposes of this analogy). In coding terms, however, if the check is as simple as checking for NULL, you are remiss if you fail to do so, even if the "real" blame in the situation lies with the caller.

How do detect if an address will cause an access violation?

I'm creating a class for a Lua binding which holds a pointer and can be changed by the scripter. It will include a few functions such as :ReadString and :ReadBool, but I don't want the application to crash if I can tell that the address they supplied will cause an access violation.
Is the a good way to detect if an address is outside of the readable/writable memory? Thanks!
A function library that may be useful is the "Virtual" function libraries, for example VirtualQuery
I'm not really looking for a foolproof design, I just want to omit the obvious (null pointers, pointers way outside the possible memory location)
I understand how unsafe this library is, and I'm not looking for safety, just sanity.
There are ways, but they do not serve the purpose you intend. That is; yes, you can determine whether an address appears to be valid at the current moment in time. But; no, you cannot determine whether that address will be valid a few clock cycles from now. Another thread could change the virtual memory map and a formerly valid address would become invalid.
The only way to properly handle the possibility of accessing suspect pointers is using whatever native exception handling is available on your platform. This may involve handling the signal SIG_BUS or it may involve using the proprietary __try and __catch extensions.
The idiom to use is the one wherein you attempt the access, and explicitly handle the resulting exception, if any does happen to occur.
You also have the problem of ensuring that the pointers you return point to your memory or to some other memory. For that, you could make your own data structure, a tree springs to mind, which stores the valid address ranges your "pointers" can achieve. Otherwise, the script code can hand you some absolute addresses and you will change memory structures by the operating system for the process environment.
The application you write about is highly suspect and you should probably start over with a less explosive design. But I thought I would tell you how to play with fire if you really want to.
Check out Raymond Chen's blog post, which goes more deeply into why this practice is bad. Most interestingly, he points out that, once a page is tested by IsBadReadPtr, further accesses to that page will not raise exceptions!
There is no, and that's why you should never do things like this.
Perhaps try using segvcatch, which can convert segfaults into C++ exceptions.

It is good programming practice to always check for null pointers before using an object in C++?

This seems like a lot of work; to check for null each time an object is used.
I have been advised that it is a good idea to check for null pointers so you don't have to spend time looking for where segmentation faults occur.
Just wondering what the community here thinks?
Use references whenever you can, because they can't be null, therefore you don't have to check if they are null.
It's good practice to check for null in function parameters and other places you may be dealing with pointers someone else is passing you. However, in your own code, you might have pointers you know will always be pointing to a valid object, so a null check is probably overkill... just use your common sense.
I don't know if it really helps with debugging because any debugger will be showing you pretty clearly that a null pointer was used and it won't take long to find it. It's more about making sure you don't crash if another programmer passes in NULL, or that the mistake is picked up by an assert in a debug build.
No. You should instead make sure the pointers were not set to NULL in the first place. Note that in Standard C++:
int * p = new int;
then p can never be NULL because new will throw an exception if the allocation fails.
If you are writing functions that can take a pointer as a parameter, you should treat them like this
// does something with p
// note that p cannot be NULL
void f( int * p );
In other words you should document the requirements of the function. You can also use assert() to check if someone has ignored your documentation (if they have, it's their problem, not yours), but I must admit I have gone off this as time has gone on - simply say what the function requires, and leave the responsibility with the caller.
A third bit of advice is simply not to use pointers - most C++ code that I've seen overuses pointers to a ridiculous extent - you should use values and references wherever possible.
In general, I would advise against doing this, as it makes your code harder to read and you also have to come up with some sensible way of dealing with the situation if a pointer is actually NULL.
In my C++ projects, I only check if a pointer (if I am using pointers at all) is NULL, only if it could be a valid state of the pointer. Checking for NULL if the pointer should never actually be NULL is a bit pointless, because you are then trying work around some programming error you should fix instead.
Additionally, when you feel the need to check if a pointer is NULL, you probably should define more clearly who owns pointer/object.
Also, you never have to check if new returns NULL, because it never will return NULL. It will throw an exception if it could not create an object.
I hate the amount of code checking for nulls adds, so I only do it for functions I export to another person.
If use the function internally, and I know how I use it, I don't check for nulls since it would get the code too messy.
the answer is yes, if you are not in control of the object. that is, if the object is returned from some method you do not control, or if in your own code you expect (or it is possible) that an object can be null.
it also depends on where the code will run. if you are writing professional code that customers / users will see, it's generally bad for them to see null pointer problems. it's better if you can detect it beforehand and print out some debugging information or otherwise report it to them in a "nicer" way.
if it's just code you are using informally, you will probably be able to understand the source of the null pointer without any additional information.
I figure I can do a whole lot of checks for NULL pointers for the cost of (debugging) just one segfault.
And the performance hit is negligible. TWO INSTRUCTIONS. Test for register == zero, branch if test succeeds. Depending on the machine, maybe only ONE instruction, if the register load sets the condition codes (and some do).
Others (AshleysBrain and Neil Butterworth), already answered correctly, but I will summarize it here:
Use references as much as possible
If using pointers, initialize them either to NULL or to a valid memory address/object
If using pointers, always verify if they are NULL before using them
Use references (again)... This is C++, not C.
Still, there is one corner case where a reference can be invalid/NULL :
void foo(T & t)
{
t.blah() ;
}
void bar()
{
T * t = NULL ;
foo(*t) ;
}
The compiler will probably compile this, and then, at execution, the code will crash at the t.blah() line (if T::blah() uses this one way or another).
Still, this is cheating/sabotage : The writer of the bar() function dereferenced t without verifying t was NOT null. So, even if the crash happens in foo(), the error is in the code of bar(), and the writer of bar() is responsible.
So, yes, use references as much as possible, know this corner case, and don't bother to protect against sabotaged references...
And if you really need to use a pointer in C++, unless you are 100% sure the pointer is not NULL (some functions guarantee that kind of thing), then always test the pointer.
I think that is a good idea for a debug version.
In a release version, checking for null pointers can result in a performance degradation.
Moreover, there are cases where you can check the pointer value in a parent function and avoid the checking in its children.
If the pointers are coming to you as parameters to a function, then make sure they are valid at the beginning of the function. Otherwise, there is not much point. new throws an exception on failure.