Why am I getting a spurious error message from SymInitialize?

Why am I getting a spurious error message from SymInitialize? - c++

In writing a C++ Windows application, I'm using the SymInitializeW to initialize the symbols for getting a backtrace. As the documentation mentions, I'm checking the return code, and then using GetLastError and FormatMessage when SymInitializeW returns false (like in the example).
However, I'm getting an error message of "The data area passed to a system call is too small" when I do so. I'm not sure what that's referring to, as there really isn't a "data area" being passed - just the process handle, the PCWSTR for the search path, and the bool. -- It's doubly confusing as it seems like the symbol loading works. (e.g. if I skip the error handling, things seem to work properly.)
Does this message point to something I'm actually doing wrong, or is it spurious? If spurious, why is SymInitializeW returning false?

The SymInitialize functions should only be called once on a given process handle. If there's any code path in which a SymInitialize function can be called multiple times, you may get odd errors like "The data area passed to a system call is too small" (ERROR_INSUFFICIENT_BUFFER, 122 (0x7A)) or "The parameter is incorrect" (ERROR_INVALID_PARAMETER 87 (0x57)) and potentially others from GetLastError, despite the fact that you're using all the correct parameters according to the documentation. (There isn't necessarily a specific "Don't call SymInitialize twice" error.)
Best practice is to make sure the control flow through your symbol-handling functions is clear, and that you call SymInitialize once and only once at the top, and then call SymCleanup on the process handle prior to exiting the functions that are doing symbol handling. If you've properly called SymCleanup, subsequent calls to SymInitialize should succeed.

Related

What may cause EnumProcesses() to fail?

The documentation states:
If the function fails, the return value is zero. To get extended error
information, call GetLastError.
But it doesn't give any example how the function could possibly fail.
For unit testing I need to reliably create a situation that makes EnumProcesses() fail.

Like most functions, it can fail if you pass it invalid parameters. In this case that means a smaller PID array than the size you tell it or a NULL pointer for the received count. It is a bit risky to do this on purpose because you don't know if the function uses SEH to protect against this or if it will just crash.
Internally the function has to allocate some memory before calling into NTDLL to get the process information and this can cause the function to fail if there is not enough memory available.
You should call EnumProcesses in a helper function to abstract away the memory/retry details anyway and that would be a good place to simulate failures when needed.
If you absolutely need the function itself to fail you could hook it with something like Microsoft Detours or IAT hooking...

Which WinAPI function set last error to ERROR_SUCCESS if no error occured?

There is a large legacy codebase which seems to fail in some hard to reproduce and investigate situation.
It calls some WinAPI function, say, CopyFile, and instead of checking the return code, checks GetLastError() value. I know this is wrong, but it would be really nice to know whether non-null last error value originates from this call or something that happened earlier. If I was sure that CopyFile sets last error to ERROR_SUCCESS in case everything went well, it would be enough to conclude that this specific call failed.
MSDN mentions that some functions do this and some do not but does not tell specifically which ones do. Is there some unofficial list/reference which covers this question?

GetLastError() usage is documented on a per-function basis. There is no single master list that documents which functions act which way in regards to GetLastError(). If any given function is not documented as setting the last error, do not use GetLastError() to check the last error after the function exits.
Most functions that do set the last error do not set it on success, only on failure, and are documented as such.
Functions that do set the last error on success are documented as such, and will also document which success conditions set the last error to which value. This is typically used in cases where a function's return value is ambiguous as to whether the function succeeded or failed, so GetLastError() is used to differentiate. For instance, most functions return 0 on failure, but some functions may return 0 on success and failure. In such cases, if GetLastError() then returns 0 (or a defined success code) then the function succeeded, otherwise the function failed. GetTLSValue() is one example of this.
A notable exception to this are functions that create named kernel objects (CreateMutex(), CreateEvent(), CreateSemaphore(), etc). They return a non-zero value on success, but also GetLastError() will return either 0 or ERROR_ALREADY_EXISTS depending on whether the function returns a handle to a newly-created object or an existing object. This is documented accordingly for each function.

Is GetLastError() kind of design pattern? Is it good mechanism?

Windows APIs uses GetLastError() mechanism to retrieve information about an error or failure. I am considering the same mechanism to handle errors as I am writing APIs for a proprietary module. My question is that is it better for API to return the error code directly instead? Does GetLastError() has any particular advantage? Consider the simple Win32 API example below:
HANDLE hFile = CreateFile(sFile,
GENERIC_WRITE, FILE_SHARE_READ,
NULL, CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);
if (hFile == INVALID_HANDLE_VALUE)
{
DWORD lrc = GetLastError();
if (lrc == ERROR_FILE_EXISTS)
{
// msg box and so on
}
}
As I was writing my own APIs I realized GetLastError() mechanism means that CreateFile() must set the last error code at all exit points. This can be a little error prone if there are many exit points and one of them maybe missed. Dumb question but is this how it is done or there is some kind of design pattern for it?
The alternative would be to provide an extra parameter to the function which can fill in the error code directly so a separate call to GetLastError() will not be needed. Yet another approach can be as below. I will stick with the above Win32 API which is good example to analyzer this. Here I am changing the format to this (hypothetically).
result = CreateFile(hFile, sFile,
GENERIC_WRITE, FILE_SHARE_READ,
NULL, CREATE_NEW, FILE_ATTRIBUTE_NORMAL, NULL);
if (result == SUCCESS)
{
// hFile has correct value, process it
}
else if (result == FILE_ALREADY_EXIT )
{
// display message accordingly
return;
}
else if ( result == INVALID_PATH )
{
// display message accordingly.
return;
}
My ultimate question is what is the preferred way to return error code from an API or even just a function since they both are the same?

Overall, it's a bad design. This is not specific to Windows' GetLastError function, Unix systems have the same concept with a global errno variable. It's a because it's an output of the function which is implicit. This has a few nasty consequences:
Two functions being executed at the same time (in different threads) may overwrite the global error code. So you may need to have a per-thread error code. As pointed out by various comments to this answer, this is exactly what GetLastError and errno do - and if you consider using a global error code for your API then you'll need to do the same in case your API should be usable from multiple threads.
Two nested function calls may throw away error codes if the outer function overwrites an error code set by the inner.
It's very easy to ignore the error code. In fact, it's harder to actually remember that it's there because not every function uses it.
It's easy to forget setting it when you implement a function yourself. There may be many different code paths, and if you don't pay attention one of them may allow the control flow to escape without setting the global error code correctly.
Usually, error conditions are exceptional. They don't happen very often, but they can. A configuration file you need may not be readable - but most of the time it is. For such exceptional errors, you should consider using C++ exceptions. Any C++ book worth it's salt will give a list of reasons why exceptions in any language (not just C++) are good, but there's one important thing to consider before getting all excited:
Exceptions unroll the stack.
This means that when you have a function which yields an exception, it gets propagated to all the callers (until it's caught by someone, possible the C runtime system). This in turn has a few consequences:
All caller code needs to be aware of the presence of exceptions, so all code which acquires resources must be able to release them even in the face of exceptions (in C++, the 'RAII' technique is usually used to tackle them).
Event loop systems usually don't allow exceptions to escape event handlers. There's no good concept of dealing with them in this case.
Programs dealing with callbacks (plain function pointers for instance, or even the 'signal & slot' system used by the Qt library) usually don't expect that a called function (a slot) can yield an exception, so they don't bother trying to catch it.
The bottom line is: use exceptions if you know what they are doing. Since you seem to be rather new to the topic, rather stick to return codes of functions for now but keep in mind that this is not a good technique in general. Don't go for a global error variable/function in either case.

The GetLastError pattern is by far the most prone to error and the least preferred.
Returning a status code enum is a better choice by far.
Another option which you did not mention, but is quite popular, would be to throw exceptions for the failure cases. This requires very careful coding if you want to do it right (and not leak resources or leave objects in half-set-up states) but leads to very elegant-looking code, where all the core logic is in one place and the error handling is neatly separated out.

I think GetLastError is a relic from the days before multi-threading. I don't think that pattern should be used any more except in cases where errors are extraordinarily rare. The problem is that the error code has to be per-thread.
The other irritation with GetLastError is that it requires two levels of testing. You first have to check the return code to see if it indicates an error and then you have to call GetLastError to get the error. This means you have to do one of two things, neither particularly elegant:
1) You can return a boolean indicating success or failure. But then, why not just return the error code with zero for success?
2) You can have a different return value test for each function based on a value that is illegal as its primary return value. But then what of functions where any return value is legal? And this is a very error-prone design pattern. (Zero is the only illegal value for some functions, so you return zero for error in that case. But where zero is legal, you may need to use -1 or some such. It's easy to get this test wrong.)

I have to say, I think the global error handler style (with proper thread-local storage) is the most realistically applicable when exception-handling cannot be used. This is not an optimal solution for sure, but I think if you are living in my world (a world of lazy developers who don't check for error status as often as they should), it's the most practical.
Rationale: developers just tend to not check error return values as often as they should. How many examples can we point to in real world projects where a function returned some error status only for the caller to ignore them? Or how many times have we seen a function that wasn't even correctly returning error status even though it was, say, allocating memory (something which can fail)? I've seen too many examples like these, and going back and fixing them can sometimes even require massive design or refactoring changes through the codebase.
The global error handler is a lot more forgiving in this respect:
If a function failed to return a boolean or some ErrorStatus type to indicate failure, we don't have to modify its signature or return type to indicate failure and change the client code all over the application. We can just modify its implementation to set a global error status. Granted, we still have to add the checks on the client side, but if we miss an error immediately at a call site, there's still opportunity to catch it later.
If a client fails to check the error status, we can still catch the error later. Granted, the error may be overwritten by subsequent errors, but we still have an opportunity to see that an error occurred at some point whereas calling code that simply ignored error return values at the call site would never allow the error to be noticed later.
While being a sub-optimal solution, if exception-handling can't be used and we're working with a team of code monkeys who have a terrible habit of ignoring error return values, this is the most practical solution as far as I see.
Of course, exception-handling with proper exception-safety (RAII) is by far the superior method here, but sometimes exception-handling cannot be used (ex: we should not be throwing out of module boundaries). While a global error handler like the Win API's GetLastError or OpenGL's glGetError sounds like an inferior solution from a strict engineering standpoint, it's a lot more forgiving to retrofit into a system than to start making everything return some error code and start forcing everything calling those functions to check for them.
If this pattern is applied, however, one must take careful note to ensure it can work properly with multiple threads, and without significant performance penalties. I actually had to design my own thread-local storage system to do this, but our system predominantly uses exception-handling and only this global error handler to translate errors across module boundaries into exceptions.
All in all, exception-handling is the way to go, but if this is not possible for some reason, I have to disagree with the majority of the answers here and suggest something like GetLastError for larger, less disciplined teams (I'd say return errors through the call stack for smaller, more disciplined ones) on the basis that if a returned error status is ignored, this allows us to at least notice an error later, and it allows us to retrofit error-handling into a function that wasn't properly designed to return errors by simply modifying its implementation without modifying the interface.

If your API is in a DLL and you wish to support clients that use a different compiler then you then you cannot use exceptions. There is no binary interface standard for exceptions.
So you pretty much have to use error codes. But don't model the system using GetLastError as your exemplar. If you want a good example of how to return error codes look at COM. Every function returns an HRESULT. This allows callers to write concise code that can convert COM error codes into native exceptions. Like this:
Check(pIntf->DoSomething());
where Check() is a function, written by you, that receives an HRESULT as its single parameter and raises an exception if the HRESULT indicates failure. It is the fact that the return value of the function indicates status that allows this more concise coding. Imagine the alternative of returning the status via a parameter:
pIntf->DoSomething(&status);
Check(status);
Or, even worse, the way it is done in Win32:
if (!pIntf->DoSomething())
Check(GetLastError());
On the other hand, if you are prepared to dictate that all clients use the same compiler as you, or you deliver the library as source, then use exceptions.

Exception handling in unmanaged code is not recommended. Dealing with memory leaks without exceptions is a big issue, with exception it becomes nightmare.
Thread local variable for error code is not so bad idea, but as some of the other people said it is a bit error prone.
I personally prefer every method to return an error code. This creates inconvenience for functional methods because instead of:
int a = foo();
you will need to write:
int a;
HANDLE_ERROR(foo(a));
Here HANDLE_ERROR could be a macro that checks the code returned from foo and if it is an error propagates it up (returning it).
If you prepare a good set of macros to handle different situations writhing code with good error handling without exception handling could became possible.
Now when your project start growing you will notice that a call stack information for the error is very important. You could extend your macros to store the call stack info in a thread local storage variable. That is very useful.
Then you will notice that even the call stack is not enough. In many cases an error code for "File not found" at line that say fopen(path, ...); does not give you enough information to find out what is the problem. Which is the file that is not found. At this point you could extend your macros to be able to store massages as well. And then you could provide the actual path of file that was not found.
The question is why bother all of this you could do with exceptions. Several reasons:
Again, Exception handling in unmanaged code is hard to do right
The macro based code (if done write) happens to be smaller and faster than the code needed for exception handling
It is way more flexible. You could enable disable features.
In the project that I am working at the moment I implemented such error handling. It took me 2 days to put in a level to be ready to start using it. And for about a year now I probably spend about 2 weeks total time of maintaining and adding features to it.

You should also consider a object/structure based error code variable. Like the stdio C library is doing it for FILE streams.
On some of my io objects for example, i just skip all further operations when the error state is set so that the user is fine just when checking the error once after a sequence of operations.
This pattern allows you to finetune the error handling scheme much better.
One of the bad designs of C/C++ comes to full light here when comparing it for example with googles GO language. The return of just one value from a function. GO does not use exceptions instead it always returns two values, the result and the error code.
There is a minor group of people who think that exceptions are most of the time bad and misused because errors are not exceptions but something you have to expect. And it hasn't proved that software becames more reliable and easier. Especially in C++ where the only way to program nowadays is RIIA techniques.

How can PyImport_AppendInittab fail?

According to the official docs, PyImport_AppendInittab will return -1 on failure. It does not, however, specify why this function would fail.
I'd like to know if it can only fail due to the programmer's fault (incorrect arguments, not being called at the right time, etc), or if it can also fail because of some other factors that are out of the programmer's control (like Python not being installed).
I'm asking because I want to know if I should handle this with an assert or an exception. Also, in case I should handle it with exceptions, is there any way for me to catch an error message from the Python API that specifies why the function call failed?

According to the docs, PyImport_AppendInittab() is a convenience wrapper around PyImport_ExtendInittab() and returns -1 "if the table could not be extended". Furthermore, PyImport_ExtendInittab() returns -1 "if insufficient memory could be allocated to extend the internal table". Both functions "should be called before Py_Initialize()".
Consequently, these functions should only fail if the program is out of memory. I guess they could also fail when supplied with invalid arguments, for example when trying to register a built-in module with the same name as an existing one. The latter case is easily avoided, since names of built-in modules are well known.
In summary, you can assume that a return value of -1 means "out of memory", and this should never happen since the function is only called early in the process (before Py_Initialize()), plus the amount of memory required for the module table is rather small.
If PyImport_AppendInittab() fails, Python does not provide an error string. To throw a meaningful exception, you could just report the information you know at this point: failed to add the module MODULENAME to the interpreter's builtin-in modules.

What is a good way to recover from a fread() failure?

If a call to fread() returns 0 and ferror() indicates an error (vs. EOF), is it OK to retry the read or is it better to close and reopen the file?
I can't start over entirely -- the input file has been partially processed in a way that can't be undone (say I'm writing out a chunk at a time to a socket and, due to existing protocol, have no way of telling the remote end, "never mind, I need to start over").
I could fclose() and fopen() the file, fseek() past the data already processed, and continue the fread()-ing from there, but is all that necessary?

There's no "one size fits all" solution, since different errors can require different handling. Errors from fread() are unusual; if you're calling it correctly, an error may indicate a situation that has left the FILE* in a weird error state. In that case you're best off calling fclose(), fopen(), fseek() to get things back in a good state.
If you're coding for something that's happening, please mention the actual errors you're getting from ferror()...

You can give the clearerr function a look.

You can show the error to the user with perror() or strerror() and ask her is she wants to retry.
It's not mandatory for the implementation to provide such an error message, though. You should set errno to 0 before calling fread(); if it fails and errno is still 0 then no error information will be available.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js