Strange strings in executable VC++ - c++

I created simple C++ Hello world program, then I compiled it using MSVC++, and then I looked into executable using Notepad++ (I know it is not the best program to open binary files with it, but I wanted to know, if there are any human readable strings). I found there strings like A cast to a smaller data type has caused a loss of data. If this was intentional, you should mask the source of the cast with the appropriate bitmask.
The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention.
What are those strings, and where they came from, and how I can get rid of them?

All constant strings are human readable through the use of a text editor. Try looking for "hello world" in your program and it will pop up. (ran into this for the first time when we were trying to figure out if obfuscating our opencl code was worth it...it wasn't).
These strings are the error strings that windows throws on top of every executable. I have no idea how to get rid of them.

Related

movefile() fails error 2 or 123

I'm updating a c++ routine to move files that was written in visual studio express 2008/2010. I'm now running VS Express 2012
Obviously there are changes to the compiler because string functions have to be upgraded to strcpy_s etc. No problem. This is a console app. I never extended my C++ knowledge past C++ to C# etc. as I need little more than to be able to write small utils to do things on the command line. Still I'm able to write somewhat complex utilities.
My issue is movefile() function always fails to move with either error 2 or 123. I'm working in C:\users\alan\downloads folder so I know I have permission. I know the file is there. Small snippet of code is:
char source=".\\test.txt"; // edited for clarity.
char dest=".\\test.txt1";
printf("\nMove\n %s\n to %s\n",source,dest); // just to see what is going on
MoveFile((LPCWSTR) source, (LPCWSTR) dest);
printf("Error %u\n",GetLastError());
output is :
Move
.\test.txt
to .\test.txt1
Error 2
All of my strings are simple char strings and I'm not exactly sure, even after reading, what LPCWSTR was type def'd for and if this is the culprit. So to get this to compile I simply typedef'd my strings. And it compiles. But still it won't move the files.
The code is more complex in developing the source & dest variables but I've reduce it to a simple "just append a 1 to the file name" situation to see if I can just simply rename it. I thought C:\xxx\yyy\zzz\test.txt was maybe wrong in some fashion but that idea fell though with the above test. I've done it with and without the .\ same issue. I'm running out of ideas other than making my own fileopen read/write binary function to replace movefile(). I'm really against that but if I have to I will.
EDIT: I pasted the printf from original code that used FullPathName, I've corrected the snippet.
The fact that you are casting your arguments to LPCWSTR suggests that you are compiling your program with UNICODE defined, which means you are calling MoveFileW and the compiler warned about an argument type mismatch.
Inserting a cast does not fix that. You are telling the compiler to stop complaining, but you haven't actually fixed the problem (the underlying data is still wrong).
Actual solutions:
Use WCHAR as MoveFileW expects (or TCHAR/LPTSTR and the _T macro).
Explicitly call MoveFileA
Compile without UNICODE defined.
Thanks Andrew Medico. I used MoveFileA and the program seems to work now.
I'm not sure I turned off unicode, but I did change one item in the properties.
I'll need to read up on the compiler about unicode/ansi settings. But for now the issue is fixed and I'm sure I've got the idea of what I need to do. "research"!!!!

How to find a pointer to a function by string

I have a list of functions in a text file that I'd like to expose to LLVM for its execution engine at run time, I'm wondering if its possible to find pointers to the functions at runtime rather than hard code in all the GlobalMappings by hand as I'd probably like to add in more later. For example:
// File: InternalFunctions.txt
PushScreen
PopScreen
TopScreen
// File: ExposeEngine.cpp
// Somehow figure out the address of the function specified in a string
void* addy = magicAddress("PushScreen");
jit->addGlobalMapping(llvmfunction, addy);
If this is possible I love to know how to do it, as I am trying to write my game engine by jit-ing c++. I was able to create some results earlier, but I had to hard-code in the mappings. I noticed that Gtk uses something along the lines of what I'm asking. When you use glade and provide a signal handler, the program you build in c will automatically find the function in your executable referenced by the string provided in the glade file. If getting results requires me to look into this Gtk thing I'd be more than happy to, but I don't know what feature or part of the api deals with that - I've already tried to look it up. I'd love to hear suggestions or advice.
Yes, you can do this. Look at the man pages for dlopen() and dlsym(): these functions are standard on *nix systems and let you look up symbols (functions or variables) by name. There is one significant issue, which is that C++ function names are usually "mangled" to encode type information. A typical way around this is to define a set of wrapper functions in an extern "C" {} block: these will be non-member, C-style functions which can then call into your C++ code. Their names will not be mangled, making them easy to look up using dlsym().
This is a pretty standard way that some plugin architectures work. Or at least used to work, before everyone started using interpreted languages!

How to get the value of a variable that's deep in the source code (C++)? (eg. value of stage_sum in haar.cpp, OpenCV)

I'd like to get a value from a variable that's located deeply in the source code of the OpenCV library. Specifically, I'm trying to print out the value of stage_sum from the file haar.cpp. My starting point, facedetect.cpp, calls the method detectMultiScale, which then calls the function cvHaarDetectObjects, which calls cvHaarDetectObjectsForROC etc., until it finally reaches the function cvRunHaarClassifierCascadeSum, where stage_sum is calculated.
Is there a way I could get the value out to facedetect.cpp easily, without changing the declarations of all the preceding functions/methods, headers etc.? Simply trying to cout or printf the value directly in the source code hasn't given any results.
Thanks everyone for your help!
One option is simply to use a debugger.
However, if you want to do this programatically (i.e. access the variable as part of your application code), then unless the variable is exposed in the library's public interface, there are two options available:
Modify the library's source code, and recompile it.
Resort to undefined-behaviour (fiddling around with the raw bytes that make up an object, etc.).
Just to point the obvious, adding a std::cout() or printf() call inside haar.cpp won't do the trick. You need to recompile OpenCV for this changes to take effect and then reinstall the libraries on your system.

How do I force a section of unfinished code to run even under max optimizations?

I have a function in my program that preforms a whole bunch of floating point math. It returns an array of values which is not currently being used in my program yet.
I want to test this piece of code for speed under maximum optimizations, however since the code isn't used, the compiler conveniently skips the function all together and I can't get a time on it.
How do force the compiler to run that section of code under maximum optimizations even though the result is not used (I want the computer to just give me a sense as to how fast the section runs).
I'm running Visual C++ 2008.
You could use SecureZeroMemory() to overwrite the result after is has been received from the function. You don't even need to overwrite the whole result, one array element will be enough, maybe you can even pass zero as "number of bytes", so that nothing is done by the function.
This will do the trick on Windows - SecureZeroMemory() is intended to never be optimized out by the compiler. Using it is pretty straightforward and it's rather fast.
I'm sure there are many compiler tricks, but the easiest way is to just make it look like you are using the value. In this case, just pass the returned array to some other function. The other function doesn't need to do anything, but that should be enough to convince the compiler you need the results.
If you find that your empty second function is being optimized out as well, then just stick it in a shared library (DLL) and it is impossible for the compiler to know how it is being used.
How you allocate the result can also change this. If you pass the original function a pointer, you could just pass it a heap pointer. Since that pointer may be used somewhere else it is highly unlikely the compiler could optimize away the code, as it has no idea if the results will be used or not.
You could also just legitimately use the data. It makes sense to verify the results in another function. If doing performance testing just put this verification part outside of the timed section. This is generally how I do such performance tests (make sure the result is checked/used).
This is what a test case is for. Write a test case in a separate binary (even just in the main() method) which sets a throwaway local variable to the result of the function. Time using your preferred method (e.g by capturing time(NULL) from immediately before and after the assignment and printing the time difference). You should have a decent idea of running time from that.
EDIT: time(NULL) is whole-second precision = bad and evil. Use clock(), as shown here, for the most accurate precision in the C/C++ standard library.
if you are using visual studio the code down here would work, but idon't know about any other solutions for gcc
#pragma optimize( "", off )
.
.
.
#pragma optimize( "", on )

What's a good, threadsafe, way to pass error strings back from a C shared library

I'm writing a C shared library for internal use (I'll be dlopen()'ing it to a c++ application, if that matters). The shared library loads (amongst other things) some java code through a JNI module, which means all manners of nightmare error modes can come out of the JVM that I need to handle intelligently in the application. Additionally, this library needs to be re-entrant. Is there in idiom for passing error strings back in this case, or am I stuck mapping errors to integers and using printfs to debug things?
Thanks!
My approach to the problem would be a little different from everyone else's. They're not wrong, it's just that I've had to wrestle with a different aspect of this problem.
A C API needs to provide numeric error codes, so that the code using the API can take sensible measures to recover from errors when appropriate, and pass them along when not. The errno.h codes demonstrate a good categorization of errors; in fact, if you can reuse those codes (or just pass them along, e.g. if all your errors come ultimately from system calls), do so.
Do not copy errno itself. If possible, return error codes directly from functions that can fail. If that is not possible, have a GetLastError() method on your state object. You have a state object, yes?
If you have to invent your own codes (the errno.h codes don't cut it), provide a function analogous to strerror, that converts these codes to human-readable strings.
It may or may not be appropriate to translate these strings. If they're meant to be read only by developers, don't bother. But if you need to show them to the end user, then yeah, you need to translate them.
The untranslated version of these strings should indeed be just string constants, so you have no allocation headaches. However, do not waste time and effort coding your own translation infrastructure. Use GNU gettext.
If your code is layered on top of another piece of code, it is vital that you provide direct access to all the error information and relevant context information that that code produces, and you make it easy for developers against your code to wrap up all that information in an error message for the end user.
For instance, if your library produces error codes of its own devising as a direct consequence of failing system calls, your state object needs methods that return the errno value observed immediately after the system call that failed, the name of the file involved (if any), and ideally also the name of the system call itself. People get this wrong waaay too often -- for instance, SQLite, otherwise a well designed API, does not expose the errno value or the name of the file, which makes it infuriatingly hard to distinguish "the file permissions on the database are wrong" from "you have a bug in your code".
EDIT: Addendum: common mistakes in this area include:
Contorting your API (e.g. with use of out-parameters) so that functions that would naturally return some other value can return an error code.
Not exposing enough detail for callers to be able to produce an error message that allows a knowledgeable human to fix the problem. (This knowledgeable human may not be the end user. It may be that your error messages wind up in server log files or crash reports for developers' eyes only.)
Exposing too many different fine distinctions among errors. If your callers will never plausibly do different things in response to two different error codes, they should be the same code.
Providing more than one success code. This is asking for subtle bugs.
Also, think very carefully about which APIs ought to be allowed to fail. Here are some things that should never fail:
Read-only data accessors, especially those that return scalar quantities, most especially those that return Booleans.
Destructors, in the most general sense. (This is a classic mistake in the UNIX kernel API: close and munmap should not be able to fail. Thankfully, at least _exit can't.)
There is a strong case that you should immediately call abort if malloc fails rather than trying to propagate it to your caller. (This is not true in C++ thanks to exceptions and RAII -- if you are so lucky as to be working on a C++ project that uses both of those properly.)
In closing: for an example of how to do just about everything wrong, look no further than XPCOM.
You return pointers to static const char [] objects. This is always the correct way to handle error strings. If you need them localized, you return pointers to read-only memory-mapped localization strings.
In C, if you don't have internationalization (I18N) or localization (L10N) to worry about, then pointers to constant data is a good way to supply error message strings. However, you often find that the error messages need some supporting information (such as the name of the file that could not be opened), which cannot really be handled by constant data.
With I18N/L10N to worry about, I'd recommend storing the fixed message strings for each language in an appropriately formatted file, and then using mmap() to 'read' the file into memory before you fork any threads. The area so mapped should then be treated as read-only (use PROT_READ in the call to mmap()).
This avoids complicated issues of memory management and avoids memory leaks.
Consider whether to provide a function that can be called to get the latest error. It can have a prototype such as:
int get_error(int errnum, char *buffer, size_t buflen);
I'm assuming that the error number is returned by some other function call; the library function then consults any threadsafe memory it has about the current thread and the last error condition returned to that thread, and formats an appropriate error message (possibly truncated) into the given buffer.
With C++, you can return (a reference to) a standard String from the error reporting mechanism; this means you can format the string to include the file name or other dynamic attributes. The code that collects the information will be responsible for releasing the string, which isn't (shouldn't be) a problem because of the destructors that C++ has. You might still want to use mmap() to load the format strings for the messags.
You do need to be careful about the files you load and, in particular, any strings used as format strings. (Also, if you are dealing with I18N/L10N, you need to worry about whether to use the 'n$ notation to allow for argument reordering; and you have to worry about different rules for different cultures/languages about the order in which the words of a sentence are presented.)
I guess you could use PWideChars, as Windows does. Its thread safe. What you need is that the calling app creates a PwideChar that the Dll will use to set an error. Then, the callling app needs to read that PWideChar and free its memory.
R. has a good answer (use static const char []), but if you are going to have various spoken languages, I like to use an Enum to define the error codes. That is better than some #define of a bunch of names to an int value.
return integers, don't set some global variable (like errno— even if it is potentially TLSed by an implementation); aking to Linux kernel's style of return -ENOENT;.
have a function similar to strerror that takes such an integer and returns a pointer to a const string. This function can transparently do I18N if needed, too, as gettext-returnable strings also remain constant over the lifetime of the translation database.
If you need to provide non-static error messages, then I recommend returning strings like this: error_code_t function(, char** err_msg). Then provide a function to free the error message: void free_error_message(char* err_msg). This way you hide how the error strings are allocated and freed. This is of course only worth implementing of your error strings are dynamic in nature, meaning that they convey more than just a translation of error codes.
Please havy oversight with mu formatting. I'm writing this on a cell phone...