Using shared C++/STL code with Objective-C++ - c++

I have a lot of shared C++ code that I'd like to use in my iPhone app. I added the .cpp and .h files to my Xcode project and used the classes in my Objective-C++ code. The project compiles fine with 0 errors or warnings.
However, when I run it in the simulator I get the following error when I attempt to access an STL method in my objective-c code (such as .c_str()):
Program received signal: “EXC_BAD_ACCESS”.
Unable to disassemble std::string::c_str.
Here's an example of the code that causes the error:
[name setText:[[NSString alloc] initWithCString:myCPlusPlusObject->cppname.c_str()]];
where name is an NSTextField object and cppname is a std::string member of myCPlusPlusObject.
Am I going about this the right way? Is there a better way to use STL-laden C++ classes in Objective-C++? I would like to keep the common C++ files untouched if possible, to avoid having to maintain the code in two places.
Thanks in advance!

Make sure the string isn't empty before passing it to the initWithCString function.
Also the function you're using has been deprecated, use this one instead.

Note also that this line of code:
[name setText:[[NSString alloc] initWithCString:myCPlusPlusObject->cppname.c_str()]];
leaks the created string.
Go back and read the memory management rules at http://developer.apple.com/documentation/Cocoa/Conceptual/MemoryMgmt/Articles/mmRules.html.
The main issue is there you have allocated the string, therefore you have taken ownership of it, and you then never release it. You should do one of the following:
[name setText:[[[NSString alloc] initWithCString:myCPlusPlusObject->cppname.c_str() encoding:NSUTF8StringEncoding] autorelease]];
or
NSString* myCPlusPlusString = [[NSString alloc] initWithCString:myCPlusPlusObject->cppname.c_str() encoding:NSUTF8StringEncoding];
[name setText:myCPlusPlusString];
[myCPlusPlusString release];
or
[name setText:[NSString stringWithCString:myCPlusPlusObject->cppname.c_str() encoding:NSUTF8StringEncoding]];
The latter is the best as far as code simplicity is concerned. The middle one is the best as far as memory usage is concerned, which is frequently an issue on the iPhone.
The first one is likely identical to the last one - I say "likely" because there is no guarentee that stringWithCString returns an autoreleased object. It probably does, but whether it does or not is not your concern, all that matters to you is that you do not take ownership of the string because the method name does not begin with “alloc” or “new” or contains “copy” and so you are not responsible for releaing it.

I would try this:
if (myCPlusPlusObject)
{
[name setText:[[NSString alloc] initWithUTF8String:myCPlusPlusObject->cppname.c_str()]];
}
else
{
[name setText:#"Plop: Bad myCPlusPlusObject"];
}
The NULL pointer is likely your problem as the std::String will always be initialized correctly if it exists, and the c_str() method will return a '\0' terminated string even if the string is empty.

Related

C++ What could cause a line of code in one place to change behavior in an unrelated function?

Consider this mock-up of my situation.
in an external header:
class ThirdPartyObject
{
...
}
my code: (spread among a few headers and source files)
class ThirdPartyObjectWrapper
{
private:
ThirdPartyObject myObject;
}
class Owner
{
public:
Owner() {}
void initialize();
private:
ThirdPartyObjectWrapper myWrappedObject;
};
void Owner::initialize()
{
//not weird:
//ThirdPartyObjectWrapper testWrappedObject;
//weird:
//ThirdPartyObject testObject;
}
ThirdPartyObject is, naturally, an object defined by a third party (static precompiled) library I'm using. ThirdPartyObjectWrapper is a convenience class that eliminates a lot of boiler-plating for working with ThirdPartyObject. Owner::initialize() is called shortly after an instance of Owner is created.
Notice the two lines I have labeled as "weird" and "not weird" in Owner::initialize(). All I'm doing here is creating a couple of objects on the stack with their default constructors. I don't do anything with those objects and they get destroyed when they leave scope. There are no build or linker errors involved, I can uncomment either or both lines and the code will build.
However, if I uncomment "weird" then I get a segmentation fault, and (here's why I say it's weird) it's in a completely unrelated location. Not in the constructor of testObject, like you might expect, but in the constructor of Owner::myObjectWrapper::myObject. The weird line never even gets called, but somehow its presence or absence consistently changes the behavior of an unrelated function in a static library.
And consider that if I only uncomment "not weird" then it runs fine, executing the ThirdPartyObject constructor twice with no problems.
I've been working with C++ for a year so it's not really a surprise to me that something like this would be able happen, but I've about reached the limit of my ability to figure out how this gotcha is happening. I need the input of people with significantly more C++ experience than me.
What are some possibilities that could cause this to happen? What might be going on here?
Also, note, I'm not asking for advice on how to get rid of the segfault. Segfaults I understand, I suspect it's a simple race condition. What I don't understand is the behavior gotcha so that's the only thing I'm trying to get answers for.
My best lead is that it has to do with headers and macros. The third party library actually already has a couple of gotchas having to do with its headers and macros, for example the code won't build if you put your #include's in the wrong order. I'm not changing any #include's so strictly this still wouldn't make sense, but perhaps the compiler is optimizing includes based on the presence of a symbol here? (it would be the only mention of ThirdPartyObject in the file)
It also occurs to me that because I am using Qt, it could be that the Meta-Object Compiler (which generates supplementary code between compilations) might be involved in this. Very unlikely, as Qt has no knowledge of the third party library where the segfault is happening and this is not actually relevant to the functionality of the MOC (since at no point ThirdPartyObject is being passed as an argument), but it's worth investigating at least.
Related questions have suggested that it could be a relatively small buffer overflow or race condition that gets tripped up by compiler optimizations. Continuing to investigate but all leads are welcome.
Typical culprits:
Some build products are stale and not binary-compatible.
You have a memory bug that has corrupted the state of your process, and are seeing a manifestation of that in a completely unrelated location.
Fixing #1 is trivial: delete the build folder and build again. If you're not building in a shadow build folder, you've set yourself up for failure, hopefully you now know enough to stop :)
Fixing #2 is not trivial. View manual memory management and possible buffer overflows with suspicion. Use modern C++ programming techniques to leverage the compiler to help you out: store things by value, use containers, use smart pointers, and use iterators and range-for instead of pointers. Don't use C-style arrays. Abhor C-style APIs of the (Type * array, int count) kind - they don't belong in C++.
What fun. I've boiled this down to the bottom.
//#include <otherthirdpartyheader.h>
#include <thirdpartyobject.h>
int main(...)
{
ThirdPartyObject test;
return 0;
}
This code runs. If I uncomment the first include, delete all build artifacts, and build again, then it breaks. There's obviously a header/macro component, and probably some kind of compiler-optimization component. But, get this, according to the library documentation it should give me a segfault every time because I haven't been doing a required initialization step. So the fact that it runs at all indicates unexpected behavior.
I'm chalking this up to library-specific issues rather than broad spectrum C++ issues. I'll be contacting the vendor going forward from here, but thanks everyone for the help.

What's a good, threadsafe, way to pass error strings back from a C shared library

I'm writing a C shared library for internal use (I'll be dlopen()'ing it to a c++ application, if that matters). The shared library loads (amongst other things) some java code through a JNI module, which means all manners of nightmare error modes can come out of the JVM that I need to handle intelligently in the application. Additionally, this library needs to be re-entrant. Is there in idiom for passing error strings back in this case, or am I stuck mapping errors to integers and using printfs to debug things?
Thanks!
My approach to the problem would be a little different from everyone else's. They're not wrong, it's just that I've had to wrestle with a different aspect of this problem.
A C API needs to provide numeric error codes, so that the code using the API can take sensible measures to recover from errors when appropriate, and pass them along when not. The errno.h codes demonstrate a good categorization of errors; in fact, if you can reuse those codes (or just pass them along, e.g. if all your errors come ultimately from system calls), do so.
Do not copy errno itself. If possible, return error codes directly from functions that can fail. If that is not possible, have a GetLastError() method on your state object. You have a state object, yes?
If you have to invent your own codes (the errno.h codes don't cut it), provide a function analogous to strerror, that converts these codes to human-readable strings.
It may or may not be appropriate to translate these strings. If they're meant to be read only by developers, don't bother. But if you need to show them to the end user, then yeah, you need to translate them.
The untranslated version of these strings should indeed be just string constants, so you have no allocation headaches. However, do not waste time and effort coding your own translation infrastructure. Use GNU gettext.
If your code is layered on top of another piece of code, it is vital that you provide direct access to all the error information and relevant context information that that code produces, and you make it easy for developers against your code to wrap up all that information in an error message for the end user.
For instance, if your library produces error codes of its own devising as a direct consequence of failing system calls, your state object needs methods that return the errno value observed immediately after the system call that failed, the name of the file involved (if any), and ideally also the name of the system call itself. People get this wrong waaay too often -- for instance, SQLite, otherwise a well designed API, does not expose the errno value or the name of the file, which makes it infuriatingly hard to distinguish "the file permissions on the database are wrong" from "you have a bug in your code".
EDIT: Addendum: common mistakes in this area include:
Contorting your API (e.g. with use of out-parameters) so that functions that would naturally return some other value can return an error code.
Not exposing enough detail for callers to be able to produce an error message that allows a knowledgeable human to fix the problem. (This knowledgeable human may not be the end user. It may be that your error messages wind up in server log files or crash reports for developers' eyes only.)
Exposing too many different fine distinctions among errors. If your callers will never plausibly do different things in response to two different error codes, they should be the same code.
Providing more than one success code. This is asking for subtle bugs.
Also, think very carefully about which APIs ought to be allowed to fail. Here are some things that should never fail:
Read-only data accessors, especially those that return scalar quantities, most especially those that return Booleans.
Destructors, in the most general sense. (This is a classic mistake in the UNIX kernel API: close and munmap should not be able to fail. Thankfully, at least _exit can't.)
There is a strong case that you should immediately call abort if malloc fails rather than trying to propagate it to your caller. (This is not true in C++ thanks to exceptions and RAII -- if you are so lucky as to be working on a C++ project that uses both of those properly.)
In closing: for an example of how to do just about everything wrong, look no further than XPCOM.
You return pointers to static const char [] objects. This is always the correct way to handle error strings. If you need them localized, you return pointers to read-only memory-mapped localization strings.
In C, if you don't have internationalization (I18N) or localization (L10N) to worry about, then pointers to constant data is a good way to supply error message strings. However, you often find that the error messages need some supporting information (such as the name of the file that could not be opened), which cannot really be handled by constant data.
With I18N/L10N to worry about, I'd recommend storing the fixed message strings for each language in an appropriately formatted file, and then using mmap() to 'read' the file into memory before you fork any threads. The area so mapped should then be treated as read-only (use PROT_READ in the call to mmap()).
This avoids complicated issues of memory management and avoids memory leaks.
Consider whether to provide a function that can be called to get the latest error. It can have a prototype such as:
int get_error(int errnum, char *buffer, size_t buflen);
I'm assuming that the error number is returned by some other function call; the library function then consults any threadsafe memory it has about the current thread and the last error condition returned to that thread, and formats an appropriate error message (possibly truncated) into the given buffer.
With C++, you can return (a reference to) a standard String from the error reporting mechanism; this means you can format the string to include the file name or other dynamic attributes. The code that collects the information will be responsible for releasing the string, which isn't (shouldn't be) a problem because of the destructors that C++ has. You might still want to use mmap() to load the format strings for the messags.
You do need to be careful about the files you load and, in particular, any strings used as format strings. (Also, if you are dealing with I18N/L10N, you need to worry about whether to use the 'n$ notation to allow for argument reordering; and you have to worry about different rules for different cultures/languages about the order in which the words of a sentence are presented.)
I guess you could use PWideChars, as Windows does. Its thread safe. What you need is that the calling app creates a PwideChar that the Dll will use to set an error. Then, the callling app needs to read that PWideChar and free its memory.
R. has a good answer (use static const char []), but if you are going to have various spoken languages, I like to use an Enum to define the error codes. That is better than some #define of a bunch of names to an int value.
return integers, don't set some global variable (like errno— even if it is potentially TLSed by an implementation); aking to Linux kernel's style of return -ENOENT;.
have a function similar to strerror that takes such an integer and returns a pointer to a const string. This function can transparently do I18N if needed, too, as gettext-returnable strings also remain constant over the lifetime of the translation database.
If you need to provide non-static error messages, then I recommend returning strings like this: error_code_t function(, char** err_msg). Then provide a function to free the error message: void free_error_message(char* err_msg). This way you hide how the error strings are allocated and freed. This is of course only worth implementing of your error strings are dynamic in nature, meaning that they convey more than just a translation of error codes.
Please havy oversight with mu formatting. I'm writing this on a cell phone...

Access violation calling C++ dll

I created c++ dll (using mingw) from code I wrote on linux (gcc), but somehow have difficulties using it in VC++. The dll basically exposes just one class, I created pure virtual interface for it and also factory function which creates the object (the only export) which looks like this:
extern "C" __declspec(dllexport) DeviceDriverApi* GetX5Driver();
I added extern "C" to prevent name mangling, dllexport is replaced by dllimport in actual code where I want to use the dll, DeviceDriverApi is the pure virtual interface.
Now I wrote simple code in VC++ which just call the factory function and then just tries to delete the pointer. It compiles without any problems but when I try to run it I get access violation error. If I try to call any method of the object I get access violation again.
When I compile the same code in MinGW (gcc) and use the same library, it runs without any problems. So there must be something (hehe, I guess many differences actually :)) between how VC++ code uses the library and gcc code.
Any ideas what?
Cheers,
Tom
Edit:
The code is:
DeviceDriverApi* x5Driver = GetX5Driver();
if (x5Driver->isConnected())
Console::WriteLine(L"Hello World");
delete x5Driver;
It's crashing when I try to call the method and when I try to delete the pointer as well. The object is created correctly though (the first line). There are some debug outputs when the object is created and I can see them before I get the access violation error.
You're using one compiler (mingw) for the DLL, and another (VC++) for the calling code.
You're calling a 'C' function, but returning a pointer to a C++ Object.
That will never work, because VTable layouts are almost guranteed to be incompatible. And, the DLL and app are probably using different memory managers, so you're doing new() with one and delete() with the other. Again, it just won't work.
For this to work the two compilers need to both support a standard ABI (Application Binary Interface). I don't think such a thing exists for Windows.
The best option is to expose all you DLL object methods and properties via C functions (including one to delete the object). You can the re-wrap into a C++ object on the calling end.
The two different compilers may be using different calling conventions. Try putting _cdecl before the function name in both the client code and the DLL code and recompiling both.
More info on calling conventions here: http://en.wikipedia.org/wiki/X86_calling_conventions
EDIT: The question was updated with more detail and it looks likely the problem is what Adrien Plisson describes at the end of his answer. You're creating an object in one module and freeing it in another, which is wrong.
(1) I suspect a calling covnention problem as well, though the simple suggestion by Leo doesn't seem to have helped.
Is isConnected virtual? It is possible that MinGW and VC++ use different implementations for a VTable, in which case, well, tough luck.
Try to see how far you get with the debugger: does it crash at the call, or the return? Do you arrive at invalid code? (If you know to read assembly, that usually helps a lot with these problems.)
Alternatively, add trace statements to the various methods, to see how far you get.
(2) For a public DLL interface, never free memory in the caller that was allocated by a callee (or vice versa). The DLL likely runs with a completely different heap, so the pointer is not known.
If you want to rely on that behavior, you need to make sure:
Caller and Callee (i.e. DLL and main program, in your case) are compiled with the same version of the sam compiler
for all supported compilers, you have configured the compile options to ensure caller and callee use the same shared runtime library state.
So the best way is to change your API to:
extern "C" __declspec(dllexport) DeviceDriverApi* GetX5Driver();
extern "C" __declspec(dllexport) void FreeDeviceDriver(DeviceDriverApi* driver);
and, at caller site, wrap in some way (e.g. in a boost::intrusive_ptr).
try looking at the imported libraries from both your DLL and your client executable. (you can use the Dependency Viewer or dumpbin or any other tool you like). verify that both the DLL and the client code are using the same C++ runtime.
if it is not the case, you can indeed run into some issues since the way the memory is managed may be different between the 2, leading to a crash when freeing from one runtime a pointer allocated from another runtime.
if this is really your problem, try not destroying the pointer in your client executable, but rather declare and export a function in your DLL which will take care of destroying the pointer.

C++, workaround for macro using 'this' in static member functions

I've overridden new so that I can track memory allocations. Additional parameters such as __FILE__, __LINE__, module name etc are added in the #define.
However I want to add the address of the calling object to the parameters so that I can backtrack up allocations when hunting down problems. The easiest way is to add 'this' to those additional parameters (which means the address of the caller is passed into my custom alloc stuff).
Unfortunately there's plenty of singletons in our code, which means a bunch of static member functions calling new. The compiler throws up the error C2671: '...' : static member functions do not have 'this' pointers
Is there a workaround where I can get the address of the object without using this, which would also realize it's in a static method and pass null say?
Or maybe is there a way that my #define new would recognize it's in a static method and switch to a different definition?
It's important that I don't affect the existing project code though - I don't want to force developers to use a custom method like staticnew just because it's in a static method - they should carry on using new like normal and this memory tracking stuff is all going on in the background...
You definitely cannot determine if a #define macro is inside a static method or not. You even shouldn't be using #define new as it violates the standard (even though all compilers support it). Your macro will also cause trouble to those who want to overload operator new for their class.
Generally, I would suggest not using this kind of memory debugging. There are many mature memory debuggers that do a better work when debugging memory errors. The most famous one is Valgrind.
To give a simple answer to your question - there is no clean solution in the way you are approaching the problem.
Well, once you're going down the "hack" path, you could throw portability out the window and get close to the compiler.
You could put some inline assembler in your macro that called a function with a pointer to the string generated by __FUNCDNAME__, and if it looks like a member function get the this pointer in the assembler, and if not just use null.

Remove never-run call to templated function, get allocation error on run-time

I have a piece of templated code that is never run, but is compiled. When I remove it, another part of my program breaks.
First off, I'm a bit at a loss as to how to ask this question. So I'm going to try throwing lots of information at the problem.
Ok, so, I went to completely redesign my test project for my experimental core library thingy. I use a lot of template shenanigans in the library. When I removed the "user" code, the tests gave me a memory allocation error. After quite a bit of experimenting, I narrowed it down to this bit of code (out of a couple hundred lines):
void VOODOO(components::switchBoard &board) {
board.addComponent<using_allegro::keyInputs<'w'> >();
}
Fundementally, what's weirding me out is that it appears that the act of compiling this function (and the template function it then uses, and the template functions those then use...), makes this bug not appear. This code is not being run. Similar code (the same, but for different key vals) occurs elsewhere, but is within Boost TDD code.
I realize I certainly haven't given enough information for you to solve it for me; I tried, but it more-or-less spirals into most of the code base. I think I'm most looking for "here's what the problem could be", "here's where to look", etc. There's something that's happening during compile because of this line, but I don't know enough about that step to begin looking.
Sooo, how can a (presumably) compilied, but never actually run, bit of templated code, when removed, cause another part of code to fail?
Error:
Unhandled exceptionat 0x6fe731ea (msvcr90d.dll) in Switchboard.exe:
0xC0000005: Access violation reading location 0xcdcdcdc1.
Callstack:
operator delete(void * pUser Data)
allocator< class name related to key inputs callbacks >::deallocate
vector< same class >::_Insert_n(...)
vector< " " >::insert(...)
vector<" ">::push_back(...)
It looks like maybe the vector isn't valid, because _MyFirst and similar data members are showing values of 0xcdcdcdcd in the debugger. But the vector is a member variable...
Update: The vector isn't valid because it's never made. I'm getting a channel ID value stomp, which is making me treat one type of channel as another.
Update:
Searching through with the debugger again, it appears that my method for giving each "channel" it's own, unique ID isn't giving me a unique ID:
inline static const char channel<template args>::idFunction() {
return reinterpret_cast<char>(&channel<CHANNEL_IDENTIFY>::idFunction);
};
Update2: These two are giving the same:
slaveChannel<switchboard, ALLEGRO_BITMAP*, entityInfo<ALLEGRO_BITMAP*>
slaveChannel<key<c>, char, push<char>
Sooo, having another compiled channel type changing things makes sense, because it shifts around the values of the idFunctions? But why are there two idFunctions with the same value?
you seem to be returning address of the function as a character? that looks weird. char has much smaller bit count than pointer, so it's highly possible you get same values. that could reason why changing code layout fixes/breaks your program
As a general answer (though aaa's comment alludes to this): When something like this affects whether a bug occurs, it's either because (a) you're wrong and it is being run, or (b) the way that the inclusion of that code happens to affect your code, data, and memory layout in the compiled program causes a heisenbug to change from visible to hidden.
The latter generally occurs when something involves undefined behavior. Sometimes a bogus pointer value will cause you to stomp on a bit of your code (which might or might not be important depending on the code layout), or sometimes a bogus write will stomp on a value in your data stack that might or might not be a pointer that's used later, or so forth.
As a simple example, supposing you have a stack that looks like:
float data[10];
int never_used;
int *important pointer;
And then you erroneously write
data[10] = 0;
Then, assuming that stack got allocated in linear order, you'll stomp on never_used, and the bug will be harmless. However, if you remove never_used (or change something so the compiler knows it can remove it for you -- maybe you remove a never-called function call that would use it), then it will stomp on important_pointer instead, and you'll now get a segfault when you dereference it.