Static library API question (std::string vs. char*) - c++

I have not worked with static libraries before, but now I need to.
Scenario:
I am writing a console app in Unix. I freely use std::string everywhere because it's easy to do so. However, I recently found out that I have to support it in Windows and a third party application would need API's to my code (I will not be sharing source, just the DLL).
With this in mind, can I still use std::string everywhere in my code but then provide them with char * when I code the API's? Would that work?

Yep. Use std::string internally and then just use const char * on the interface functions (which will be converted to std::strings on input.

Why not just provide them with std::string?
It's standard C++, and I'd be very suprised if they didn't support it.

The question is, what your clients will do with that pointer. It should of course be const char*, but if clients will keep and reference it later on, its probably risky to use std::string internally, because as soon as you operate yourself on the strings there is no way to keep std::string from moving memory, as its reference counting mechanism can not work with exported char* pointers. As long as you dont touch the std::string objects, their memory wont move, and the pointer is safe.

There is no standardized C++ binary interface (at least I haven;t heard about it), thus projects with different settings may appear to be unlinkable together. For example, Visual C++ provides a way to enable/disable iterator debug support. This is controlled by macro and size of some data structures depends on it.
If two codes compiled with different settings start to communicate using these data structures, the best thing you can have is linker error. Other alternatives are worse - stable run-time error, release-configuration-only error, etc...
So if you don't want to restrict your users to single correct project settings set and compiler version, use only primitive data for interface. For internal implementation choose what is more convenient.

Adding to Poita_'s response:
consider unicode support
If you ever have to support localization, too, you'll be happy to have done it in the first place
when returning char/wchar_t const *, define the lifetime of the data. The best would be to have a project-wide "unless stated otherwise..." standard
Alternatively, you can return a copy that must be freed through a method exported by your library. (C++ clients can move that into a smart pointer to regain automatic memory management.)

std::string will work in, at the very least, Visual Studio C++ (and others), so why not just use that?

Related

Is it possible to create a user-defined datatype in a language like C/C++(or maybe any) from a string as user input or from file

Well this might be a very weird question but my curiosity has striken pretty hard on this. So here it goes...
NOTE: Lets take the language C into consideration here.
As programmers we usually define a user-defined datatype(say struct) in the source code with the appropriate name.
Suppose I have a program in which I have a structure defined as:
struct Animal {
char *name;
int lifeSpan;
};
And also I have started the execution of this program.
Now, my question here is;
What if I want to define a new structure called "Plant" just like "Animal" mentioned above in my program, without writing its definition in the source code itself(which is obviously impossible currently) but rather from a user input string(or a file input) during runtime.
Lets say my program takes input string from a text file named file1.txt whose content is:
struct Plant {
char *name;
int lifeSpan;
};
What I want now is to have a new structure named "Plant" in my program which is already in execution. The program should read the file content and create a structure as written in the file and attach it to itself on-the-go.
I have checked out a solution for C++ in the discussion Declaring a data type dynamically in C++ but it doesnt seem to have a very convincing solution.
The solution I am looking for is at the compiler-linker-loader level rather than from the language itself.I would be very pleased and thankful if anyone is looking forward to sharing their ideas on this.
What you're asking about is basically "can we implement C as a scripting language?", since this is the only way code can be executed after compilation.
I'm aware that people have been writing (mostly in the comments) that it's possible in other languages but isn't possible in C, since C is a compiled language (hence data types should be defined during compile time).
However, to the best of my knowledge it's actually possible (and might not be as hard as one would imagine).
There are many possible approaches (machine code emulation (VM), JIT compilation, etc').
One approach will use a C compiler to compile the C script as an external dynamic library (.dll on windows, .so on linux, etc') and than "load" the compiled library and execute the code (this is pretty much the JIT compilation approach, for lazy people).
EDIT:
As mentioned in the comments, by using this approach, the new type is loaded as part of an external library.
The original code won't know about this new type, only the new code (or library) will be "aware" of this new type and able to properly use it.
On the other hand, I'm not sure why you're insisting on the need to use static types and a compiler-linker-loader level solution.
The language itself (the C language) can manage this task dynamically (during execution time).
Consider Ruby MRI, for example. The Ruby language supports dynamic types that can be defined during runtime...
...However, this is implemented in C and it's possible to use the code from within C to define new modules and classes. These aren't static types that can be tested during compilation (type creation and identification is performed during runtime).
This is a perfect example showing that C (as a language) can dynamically define "types".
However, this is also a poor example because Ruby's approach is slow. A custom approved can be far faster since it would avoid the huge overhead related to functionality you might not need (such as inheritance).

How can i pass C++ Objects to DLLs with different _ITERATOR_DEBUG_LEVEL

My executable makes calls to a number of DLLs, that i wrote myself. According to 3rd party C++ libs used by these DLLs i can not freely choose compiler settings for all DLLs. Therefore in some DLLs _ITERATOR_DEBUG_LEVEL is set to 2 (default in the debug version), but in my executable _ITERATOR_DEBUG_LEVEL is set to 0, according to serious performance problems.
When i now pass a std::string to the DLL, the application crashes, as soon as the DLL tries to copy it to a local std::string obj, as the memory layout of the string object in the DLL is different from that in my executable. So far i work around this by passing C-strings. I even wrote a small class that converts a std::map<std::string, int> to and from a temporary representation in C-Data in order to pass sich data from and to the DLL. This works.
How can i overcome this problem? I want to pass more different classes and containers, and for several reasons i do not want to work with _ITERATOR_DEBUG_LEVEL = 2.
The problem is that std::string and other containers are template classes. They are generated at compilation time for each binary, so in your case they are generated differently. You can say it's not the same objects.
To fix this, you have several solutions, but they are all following the same rule of thumb : don't expose any template code in your header code shared between binaries.
You could create specific interfaces only for this purpose, or just make sure your headers don't expose template types and functions. You can use those template types inside your binaries, but just don't expose them to other binaries.
std::string in interfaces can be replaced by const char * . You can still use std::string in your systems, just ask for const char * in interfaces and use std::string::c_str() to expose your data.
For maps and other containers you'll have to provide functions that allow external code to manipulate the internal map. Like "Find( const char* key ); ".
The main problem will be with template members of your exposed classes. A way to fix this is to use the PImpl idiom : create (or generate) an API, headers, that just expose what can be done with your binaries, and then make sure that API have pointers to the real objects inside your binaries. The API will be used outside but inside your library you can code with whatever you want. DirectX API and other OS API are done that way.
It is not recommended to use an c++ interface with complex types (stl...) with 3rd party libs, if you get them only as binary or need special settings for compile which are different from your settings.
As you wrote - the implementation could be different with the same compiler depending on your settings and with different compilers the situation gets even worse.
If possible compile the 3rd party lib with your compiler and your settings.
If that's not possible you may write an wrapper-DLL which is compiled with the same compiler and same settings than the 3rd party lib and provides an C-Data interface for you. In your project you may write another wrapper-class so you can make your function-calls with STL-Objects and get them converted and transfered "in background".
My own experience with flags like _SECURE_SCL and _ITERATOR_DEBUG_LEVEL, is that they must be consistent if you attempt to pass a stl object accross dll boudaries.
However I think you can pass a stl object to a dll which has a smaller _ITERATOR_DEBUG_LEVEL
since you can probably give a stl object instantiated in a debug dll to a dll compiled in release mode.
EDIT 07/04/2011
Apparently Visual Studio 2010 provides some niceties to detect mismatches between ITERATOR_DEBUG_LEVEL. I have not watch the video yet.
http://blogs.msdn.com/b/vcblog/archive/2011/04/05/10150198.aspx

What's a good, threadsafe, way to pass error strings back from a C shared library

I'm writing a C shared library for internal use (I'll be dlopen()'ing it to a c++ application, if that matters). The shared library loads (amongst other things) some java code through a JNI module, which means all manners of nightmare error modes can come out of the JVM that I need to handle intelligently in the application. Additionally, this library needs to be re-entrant. Is there in idiom for passing error strings back in this case, or am I stuck mapping errors to integers and using printfs to debug things?
Thanks!
My approach to the problem would be a little different from everyone else's. They're not wrong, it's just that I've had to wrestle with a different aspect of this problem.
A C API needs to provide numeric error codes, so that the code using the API can take sensible measures to recover from errors when appropriate, and pass them along when not. The errno.h codes demonstrate a good categorization of errors; in fact, if you can reuse those codes (or just pass them along, e.g. if all your errors come ultimately from system calls), do so.
Do not copy errno itself. If possible, return error codes directly from functions that can fail. If that is not possible, have a GetLastError() method on your state object. You have a state object, yes?
If you have to invent your own codes (the errno.h codes don't cut it), provide a function analogous to strerror, that converts these codes to human-readable strings.
It may or may not be appropriate to translate these strings. If they're meant to be read only by developers, don't bother. But if you need to show them to the end user, then yeah, you need to translate them.
The untranslated version of these strings should indeed be just string constants, so you have no allocation headaches. However, do not waste time and effort coding your own translation infrastructure. Use GNU gettext.
If your code is layered on top of another piece of code, it is vital that you provide direct access to all the error information and relevant context information that that code produces, and you make it easy for developers against your code to wrap up all that information in an error message for the end user.
For instance, if your library produces error codes of its own devising as a direct consequence of failing system calls, your state object needs methods that return the errno value observed immediately after the system call that failed, the name of the file involved (if any), and ideally also the name of the system call itself. People get this wrong waaay too often -- for instance, SQLite, otherwise a well designed API, does not expose the errno value or the name of the file, which makes it infuriatingly hard to distinguish "the file permissions on the database are wrong" from "you have a bug in your code".
EDIT: Addendum: common mistakes in this area include:
Contorting your API (e.g. with use of out-parameters) so that functions that would naturally return some other value can return an error code.
Not exposing enough detail for callers to be able to produce an error message that allows a knowledgeable human to fix the problem. (This knowledgeable human may not be the end user. It may be that your error messages wind up in server log files or crash reports for developers' eyes only.)
Exposing too many different fine distinctions among errors. If your callers will never plausibly do different things in response to two different error codes, they should be the same code.
Providing more than one success code. This is asking for subtle bugs.
Also, think very carefully about which APIs ought to be allowed to fail. Here are some things that should never fail:
Read-only data accessors, especially those that return scalar quantities, most especially those that return Booleans.
Destructors, in the most general sense. (This is a classic mistake in the UNIX kernel API: close and munmap should not be able to fail. Thankfully, at least _exit can't.)
There is a strong case that you should immediately call abort if malloc fails rather than trying to propagate it to your caller. (This is not true in C++ thanks to exceptions and RAII -- if you are so lucky as to be working on a C++ project that uses both of those properly.)
In closing: for an example of how to do just about everything wrong, look no further than XPCOM.
You return pointers to static const char [] objects. This is always the correct way to handle error strings. If you need them localized, you return pointers to read-only memory-mapped localization strings.
In C, if you don't have internationalization (I18N) or localization (L10N) to worry about, then pointers to constant data is a good way to supply error message strings. However, you often find that the error messages need some supporting information (such as the name of the file that could not be opened), which cannot really be handled by constant data.
With I18N/L10N to worry about, I'd recommend storing the fixed message strings for each language in an appropriately formatted file, and then using mmap() to 'read' the file into memory before you fork any threads. The area so mapped should then be treated as read-only (use PROT_READ in the call to mmap()).
This avoids complicated issues of memory management and avoids memory leaks.
Consider whether to provide a function that can be called to get the latest error. It can have a prototype such as:
int get_error(int errnum, char *buffer, size_t buflen);
I'm assuming that the error number is returned by some other function call; the library function then consults any threadsafe memory it has about the current thread and the last error condition returned to that thread, and formats an appropriate error message (possibly truncated) into the given buffer.
With C++, you can return (a reference to) a standard String from the error reporting mechanism; this means you can format the string to include the file name or other dynamic attributes. The code that collects the information will be responsible for releasing the string, which isn't (shouldn't be) a problem because of the destructors that C++ has. You might still want to use mmap() to load the format strings for the messags.
You do need to be careful about the files you load and, in particular, any strings used as format strings. (Also, if you are dealing with I18N/L10N, you need to worry about whether to use the 'n$ notation to allow for argument reordering; and you have to worry about different rules for different cultures/languages about the order in which the words of a sentence are presented.)
I guess you could use PWideChars, as Windows does. Its thread safe. What you need is that the calling app creates a PwideChar that the Dll will use to set an error. Then, the callling app needs to read that PWideChar and free its memory.
R. has a good answer (use static const char []), but if you are going to have various spoken languages, I like to use an Enum to define the error codes. That is better than some #define of a bunch of names to an int value.
return integers, don't set some global variable (like errno— even if it is potentially TLSed by an implementation); aking to Linux kernel's style of return -ENOENT;.
have a function similar to strerror that takes such an integer and returns a pointer to a const string. This function can transparently do I18N if needed, too, as gettext-returnable strings also remain constant over the lifetime of the translation database.
If you need to provide non-static error messages, then I recommend returning strings like this: error_code_t function(, char** err_msg). Then provide a function to free the error message: void free_error_message(char* err_msg). This way you hide how the error strings are allocated and freed. This is of course only worth implementing of your error strings are dynamic in nature, meaning that they convey more than just a translation of error codes.
Please havy oversight with mu formatting. I'm writing this on a cell phone...

isDefined function?

In C++ is there any function that returns "true" when the variable is defined or false in vice versa. Something like this:
bool isDefined(string varName)
{
if (a variable called "varName" is defined)
return true;
else
return false;
}
C++ is not a dynamic language. Which means, that the answer is no. You know this at compile time, not runtime.
There is no such a thing in runtime as it doesn't make sense in a non-dynamic language as C++.
However you can use it inside a sizeof to test if it exists on compile time without side-effects.
(void)sizeof(variable);
That will stop compilation if var doesn't exist.
As already stated, the C++ runtime system does not support the querying of whether or not a variable is declared or not. In general a C++ binary doesn't contain information on variable symbols or their mappings to their location. Technically, this information would be available in a binary compiled with debugging information, and you could certainly query the debugging information to see if a variable name is present at a given location in code, but it would be a dirty hack at best (If you're curious to see what it might look at, I posted a terrible snippet # Call a function named in a string variable in C which calls a C function by a string using the DWARF debugging information. Doing something like this is not advised)
Microsoft has two extensions to C++ named: __if_exists and __if_not_exists. They can be useful in some cases, but they don't take string arguments.
If you really need such a functionality you can add all your variables to a set and then query that set for variable existance.
Already mentioned that C++ doesn't provide such facility.
On the other hand there are cases where the OS implement mechanisms close to isDefined(),
like the GetProcAddress Function, on Windows.
No. It's not like you have a runtime system around C++ which keeps remembers variables with names in some sort of table (meta data) and lets you access a variable through a dynamically generated string. If you want this, you have to build it yourself, for example using a std::map that maps strings to some objects.
Some compile-time mechanism would fit into the language. But I don't think that it would be any useful.
In order to achieve this first you need to implement a dynamic variable handling system, or at least find some on the internet. As previously mentioned the C++ is designed to be a native language so there are no built-in facilities to do this.
What I can suggest for the most easy solution to create a std::map with string keys storing global variables of interest with a boost::any, wxVariant or something similar, and store your variables in this map. You can make your life a bit easier with a little preprocessor directive to define a variables by their name, so you don't need to retype the name of the variable twice. Also, to make life easier I suggest to create a little inline function which access this variable map, and checks if the given string key is contained by the map.
There are implementation such a functionality in many places, the runtime property handling systems are available in different fashion, but if you need just this functionality I suggest to implement by yourself, because most of these solutions are quite general what you probably don't need.
You can make such function, but it wouldn't operate strings. You would have to send variable name. Such a function would try to add 0 to the variable. If it doesn't exists, an error would occur, so you might want to try to make exception handling with try...throw...catch . But because I'm on the phone, I don't know if this wouldn't throw an error anyways when trying to send non-existing variable to the function...

use of char * vs std::string in different environments

I have been using std::string in my code. I was going to make a std::string and pass it by reference. However, someone suggested using a char * instead. Something about std::string is not reliable when porting code. Is that true? I have avoided using char * as I would need to do some memory management for it. Instead I find using the std::string much easier to use.
Basically I have a 10 digit output that I am storing in this string. Atm, I am not sure which would be better to use.
std::string is part of the C++ Standard, and has been since 1998. It is available in all the current C++ compilers. There really is no portability reason not to use it. If you have an API that needs to use a C-style string, you can use the std::string's c_str() member to get one from a string:
std::string s = "foo";
int n = strlen( s.c_str() );
In C++, almost every string should be std::string unless another library requires a cstring, in which case you should still be using an std::string and passing string.c_str(), unless you're using functions that work with buffers.
However, if you're writing a library and exporting functions, it's better to use const char* parameters rather than std::string parameters for portability.
Using a char * you are sure that you will not get portability issues among libraries.
If a library exports a function that uses an std::string, it might have problems communicating with another library that has been linked against a different version of the standard library.
I think that there is nothing to worry about unless you are going to provide some API to 3rd party.
Just use std::string
There's nothing unportable about std::string that isn't also an issue with char *. std::string actually uses a char * internally...
string is better. There is nothing unreliable about it on any platform. If you're worried about passing large classes, you can pass const references of your strings into functions. Makes coding faster and less bug prone.
In addition to the fact thata it's easier, std::string will probably be more efficient. Its small string optimization can keep the 10 digits in the std::string object itself, instead of putting them in another memory block off the heap.