Returning char* from dllexported function - c++

I'm creating a DLL which will be used by some external exe file. One of the exposed function is
...
char *current_version = "1.1";
...
extern "C" _declspec(dllexport) char* version(){
return current_version;
}
Because the current version is used in multiple places I created the current_version variable. Will caller of the version function be able to change the content of current_version variable ? (I expect he'll).
If I'll change the code to:
...
const char *current_version = "1.1"; //this is preferable
...
extern "C" _declspec(dllexport) char* version(){
char local_copy[100] = make_local_copy(current_version);
return *local_copy;
}
will the local_copy variable be disposed after execution of the version function finishes (and in this case returned pointer will point at some random data) ? If so, what is the best way to return a pointer to const char* ?

Will caller of the version function be able to change the content of current_version variable ?
This is UB, so the actual behavior depends on implementation. There is a good chance that caller indeed could change this constant. In some implementations, string literals are stored in read-only memory, so attempting to write to it through a non-const pointer will throw a runtime error instead.
will the local_copy variable be disposed after execution of the version function finishes
Yes.
(and in this case returned pointer will point at some random data) ?
It will, in most implementations, point to an area of stack. Writing to it would corrupt program execution flow.
If so, what is the best way to return a pointer to const char* ?
There is no good way to do that in C++.

extern "C" _declspec(dllexport) void version(char* buffer, int* len)
{
if (buffer == NULL || *len <= 0)
{
char local_copy[100] = make_local_copy(current_version);
*len = strlen (local_copy);
return;
}
char local_copy[100] = make_local_copy(current_version);
*len = strlen (local_copy);
strcpy_s(buffer, *len, local_copy);
return;
}
This should be a good starting point. There may be bugs and also I recommend you use wchar instead of char. This is my best guess at a safe function with memory issues. User makes the first call to determine length required. dynamically allocate buffer in calling function and then call this method again. Or allocate memory and assign length if you already know the size of the buffer and the length, you will need to call this function just once.

About the questions, I agree with #Agent_L.
About the solution, I think you can use a static char array as a buffer. Just as follows:
static char local_copy[64];
static char current_version[] = "1.1";
char *version() {
strcpy(local_copy, current_version);
return local_copy;
}
Then you don't need to worry about disposing of local_copy.

Related

Within a DLL function returning memory, where to allocate and where to deallocate?

I'm extremely new to DLL writing/usage, and have written a function within the DLL that accepts a string, and returns another string to the executable as output.
#define DECL_EXPORT extern "C" __declspec(dllexport)
DECL_EXPORT char * organizeArgs(const char * args) {
uint32 outputLen;
...
char * result = new char[outputLen];
...
return result;
}
The easiest way for me was to allocate the memory in the DLL and return it to the executable to deallocate, but I'm reading that this is generally bad as it'll break, when the allocation code between dll and exe are not identical. Trying to avoid allocating in the dll and deallocating in the executable is making this function and how it's used much more complex.
How should this type of allocation/deallocation be handled?
I have a couple of theories but both of them feel kind of terrible:
Calculating the size of the output in a separate dll function call, so the caller can allocate that much memory for the output and provide it to the dll. This seems like the most sane solution, but requires running the analysis code twice:
DECL_EXPORT uint32 organizeArgsSize(const char * args) {
...
return size;
}
DECL_EXPORT char * organizeArgs(const char * args, char * outputBuffer) {
...
return outputBuffer;
}
Allocate on the dll and return a pointer that the executable isn't expected to free, which is valid until overwritten by a second call of the dll's function. This has the best ergonomics for the caller I think, but since I'm multi-threading, this would require thread_local storage. I'm still trying to research if using thread_local in a dll is acceptable and haven't found anything:
DECL_EXPORT const char * organizeArgs(const char * args) {
thread_local std::string buffer;
...
return buffer.c_str();
}
I imagine something like this comes up a lot in DLL writing, and I'm making it out to be more hard than it actually is. How is it usually done?
You missed one: put a deallocate function in the DLL:
DECL_EXPORT char * organizeArgs(const char * args) {
...
char * result = new char[outputLen];
...
return result;
}
DECL_EXPORT void organizeArgsDeallocate(char *organizedArgs) {
delete [] organizedArgs;
}
This is okay because the new and delete operators are called within the same DLL.
There's also a fourth one: use an allocation method that isn't different in each DLL. The problem arises in the first place because each DLL might be using a different MSVCRT DLL (C/C++ standard library). But they all have the same Win32 API DLLs, and you can share memory that's been allocated from a Win32 API function.
DECL_EXPORT char * organizeArgs(const char * args) {
...
char * result = (char*)HeapAlloc(GetProcessHeap(), 0, outputLen);
// don't forget to check for NULL return value meaning out-of-memory
// or you can pass the HEAP_GENERATE_EXCEPTIONS flag
...
return result;
}
// caller does HeapFree(GetProcessHeap(), 0, result)
Note that on Linux you generally can call malloc and free on different shared libraries since they are shared - you don't need to use any of these workarounds.
These are all valid ways to approach this problem, and in fact, you can find all of these in the standard library and the Win32 API:
FormatMessage lets you call it with a NULL buffer to calculate the size (option 1) or you can pass a certain flag and it will allocate the buffer with a cross-DLL-safe way (option 4).
getaddrinfo allocates memory itself and you can free it with freeaddrinfo. (option 3)
asctime returns a pointer to a thread-local or global buffer (option 2). (I hope it's thread-local, but MSDN isn't clear!)
Another method is to change your function to the following:
DECL_EXPORT LONG organizeArgs(LPCSTR args, LPSTR outbuf, LONG length);
Then the API could be documented like this:
args - is the set of arguments
outbuf - is the output buffer or NULL
length - length of the output buffer, ignored if outbuf is NULL
Returns:
Number of characters written to outbuf, or if outbuf is NULL,
returns the maximum number of characters that would have been written.
So the onus is on the client on whether to call the function twice. If the client is confident that they have a buffer big enough to hold the information, then they will allocate it and call your function once using the length argument to limit the number of characters.
If they are not confident or want to ensure that they get all the arg information, then the client is responsible for calling your function twice, the first time with outbuf being NULL and getting the return value, and a second time with outbuf being the allocated buffer.
This is exactly how a few Windows API functions work. The DLL allocates no memory whatsoever.

Returning the address of a char* variable

I'm currently developing a project and in one of my functions, I need to return the address of a char* variable.
Or maybe it's better to return the address of my string, but I don't know how to do this.
This is my code:
const char* rogner(){
int tableauXY[2];
tableauXY[0]=5;
tableauXY[1]=8;
string valeurs=to_string(tableauXY[0])+";"+to_string(tableauXY[1]);
const char* val1=valeurs.c_str();
return val1;
}
There are several ways to export string from extern "C" function:
1) Provide the second method to free release memory allocated:
extern "C" const char* getStr(...)
{
auto result = new char[N];
...
return result;
}
extern "C" void freeStr(const char* str)
{
delete[] const_cast<char*>(str);
}
2) Use OS allocator, e.g. in case of Windows:
// Client must call SysFreeString later
extern "C" const char* getStr(...)
{
auto result = SysAllocString(N);
...
return result;
}
3) Use client-provided buffer (there are a lot of variations how to tell the client what size buffer should have).
extern "C" int getStr(char* buffer, int bufSize, ...){}
There are many answers to your question. But one thing you can't do is return the address of a local variable. Returning the address of a local variable is undefined behavior, so forget about that option completely.
Since you stated that the function is part of a DLL, then it is better to use the "standard" ways to handle strings between a DLL and the client application.
One usual and widely used method is to have the client create the char buffer, and have the char buffer passed to you. Your function then fills in the user-supplied buffer with the information. Optional items such as size of the buffer can also be passed.
The above method is used by most functions in the Windows API that requires passing and returning strings.
Here is a small example:
#include <algorithm>
//...
char *rogner(char *buffer, int buflen)
{
int tableauXY[2];
tableauXY[0]=5;
tableauXY[1]=8;
string valeurs=to_string(tableauXY[0])+";"+to_string(tableauXY[1]);
// get length of buffer and make sure we copy the correct
// number of characters
size_t minSize = std::min(buflen, valeurs.length());
const char* val1=valeurs.c_str();
memcpy(buffer, val1, minSize);
// return the user's buffer back to them.
return buffer;
}
There are variations to the above code, such as
returning the number of characters copied instead of the original buffer.
If buflen==0, only return the total number of characters that would be copied. This allows the client to call your function to get the number of characters, and then call it a second time with
the buffer of the appropriate size.
Thanks for all, it works
I've used the PaulMcKenzie's method and it's good!
I have not tried the Nikerboker's solution so maybe it works too.
Thanks

Access violation writing location 0x00000000. memset function issues

#include <iostream>
#include <string.h>
using namespace std;
void newBuffer(char* outBuffer, size_t sz) {
outBuffer = new char[sz];
}
int main(void) {
const char* abcd = "ABCD";
char* foo;
foo = NULL;
size_t len = strlen(abcd);
cout<<"Checkpoint 1"<<endl;
newBuffer(foo, len);
cout<<"Checkpoint 2"<<endl;
cout<<"Checkpoint 2-A"<<endl;
memset(foo, '-', len);
cout<<"Checkpoint 3"<<endl;
strncpy(foo, abcd, len);
cout<<"Checkpoint 4"<<endl;
cout << foo << endl;
int hold;
cin>>hold;
return 0;
}
This program crashes between checkpoint 2-1 and 3. What it tries to do is to set the char array foo to the char '-', but it fails because of some access issues. I do not understand why this happens. Thank you very much in advance!
Your newBuffer function should accept the first parameter by reference so that changes made to it inside the function are visible to the caller:
void newBuffer(char*& outBuffer, size_t sz) {
outBuffer = new char[sz];
}
As it is now, you assign the result of new char[sz] to the local variable outBuffer which is only a copy of the caller's foo variable, so when the function returns it's as if nothing ever happened (except you leaked memory).
Also you have a problem in that you are allocating the buffer to the size of the length of ABCD which is 4. That means you can hold up to 3 characters in that buffer because one is reserved for the NUL-terminator at the end. You need to add + 1 to the length somewhere (I would do it in the call to the function, not inside it, because newBuffer shouldn't be specialised for C-strings). strncpy only NUL-terminates the buffer if the source string is short enough, so in this case you are only lucky that there happens to be a 0 in memory after your buffer you allocated.
Also don't forget to delete[] foo in main after you're done with it (although it doesn't really matter for a program this size).
It fails because your newBuffer function doesn't actually work. The easiest way to fix it would be to change the declaration to void newBuffer (char *&outBuffer, size_t sz). As it's written, the address of the newly allocated memory doesn't actually get stored into main's foo because the pointer is passed by value.
You are passing the pointer by value. You would need to pass either a reference to the pointer, or the address of the pointer.
That said, using the return value would be better in my view:
char* newBuffer(size_t sz) {
return new char[sz];
}
When written this way, the newBuffer function doesn't really seem worthwhile. You don't need it. You can use new directly and that would be clearer.
Of course, if you are using C++ then this is all rather pointless. You should be using string, smart pointers etc. You should not have any need to call new directly. Once you fix the bug you are talking about in this question you will come across the problem that your string is not null-terminated and that the buffer is too short to hold the string since you forgot to allocate space for the null-terminator. One of the nice things about C++ is that you can escape the horrors of string handling in C.

Conversion char[] to char*

may be this is a sizzle question but please help
void Temp1::caller()
{
char *cc=Called();
printf("sdfasfasfas");
printf("%s",cc);
}
char *Temp1::Called()
{
char a[6]="Hello";
return &a;
}
Here how to print Hello using printf("%s",cc);
Firstly, this function:
char *Temp1::Called()
{
char a[6]="Hello";
return &a;
}
is returning a local variable, which will cease to exist once the function ends - change to:
const char *Temp1::Called()
{
return "Hello";
}
and then, the way to print strings using printf() is to use "%s":
void Temp1::caller()
{
const char *cc=Called();
printf("sdfasfasfas");
printf("%s",cc);
}
You are returning address of local variable, which exploits undefined behaviour. You need to make a static inside Called, or global, or allocate memory for it.
And use %s as format for printf
char a[6] is a local variable and you can not return it from function. It will be destroyed when your code will go out of scope.
You can use STL fot this:
#include <stdio.h>
#include <string>
using namespace std;
string Called()
{
string a=string("Hello");
return a;
}
int main()
{
string cc=Called();
printf("sdfasfasfas\n");
printf("%s",cc.c_str());
}
2 things
You need %s, the string format specifier to print strings.
You are returning the address of a local array variable a[6], it will be destroyed after the function returns. The program should be giving you a segmentation fault. You should be getting a crash. If you are on a linux machine do ulimit -c unlimited, and then run the program. You should see a core dump.
%c is for printing a single character. You need to use %s for printing a null terminated string. That said, this code is likely to crash as you are trying to return the address of the local variable a from function Called. This variable memory is released as soon as Called is returned. You are trying to use this released memory is Caller and in most of the cases it will crash.
Few things:
Change return &a; to return a;
You are returning the address of a
local array which will cease to exist
once the function return..so allocate
it dynamically using new or make it static.
use %s format specifier in printf in place of %c
Problem 1: %c is the specifier for a single char, while you need to use %s, which is the format specifier for strings (pointers to a NUL-terminated array of chars).
Problem 2: in Called you are returning a pointer to a pointer to char (a char **): a itself is considered a pointer to the first element of the array, so you don't need that ampersand in the return.
Problem 3: even if you corrected the other two errors, there's a major flaw in your code: you're trying to return a pointer to a local object (the array a ), which will be destructed when it will get out of scope (e.g. when Called will return). So the caller will have a pointer to an area of memory which is no longer dedicated to a; what happens next is undefined behavior: it may work for a while, until the memory where a was stored isn't used for something else (for example if you don't call other functions before using the returned value), but most likely will explode tragically at the first change in the application.
The correct method for returning strings in C is allocating them on the heap with malloc or calloc and return this pointer to the caller, which will have the responsibility to free it when it won't be needed anymore. Another common way to do it in C is to declare the local variable as static, so it won't be destructed at the return, but this will make your function non-reentrant neither thread-safe (and it may also give other nasty problems).
On the other hand, since you are using C++, the best way to deal with strings is the std::string class, which has a nice copy semantic so that you can return it as a normal return value without bothering about scope considerations.
By the way, if the string you need to return is always the same, you can just declare the return type of your function as const char * and return directly the string, as in
const char * Test()
{
return "Test";
}
This works because the string "Test" is put by the compiler in a fixed memory location, where it will stay during all the execution. It needs to be a const char * because the compiler is allowed to say to every other piece of program that needs a "Test" string to look there, and because all the strings of the application are tightly-packed there, so if you tried to make that string longer you would overwrite some other string.
Still, in my opinion, if you are doing errors like those I outlined, it may be that you're trying to do something a bit too complex for your current skills: if may be better to have another look at your C++ manual, especially at the chapter about pointers and strings.

Caching a const char * as a return type

Was reading up a bit on my C++, and found this article about RTTI (Runtime Type Identification):
http://msdn.microsoft.com/en-us/library/70ky2y6k(VS.80).aspx . Well, that's another subject :) - However, I stumbled upon a weird saying in the type_info-class, namely about the ::name-method. It says: "The type_info::name member function returns a const char* to a null-terminated string representing the human-readable name of the type. The memory pointed to is cached and should never be directly deallocated."
How can you implement something like this yourself!? I've been struggling quite a bit with this exact problem often before, as I don't want to make a new char-array for the caller to delete, so I've stuck to std::string thus far.
So, for the sake of simplicity, let's say I want to make a method that returns "Hello World!", let's call it
const char *getHelloString() const;
Personally, I would make it somehow like this (Pseudo):
const char *getHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}
.. But this would mean that the caller should do a delete[] on my return pointer :(
Thx in advance
How about this:
const char *getHelloString() const
{
return "HelloWorld!";
}
Returning a literal directly means the space for the string is allocated in static storage by the compiler and will be available throughout the duration of the program.
I like all the answers about how the string could be statically allocated, but that's not necessarily true for all implementations, particularly the one whose documentation the original poster linked to. In this case, it appears that the decorated type name is stored statically in order to save space, and the undecorated type name is computed on demand and cached in a linked list.
If you're curious about how the Visual C++ type_info::name() implementation allocates and caches its memory, it's not hard to find out. First, create a tiny test program:
#include <cstdio>
#include <typeinfo>
#include <vector>
int main(int argc, char* argv[]) {
std::vector<int> v;
const type_info& ti = typeid(v);
const char* n = ti.name();
printf("%s\n", n);
return 0;
}
Build it and run it under a debugger (I used WinDbg) and look at the pointer returned by type_info::name(). Does it point to a global structure? If so, WinDbg's ln command will tell the name of the closest symbol:
0:000> ?? n
char * 0x00000000`00857290
"class std::vector<int,class std::allocator<int> >"
0:000> ln 0x00000000`00857290
0:000>
ln didn't print anything, which indicates that the string wasn't in the range of addresses owned by any specific module. It would be in that range if it was in the data or read-only data segment. Let's see if it was allocated on the heap, by searching all heaps for the address returned by type_info::name():
0:000> !heap -x 0x00000000`00857290
Entry User Heap Segment Size PrevSize Unused Flags
-------------------------------------------------------------------------------------------------------------
0000000000857280 0000000000857290 0000000000850000 0000000000850000 70 40 3e busy extra fill
Yes, it was allocated on the heap. Putting a breakpoint at the start of malloc() and restarting the program confirms it.
Looking at the declaration in <typeinfo> gives a clue about where the heap pointers are getting cached:
struct __type_info_node {
void *memPtr;
__type_info_node* next;
};
extern __type_info_node __type_info_root_node;
...
_CRTIMP_PURE const char* __CLR_OR_THIS_CALL name(__type_info_node* __ptype_info_node = &__type_info_root_node) const;
If you find the address of __type_info_root_node and walk down the list in the debugger, you quickly find a node containing the same address that was returned by type_info::name(). The list seems to be related to the caching scheme.
The MSDN page linked in the original question seems to fill in the blanks: the name is stored in its decorated form to save space, and this form is accessible via type_info::raw_name(). When you call type_info::name() for the first time on a given type, it undecorates the name, stores it in a heap-allocated buffer, caches the buffer pointer, and returns it.
The linked list may also be used to deallocate the cached strings during program exit (however, I didn't verify whether that is the case). This would ensure that they don't show up as memory leaks when you run a memory debugging tool.
Well gee, if we are talking about just a function, that you always want to return the same value. it's quite simple.
const char * foo()
{
static char[] return_val= "HelloWorld!";
return return_val;
}
The tricky bit is when you start doing things where you are caching the result, and then you have to consider Threading,or when your cache gets invalidated, and trying to store thing in thread local storage. But if it's just a one off output that is immediate copied, this should do the trick.
Alternately if you don't have a fixed size you have to do something where you have to either use a static buffer of arbitrary size.. in which you might eventually have something too large, or turn to a managed class say std::string.
const char * foo()
{
static std::string output;
DoCalculation(output);
return output.c_str();
}
also the function signature
const char *getHelloString() const;
is only applicable for member functions.
At which point you don't need to deal with static function local variables and could just use a member variable.
I think that since they know that there are a finite number of these, they just keep them around forever. It might be appropriate for you to do that in some instances, but as a general rule, std::string is going to be better.
They can also look up new calls to see if they made that string already and return the same pointer. Again, depending on what you are doing, this may be useful for you too.
Be careful when implementing a function that allocates a chunk of memory and then expects the caller to deallocate it, as you do in the OP:
const char *getHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}
By doing this you are transferring ownership of the memory to the caller. If you call this code from some other function:
int main()
{
char * str = getHelloString();
delete str;
return 0;
}
...the semantics of transferring ownership of the memory is not clear, creating a situation where bugs and memory leaks are more likely.
Also, at least under Windows, if the two functions are in 2 different modules you could potentially corrupt the heap. In particular, if main() is in hello.exe, compiled in VC9, and getHelloString() is in utility.dll, compiled in VC6, you'll corrupt the heap when you delete the memory. This is because VC6 and VC9 both use their own heap, and they aren't the same heap, so you are allocating from one heap and deallocating from another.
Why does the return type need to be const? Don't think of the method as a get method, think of it as a create method. I've seen plenty of API that requires you to delete something a creation operator/method returns. Just make sure you note that in the documentation.
/* create a hello string
* must be deleted after use
*/
char *createHelloString() const
{
char *returnVal = new char[13];
strcpy("HelloWorld!", returnVal);
return returnVal
}
What I've often done when I need this sort of functionality is to have a char * pointer in the class - initialized to null - and allocate when required.
viz:
class CacheNameString
{
private:
char *name;
public:
CacheNameString():name(NULL) { }
const char *make_name(const char *v)
{
if (name != NULL)
free(name);
name = strdup(v);
return name;
}
};
Something like this would do:
const char *myfunction() {
static char *str = NULL; /* this only happens once */
delete [] str; /* delete previous cached version */
str = new char[strlen("whatever") + 1]; /* allocate space for the string and it's NUL terminator */
strcpy(str, "whatever");
return str;
}
EDIT: Something that occurred to me is that a good replacement for this could be returning a boost::shared_pointer instead. That way the caller can hold onto it as long as they want and they don't have to worry about explicitly deleting it. A fair compromise IMO.
The advice given that warns about the lifetime of the returned string is sound advise. You should always be careful about recognising your responsibilities when it comes to managing the lifetime of returned pointers. The practise is quite safe, however, provided the variable pointed to will outlast the call to the function that returned it. Consider, for instance, the pointer to const char returned by c_str() as a method of class std::string. This is returning a pointer to the memory managed by the string object which is guaranteed to be valid as long as the string object is not deleted or made to reallocate its internal memory.
In the case of the std::type_info class, it is a part of the C++ standard as its namespace implies. The memory returned from name() is actually pointed to static memory created by the compiler and linker when the class was compiled and is a part of the run time type identification (RTTI) system. Because it refers to a symbol in code space, you should not attempt to delete it.
I think something like this can only be implemented "cleanly" using objects and the RAII idiom.
When the objects destructor is called (obj goes out of scope), we can safely assume that the const char* pointers arent be used anymore.
example code:
class ICanReturnConstChars
{
std::stack<char*> cached_strings
public:
const char* yeahGiveItToMe(){
char* newmem = new char[something];
//write something to newmem
cached_strings.push_back(newmem);
return newmem;
}
~ICanReturnConstChars(){
while(!cached_strings.empty()){
delete [] cached_strings.back()
cached_strings.pop_back()
}
}
};
The only other possibility i know of is to pass a smart_ptr ..
It's probably done using a static buffer:
const char* GetHelloString()
{
static char buffer[256] = { 0 };
strcpy( buffer, "Hello World!" );
return buffer;
}
This buffer is like a global variable that is accessible only from this function.
You can't rely on GC; this is C++. That means you must keep the memory available until the program terminates. You simply don't know when it becomes safe to delete[] it. So, if you want to construct and return a const char*, simple new[] it and return it. Accept the unavoidable leak.