retrieving string from LPVOID - c++

Can someone explain and help me out here please.
Lets say i have function like this where lpData holds a pointer to the data i want.
void foo(LPVOID lpData) {
}
What is the proper way to retreive this. This works but i get weird characters at the end
void foo(LPVOID lpData) {
LPVOID *lpDataP = (LPVOID *)lpData;
char *charData = (char*)lpDataP;
//i log charData....
}
I would prefer to use strings but I don't understand how to retrieve the data, i just get null pointer error when i try to use string. lpData holds a pointer right? (But my function is lpData not *lpData) so it isn't working? Am i doing this all wrong?
string *datastring = reinterpret_cast<std::string *>(lpData);
is what im trying.

This works but i get weird characters at the end
That means that your string isn't null-terminated—that is, it doesn't have a NUL byte (0) marking the end of the string.
C strings have to be null-terminated.* When you log a C string (char *), it keeps logging characters until it finds a NUL. If there wasn't one on the end of the string, it'll keep going through random memory until it finds one (or until you hit a page fault and crash). This is bad. And there's no way to fix it; once you lose the length, there's no way to get it back.
However, an unterminated string along with its length can be useful. Many functions can take the length alongside the char *, as an extra argument (e.g., the string constructor) or otherwise (e.g., width specifiers in printf format strings).
So, if you take the length, and only call functions that also take the length—or just make a null-terminated copy and use that—you're fine. So:
void foo(LPVOID lpData, int cchData) {
string sData(static_cast<const char *>(lpData), cchData);
// now do stuff with sData
}
Meanwhile, casting from LPVOID (aka void *, aka pointer-to-anything) to LPVOID * (aka void **, aka pointer to pointer-to-anything) to then cast to char * (pointer-to-characters) is wrong (and should be giving you a compiler warning in the second cast; if you're getting warnings and ignoring them, don't do that!). Also, it's generally better to use modern casts instead of C-style casts, and it's always better to be const-correct when there's no down side; it just makes things more explicit to the reader and safer in the face of future maintenance.
Finally:
string *datastring = reinterpret_cast<std::string *>(lpData);
This is almost certainly wrong.** The LPVOID is just pointing at a bunch of characters. You're saying you want to interpret those characters as if they were a string object. But a string object is some header information (maybe a length and capacity, etc.) plus a pointer to a bunch of characters. Treating one as the other is going to lead to garbage or crashes.***
* Yes, you're using C++, not C, but a char * is a "C string".
** If you actually have a string object that you've kept alive somewhere, and you stashed a pointer to that object in an LPVOID and have now retrieved it (e.g., with SetWindowLongPtr/GetWindowLongPtr), then a cast from LPVOID to string * would make sense. But I doubt that's what you're doing. (If you are, then you don't need the reinterpret_cast. The whole point of void * is that it's not interpreted, so there's nothing to reinterpret from. Just use static_cast.)
*** Or, worst of all, it may appear to work, but then lead to hard-to-follow crashes or corruption. Some standard C++ libraries use a special allocator to put the header right before the characters and return a pointer to the first character, so that a string can be used anywhere a char * can. Inside the string class, every method has to fudge the this pointer backward; for example, instead of just saying m_length it has to do something like static_cast<_string_header *>(this)[-1]->m_length. But the other way around doesn't work—if you just have a bunch of characters, not a string object, that fudge is going to read whatever bytes happened to be allocated right before the characters and try to interpret them as an integer, so you may end up thinking you have a string of length 0, or 182423742341241243.

There are at least two ways:
void foo(LPVOID lpData)
{
char *charData = (char*)lpData;
//i log charData....
}
or
void foo(LPVOID lpData)
{
char *charData = static_cast<char*>lpData;
//i log charData....
}

Related

How to avoid providing length along with char*?

There is a function which sends data to the server:
int send(
_In_ SOCKET s,
_In_ const char *buf,
_In_ int len,
_In_ int flags
);
Providing length seems to me a little bit weird. I need to write a function, sending a line to the server and wrapping this one such that we don't have to provide length explicitly. I'm a Java-developer and in Java we could just invoke String::length() method, but now we're not in Java. How can I do that, unless providing length as a template parameter? For instance:
void sendLine(SOCKET s, const char *buf)
{
}
Is it possible to implement such a function?
Use std string:
void sendLine(SOCKET s, const std::string& buf) {
send (s, buf.c_str(), buf.size()+1, 0); //+1 will also transmit terminating \0.
}
On a side note: your wrapper function ignores the return value and doesn't take any flags.
you can retrieve the length of C-string by using strlen(const char*) function.
make sure all the strings are null terminated and keep in mind that null-termination (the length grows by 1)
Edit: My answer originally only mentioned std::string. I've now also added std::vector<char> to account for situations where send is not used for strictly textual data.
First of all, you absolutely need a C++ book. You are looking for either the std::string class or for std::vector<char>, both of which are fundamental elements of the language.
Your question is a bit like asking, in Java, how to avoid char[] because you never heard of java.lang.String, or how to avoid arrays in general because you never heard of java.util.ArrayList.
For the first part of this answer, let's assume you are dealing with just text output here, i.e. with output where a char is really meant to be a text character. That's the std::string use case.
Providing lenght seems to me a little bit wierd.
That's the way strings work in C. A C string is really a pointer to a memory location where characters are stored. Normally, C strings are null-terminated. This means that the last character stored for the string is '\0'. It means "the string stops here, and if you move further, you enter illegal territory".
Here is a C-style example:
#include <string.h>
#include <stdio.h>
void f(char const* s)
{
int l = strlen(s); // l = 3
printf(s); // prints "foo"
}
int main()
{
char* test = new char[4]; // avoid new[] in real programs
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
test[3] = '\0';
f(test);
delete[] test;
}
strlen just counts all characters at the specified position in memory until it finds '\0'. printf just writes all characters at the specified position in memory until it finds '\0'.
So far, so good. Now what happens if someone forgets about the null terminator?
char* test = new char[3]; // don't do this at home, please
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
f(test); // uh-oh, there is no null terminator...
The result will be undefined behaviour. strlen will keep looking for '\0'. So will printf. The functions will try to read memory they are not supposed to. The program is allowed to do anything, including crashing. The evil thing is that most likely, nothing will happen for a while because a '\0' just happens to be stored there in memory, until one day you are not so lucky anymore.
That's why C functions are sometimes made safer by requiring you to explicitly specify the number of characters. Your send is such a function. It works fine even without null-terminated strings.
So much for C strings. And now please don't use them in your C++ code. Use std::string. It is designed to be compatible with C functions by providing the c_str() member function, which returns a null-terminated char const * pointing to the contents of the string, and it of course has a size() member function to tell you the number of characters without the null-terminated character (e.g. for a std::string representing the word "foo", size() would be 3, not 4, and 3 is also what a C function like yours would probably expect, but you have to look at the documentation of the function to find out whether it needs the number of visible characters or number of elements in memory).
In fact, with std::string you can just forget about the whole null-termination business. Everything is nicely automated. std::string is exactly as easy and safe to use as java.lang.String.
Your sendLine should thus become:
void sendLine(SOCKET s, std::string const& line)
{
send(s, line.c_str(), line.size());
}
(Passing a std::string by const& is the normal way of passing big objects in C++. It's just for performance, but it's such a widely-used convention that your code would look strange if you just passed std::string.)
How can I do that, unless providing lenght as a template parameter?
This is a misunderstanding of how templates work. With a template, the length would have to be known at compile time. That's certainly not what you intended.
Now, for the second part of the answer, perhaps you aren't really dealing with text here. It's unlikely, as the name "sendLine" in your example sounds very much like text, but perhaps you are dealing with raw data, and a char in your output does not represent a text character but just a value to be interpreted as something completely different, such as the contents of an image file.
In that case, std::string is a poor choice. Your output could contain '\0' characters that do not have the meaning of "data ends here", but which are part of the normal contents. In other words, you don't really have strings anymore, you have a range of char elements in which '\0' has no special meaning.
For this situation, C++ offers the std::vector template, which you can use as std::vector<char>. It is also designed to be usable with C functions by providing a member function that returns a char pointer. Here's an example:
void sendLine(SOCKET s, std::vector<char> const& data)
{
send(s, &data[0], data.size());
}
(The unusual &data[0] syntax means "pointer to the first element of the encapsulated data. C++11 has nicer-to-read ways of doing this, but &data[0] also works in older versions of C++.)
Things to keep in mind:
std::string is like String in Java.
std::vector is like ArrayList in Java.
std::string is for a range of char with the meaning of text, std::vector<char> is for a range of char with the meaning of raw data.
std::string and std::vector are designed to work together with C APIs.
Do not use new[] in C++.
Understand the null termination of C strings.

What's the difference between char* and char when storing a string in C++?

I saw this example:
const char* SayHi() { return "Hi"; }
And it works fine, but if I try to remove the pointer it doesn't work and I can't figure
out why.
const char SayHi() { return "Hi"; } \\Pointer removed
It works if I assign it a single character like this:
const char SayHi() { return 'H'; } \\Pointer removed and only 1 character
But I don't know what makes it work exactly. Why would a pointer be able to hold more than one character? Isn't a pointer just a variable that points to another one? What does this point to?
That is because a char is by definition a single character (like in your 3rd case). If you want a string, you can either use a array of chars which decays to const char* (like in your first case) or, the C++ way, use std::string.
Here you can read more about the "array decaying to pointer" thing.
You are correct that a pointer is just a variable that points somewhere -- in this case it points to a string of characters somewhere in memory. By convention, strings (arrays of char) end with a null character (0), so operations like strlen can terminate safely without overflowing a buffer.
As for where that particular pointer (in your first example) points to, it is pointing to the string literal "Hi" (with a null terminator at the end added by the compiler). That location is platform-dependent and is answered here.
It is also better practice to use std::string in C++ than plain C arrays of characters.

good manier to get char[] from another function. Starting thinking in c/c++

As I understood the correct programming style tells that if you want to get string (char []) from another function is best to create char * by caller and pass it to string formating function together with created string length. In my case string formating function is "getss".
void getss(char *ss, int& l)
{
sprintf (ss,"aaaaaaaaaa%d",1);
l=11;
}
int _tmain(int argc, _TCHAR* argv[])
{
char *f = new char [1];
int l =0;
getss(f,l);
cout<<f;
char d[50] ;
cin>> d;
return 0;
}
"getss" formats string and returns it to ss*. I thought that getss is not allowed to got outside string length that was created by caller. By my understanding callers tells length by variable "l" and "getcc" returns back length in case buffer is not filled comleatly but it is not allowed go outside array range defined by caller.
But reality told me that really it is not so important what size of buffer was created by caller. It is ok, if you create size of 1, and getss fills with 11 characters long. In output I will get all characters that "getss" has filled.
So what is reason to pass length variable - you will always get string that is zero terminated and you will find the end according that.
What is the reason to create buffer with specified length if getss can expand it?
How it is done in real world - to get string from another function?
Actually, the caller is the one that has allocated the buffer and knows the maximum size of the string that can fit inside. It passes that size to the function, and the function has to use it to avoid overflowing the passed buffer.
In your example, it means calling snprintf() rather than sprintf():
void getss(char *ss, int& l)
{
l = snprintf(ss, l, "aaaaaaaaaa%d", 1);
}
In C++, of course, you only have to return an instance of std::string, so that's mostly a C paradigm. Since C does not support references, the function usually returns the length of the string:
int getss(char *buffer, size_t bufsize)
{
return snprintf(buffer, bufsize, "aaaaaaaaaa%d", 1);
}
You were only lucky. Sprintf() can't expand the (statically allocated) storage, and unless you pass in a char array of at least length + 1 elements, expect your program to crash.
In this case you are simply lucky that there is no "important" other data after the "char*" in memory.
The C runtime does not always detect these kinds of violations reliably.
Nonetheless, your are messing up the memory here and your program is prone to crash any time.
Apart from that, using raw "char*" pointers is really a thing you should not do any more in "modern" C++ code.
Use STL classes (std::string, std::wstring) instead. That way you do not have to bother about memory issues like this.
In real world in C++ is better to use std::string objects and std::stringstream
char *f = new char [1];
sprintf (ss,"aaaaaaaaaa%d",1);
Hello, buffer overflow! Use snprintf instead of sprintf in C and use C++ features in C++.
By my understanding callers tells length by variable "l" and "getcc" returns back length in case buffer is not filled comleatly but it is not allowed go outside array range defined by caller.
This is spot on!
But reality told me that really it is not so important what size of buffer was created by caller. It is ok, if you create size of 1, and getss fills with 11 characters long. In output I will get all characters that "getss" has filled.
This is absolutely wrong: you invoked undefined behavior, and did not get a crash. A memory checker such as valgrind would report this behavior as an error.
So what is reason to pass length variable.
The length is there to avoid this kind of undefined behavior. I understand that this is rather frustrating when you do not know the length of the string being returned, but this is the only safe way of doing it that does not create questions of string ownership.
One alternative is to allocate the return value dynamically. This lets you return strings of arbitrary length, but the caller is now responsible for freeing the returned value. This is not very intuitive to the reader, because malloc and free happen in different places.
The answer in C++ is quite different, and it is a lot better: you use std::string, a class from the standard library that represents strings of arbitrary length. Objects of this class manage the memory allocated for the string, eliminating the need of calling free manually.
For cpp consider smart pointers in your case propably a shared_ptr, this will take care of freeing the memory, currently your program is leaking memory since, you never free the memory you allocate with new. Space allocate by new must be dealocated with delete or it will be allocated till your programm exits, this is bad, imagine your browser not freeing the memory it uses for tabs when you close them.
In the special case of strings I would recommend what OP's said, go with a String. With Cpp11 this will be moved (not copied) and you don't need to use new and have no worries with delete.
std::string myFunc() {
std::string str
//work with str
return str
}
In C++ you don't have to build a string. Just output the parts separately
std::cout << "aaaaaaaaaa" << 1;
Or, if you want to save it as a string
std::string f = "aaaaaaaaaa" + std::to_string(1);
(Event though calling to_string is a bit silly for a constant value).

CString : What does (TCHAR*)(this + 1) mean?

In the CString header file (be it Microsoft's or Open Foundation Classes - http://www.koders.com/cpp/fid035C2F57DD64DBF54840B7C00EA7105DFDAA0EBD.aspx#L77 ), there is the following code snippet
struct CStringData
{
long nRefs;
int nDataLength;
int nAllocLength;
TCHAR* data() { return (TCHAR*)(&this[1]); };
...
};
What does the (TCHAR*)(&this[1]) indicate?
The CStringData struct is used in the CString class (http :// www.koders.com/cpp/fid100CC41B9D5E1056ED98FA36228968320362C4C1.aspx).
Any help is appreciated.
CString has lots of internal tricks which make it look like a normal string when passed e.g. to printf functions, despite actually being a class - without having to cast it to LPCTSTR in the argument list, e.g., in the case of varargs (...) in e.g. a printf. Thus trying to understand a single individual trick or function in the CString implementation is bad news. (The data function is an internal function which gets the 'real' buffer associated with the string.)
There's a book, MFC Internals that goes into it, and IIRC the Blaszczak book might touch it.
EDIT: As for what the expression actually translates to in terms of raw C++:-
TCHAR* data() { return (TCHAR*)(&this[1]); };
this says "pretend you're actually the first entry in an array of items allocated together. Now, the second item isnt actually a CString, it's a normal NUL terminated buffer of either Unicode or normal characters - i.e., an LPTSTR".
Another way of expressing the same thing is:
TCHAR* data() { return (TCHAR*)(this + 1); };
When you add 1 to a pointer to T, you actually add 1* sizeof T in terms of a raw memory address. So if one has a CString located at 0x00000010 with sizeof(CString) = 4, data will return a pointer to a NUL terminated array of chars buffer starting at 0x00000014
But just understanding this one thing out of context isnt necessarily a good idea.
Why do you need to know?
It returns the memory area that is immediately after the CStringData structure as an array of TCHAR characters.
You can understand why they are doing this if you look at the CString.cpp file:
static const struct {
CStringData data;
TCHAR ch;
} str_empty = {{-1, 0, 0}, 0};
CStringData* pData = (CStringData*)mem_alloc(sizeof(CStringData) + size*sizeof(TCHAR));
They do this trick, so that CString looks like a normal data buffer, and when you ask for the getdata it skips the CStringData structure and points directly to the real data buffer like char*

Pass an element from C type string array to a COM object as BSTR? (in C++)

I am writing a C++ DLL that is called by an external program.
1.) I take an array of strings (as char *var) as an argument from this program.
2.) I want to iterate through this array and call a COM function on each element of the string array. The COM function must take a BSTR:
DLL_EXPORT(void) runUnitModel(char *rateMaterialTypeNames) {
HRESULT hr = CoInitialize(NULL);
// Create the interface pointer.
IUnitModelPtr pIUnit(__uuidof(BlastFurnaceUnitModel));
pIUnit->initialiseUnitModel();
int i;
for(i=0; i < sizeOfPortRatesArray; i++)
pIUnit->createPort(SysAllocString(BSTR((const char *)rateMaterialTypeNames[i])));
I think its the SysAllocString(BSTR((const char *)rateMaterialTypeNames[i])) bit that is giving me problems. I get an access violation when the programs runs.
Is this the right way to access the value of the rateMaterialTypeName at i? Note I am expecting something like "IronOre" as the value at i, not a single character.
If you're using Microsofts ATL, you can use the CComBSTR class.
It will accept a char* and create a BSTR from it, also, you don't need to worry about deleting the BSTR, all that happens in the dtor for CComBSTR.
Also, see Matthew Xaviers answer, it doesn't look like you're passing your array of strings into that function properly.
Hope this helps
Because a variable holding a C string is just a pointer to the first element (a char*), in order to pass an array of C strings, the parameter to your function should be a char**:
DLL_EXPORT(void) runUnitModel(char **rateMaterialTypeNames)
This way, when you evaluate rateMaterialTypeNames[i], the result will be a char*, which is the parameter type you need to pass to SysAllocString().
Added note: you will also need to convert the strings to wide chars at some point, as Tommy Hui's answer points out.
If the parameter to the function rateMaterialTypeNames is a string, then
rateMaterialTypeNames[i]
is a character and not a string. You should use just the parameter name itself.
In addition, casts in general are bad. The conversion to a BSTR is a big flag. The parameter type for SysAllocString is
const OLECHAR*
which for 32-bit compilers is a wide character. So this will definitely fail because the actual parameter is a char*.
What the code needs is a conversion of narrow string to a wide string.
const OLECHAR* pOleChar = A2COLE( *pChar );
BSTR str = SysAllocString( pOleChar );
// do something with the 'str'
SysFreeString( str ); // need to cleanup the allocated BSTR