Getting a char* from a _variant_t in optimal time - c++

Here's the code I want to speed up. It's getting a value from an ADO recordset and converting it to a char*. But this is slow. Can I skip the creation of the _bstr_t?
_variant_t var = pRs->Fields->GetItem(i)->GetValue();
if (V_VT(&var) == VT_BSTR)
{
char* p = (const char*) (_bstr_t) var;

The first 4 bytes of the BSTR contain the length. You can loop through and get every other character if unicode or every character if multibyte. Some sort of memcpy or other method would work too. IIRC, this can be faster than W2A or casting (LPCSTR)(_bstr_t)

Your problem (other than the possibility of a memory copy inside _bstr_t) is that you're converting the UNICODE BSTR into an ANSI char*.
You can use the USES_CONVERSION macros which perform the conversion on the stack, so they might be faster. Alternatively, keep the BSTR value as unicode if possible.
to convert:
USES_CONVERSION;
char* p = strdup(OLE2A(var.bstrVal));
// ...
free(p);
remember - the string returned from OLE2A (and its sister macros) return a string that is allocated on the stack - return from the enclosing scope and you have garbage string unless you copy it (and free it eventually, obviously)

This creates a temporary on the stack:
USES_CONVERSION;
char *p=W2A(var.bstrVal);
This uses a slightly newer syntax and is probably more robust. It has a configurable size, beyond which it will use the heap so it avoids putting massive strings onto the stack:
char *p=CW2AEX<>(var.bstrVal);

_variant_t var = pRs->Fields->GetItem(i)->GetValue();
You can also make this assignment quicker by avoiding the fields collection all together. You should only use the Fields collection when you need to retrieve the item by name. If you know the fields by index you can instead use this.
_variant_t vara = pRs->Collect[i]->Value;
Note i cannot be an integer as ADO does not support VT_INTEGER, so you might as well use a long variable.

Ok, my C++ is getting a little rusty... but I don't think the conversion is your problem. That conversion doesn't really do anything except tell the compiler to consider _bstr_t a char*. Then you're just assigning the address of that pointer to p. Nothing's actually being "done."
Are you sure it's not just slow getting stuff from GetValue?
Or is my C++ rustier than I think...

Related

Using, StringCchCat

I'm trying to use the StringCchCat function:
HRESULT X;
LPWSTR _strOutput = new wchar_t[100];
LPCWSTR Y =L"Sample Text";
X = StringCchCat(_strOutput, 100, Y);
But for some reason I keep getting the "E_INVALIDARG One or more arguments are invalid." error from X. _strOutput Is also full of some random characters.
This is actually part of a bigger program. So what I'm trying to do is to concatenated the "sample text" to the empty _strOutput variable. This is inside a loop so it is going to happen multiple times. For this particular example it will be as if I'm assigning the Text "Sample Text" to _strrOutput.
Any Ideas?
If it's part of a loop, a simple *_strOutput = 0; will fix your issue.
If you're instead trying to copy a string, not concatenate it, there's a special function that does this for you: StringCchCopy.
Edit: As an aside, if you're using the TCHAR version of the API (and you are), you should declare your strings as TCHAR arrays (ie LPTSTR instead of LPWSTR, and _T("") instead of L""). This would keep your code at least mildly portable.
String copy/concat functions look for null terminators to know where to copy/concat to. You need to initialize the first element of _strOutput to zero so the buffer is null terminated, then you can copy/concat values to it as needed:
LPWSTR _strOutput = new wchar_t[100];
_strOutput[0] = L'\0`; // <-- add this
X = StringCchCat(_strOutput, 100, Y);
I'm writing this answer to notify you (so you see the red 1 at the top of any Stack Overflow page) because you had the same bug yesterday (in your message box) and I now realize I neglected to say this in my answer yesterday.
Keep in mind that the new[] operator on a built-in type like WCHAR or int does NOT initialize the data at all. The memory you get will have whatever garbage was there before the call to new[], whatever that is. The same happens if you say WCHAR x[100]; as a local variable. You must be careful to initialize data before using it. Compilers are usually good at warning you about this. (I believe C++ objects have their constructors called for each element, so that won't give you an error... unless you forget to initialize something in the class, of course. It's been a while.)
In many cases you'll want everything to be zeroes. The '\0'/L'\0' character is also a zero. The Windows API has a function ZeroMemory() that's a shortcut for filling memory with zeroes:
ZeroMemory(array, size of array in bytes)
So to initialize a WCHAR str[100] you can say
ZeoMemory(str, 100 * sizeof (WCHAR))
where the sizeof (WCHAR) turns 100 WCHARs into its equivalent byte count.
As the other answers say, simply setting the first character of a string to zero will be sufficient for a string. Your choice.
Also just to make sure: have you read the other answers to your other question? They are more geared toward the task you were trying to do (and I'm not at all knowledgeable on the process APIs; I just checked the docs for my answer).

getline(cin,_string);

I know that getline(cin,_string); works perfectly
but this dosen't:
char* _chArr = new char;
getline(cin,_chArr);
Even this alson doesn't work:
char* _chArr = new char[30];
getline(cin,_chArr);
Isn't char* a string??
isn't char* is a string
No, it's a pointer to a char and that's that. The function std::getline does some cool stuff (extending the string and all) that can't be done easily on a char *.
Well think of it logically. the char* is just a pointer to a character type memory block. You have to assign it some amount of dynamic memory and then copy data into it using strcpy() or manually. Direct input is not supported in C++. Strings are in fact objects which contain size within themselves. They are designed by the experts in this industry, and they have provided the direct input and dynamic growth as in built functionality.
There is a differnce between string and cstring. Cstring is the char*.
No, C++ strings are not just character arrays, they are a full blown class, usually with quite a bit of extra stuff under the covers, over and above what a character array provides.

What is the internal structure of an object of the (EDIT: MFC) CString class?

I need to strncpy() (effectively) from a (Edit: MFC) CString object to a C string variable. It's well known that strncpy() sometimes fails (depending on the source length **EDIT and the length specified in the call) to terminate the dest C string correctly. To avoid that evil, I'm thinking to store a NUL char inside the CString source object and then to strcpy() or memmove() that guy.
Is this a reasonable way to go about it? If so, what must I manipulate inside the CString object? If not, then what's an alternative that will guarantee a properly-terminated destination C string?
strncpy() only "fails" to null-terminate the destination string when the source string is longer than the length limit you specify. You can ensure that the destination is null-terminated by setting its last character to null yourself. For example:
#define DEST_STR_LEN 10
char dest_str[DEST_STR_LEN + 1]; // +1 for the null
strncpy(dest_str, src_str, DEST_STR_LEN);
dest_str[DEST_STR_LEN] = '\0';
If src_str is more than DEST_STR_LEN characters long, dest_str will be a properly-terminated string of DEST_STR_LEN characters. If src_str is shorter than that, strncpy() will put a null terminator somewhere within dest_str, so the null at the very end is irrelevant and harmless.
CSimpleStringT::GetString gives a pointer to a null-terminated string. Use this as the soure for strncpy. As this is C++, you should only use C-style strings when interfacing with legacy APIs. Use std::string instead.
One of the alternative ways would be to zero string first and then cast or memcpy from CString.
I hope they don't changed from when I used them: that was many years ago :)
They used an interesting 'trick' to handle the refcount and the very fast and efficient automatic conversion to char*: i.e the pointer is to LPCSTR, but some back byte is reserved to keep the implementation state.
So the struct can be used with the older windows API (LPCSTR without overhead). I found at the time the idea interesting!
Of course the key ìs the availability of allocators: they simply offsets the pointer when mallocing/freeing.
I remember there was a buffer request to (for instance) modify the data available: GetBuffer(0), followed by ReleaseBuffer().
HTH
If you are not compiling with _UNICODE enabled, then you can get a const char * from a CString very easily. Just cast it to an LPCTSTR:
CString myString("stuff");
const char *byteString = (LPCTSTR)myString;
This is guaranteed to be NULL-terminated.
If you have built with _UNICODE, then CString is a UTF-16 encoded string. You can't really do anything directly with that.
If you do need to copy the data from the CString, this very easy, even using C-style code. Just make sure that you allocate sufficient memory and are copying the right length:
CString myString("stuff");
char *outString = (char*)malloc(myString.Length() + 1);
strncpy(outString, (LPCTSTR)myString, myString.Length());
CString ends with NULL so as long as your text is correct (no NULL characters inside) then copying should be safe. You can write:
char szStr[256];
strncpy(szStr, (LPCSTR) String, 3);
szStr[3]='\0'; /// b-cos no null-character is implicitly appended to the end of destination
if you store null somehere inside CString object you will probably cause yourself more problems, CString stores its lenght internally.
Another alternative solution would rather involve support from CPU or compiler, as it's much better approach - simply make sure that when copying memory in "safe" mode, at any time after every atomic operation there is zero added on the end, so when whole loop fails, the destination string will still be terminated, without need to zero it fully before making copy.
There could be also support for fast zero - just mark start and stop of zeroed region and it's instantly cleared in RAM, this would make things a lot easier.

Strange error in variable values C++

I have used this code. Here a string is present from location starting from 4 and length of string is 14. All these calculations are done prior to this code. I am pasting a small snippet of the error containing code.
void *data = malloc(4096);
int len = 14;
int fileptr = 4;
string str;
cout<<len<<endl;
cout<<fileptr<<endl;
memcpy(&str, (char *)data+fileptr, len);
cout<<len<<endl;
cout<<fileptr<<endl;
Output i get is:
14
4
4012176
2009288233
Here i am reading a string "System Catalog" from memory. Its displaying the string correctly. But the values of fileptr and len are abruptly changing after using memcpy() function.
string is not the same as a char*. string is an object. So you can't just memcpy() data to it. So the behavior of this code is undefined.
In your case, you are copying 14 bytes of junk data into str and corrupting the stack.
The result is that you are overwriting both len and fileptr with junk from the malloc().
I'm not sure exactly what you're trying to do, but if you want to create a string, you should do it like this:
string str = "System Catalog";
A string is an object and is not just a sequence of bytes. You cannot just memcpy over it from raw memory.
My guess is that in your code the str variable is allocated before other variables in stack memory and memcpy-ing over it you are overwriting them.
Note that your phrase "It's displaying the string correctly" has the seed of a common misconception about C++ in it.
When you do bad things in C++ (e.g. writing bytes over an object) you should expect the worst possible behavior. The worst possible behavior however is NOT an ugly result, a crash or a runtime error... but something that seems to work but that has bad consequences in the future.
You want to assign this many characters from that char pointer into a std::string, so you should look at what facilities a string object provides for doing that rather than hitting it over the head with memcpy(). As others have noted, memcpy() is for use in low-level C-style code, not for interacting with C++ objects.
In particular, you should study the assignment methods provided by std::string, one of which does exactly what you want -- which isn't a coincidence.
string is an object - please look up the semantics for it. Why are you doing this and what are you trying to achieve?
If for some reason you actually MUST use memcpy you can get the Internal address of the string to copy to (provided the string is big enough to contain the information you want to copy)
static_cast < char * >(&(str[0]));
But this is VERY VERY BAD. If you use it, I'm quite sure there are more crazy things going on in your code :-)

How do I find the memory address of a string?

I am having a mental block and I know I should know this but I need a little help.
If I declare a string variable like this:
string word = "Hello";
How do I find the memory address of "Hello"?
Edit: This is what I am trying to do...
Write a function that takes one argument, the address of a string, and prints that string once. (Note: you will need to use a pointer to complete this part.)
However, if a second argument, type int, is provided and is nonzero, the function should print the string a number of times equal to the number of times that function has been called at that point. (Note that the number of times the string is printed is not equal to the value of the second argument; it is equal to the number of times the function has been called so far.)
Use either:
std::string::data() if your data isn't null-terminated c-string like.
or
std::string::c_str() if you want the data and be guaranteed to get the null-termination.
Note that the pointer returned by either of these calls doesn't have to be the underlying data the std::string object is manipulating.
Take the address of the first character is the usual way to do it. &word[0]. However, needing to do this if you're not operating with legacy code is usually a sign that you're doing something wrong.
I guess you want a pointer to a plain old C-string? Then use word.c_str(). Note that this is not guaranteed to point to the internal storage of the string, it's just a (constant) C-string version you can work with.
You can use the c_str() function to get a pointer to the C string (const char *); however note that the pointer is invalidated whenever you modify the string; you have to invoke c_str() again as the old string may have been deallocated.
OK, so, I know this question is old but I feel like the obvious answer here is actually:
std::string your_string{"Hello"};
//Take the address of the beginning
auto start_address = &(*your_string.begin())
//Take the address at the end
auto end_address = &(*your_string.end())
In essence this will accomplish the same thing as using:
auto start_address = your_string.c_str();
auto end_address = your_string.c_str() + strlen(your_string.c_str());
However I would prefer the first approach (taking the address of the dereferenced iterator) because:
a) Guaranteed to work with begin/end compatible containers which might not have the c_str method. So for example, if you decided you wanted a QString (the string QT uses) or AWS::String or std::vector to hold your characters, the c_str() approach wouldn't work but the one above would.
b) Possibly not as costly as c_str()... which generally speaking should be implemented similarly to the call I made in the second line of code to get the address but isn't guaranteed to be implemented that way (E.g. if the string you are using is not null terminated it might require the reallocation and mutation of your whole string in order to add the null terminator... which will suddenly make it thread unsafe and very costly, this is not the case for std::string but might be for other types of strings)
c) It communicates intent better, in the end what you want is an address and the first bunch of code expresses exactly that.
So I'd say this approach is better for:
Clarity, compatibility and efficiency.
Edit:
Note that with either approach, if you change your initial string after taking the address the address will be invalidated (since the string might be rellocated). The compiler will not warn you against this and it could cause some very nasty bugs :/
Declare a pointer to the variable and then view it how you would.