Convert Platform::Array<byte> to String - c++

A have a function in C++ from a library that reads a resource and returns Platform::Array<byte>^
How can I convert this into a Platform::String or an std::string
BasicReaderWriter^ m_basicReaderWriter = ref new BasicReaderWriter()
Platform::Array<byte>^ data = m_basicReaderWriter ("file.txt")
I need a Platform::String from data

If your Platform::Array<byte>^ data contains an ASCII string (as you clarified in a comment to your question), you can convert it to std::string using proper std::string constructor overloads (note that Platform::Array offers STL-like begin() and end() methods):
// Using std::string's range constructor
std::string s( data->begin(), data->end() );
// Using std::string's buffer pointer + length constructor
std::string s( data->begin(), data->Length );
Unlike std::string, Platform::String contains Unicode UTF-16 (wchar_t) strings, so you need a conversion from your original byte array containing the ANSI string to Unicode string. You can perform this conversion using ATL conversion helper class CA2W (which wraps calls to Win32 API MultiByteToWideChar()).
Then you can use Platform::String constructor taking a raw UTF-16 character pointer:
Platform::String^ str = ref new String( CA2W( data->begin() ) );
Note:
I currently don't have VS2012 available, so I haven't tested this code with the C++/CX compiler. If you get some argument matching errors, you may want to consider reinterpret_cast<const char*> to convert from the byte * pointer returned by data->begin() to a char * pointer (and similar for data->end()), e.g.
std::string s( reinterpret_cast<const char*>(data->begin()), data->Length );

Related

Using rapidjson and ATL CString

I am attempting to use the rapidjson library with Microsoft ATL CString type, as shown in the example below.
#include "stdafx.h"
#include "rapidjson\document.h"
using namespace rapidjson;
typedef GenericDocument<UTF16<> > WDocument;
int main()
{
WDocument document;
CString hello = _T("Hello");
document.SetObject();
document.AddMember(_T("Hello"), hello, document.GetAllocator());
return 0;
}
This fails with the compiler error
'rapidjson::GenericValue::GenericValue(rapidjson::GenericValue &&)': cannot convert argument 1 from 'CString' to 'rapidjson::Type' rapidjson document.h 1020
which does imply that a conversion between CString and a format which rapidjson would need is required. I know that rapidjson internally uses wchar_t as the encoding for the UTF16 version of its functions, however I am not sure how to convert a CString to a wchar_t (or array of wchar_t) in a way that rapidjson will be able to use the string as it uses strings defined by the _T macro.
I have looked at the msdn resources on converting between string types here but this only gives a way to return a pointer to the first member of an array of wchar_t, which rapidjson cannot then use.
The correct way to do this is to use one of the constructors rapidjson provides for its GenericValue class, namely the constructor for a pointer to a character encoding type and a character length.
GenericValue(const Ch* s, SizeType length) RAPIDJSON_NOEXCEPT : data_(), flags_() { SetStringRaw(StringRef(s, length)); }
This constructor can take a pointer to any of the character types which rapidjson accepts along with a length and then read this into a value. For the ATL::CString class, this can be accomplished with the .GetString() and .GetLength() methods available on a CString object. A function to return a Value which can be used in a DOM tree would look like this:
typedef GenericValue<UTF16<> > WValue;
WValue CStringToRapidjsonValue(CString in)
{
WValue out(in.GetString(), in.GetLength());
return out;
}

Deep copy of TCHAR array is truncated

I've created a class to test some functionality I need to use. Essentially the class will take a deep copy of the passed in string and make it available via a getter. Am using Visual Studio 2012. Unicode is enabled in the project settings.
The problem is that the memcpy operation is yielding a truncated string. Output is like so;
THISISATEST: InstanceDataConstructor: Testing testing 123
Testing te_READY
where the first line is the check of the passed in TCHAR* string & the second line is the output from populating the allocated memory with the memcpy operation. Output expected is; "Testing testing 123".
Can anyone explain what is wrong here?
N.B. Got the #ifndef UNICODE typedefs from here: how-to-convert-tchar-array-to-stdstring
#ifndef INSTANCE_DATA_H//if not defined already
#define INSTANCE_DATA_H//then define it
#include <string>
//TCHAR is just a typedef, that depending on your compilation configuration, either defaults to char or wchar.
//Standard Template Library supports both ASCII (with std::string) and wide character sets (with std::wstring).
//All you need to do is to typedef String as either std::string or std::wstring depending on your compilation configuration.
//To maintain flexibility you can use the following code:
#ifndef UNICODE
typedef std::string String;
#else
typedef std::wstring String;
#endif
//Now you may use String in your code and let the compiler handle the nasty parts. String will now have constructors that lets you convert TCHAR to std::string or std::wstring.
class InstanceData
{
public:
InstanceData(TCHAR* strIn) : strMessage(strIn)//constructor
{
//Check to passed in string
String outMsg(L"THISISATEST: InstanceDataConstructor: ");//L for wide character string literal
outMsg += strMessage;//concatenate message
const wchar_t* finalMsg = outMsg.c_str();//prepare for outputting
OutputDebugStringW(finalMsg);//print the message
//Prepare TCHAR dynamic array. Deep copy.
charArrayPtr = new TCHAR[strMessage.size() +1];
charArrayPtr[strMessage.size()] = 0;//null terminate
std::memcpy(charArrayPtr, strMessage.data(), strMessage.size());//copy characters from array pointed to by the passed in TCHAR*.
OutputDebugStringW(charArrayPtr);//print the copied message to check
}
~InstanceData()//destructor
{
delete[] charArrayPtr;
}
//Getter
TCHAR* getMessage() const
{
return charArrayPtr;
}
private:
TCHAR* charArrayPtr;
String strMessage;//is used to conveniently ascertain the length of the passed in underlying TCHAR array.
};
#endif//header guard
A solution without all of the dynamically allocated memory.
#include <tchar.h>
#include <vector>
//...
class InstanceData
{
public:
InstanceData(TCHAR* strIn) : strMessage(strIn),
{
charArrayPtr.insert(charArrayPtr.begin(), strMessage.begin(), strMessage.end())
charArrayPtr.push_back(0);
}
TCHAR* getMessage()
{ return &charArrayPtr[0]; }
private:
String strMessage;
std::vector<TCHAR> charArrayPtr;
};
This does what your class does, but the major difference being that it does not do any hand-rolled dynamic allocation code. The class is also safely copyable, unlike the code with the dynamic allocation (lacked a user-defined copy constructor and assignment operator).
The std::vector class has superseded having to do new[]/delete[] in almost all circumstances. The reason being that vector stores its data in contiguous memory, no different than calling new[].
Please pay attention to the following lines in your code:
// Prepare TCHAR dynamic array. Deep copy.
charArrayPtr = new TCHAR[strMessage.size() + 1];
charArrayPtr[strMessage.size()] = 0; // null terminate
// Copy characters from array pointed to by the passed in TCHAR*.
std::memcpy(charArrayPtr, strMessage.data(), strMessage.size());
The third argument to pass to memcpy() is the count of bytes to copy.
If the string is a simple ASCII string stored in std::string, then the count of bytes is the same of the count of ASCII characters.
But, if the string is a wchar_t Unicode UTF-16 string, then each wchar_t occupies 2 bytes in Visual C++ (with GCC things are different, but this is a Windows Win32/C++ code compiled with VC++, so let's just focus on VC++).
So, you have to properly scale the size count for memcpy(), considering the proper size of a wchar_t, e.g.:
memcpy(charArrayPtr, strMessage.data(), strMessage.size() * sizeof(TCHAR));
So, if you compile in Unicode (UTF-16) mode, then TCHAR is expanded to wchar_t, and sizeof(wchar_t) is 2, so the content of your original string should be properly deep-copied.
As an alternative, for Unicode UTF-16 strings in VC++ you may use also wmemcpy(), which considers wchar_t as its "unit of copy". So, in this case, you don't have to scale the size factor by sizeof(wchar_t).
As a side note, in your constructor you have:
InstanceData(TCHAR* strIn) : strMessage(strIn)//constructor
Since strIn is an input string parameter, consider passing it by const pointer, i.e.:
InstanceData(const TCHAR* strIn)

C++ using std::string, std::wstring as a buffer

Using WinAPI you can often encounter some methods getting LPWSTR or LPSTR as a parameter. Sometimes this pointer should be a pointer to buffer in fact, for example:
int GetWindowTextW(HWND hWnd, LPWSTR lpString, int nMaxCount);
Is it a good idea to use std::wstring for such buffers, in particular case I strongly need to produce std::wstring as result and cannot replace it with vector<wchar_t> for example?
std::wstring myWrapper(HWND hWnd){
auto desiredBufferSize = GetWindowTextLengthW(hWnd);
std::wstring resultWstr;
resultWstr.resize(desiredBufferSize);
auto ret = GetWindowText(hWnd,
const_cast<wchar_t*>(resultWstr.data()), // const_cast
resultWstr.size());
// handle return code code
return resultWstr;
}
Both data() and c_str() string methods return const pointer, so we must use const_cast to remove constness, which sometimes is a bad sign. Is it a good idea in such case? Can I do better?
Use String as C-String
Auto type conversion from const char* to std::string, but not other way around.
The character ‘\0’ is not special for std::string.
&s[0] for write access
Make sure the string size (not just capacity) is big enough for C style writing.
s.c_str() for read only access
Is valid only until the next call of a non-constant method.
Code sample:
const int MAX_BUFFER_SIZE = 30;         // Including NULL terminator.         
string s(MAX_BUFFER_SIZE, '\0');      // Allocate enough space, NULL terminated
strcpy(&s[0], "This is source string.");    // Write, C++11 only (VS2010 OK)
printf("C str: '%s'\n", s.c_str());     // Read only: Use const whenever possible.
It's tempting to go for nice standard wstring. However it's never good to cast away const...
Here a temporary string wrapper that automatically creates a buffer, passes its pointer to the winapi function, and copies the content of the buffer to your string and disapears cleanly:
auto ret = GetWindowText(hWnd,
tmpstr (resultWstr, desiredBufferSize),
resultWstr.size());
This solution works with any windows API function that writes to a character pointer before it returns (i.e. no assync).
How does it work ?
It's based on C++ standard §12.2 point 3 : "Temporary objects are destroyed as the last step in evaluating the full-expression that (lexically) contains the point where they were created. (...) The value computations and side effects of destroying a temporary object are associated only with the full-expression, not with any specific subexpression.".
Here it's implementation:
typedef std::basic_string<TCHAR> tstring; // based on microsoft's TCHAR
class tmpstr {
private:
tstring &t; // for later cpy of the result
TCHAR *buff; // temp buffer
public:
tmpstr(tstring& v, int ml) : t(v) { // ctor
buff = new TCHAR[ml]{}; // you could also initialize it if needed
std::cout << "tmp created\n"; // just for tracing, for proof of concept
}
tmpstr(tmpstr&c) = delete; // No copy allowed
tmpstr& operator= (tmpstr&c) = delete; // No assignment allowed
~tmpstr() {
t = tstring(buff); // copy to string passed by ref at construction
delete buff; // clean everyhing
std::cout<< "tmp destroyed"; // just for proof of concept. remove this line
}
operator LPTSTR () {return buff; } // auto conversion to serve as windows function parameter without having to care
};
As you can see, the first line uses a typedef, in order to be compatible with several windows compilation options (e.g. Unicode or not). But of course, you could just replace tstring and TCHAR with wstring and wchar_t if you prefer.
The only drawback is that you have to repeat the buffer size as parameter tmpstr constructor and as parameter of the windows function. But this is why you're writing a wrepper for the function, isn't it ?
For a string buffer why not to use just char array? :)
DWORD username_len = UNLEN + 1;
vector<TCHAR> username(username_len);
GetUserName(&username[0], &username_len);
the accepted solution is nice example of overthinking.

What is the use of the c_str() function?

I understand c_str converts a string, that may or may not be null-terminated, to a null-terminated string.
Is this true? Can you give some examples?
c_str returns a const char* that points to a null-terminated string (i.e., a C-style string). It is useful when you want to pass the "contents"¹ of an std::string to a function that expects to work with a C-style string.
For example, consider this code:
std::string string("Hello, World!");
std::size_t pos1 = string.find_first_of('w');
std::size_t pos2 = static_cast<std::size_t>(std::strchr(string.c_str(), 'w') - string.c_str());
if (pos1 == pos2) {
std::printf("Both ways give the same result.\n");
}
See it in action.
Notes:
¹ This is not entirely true because an std::string (unlike a C string) can contain the \0 character. If it does, the code that receives the return value of c_str() will be fooled into thinking that the string is shorter than it really is, since it will interpret \0 as the end of the string.
In C++, you define your strings as
std::string MyString;
instead of
char MyString[20];.
While writing C++ code, you encounter some C functions which require C string as parameter.
Like below:
void IAmACFunction(int abc, float bcd, const char * cstring);
Now there is a problem. You are working with C++ and you are using std::string string variables. But this C function is asking for a C string. How do you convert your std::string to a standard C string?
Like this:
std::string MyString;
// ...
MyString = "Hello world!";
// ...
IAmACFunction(5, 2.45f, MyString.c_str());
This is what c_str() is for.
Note that, for std::wstring strings, c_str() returns a const w_char *.
Most old C++ and C functions, when dealing with strings, use const char*.
With STL and std::string, string.c_str() is introduced to be able to convert from std::string to const char*.
That means that if you promise not to change the buffer, you'll be able to use read-only string contents. PROMISE = const char*
In C/C++ programming there are two types of strings: the C strings and the standard strings. With the <string> header, we can use the standard strings. On the other hand, the C strings are just an array of normal chars. So, in order to convert a standard string to a C string, we use the c_str() function.
For example
// A string to a C-style string conversion //
const char *cstr1 = str1.c_str();
cout<<"Operation: *cstr1 = str1.c_str()"<<endl;
cout<<"The C-style string c_str1 is: "<<cstr1<<endl;
cout<<"\nOperation: strlen(cstr1)"<<endl;
cout<<"The length of C-style string str1 = "<<strlen(cstr1)<<endl;
And the output will be,
Operation: *cstr1 = str1.c_str()
The C-style string c_str1 is: Testing the c_str
Operation: strlen(cstr1)
The length of C-style string str1 = 17
c_str() converts a C++ string into a C-style string which is essentially a null terminated array of bytes. You use it when you want to pass a C++ string into a function that expects a C-style string (e.g., a lot of the Win32 API, POSIX style functions, etc.).
It's used to make std::string interoperable with C code that requires a null terminated char*.
You will use this when you encode/decode some string object you transfer between two programs.
Let’s say you use Base64 to encode some array in Python, and then you want to decode that into C++. Once you have the string you decode from Base64-decoded in C++. In order to get it back to an array of float, all you need to do here is:
float arr[1024];
memcpy(arr, ur_string.c_str(), sizeof(float) * 1024);
This is pretty common use, I suppose.
const char* c_str() const;
It returns a pointer to an array that contains a null-terminated sequence of characters (i.e., a C string), representing the current value of the string object.
This array includes the same sequence of characters that make up the value of the string object plus an additional terminating null - character ('\0') at the end.
std::string str = "hello";
std::cout << str; // hello
printf("%s", str); // ,²/☺
printf("%s", str.c_str()); // hello

CStringT to char[]

I'm trying to make changes to some legacy code. I need to fill a char[] ext with a file extension gotten using filename.Right(3). Problem is that I don't know how to convert from a CStringT to a char[].
There has to be a really easy solution that I'm just not realizing...
TIA.
If you have access to ATL, which I imagine you do if you're using CString, then you can look into the ATL conversion classes like CT2CA.
CString fileExt = _T ("txt");
CT2CA fileExtA (fileExt);
If a conversion needs to be performed (as when compiling for Unicode), then CT2CA allocates some internal memory and performs the conversion, destroying the memory in its destructor. If compiling for ANSI, no conversion needs to be performed, so it just hangs on to a pointer to the original string. It also provides an implicit conversion to const char * so you can use it like any C-style string.
This makes conversions really easy, with the caveat that if you need to hang on to the string after the CT2CA goes out of scope, then you need to copy the string into a buffer under your control (not just store a pointer to it). Otherwise, the CT2CA cleans up the converted buffer and you have a dangling reference.
Well you can always do this even in unicode
char str[4];
strcpy( str, CStringA( cString.Right( 3 ) ).GetString() );
If you know you AREN'T using unicode then you could just do
char str[4];
strcpy( str, cString.Right( 3 ).GetString() );
All the original code block does is transfer the last 3 characters into a non unicode string (CStringA, CStringW is definitely unicode and CStringT depends on whether the UNICODE define is set) and then gets the string as a simple char string.
First use CStringA to make sure you're getting char and not wchar_t. Then just cast it to (const char *) to get a pointer to the string, and use strcpy or something similar to copy to your destination.
If you're completely sure that you'll always be copying 3 characters, you could just do it the simple way.
ext[0] = filename[filename.Length()-3];
ext[1] = filename[filename.Length()-2];
ext[2] = filename[filename.Length()-1];
ext[3] = 0;
I believe this is what you are looking for:
CString theString( "This is a test" );
char* mychar = new char[theString.GetLength()+1];
_tcscpy(mychar, theString);
If I remember my old school MS C++.
You do not specify where is the CStringT type from. It could be anything, including your own implementation of string handling class. Assuming it is CStringT from MFC/ATL library available in Visual C++, you have a few options:
It's not been said if you compile with or without Unicode, so presenting using TCHAR not char:
CStringT
<
TCHAR,
StrTraitMFC
<
TCHAR,
ChTraitsCRT<TCHAR>
>
> file(TEXT("test.txt"));
TCHAR* file1 = new TCHAR[file.GetLength() + 1];
_tcscpy(file1, file);
If you use CStringT specialised for ANSI string, then
std::string file1(CStringA(file));
char const* pfile = file1.c_str(); // to copy to char[] buffer