UnicodeString compatibility issue - c++

I am porting an older project from C++ Builder 2009 to XE5. In the old project, the compiler option for Unicode strings was set as "_TCHAR maps to: char". This worked fine in the old project.
When porting it, I set the same compiler option in XE5. But I still get compiler errors for code like this:
std::string str = String(some_component.Text).t_str();
This gives the following errors:
[bcc32 Warning] file.cpp(89): W8111 Accessing deprecated
entity 'UnicodeString::t_str() const'
[bcc32 Error] file.cpp(89): E2285 Could not find a match for
'operator string::=(wchar_t *)'
So apparently XE5 has decided that String::t_str() should give me a wchar_t* rather than a char*, even though I have set the compiler option as described above.
How do I solve this?
I am well-aware that C++ Builder has taken the step to use Unicode internally (even in the 2009 version), but this is an old project with 200k LOC. Updating it to Unicode would be a steep task with very low priority.
EDIT
I can get it to work by changing the code to
std::string str = AnsiString(some_component.Text).c_str();
But this means I have to change the code at numerous places. Is there a better way that doesn't involve rewriting code?

When UnicodeString::t_str() was first introduced in CB2009, it returned either a char* or wchar_t* depending on what TCHAR mapped to. In order to return a char*, it ALTERED the internal data of the UnicodeString to make it Ansi (thus breaking the contract that UnicodeString is a Unicode string). THIS WAS TEMPORARY for migration purposes while people were still re-writing their code to support Unicode. This breakage was acceptable because the RTL had special logic to handle Ansi-encoded UnicodeString (and Unicode-encoded AnsiString) values. However, this was dangerous code. After a few versions, when people had adequate time to migrate, this RTL logic was removed and UnicodeString::t_str() was locked in to wchar_t* only, to match UnicodeString::c_str(). DO NOT USE t_str() anymore! That is why it is marked as deprecated now. If you need to pass a UnicodeString to something that expects Ansi data, converting to an intermediate AnsiString is the correct and safe approach. That is just the way it is now.

Related

How can I convert an std::filesystem::path to LPCSTR for use in one of the LoadLibrary() variants?

On Windows I'm trying to use one of the variants of LoadLibrary() to open a dll previously written to an std::filesystem::path with an ofstream.
Note: I know the dll is written correctly as I can use it in the standard fashion by linking to it at runtime.
I've been trying to combine the methods from the two answers below.
How to convert std::string to LPCSTR?
how to convert filesystem path to string
This seems like it should be pretty basic but with anything I've tried so far I either get an error about conversion to LPCSTR or something like C2228: left of '.c_str' must have class/struct/union which I am baffled by.
Here's a simple example:
// Assuming I have
// std::filesystem::path path1
// correctly set, I should be able to directly access it in
// a number of ways; i.e. path1.c_str(), path1.string.c_str(), etc.
// in order to pass it the function or a temp variable.
// However direct use of it in LoadLibrary() fails with the C2228 error.
HINSTANCE hGetProcIDDLL = LoadLibrary(path1.c_str());
I've tried avoiding the macro and calling LoadLibraryA() directly with no luck.
I've also tried various ways of passing path1 with path1.string(), path1.string.c_str(), path1.wstring(), etc. with no luck.
I've also tried using a temp variable in a number of ways to avoid the cast within LoadLibrary().
LPCSTR temp_lpcstr = path1.c_str(); // Also tried things like path1.string() path1.string.c_str()
// Also tried just using a temp string...
std::string temp_string = path1.string(); // and variants.
I'm willing to try playing with the encoding (like path1.u8string() etc.) but I think it shouldn't be necessary with use of LoadLibraryA() directly.
I'm trying to avoid C casts and would prefer a c++ static_ or dynamic_ but I'll use anything that works.
Any help is appreciated.
Thanks in advance.
UPDATE
#eryk-sun's comment and #Gulrak's answer solved it for me.
It looks like with my setup, path1.c_str() alone is wchar_t but the LoadLibrary() macro was not picking that up and directing it to LoadLibraryW() as it should.
Note: For anyone else who might stumble onto this in the future here's more details of my specific setup. I'm using the MSVC compiler from 16.1.0 (~VS2019) but that's getting called from VSCode and CMake. I'm not explicitly defining _UNICODE however VSCode's intellisense certainly thinks it's been defined somewhere and points me to LoadLibraryA(). However, I think the compiler is not actually seeing that define so it interprets path1.c_str() as a wchar_t.
Actually on Windows you should be able to use LoadLibraryW(path1.c_str()) as on Windows the returned type of std::filesystem::path::c_str() should be a const wchar_t* so it's a good fit for the LoadLibraryW expected LPCWSTR.
As for the error of C2228 my guess is, you tried path1.string.c_str() as given by your comment, wich should have been path1.string().c_str(). That would give you a LPCSTR compatible string for LoadLibaryA, but if there is a chance of Non-ASCII in your path I would suggest using the explicit LoadLibaryW version.
In any way: When interfacing WinAPI with std::filesystem::path you should use the explicit A/W-Version to make your code safe independent of the state of _UNICODE, and I allways suggest the *W versions.
You should use string member function of path class which returns std::string. Then call c_stron the returned string.
std::filesystem::path path /* = initialization here */;
std::string str = path.string();
/* some handle = */ LoadLibrary(str.c_str());

C++ C2440 Error, cannot convert from 'std::_String_iterator<_Elem,_Traits,_Alloc>' to 'LPCTSTR'

Unfortunately I have been tasked with compiling an old C++ DLL so that it will work on Win 7 64 Bit machines. I have zero C++ experience. I have muddled through other issues, but this one has stumped me and i have not found a solution elsewhere. The following code is throwing a C2440 compiler error.
The error: "error C2440: '' : cannot convert from 'std::_String_iterator<_Elem,_Traits,_Alloc>' to 'LPCTSTR'"
the code:
#include "StdAfx.h"
#include "Antenna.h"
#include "MyTypes.h"
#include "objbase.h"
const TCHAR* CAntenna::GetMdbPath()
{
if(m_MdbPath.size() <= 0) return NULL;
return LPCTSTR(m_MdbPath.begin());
}
"m_MdbPath" is defined in the Antenna.h file as
string m_MdbPath;
Any assistance or guidance someone could provide would be extremely helpful. Thank you in advance. I am happy to provide additional details on the code if needed.
std::string has a .c_str() member function that should accomplish what you're looking for. It will return a const char* (const wchar_t* with std::wstring). std::string also has an empty() member function and I recommend using nullptr instead of the NULL macro.
const TCHAR* CAntenna::GetMdbPath()
{
if(m_MdbPath.empty()) return nullptr;
return m_MdbPath.c_str();
}
The solution:
const TCHAR* CAntenna::GetMdbPath()
{
if(m_MdbPath.size() <= 0) return NULL;
return m_MdbPath.c_str();
}
will work if you can guarantee that your program is using the MBCS character set option (or if the Character Set option is set to Not Setif you're using Visual Studio), since a std::string character type will be the same as the TCHAR type.
However, if your build is UNICODE, then a std::string character type will not be the same as the TCHAR type, since TCHAR is defined as a wide character, not a single-byte character. Thus returning c_str() will give a compiler error. So your original code, plus the tentative fix of returning std::string::c_str() may work, but it is technically wrong with respect to the types being used.
You should be working directly with the types presented to your function -- if the type is TCHAR, then you should be using TCHAR based types, but again, a TCHAR is a type that changes definition depending on the build type.
As a matter of fact, the code will fail miserably at runtime if it were a UNICODE build and to fix the compiler error(s), you casted to an LPCTSTR to keep the compiler quiet. You cannot cast a string pointer type to another string pointer type. Casting is not converting, and merely casting will not convert a narrow string into a wide string and vice-versa. You will wind up with at the least, odd characters being used or displayed, and worse, a program with random weird behavior and crashes.
In addition to this you never know when you really will need to build a UNICODE version of the DLL, since MBCS builds are becoming more rare. Going by your original post, the DLL's usage of TCHAR indicates that yes, this DLL may (or even has) been built for UNICODE.
There are a few alternate solutions that you can use. Note that some of these solutions may require you to make additional coding changes beyond the string types. If you're using C++ stream objects, they may also need to be changed to match the character type (for example, std::ostream and std::wostream)
Solution 1. Use a typedef, where the string type to use depends on the build type.
For example:
#ifdef UNICODE
typedef std::wstring tchar_string;
#else
typedef std::string tchar_string;
#endif
and then use tchar_string instead of std::string. Switching between MBCS and UNICODE builds will work without having to cast strin types, but requires coding changes.
Solution 2. Use a typedef to use TCHAR as the underlying character type in the std::basic_string template.
For example:
typedef std::basic_string<TCHAR> tchar_string;
And use tchar_string in your application. You get the same public interface as std::(w)string and this allows you to switch between MBCS and UNICODE build seamlessly (with respect to the string types).
Solution 3. Go with UNICODE builds and drop MBCS builds altogether
If the application you're building is a new one, then UNICODE is the default option (at least in Visual Studio) when creating a new application. Then all you really need to do is use std::wstring.
This is sort of the opposite of the original solution given of making sure you use MBCS, but IMO this solution makes more sense if there is only going to be one build type. MBCS builds are becoming more rare, and basically should be used only for legacy apps.

movefile() fails error 2 or 123

I'm updating a c++ routine to move files that was written in visual studio express 2008/2010. I'm now running VS Express 2012
Obviously there are changes to the compiler because string functions have to be upgraded to strcpy_s etc. No problem. This is a console app. I never extended my C++ knowledge past C++ to C# etc. as I need little more than to be able to write small utils to do things on the command line. Still I'm able to write somewhat complex utilities.
My issue is movefile() function always fails to move with either error 2 or 123. I'm working in C:\users\alan\downloads folder so I know I have permission. I know the file is there. Small snippet of code is:
char source=".\\test.txt"; // edited for clarity.
char dest=".\\test.txt1";
printf("\nMove\n %s\n to %s\n",source,dest); // just to see what is going on
MoveFile((LPCWSTR) source, (LPCWSTR) dest);
printf("Error %u\n",GetLastError());
output is :
Move
.\test.txt
to .\test.txt1
Error 2
All of my strings are simple char strings and I'm not exactly sure, even after reading, what LPCWSTR was type def'd for and if this is the culprit. So to get this to compile I simply typedef'd my strings. And it compiles. But still it won't move the files.
The code is more complex in developing the source & dest variables but I've reduce it to a simple "just append a 1 to the file name" situation to see if I can just simply rename it. I thought C:\xxx\yyy\zzz\test.txt was maybe wrong in some fashion but that idea fell though with the above test. I've done it with and without the .\ same issue. I'm running out of ideas other than making my own fileopen read/write binary function to replace movefile(). I'm really against that but if I have to I will.
EDIT: I pasted the printf from original code that used FullPathName, I've corrected the snippet.
The fact that you are casting your arguments to LPCWSTR suggests that you are compiling your program with UNICODE defined, which means you are calling MoveFileW and the compiler warned about an argument type mismatch.
Inserting a cast does not fix that. You are telling the compiler to stop complaining, but you haven't actually fixed the problem (the underlying data is still wrong).
Actual solutions:
Use WCHAR as MoveFileW expects (or TCHAR/LPTSTR and the _T macro).
Explicitly call MoveFileA
Compile without UNICODE defined.
Thanks Andrew Medico. I used MoveFileA and the program seems to work now.
I'm not sure I turned off unicode, but I did change one item in the properties.
I'll need to read up on the compiler about unicode/ansi settings. But for now the issue is fixed and I'm sure I've got the idea of what I need to do. "research"!!!!

Pass CString to fprintf

I have ran the code analyzer in visual studio on a large code base and i got about a billion of this error:
warning C6284: Object passed as parameter '3' when string is required in call to 'fprintf'
According to http://msdn.microsoft.com/en-us/library/ta308ywy.aspx "This defect might produce incorrect output or crashes." My colleague however states that we can just ignore all these errors without any problems. So one of my questions is do we need to do anything about this or can we just leave it as is?
If these errors need to be solved what is the nicest approach to solve it?
Would it work to do like this:
static_cast<const char*>(someCString)
Is there a better or more correct approach for this?
The following lines generate this warning:
CString str;
fprintf(pFile, "text %s", str);
I'm assuming that you're passing a Microsoft "CString" object to a printf()-family function where the corresponding format specifier is %s. If I'm right, then your answer is here: How can CString be passed to format string %s? (in short, your code is OK).
It seems that originally an implementation detail allowed CString to be passed directly to printf(), and later it was made part of the contract. So you're good to go as far as your program being correct, but if you want to avoid the static analysis warning, you may indeed need to use the static_cast to a char pointer. I'm not sure it's worth it here...maybe there's some other way to make these tools place nice together, since they're all from Microsoft.
Following the MSDN suggestions in C6284, you may cast the warnings away. Using C++ casts will be the most maintainable option to do this. Your example above would change to
fprintf(pFile, "text %s", static_cast<const TCHAR*>(str));
or, just another spelling of the same, to
fprintf(pFile, "text %s", static_cast<LPCTSTR>(str));
The most convincing option (100% cast-free, see Edits section) is
fprintf(pFile, "text %s", str.GetString());
Of course, following any of these change patterns will be a first porting step, and if nothing indicates a need for it, this may be harmful (not only for your team atmosphere).
Edits: (according to the comment of xMRi)
1) I added const because the argument is read-only for fprintf
2) notes to the cast-free solution CSimpleStringT::GetString: the CSimpleStringT class template is used for the definition of CStringT which again is used to typedef the class CString used in the original question
3) reworked answer to remove noise.
4) reduced the intro about the casting option
Technically speaking it is ok because the c-string is stored in such a way in CString that you can use it as stated but it is not good rely on how CString is implemented to do a shortcut. printf is a C-runtime function and knows nothing about C++ objects but here one is relying on an that the string is stored first in the CString - an implementation detail.
If I recall correctly originally CString could not be used that way and one had to cast the CString to a c-string to print it out but in later versions MS changed the implementation to allow for it to be treated as a c-string.
Another breaking issue is UNICODE, it will definitely not work if you one day decide to compile the program with UNICODE character set since even if you changed all string formatters to %ld, embedded 0s will sometimes prevent the string from being printed.
The actual problem is rather why are you using printf instead of C++ to print/write files?

How to assign a value to a TCHAR array

I have a TCHAR array in my C++ code which I want to assign static strings to it.
I set an initial string to it via
TCHAR myVariable[260] = TEXT("initial value");
Everything works fine on this. However, when I split it in two lines as in
TCHAR myVariable[260];
myVariable = TEXT("initial value");
it bugs and gives a compiler error:
error C2440: '=': cannot convert from 'const char [14]' to 'TCHAR [260]'
shouldn't the TEXT() function do exactly what I want here? convert the given string to TCHARs? Why does it work, when putting the two lines together? What do I have to change in order to get it working?
Some other confusing thing I have encountered:
I've searched the internet for it and have seen that there are also _T() and _TEXT() and __T() and __TEXT(). What are they for? Which of them should I use in what environment?
The reason the assignment doesn't work has very little to do with TCHARs and _T. The following won't work either.
char var[260];
var = "str"; // fails
The reason is that in C and C++ you can't assign arrays directly. Instead, you have to copy the elements one by one (using, for example, strcpy, or in your case _tcscpy).
strcpy(var, "str");
Regarding the second part of your question, TEXT, _T and the others are macros, that in Unicode builds turn a string literal to a wide-string literal. In non-Unicode builds they do nothing.
See avakar's answer for the direct answer. I was going to add this as a comment, but it really is a freestanding recommendation. I will warn you up from that this will sound like a rant but it does come from using TCHAR and then working issues for a few years before trying to remove it from a rather large codebase.
Make sure that you really understand the effect of using TCHAR arrays and their friends. They are deceptively difficult to use correctly. Here is a short list of things that you have to watch out for:
sizeof(TCHAR) is conditional: Scrutinize code that contains this. If it is doing anything other than calculating the size that is being passed to malloc() or memcpy() then it is probably wrong.
TCHAR is a type alias: Since TCHAR is nothing more than a typedef, it is really easy to write things like wcscpy(tszStr, wszAry) and be none the wiser. Basically, TCHAR is either char or wchar_t so overload selections might surprise you.
wsprintf() and swprintf() are different: This is a special case of the previous but it bears special consideration since it is so easy to make a mistake here!
If you want to use TCHAR, then make sure that you compile both UNICODE and MBCS versions regularly. If you do not do this, then you are probably going to undo any advantage that you are trying to gain by using them in the first place.
Here are two recommendations that I have for how to not use TCHAR in the first place when you are writing C++ code.
Use CString if you are using MFC or are comfortable tying yourself to MSFT.
Use std::string or std::wstring in conjunction with the type-specific API - use CreateFileA() and CreateFileW() where appropriate.
Personally, I choose the latter but the character encoding and string transcoding issues are just a nightmare from time to time.