I'm writing a locale-aware Node application for Windows, Mac, and Linux. Since JavaScript's handling of locales doesn't take into account a user's custom regional settings, I'm writing a native Node module to handle this for each platform.
I'm designing a function that takes a number to format and a precision (number of digits after the decimal) and returns a string. For example, if I pass in 1234.456 with precision 1, and my regional settings use "_" for groups and "/" for decimals, I would expect "1_234/5", using the appropriate locale info of the user's current language and region settings.
On the Windows side of things, I'm struggling to get the Windows API to give me exactly what I want. Here's the C++ function I'm writing:
std::string FormatNumber(const double number, const int precision)
{
std::wstringstream stream;
stream.setf(std::ios::fixed, std::ios::floatfield);
stream.precision(precision);
stream << number;
std::wstring roundedNumberStr = stream.str();
const wchar_t *numStr = roundedNumberStr.c_str();
wchar_t wBuffer[MAX_PATH];
GetNumberFormatEx(
LOCALE_NAME_USER_DEFAULT, // lpLocaleName,
0, // dwFlags,
numStr, // lpValue,
NULL, // lpFormat,
wBuffer, // lpNumberStr,
MAX_PATH // cchNumber
);
// Return a std::string using wBuffer
}
This almost works perfectly, except it always tacks on 2 decimal places regardless of the precision I use (something like "1_234/00"). To fix this, I suspect I need to pass in a parameter for lpFormat. The Windows API documentation suggests that I only need to provide values that I want to control and leave the rest to the user's settings, so I tried something like this:
NUMBERFMTW format{ precision };
GetNumberFormatEx(
LOCALE_NAME_USER_DEFAULT, // lpLocaleName,
0, // dwFlags,
numStr, // lpValue,
&format, // lpFormat,
wBuffer, // lpNumberStr,
MAX_PATH // cchNumber
);
But...that causes an ERROR_INVALID_PARAMETER error. Sadly, the documentation doesn't give me any more insight.
So how do I get this function to give me the strings I am expecting?
The two strings in NUMBERFMT are not optional. The other members are optional in the sense that 0 is valid but 0 does not mean they get their correct locale specific values!
Call GetLocaleInfoEx with LOCALE_SDECIMAL and LOCALE_STHOUSAND to load the "correct" strings and LOCALE_SGROUPING for the grouping. MSDN tells you which locale query to use for the other members.
Completely initialize your NUMBERFORMAT:
NUMBERFMTW format{NumDigits };
With what you have any members that aren't initialized have unspecified (and likely random) values.
Related
I saw the answer here and have read the original documentation.
So their code:
bool IsValidFloat(const CString& text, double& value)
{
LPCTSTR ptr = (LPCTSTR) text;
LPTSTR endptr;
value = _tcstod(ptr, &endptr);
return (*ptr && endptr - ptr == text.GetLength());
}
Some of my users have been encountering a issue and I think I have narrowed it down to:
LPCTSTR lpszValue = (LPCTSTR)strWord ;
LPTSTR lpszEndChar = NULL ;
double dLineSpace = _tcstod(lpszValue, &lpszEndChar);
I noticed that in the sample above endptr is not defaulted to NULL. Nor the example here (where they simply use char *string, *stopstring;).
Is this the reason my code is failing? Is there a specific reason why we can't default lpszEndChar to NULL?
If you give one of the strtod family of functions (which includes _tcstod on Windows/MSVC builds) a non-NULL argument as the second, endptr argument (that is, the address of a valid pointer variable), then it doesn't (or shouldn't) matter what address value that pointer has (even if it is NULL) when you call the function: it is an 'output-only' argument. If you pass an actual NULL as the second argument, then the functions won't (can't) modify what is (not) pointed-to.
In your case, you are comparing *endptr - str (note the dereference on the first) with the length of the entire string, to check for a valid read. This makes the assumption that there are no 'extra' characters after the number (even whitespace).
A more likely cause of this test failing is that floating-point numbers are being given in the wrong locale; so, if your locale is set the the default "C" (expecting a dot decimal point), and a number is given as "12,34" (in European format), then 12 will be successfully read from the string, but *endptr will point at the 3.
To address the locale issue, there are a couple of options. First (if you know the 'origin' locale), you can use the setlocale function before calling _tcstod to set the appropriate decimal character:
setlocale(LC_NUMERIC, FRENCH_LOCALE);
double dLineSpace = _tcstod(lpszValue, &lpszEndChar);
setlocale(LC_NUMERIC, "C"); // Revert to default, 'C' locale
Or, if you're not sure whether the decimals will be dots or commas, you should first replace any commas with dots. If your data are (initially) in a CString variable, and you only have a number in it, then you can use the CString::Replace function:
myString.Replace(_T(','), _T('.'));
//...
However, if your CString is more complex than just a single number (or isn't a CString), you will have to write a small function to do the replacement(s) for you. (I can maybe offer a hint, if that's what you need!)
I write a function to convert wstring to string.If I remove the code setlocale(LC_CTYPE, "") the program goes wrong.I refer to cplusplus read the doc.
C string containing the name of a C locale. These are system specific,
but at least the two following locales must exist:
"C" Minimal "C" locale
"" Environment's default locale
If the value of this parameter is NULL, the function does not make any
changes to the current locale, but the name of the current locale is
still returned by the function.
my code here,source code from cplusplus.com(I add some chinese character):
/* wcstombs example */
#include <stdio.h> /* printf */
#include <stdlib.h> /* wcstombs, wchar_t(C) */
#include <locale.h> /* setlocale */
int main()
{
setlocale(LC_CTYPE, "");
const wchar_t str[] = L"中国、wcstombs example";
char buffer[64];
int ret;
printf ("wchar_t string: %ls \n",str);
ret = wcstombs ( buffer, str, sizeof(buffer) );
if (ret==64)
buffer[63]='\0';
if (ret)
printf ("length:%d,multibyte string: %s \n",ret,buffer);
return 0;
}
If I remove the code setlocale(LC_CTYPE, ""),the program does not run as I expect.
My question is :"If I run in different machine,the program will differ? As the doc say,if the locale is "" ,function does not make any changes to the current locale,but the name of the current locale is still returned by the funciton."
Because the current locale in different machine may differ?
Here is a my c++ version of convert wstring with string,while string to wstring do not need function setlocale,and the program runs well:
/*
string converts to wstring
*/
std::wstring s2ws(const std::string& src)
{
std::wstring res = L"";
size_t const wcs_len = mbstowcs(NULL, src.c_str(), 0);
std::vector<wchar_t> buffer(wcs_len + 1);
mbstowcs(&buffer[0], src.c_str(), src.size());
res.assign(buffer.begin(), buffer.end() - 1);
return res;
}
/*
wstring converts to string
*/
std::string ws2s(const std::wstring & src)
{
setlocale(LC_CTYPE, "");
std::string res = "";
size_t const mbs_len = wcstombs(NULL, src.c_str(), 0);
std::vector<char> buffer(mbs_len + 1);
wcstombs(&buffer[0], src.c_str(), buffer.size());
res.assign(buffer.begin(), buffer.end() - 1);
return res;
}
If the second argument to setlocale is NULL, it does nothing apart from returning the current locale. But you're not doing that. You're sending it a string entirely consisting of a single nil byte, aka "". My setlocale man page says
If locale is an empty string, "", each part of the locale that should be modified is set according to the environment variables. The details are implementation-dependent.
So what this is doing for you is setting the locale to whatever the user has specified or to the system default.
Without running setlocale at all presumably leaves the current locale either uninitialized or NULL on your system, which is why your program fails without that setting.
Two other man pages for stuff you're using say
The behavior of mbstowcs() depends on the LC_CTYPE category of the current locale.
The behavior of wcstombs() depends on the LC_CTYPE category of the current locale.
Presumably these routines are what is failing if you haven't set the locale at all.
I would guess that you probably don't need to run the setlocale statement on every invocation of these routines, but you do need to make sure it's run at least once before running them.
As far as what happens differently depending on the current locale, I believe that would be how exactly the multibyte string is converted to wide characters and vis versa. I think that the man page for those routines leaves it vague because of that difference. Personally, I'd prefer if it set some examples, such as, "if the current locale is C, the multibyte string is ASCII characters." I would guess there's also at least one in which it is interpreted as UTF-8, but I don't know enough about the different locales to say exactly which one that is. There's probably also at least one locale where the multibyte string happened to be another two bytes per character encoding, but C and C++ would still treat it as bytes.
Edit: Thinking about this more, given the characters you added to the example code, it might make sense to explicitly state that using locales that do not support Chinese characters will cause the final printf to report that the length was -1, and this includes the default C locale. In this case, the contents of the buffer is not clearly specified by the standard - at least, my reading of it indicates that the buffer value will probably be all of the characters up to but not including the one that failed to convert. While neither the C++ documentation nor the C documentation state what happens regarding the character that could not be converted. I haven't paid for the official standards, but I do have copies of the last free releases. C++17 defers to C17. C17 also refrains from commenting on this aspect of this function. For wcsrtombs, it explicitly states that the conversion state is unspecified. However, on wcstombs_s, C17 states
If the conversion stops without converting a null wide character and dst is not a null pointer, then a null character is stored into the array pointed to by dst immediately following any multibyte characters already stored.
In my own experiments with the code provided by the OP above, it appears that the wcstombs implementation on Fedora 28 simply refrains from making any further changes to the buffer. That seems to indicate to me, if the exact behavior of the code matters for this situation, it may make sense to use wcstombs_s instead. But at a minimum, you just check to see if the length returned is -1, and if it is, report an error rather than assuming the conversion worked.
I am trying to convert a std::string to a TCHAR* for use in CreateFile(). The code i have compiles, and works, but Visual Studio 2013 comes up with a compiler warning:
warning C4996: 'std::_Copy_impl': Function call with parameters that may be unsafe - this call relies on the caller to check that the passed values are correct. To disable this warning, use -D_SCL_SECURE_NO_WARNINGS. See documentation on how to use Visual C++ 'Checked Iterators'
I understand why i get the warning, as in my code i use std::copy, but I don't want to define D_SCL_SECURE_NO_WARNINGS if at all possible, as they have a point: std::copy is unsafe/unsecure. As a result, I'd like to find a way that doesn't throw this warning.
The code that produces the warning:
std::string filename = fileList->getFullPath(index);
TCHAR *t_filename = new TCHAR[filename.size() + 1];
t_filename[filename.size()] = 0;
std::copy(filename.begin(), filename.end(), t_filename);
audioreader.setFile(t_filename);
audioreader.setfile() calls CreateFile() internally, which is why i need to convert the string.
fileList and audioreader are instances of classes i wrote myself, but I'd rather not change the core implementation of either if at all possible, as it would mean I'd need to change a lot of implementation in other areas of my program, where this conversion only happens in that piece of code. The method I used to convert there was found in a solution i found at http://www.cplusplus.com/forum/general/12245/#msg58523
I've seen something similar in another question (Converting string to tchar in VC++) but i can't quite fathom how to adapt the answer to work with mine as the size of the string isn't constant. All other ways I've seen involve a straight (TCHAR *) cast (or something equally unsafe), which as far as i know about the way TCHAR and other windows string types are defined, is relatively risky as TCHAR could be single byte or multibyte characters depending on UNICODE definition.
Does anyone know a safe, reliable way to convert a std::string to a TCHAR* for use in functions like CreateFile()?
EDIT to address questions in the comments and answers:
Regarding UNICODE being defined or not: The project in VS2013 is a win32 console application, with #undef UNICODE at the top of the .cpp file containing main() - what is the difference between UNICODE and _UNICODE? as i assume the underscore in what Amadeus was asking is significant.
Not directly related to the question but may add perspective: This program is not going to be used outside the UK, so ANSI vs UNICODE does not matter for this. This is part of a personal project to create an audio server and client. As a result you may see some bits referencing network communication. The aim of this program is to get me using Xaudio and winsock. The conversion issue purely deals with the loading of the file on the server-side so it can open it and start reading chunks to transmit. I'm testing with .wav files found in c:/windows/media
Filename encoding: I read the filenames in at runtime by using FindFirstFileA() and FindNextFileA(). The names are retrieved by looking at cFilename in a WIN32_FIND_DATAA structure. They are stored in a vector<string> (wrapped in a unique_ptr if that matters) but that could be changed. I assume this is what Dan Korn means.
More info about the my classes and functions:
The following are spread between AudioReader.h, Audioreader.cpp, FileList.h, FileList.cpp and ClientSession.h. The fragment above is in ClientSession.cpp. Note that in most of my files i declare using namespace std;
shared_ptr<FileList> fileList; //ClientSession.h
AudioReader audioreader; //ClientSession.h
string _storedpath; //FileList.h
unique_ptr<vector<string>> _filenames; //FileList.h
//FileList.cpp
string FileList::getFullPath(int i)
{
string ret = "";
unique_lock<mutex> listLock(listmtx);
if (static_cast<size_t>(i) < _count)
{
ret = _storedpath + _filenames->at(i);
}
else
{
//rather than go out of bounds, return the last element, as returning an error over the network is difficult at present
ret = _storedpath + _filenames->at(_count - 1);
}
return ret;
}
unique_ptr<AudioReader_Impl> audioReaderImpl; //AudioReader.h
//AudioReader.cpp
HRESULT AudioReader::setFile(TCHAR * fileName)
{
return audioReaderImpl->setFile(fileName);
}
HANDLE AudioReader_Impl::fileHandle; //AudioReader.cpp
//AudioReader.cpp
HRESULT AudioReader_Impl::setFile(TCHAR * fileName)
{
fileHandle = CreateFile(fileName, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
if (fileHandle == INVALID_HANDLE_VALUE)
{
return HRESULT_FROM_WIN32(GetLastError());
}
if (SetFilePointer(fileHandle, 0, NULL, FILE_BEGIN) == INVALID_SET_FILE_POINTER)
{
return HRESULT_FROM_WIN32(GetLastError());
}
return S_OK;
}
If you do not need to support the string containing UTF-8 (or another multi-byte encoding) then simply use the ANSI version of Windows API:
handle = CreateFileA( filename.c_str(), .......)
You might need to rejig your code for this as you have the CreateFile buried in a function that expects TCHAR. That's not advised these days; it's a pain to splatter T versions of everything all over your code and it has flow-on effects (such as std::tstring that someone suggested - ugh!)
There hasn't been any need to support dual compilation from the same source code since about 1998. Windows API has to support both versions for backward compatibility but your own code does not have to.
If you do want to support the string containing UTF-8 (and this is a better idea than using UTF-16 everywhere) then you will need to convert it to a UTF-16 string in order to call the Windows API.
The usual way to do this is via the Windows API function MultiByteToWideChar which is a bit awkward to use correctly, but you could wrap it up in a function:
std::wstring make_wstring( std::string const &s );
that invokes MultiByteToWideChar to return a UTF-16 string that you can then pass to WinAPI functions by using its .c_str() function.
See this codereview thread for a possible implementation of such a function (although note discussion in the answers)
The root of your problem is that you are mixing TCHARs and non-TCHARs. Even if you get it to work on your machine, unless you do it precisely right, it will fail when non-ASCII characters are used in the path.
If you can use std::tstring instead of regular string, then you won't have to worry about format conversions or codepage versus Unicode issues.
If not, you can use conversion functions like MultiByteToWideChar but make sure you understand the encoding used in the source string or it will just make things worse.
Try this instead:
std::string filename = fileList->getFullPath(index);
#ifndef UNICODE
audioreader.setFile(filename.c_str());
#else
std::wstring w_filename;
int len = MultiByteToWideChar(CP_ACP, 0, filename.c_str(), filename.length(), NULL, 0);
if (len > 0)
{
w_filename.resize(len);
MultiByteToWideChar(CP_ACP, 0, filename.c_str(), filename.length(), &w_filename[0], len);
}
audioreader.setFile(w_filename.c_str());
#endif
Alternatively:
std::string filename = fileList->getFullPath(index);
#ifndef UNICODE
audioreader.setFile(filename.c_str());
#else
std::wstring_convert<std::codecvt<wchar_t, char, std::mbstate_t>> conv;
std::wstring w_filename = conv.from_bytes(filename);
audioreader.setFile(w_filename.c_str());
#endif
I have recently begun to write in C on Windows and have been trying to be careful with the different ways that string buffers are handled. For instance, GetWindowText() takes an int nMaxCount of the maximum number of characters, including null. GetModuleFileName() takes a DWORD nSize of the size of the buffer, in TCHARs (I assume this also includes null). Even though these are worded differently and one takes a DWORD while the other takes an int (why the difference in types?), the behavior is identical, correct?
Both return the length of the string that is copied, not including the null, so I should be able to call either of them repeatedly, doubling the buffer size until the returned length is less than the buffer size passed in, like this:
DWORD buf_size = 1024;
DWORD return_val;
wchar_t *full_path = malloc(buf_size * sizeof(wchar_t));
// double the buffer until it's big enough
while ((return_val = GetModuleFileNameW(NULL, full_path, buf_size)) == buf_size) {
buf_size *= 2;
full_path = realloc(full_path, buf_size * sizeof(wchar_t));
}
if (!return_val) {
fprintf(stderr, "Error in GetModuleFileNameW()\n");
return NULL;
}
Do all of the Windows API functions with string [out] parameters work in the same way? Are there any individual functions or groups of functions that behave differently? (for instance, functions that take the size of the buffer in bytes instead of characters, or that take a maximum string length not including the null character or that return a different value than these two)
Actually, I just noticed that the return value of these two is not entirely consistent: GetModuleFileName() returns 0 when it errors; GetWindowText() will return 0 whenever there is an empty string for the window text, which I think I saw quite frequently when I was enumerating windows...
One reason I want to understand it in detail is because in some cases (GetModuleFileName() on WinXP, for instance), an off-by-one error in my code will result in a string that is not null-terminated.
By and large the majority of the Win32 API functions that return strings do so in a consistent manner. GetWindowText is a good choice for a canonical such function. However, there are exceptions, and I don't think anyone has ever compiled a comprehensive list.
The bottom line here is that you need to consult the documentation carefully every single time you write code to call a Win32 API function. Not only regarding the treatment of string output values, but all parameters. And all return values. And error handling. There is variation in style across the API, and even variation within related groups of functions.
I'm a Javascript developer, so go easy on me! I am trying to write just a patch of C++ to enable printing on a framework. I'm compiling with Unicode, and based on my research, that is what is messing me up.
I think this is a relatively simple thing that I'm over complicating. The application has a std::string that contains the current printer name. The script first checks if it is unset (if it is it utilizes GetDefaultPrinter which outputs a LPTSTR). Finally, the script takes either than std::string or the LPTSTR and converts it to a LPCTSTR for CreateDC.
Here is my code:
std::string PrinterName = window->getPrinter();
LPDWORD lPrinterNameLength;
LPWSTR szPrinterName;
LPCTSTR PrinterHandle;
if (PrinterName == "unset") {
GetDefaultPrinter( szPrinterName, &lPrinterNameLength );
PrinterHandle = szPrinterName; //Note sure the best way to convert here
} else {
PrinterHandle = PrinterName.c_str();
}
HDC hdc = CreateDC( L"WINSPOOL\0", PrinterHandle, NULL, NULL);
When compiling, I only get conversions errors. Such as
Cannot convert parameter 2 from LPDWORD * to LPDWORD (GetDefaultPrinter)
and
Cannot convert from 'const char *' to 'LPCTSTR' (On the PrinterHandle = PrinterName.c_str() line)
I've done quite a bit of SO research on this, but haven't come up with a concrete solution.
Any help is greatly appreciated!
Even if you're compiled for "Unicode" (wide characters strings), you can call the "ANSI" (narrow characters strings) versions of the API functions. Windows will do the conversions for you and call the wide character version under the covers.
For example, for most Windows APIs like CreateDC, there isn't actually a function with that name. Instead, there's a macro named CreateDC that expands to either CreateDCA or CreateDCW, which are the actual function names. When you're compiled for "Unicode", the macros expand to the -W versions (which are the native ones in all modern versions of the OS. Nothing prevents you from explicitly calling either version, regardless of whether you're compiled for Unicode. In most cases, the -A version will simply convert the narrow strings to wide ones for you and then call the corresponding -W version. (There are some caveats here related to creating windows, but I don't think they apply to DCs.)
std::string PrinterName = window->getPrinter();
if (PrinterName == "unset") {
char szPrinterName[MAX_PATH]; // simplified for illustration
DWORD cchPrinterNameLength = ARRAYSIZE(szPrinterName);
GetDefaultPrinterA(szPrinterName, &cchPrinterNameLength);
PrinterName = szPrinterName;
}
HDC hdc = CreateDCA("WINSPOOL", PrinterName.c_str(), NULL, NULL);
First of all, as mentioned in the comments, the proper way is to make a DWORD and pass the address:
DWORD lpPrinterNameLength;
...
GetDefaultPrinter(..., &lpPrinterNameLength);
Why it's like that is so that it can use and change a number:
On input, specifies the size, in characters, of the pszBuffer buffer. On output, receives the size, in characters, of the printer name string, including the terminating null character.
It would just take a DWORD, but the function changes the number in the variable passed in, so the function needs the address of the variable to change in order for those changes to reflect back to the caller.
Secondly, since window->getPrinter() returns a narrow string and you're using UNICODE, which makes the functions take wide strings, you should convert from the narrow string into a wide one. There are several ways to do this (such as the really easy one mentioned in ildjarn's comment), and even this one is slightly better with C++11, though the aforementioned note applies even better with that, but I'll use MultiByteToWideChar and C++03:
std::wstring narrowToWide(const std::string &narrow) {
std::vector<wchar_t> wide;
int length = MultiByteToWideChar(CP_ACP, MB_ERR_INVALID_CHARS, narrow.c_str(), -1, NULL, 0);
if (!length) {
//error
}
wide.resize(length);
if (!MultiByteToWideChar(CP_ACP, MB_ERR_INVALID_CHARS, narrow.c_str(), -1, &wide[0], length)) {
//error, should probably check that the number of characters written is consistent as well
}
return std::wstring(wide.begin(), wide.end());
}
...
std::wstring PrinterName = narrowToWide(window->getPrinter());
//rest is same, but should be L"unset"
CreateDC( L"WINSPOOL\0", PrinterHandle, NULL, NULL);