How can I iterate over CString and compare its characters to int? - c++

I'm writing a dialog based MFC application in Visual Studio 2017 in C++. In the dialog I added a list control where the user can change the values of the cells as shown in the picture below:
after he changes the values, I want to check if those values are valid (so if he accidentally pressed the wrong button he will be notified). For this purpose I'm iterating over the different cells of the list and from each cell I extract the text which is written in it into a CString type variable. I want to check that this variable has only 8 characters which are '1' or '0'. The problem with the code I've written is that I get weird values when I try to print the different characters of the CString variable.
The Code for checking the validity of the CString:
void CEditableListControlDlg::OnBnClickedButton4()
{
// TODO: Add your control notification handler code here
// Iterate over the different cells
int bit7Col = 2;
int bit6Col = 3;
int bit5Col = 4;
int bit4Col = 5;
int bit3Col = 6;
int bit2Col = 7;
int bit1Col = 8;
int bit0Col = 9;
for (int i = 0; i < m_EditableList.GetItemCount(); ++i) {
CString bit7 = m_EditableList.GetItemText(i, bit7Col);
CString bit6 = m_EditableList.GetItemText(i, bit6Col);
CString bit5 = m_EditableList.GetItemText(i, bit5Col);
CString bit4 = m_EditableList.GetItemText(i, bit4Col);
CString bit3 = m_EditableList.GetItemText(i, bit3Col);
CString bit2 = m_EditableList.GetItemText(i, bit2Col);
CString bit1 = m_EditableList.GetItemText(i, bit1Col);
CString bit0 = m_EditableList.GetItemText(i, bit0Col);
CString cvalue = bit7 + bit6 + bit5 + bit4 + bit3 + bit1 + bit0;
std::string value((LPCSTR)cvalue);
int length = value.length();
if (length != 7) {
MessageBox("Register Value Is Too Long", "Error");
return;
}
for (int i = 0; i < length; i++) {
if (value[i] != static_cast<char>(0) || value[i] != static_cast<char>(1)) {
char c = value[i];
MessageBox(&c, "value"); // this is where I try to print the value
return;
}
}
}
}
Picture of what get's printed in the message box when I try to print one character of the variable value. I expect to see '1' but instead I see in the message box '1iiiiii`:
I've tried extracting the characters directly from the variable cvalue of type CString like this:
cvalue[i]
and it's length I got by using
strlen(cvalue[i])
but I've got the same result. I've also tried accessing the characters in the variable cvalue of type CString as follows:
cvalue.GetAt(i)
and to get it's length by using:
cvalue.GetLength()
But again, I've got the same results.
Perhaps anyone could advice me how can I check that the characters in the variable cvalue of type CString are '0' or '1'?
Thank you.

You don't need to use std::string to process your strings in this case: CString works fine.
Assuming that your CString cvalue is the string you want to check, you can write a simple loop like this:
// Check cvalue for characters different than '0' and '1'
for (int i = 0; i < cvalue.GetLength(); i++)
{
TCHAR currChar = cvalue.GetAt(i);
if ((currChar != _T('0')) && (currChar != _T('1')))
{
CString message;
message.Format(_T("Invalid character at position %d : %c"), i, currChar);
MessageBox(message, _T("Error"));
}
}
The reason for the apparently weird output in your case is that you are passing a pointer to a character that is not followed by a null-terminator:
// Wrong code
char c = value[i];
MessageBox(&c, "value");
If you don't want to build a CString with a formatted message containing the offending character, like I did in the previous sample code, an alternative could be creating a simple raw char array storing the character you want to output followed by the null-terminator:
// This is an array storing two chars: value[i] followed by '\0' (null)
char s[2] = {value[i], '\0'};
MessageBox(s, "value");
P.S.
I used TCHAR in my code sample instead of char, to make the code more easily portable to Unicode builds. Think of TCHAR as a preprocessor macro that maps to char for ANSI/MBCS builds (which seems your current case), and to wchar_t for Unicode builds.
A Brief Note on Validating User Input
With the above answer, I tried to strictly address your specific problem with CString character validation. But, if you can take a look from a broader perspective, I would definitely consider validating the user input before storing it in the list-view control. For example, you could handle the LVN_ENDLABELEDIT notification from the list-view control, and reject invalid input values.
Or, considering that the only valid values for each bit are 0 and 1, you could let the user select them from a combo-box.
Doing that starting from the MFC's CListCtrl is non-trivial work; so, you may also consider using other open-source controls, like this CGridListCtrlEx control available from CodeProject.

As you write in your last paragraph, "check that the characters in the variable cvalue of type CString are '0' or '1'?".
That's exactly how. '0' is the character 0. But you check for the integer 0, which is equal to the character '\0'. That's the end-of-string character.

Related

Retrieve CString file path from XML file

I have an XML file with many values and a working C++ function that can retrieve these values
Two of these values are:
A file path such as: "C:\foo1\foo2" and
A file name: "foo3.txt"
Combining these together, they would become "C:\foo1\foo2\foo3.txt"
However, while trying to set a CString to save a file path, it will give an error because using the character, \, in a string is not allowed due to string notation and its interaction with the \ character.
I am using MFC, and I know WIN32 allows you to create a file path with / instead of \, so: "C:/foo1/foo2/foo3.txt" would work. I tested this in Windows Explorer and it worked.
I would like to collect the file path from XML file, but when it comes in, it will have \ instead of / in its file path, meaning it will not be possible to replace the character (the string coming in will have an error already due to XML not having a problem with the \ character.
How do I safely retrieve the path as a CString, ideally while converting any \ character to a / character.
Now I'm not familiar with the "CString" class you are refering to. Googling the API documentation just has the standard c style char array format commands, so I'm going to assume rightly or wrongly cstring is a char array.
The fact we are going to need to use an object that is not resizable means we either
Need to use the heap, which will be slow, and can leak memory if the memory isn't deleted later
Allow a maximum string length and accept it will be truncated if below this
Heap example (NOTE: I'm not using smart pointers as I assume they don't have access to them, else you'd just std::string and not do this.)
char* escapeString(const char* data, unsigned int length){
//multiplying by 1.5 means this could still truncate,
//but I'm making an educated guess it's not all bad characters.
const int newLen = (length + 1) * 1.5;
char* escaped = new char[newLen + 1];
unsigned int index = 0;
for(unsigned int i = 0; i < length && i < newLen; i++){
if(data[i] == '\\' || data[i] == '\"'){
escaped[index++] = '\\';
}
else if(data[i] == '%'){
escaped[index++] = '%';
}
//else anything else you want to escape
escaped[index++] = data[i];
}
//Make sure a null string is null terminatedescaped
escaped[index] = '\0';
return escaped;
}
int main() {
const char* stringWithBadChars = "I\"m not a %%good \\string";
char* escapedString = escapeString(stringWithBadChars, strlen(stringWithBadChars));
std::cout << escapedString;
delete [] escapedString;
return 0;
}
If we do this on the stack instead it would be a lot faster, but we are limited by the size of the buffer we give, and the size of the buffer in the function. We will return a bool if either fails.
bool escapeString(char* data, unsigned int length){
const int newLen = 1000;
char escaped[1001];
unsigned int index = 0;
for(unsigned int i = 0; i < length && i < newLen; i++){
if(data[i] == '\\' || data[i] == '\"'){
escaped[index++] = '\\';
}
else if(data[i] == '%'){
escaped[index++] = '%';
}
escaped[index++] = data[i];
}
//Make sure a null string is null terminatedescaped
memcpy(data, escaped, index);
escaped[index] = '\0';
return index < length && index < 1000;
}
You could probably get even more efficiency using memmov rather than copy it character by character. Doing it this way you also wouldn't need the second char array.
CString reserves some special characters. Have a look at the Format command as an example. The linked documentation refers you to: Format specification syntax: printf and wprintf functions.
The \ is used as mentioned in the comments to indicate a special character. For example:
\t will insert a tab character.
\" will insert a double quote character.
So when it hits the \ it expects the next character to be one of the special ones. Therefore, when you actually need a backslash, you use \\.
The linked article does explain about % but not the slash. However, tt is exactly the same with % because it too has special meaning. So you would use %% when you want the percent sign.

Replacing chars from string

My code is the following (reduced):
CComVariant* input is an input parameter
CString cstrPath(input ->bstrVal);
const CHAR cInvalidChars[] = {"/*&#^°\"§$[]?´`\';|\0"};
for (unsigned int i = 0; i < strlen(cInvalidChars); i++)
{
cstrPath.Replace(cInvalidChars[i],_T(''));
}
When debugging, value of cstrPath is L"§", value of cInvalidChars[7] is -89 '§'
I have tried to use .Remove() before, but the problem remains the same: when it comes to § or ´, the code table does not seem to match and the char does not get recognized properly and will not be removed. using a TCHAR array for invalidChars results in even different problems ('§' -> 'ᄡ').
The problem seems that I am not using the correct code tables, but everything I tried so far did not result in any success.
I want to successfully replace/delete any occuring '§'..
I also have had a look at several "delete character from string"-Posts but I did not find anything that helped me.
executable code:
CComVariant* pccovaValue = new CComVariant();
pccovaValue->bstrVal = L"§§";
const CHAR cInvalidChars[] = {"§"};
CString cstrPath(pccovaValue->bstrVal);
for (unsigned int i = 0; i < strlen(cInvalidChars); i++)
{
cstrPath.Remove(cInvalidChars[i]);
}
cstrPath = cstrPath;
just break into cstrPath = cstrPath;
According to the comments you are mixing up Unicode and ANSI encodings. It seems that your application is targeting Unicode which is good. You should stop using ANSI altogether.
Declare cInvalidChars like this:
CString cInvalidChars = L"/*&#^°\"§$[]?´`\';|";
The use of the L prefix means that the string literal is a wide character UTF-16 literal.
Then your loop can look like this:
for (int i = 0; i < cInvalidChars.GetLength(); i++)
cstrPath.Remove(cInvalidChars[i]);

Converting a unsigned char(BYTE) array to const t_wchar* (LPCWSTR)

Alright so I have a BYTE array that I need to ultimately convert into a LPCWSTR or const WCHAR* to use in a built in function. I have been able to print out the BYTE array with printf but now that I need to convert it into a string I am having problems... mainly that I have no idea how to convert something like this into a non array type.
BYTE ba[0x10];
for(int i = 0; i < 0x10; i++)
{
printf("%02X", ba[i]); // Outputs: F1BD2CC7F2361159578EE22305827ECF
}
So I need to have this same thing basically but instead of printing the array I need it transformed into a LPCWSTR or WCHAR or even a string. The main problem I am having is converting the array into a non array form.
LPCWSTR represents a UTF-16 encoded string. The array contents you have shown are outside the 7bit ASCII range, so unless the BYTE array is already encoded in UTF-16 (the array you showed is not, but if it were, you could just use a simple type-cast), you will need to do a conversion to UTF-16. You need to know the particular encoding of the array before you can do that conversion, such as with the Win32 API MultiByteToWideChar() function, or third-party libraries like iconv or ICU, or built-in locale convertors in C++11, etc. So what is the actual encoding of the array, and where is the array data coming from? It is not UTF-8, for instance, so it has to be something else.
Alright I got it working. Now I can convert the BYTE array to a char* var. Thanks for the help guys but the formatting wasn't a large problem in this instance. I appreciate the help though, its always nice to have some extra input.
// Helper function to convert
Char2Hex(unsigned char ch, char* szHex)
{
unsigned char byte[2];
byte[0] = ch/16;
byte[1] = ch%16;
for(int i = 0; i < 2; i++)
{
if(byte[i] >= 0 && byte[i] <= 9)
{
szHex[i] = '0' + byte[i];
}
else
szHex[i] = 'A' + byte[i] - 10;
}
szHex[2] = 0;
}
// Function used throughout code to convert
CharStr2HexStr(unsigned char const* pucCharStr, char* pszHexStr, int iSize)
{
int i;
char szHex[3];
pszHexStr[0] = 0;
for(i = 0; i < iSize; i++)
{
Char2Hex(pucCharStr[i], szHex);
strcat(pszHexStr, szHex);
}
}

SetWindowText with a single dimensional array

Is it possible to display a single dimensional array of values using SetWindowsText() in a text box on windows api?
for example. SetWindowText(hwndStatic3, sArray);
******************EDIT************
I have a textbox on the windows api where I use GetWindowText() to retrieve the string written in the text box then I convert the string to decimal array. I then convert this decimal array value to hexadecimal value as I am trying to print those values using SetwindowsText within another textbox. However only the last value of the array is printing. How can I print all the values?
******************EDIT************
code:
GetWindowText(hwndtext1, value, 256);
for (i = 15; i >= 0; i--)
{
temp[i] = atoll(value); //converts sting to decimal
ulltoa(temp[i] , sArray, 16); //converts decimal to hexadecimal
buf[i] = temp[i];
}
SetWindowText(hwndStatic3, sArray);
SetWindowText is just a macro with signature:
BOOL SetWindowText(HWND, const TCHAR*);
Depending on your build settings, it will call one of the following:
BOOL SetWindowTextA(HWND, const char*); //ansi version
BOOL SetWindowTextW(HWND, const wchar_t*); //unicode version
where TCHAR is defined as:
#ifdef _UNICODE
typedef wchar_t TCHAR;
#else
typedef char TCHAR;
#endif
So, an array of strings is not compatible with SetWindowText but an array of characters will work, provided that the array is of type TCHAR *, or of type (char * or wchar_t *) that is compatible with your settings.
First, atoll and ulltoa aren't documented with the Microsoft Visual C/C++ (which is what I use for Windows) so I'm working from documentation I found online. Either your versions do more than those I've found documented, or you've left out some significant code from your example.
Based on the loop control, I'm guessing that you expect to always find 15 values in the string you read from the first control. BUT... the atoll and ulltoa functions only operate on one value at a time and do nothing to advance through the input list. So your loop is converting the first number from string to 64 bit int and then converting that into a string 15 times.
Since you say the last value is the only one you see, your functions must actually be parsing the value string in some way that is not apparent in your example. However, ulltoa seems to always be placing the value into the same place in the same string variable, with each subsequent call in the loop overwriting the previous call. My lazy self would add a bit like this:
int len = 0;
char szOutput[15*20]; // enough space for 15 64 bit hex strings
GetWindowText(hwndtext1, value, 256);
for (i = 15; i >= 0; i--)
{
temp[i] = atoll(value); //converts sting to decimal
ulltoa(temp[i] , sArray, 16); //converts decimal to hexadecimal
buf[i] = temp[i];
len += sprintf( szOutput+len, "%s ", sArray );
}
szOutput[len-1] - '\0'; // remove the final space
SetWindowText(hwndStatic3, szOutput);
Of course, with the sprintf you could also skip the ulltoa call entirely and change the sprintf line to:
len += sprintf( szOutput+len, "%16.16I64X", temp[i] );
(or whatever flavor/form of the hex output you want (see the printf format documentation for details.) If you want your list to be one item per line, then replace the trailing space with a newline. Oh, the I64 in the %16.16I64X is a Microsoft thing that might be different in other compilers/libraries.
FYI, the sprintf technique I used lets the function keep appending to the end of the buffer but incrementing the offset into the buffer (len) by the length of the string just appended, which is the value returned by sprintf. It is a quick and easy way to assembling string lists such as yours.

Convert wchar_t to char

I was wondering is it safe to do so?
wchar_t wide = /* something */;
assert(wide >= 0 && wide < 256 &&);
char myChar = static_cast<char>(wide);
If I am pretty sure the wide char will fall within ASCII range.
Why not just use a library routine wcstombs.
assert is for ensuring that something is true in a debug mode, without it having any effect in a release build. Better to use an if statement and have an alternate plan for characters that are outside the range, unless the only way to get characters outside the range is through a program bug.
Also, depending on your character encoding, you might find a difference between the Unicode characters 0x80 through 0xff and their char version.
You are looking for wctomb(): it's in the ANSI standard, so you can count on it. It works even when the wchar_t uses a code above 255. You almost certainly do not want to use it.
wchar_t is an integral type, so your compiler won't complain if you actually do:
char x = (char)wc;
but because it's an integral type, there's absolutely no reason to do this. If you accidentally read Herbert Schildt's C: The Complete Reference, or any C book based on it, then you're completely and grossly misinformed. Characters should be of type int or better. That means you should be writing this:
int x = getchar();
and not this:
char x = getchar(); /* <- WRONG! */
As far as integral types go, char is worthless. You shouldn't make functions that take parameters of type char, and you should not create temporary variables of type char, and the same advice goes for wchar_t as well.
char* may be a convenient typedef for a character string, but it is a novice mistake to think of this as an "array of characters" or a "pointer to an array of characters" - despite what the cdecl tool says. Treating it as an actual array of characters with nonsense like this:
for(int i = 0; s[i]; ++i) {
wchar_t wc = s[i];
char c = doit(wc);
out[i] = c;
}
is absurdly wrong. It will not do what you want; it will break in subtle and serious ways, behave differently on different platforms, and you will most certainly confuse the hell out of your users. If you see this, you are trying to reimplement wctombs() which is part of ANSI C already, but it's still wrong.
You're really looking for iconv(), which converts a character string from one encoding (even if it's packed into a wchar_t array), into a character string of another encoding.
Now go read this, to learn what's wrong with iconv.
An easy way is :
wstring your_wchar_in_ws(<your wchar>);
string your_wchar_in_str(your_wchar_in_ws.begin(), your_wchar_in_ws.end());
char* your_wchar_in_char = your_wchar_in_str.c_str();
I'm using this method for years :)
A short function I wrote a while back to pack a wchar_t array into a char array. Characters that aren't on the ANSI code page (0-127) are replaced by '?' characters, and it handles surrogate pairs correctly.
size_t to_narrow(const wchar_t * src, char * dest, size_t dest_len){
size_t i;
wchar_t code;
i = 0;
while (src[i] != '\0' && i < (dest_len - 1)){
code = src[i];
if (code < 128)
dest[i] = char(code);
else{
dest[i] = '?';
if (code >= 0xD800 && code <= 0xD8FF)
// lead surrogate, skip the next code unit, which is the trail
i++;
}
i++;
}
dest[i] = '\0';
return i - 1;
}
Technically, 'char' could have the same range as either 'signed char' or 'unsigned char'. For the unsigned characters, your range is correct; theoretically, for signed characters, your condition is wrong. In practice, very few compilers will object - and the result will be the same.
Nitpick: the last && in the assert is a syntax error.
Whether the assertion is appropriate depends on whether you can afford to crash when the code gets to the customer, and what you could or should do if the assertion condition is violated but the assertion is not compiled into the code. For debug work, it seems fine, but you might want an active test after it for run-time checking too.
Here's another way of doing it, remember to use free() on the result.
char* wchar_to_char(const wchar_t* pwchar)
{
// get the number of characters in the string.
int currentCharIndex = 0;
char currentChar = pwchar[currentCharIndex];
while (currentChar != '\0')
{
currentCharIndex++;
currentChar = pwchar[currentCharIndex];
}
const int charCount = currentCharIndex + 1;
// allocate a new block of memory size char (1 byte) instead of wide char (2 bytes)
char* filePathC = (char*)malloc(sizeof(char) * charCount);
for (int i = 0; i < charCount; i++)
{
// convert to char (1 byte)
char character = pwchar[i];
*filePathC = character;
filePathC += sizeof(char);
}
filePathC += '\0';
filePathC -= (sizeof(char) * charCount);
return filePathC;
}
one could also convert wchar_t --> wstring --> string --> char
wchar_t wide;
wstring wstrValue;
wstrValue[0] = wide
string strValue;
strValue.assign(wstrValue.begin(), wstrValue.end()); // convert wstring to string
char char_value = strValue[0];
In general, no. int(wchar_t(255)) == int(char(255)) of course, but that just means they have the same int value. They may not represent the same characters.
You would see such a discrepancy in the majority of Windows PCs, even. For instance, on Windows Code page 1250, char(0xFF) is the same character as wchar_t(0x02D9) (dot above), not wchar_t(0x00FF) (small y with diaeresis).
Note that it does not even hold for the ASCII range, as C++ doesn't even require ASCII. On IBM systems in particular you may see that 'A' != 65