CSocket Client-Server String recieved is chinese looking characters - c++

Simple CAsyncSocket Server and client program. Right now I'm testing locally using tera term vt. So I type a word in TT and it gets sent to my program but the string I receive is just a bunch of Chinese characters. I'm using MFC and compiling in Unicode. Now the funny thing is when I comply with multibyte character set the string is received just fine so I'm not sure what that means or what I can change to get that result.
Code where the receiving happens
void CClientSock::OnReceive(int nErrorCode)
{
TCHAR buf[1000];
memset(buf,'\0',1000);
CString recStr;
int bytesRead;
bytesRead = Receive(buf,1000);
switch(bytesRead)
{
case 0:
Close();
break;
case SOCKET_ERROR:
if(GetLastError() != WSAEWOULDBLOCK)
{
AfxMessageBox(L"Error occured");
Close();
}
break;
default:
buf[bytesRead] = '\0';
CString temp(buf);
recStr = temp;
CT2A Astring(recStr);
CString nString(Astring);
AfxMessageBox(nString);
}
CAsyncSocket::OnReceive(nErrorCode);
}

The data that you received from CAsyncSocket::Receive is probably multi-byte character, so just replace the TCHAR buf[1000]; with char buf[1000];
You also have created too many redundant CStrings for text conversion. It can be simplied to:
default:
buf[bytesRead] = '\0';
recStr = buf;
AfxMessageBox(recStr);

Related

C++ URLDownloadToFile to executable directory in CLR forum

I'm trying to figure out how I can fix two issues that I've been having with the URLDownloadToFile function in C++. The first is that when attempting to download the file in question, the download doesn't actually appear until the CLR C++ window is closed. The second is that as shown on the image below, the resulting file's file name and extension (Which is above the success window) is messed up (Although opening it normally shows that the file downloaded fine, with the name as the exception). If anyone has any suggestions on what I could do to fix these two issues, I'd greatly appreciate it.
For this logic, I'm using:
{
char buffer[MAX_PATH];
GetModuleFileNameA(NULL, buffer, MAX_PATH);
std::string::size_type pos = std::string(buffer).find_last_of("\\/");
return std::string(buffer).substr(0, pos);
}
void StartDownload()
{
HRESULT downloadUpdate;
LPCTSTR downloadUrl = "https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png", File = "download.png";
string currentDirectory = GetCurrentDirectory();
LPTSTR currentDirectoryLPTSTR = new TCHAR[currentDirectory.size() + 1];
std::string(currentDirectoryLPTSTR).append(File).c_str();
downloadUpdate = URLDownloadToFile(0, downloadUrl, currentDirectoryLPTSTR, 0, 0);
switch (downloadUpdate)
{
case S_OK:
updateSuccess();
break;
case E_OUTOFMEMORY:
updateOOMError();
break;
case INET_E_DOWNLOAD_FAILURE:
updateError();
break;
default:
updateErrorUnknown();
break;
}
}
[STAThread]
int main() {
Application::EnableVisualStyles();
Application::SetCompatibleTextRenderingDefault(false);
Application::Run(gcnew UpdaterGUIProject::UpdaterGUI());
StartDownload();
return 0;
}
LPTSTR currentDirectoryLPTSTR = new TCHAR[currentDirectory.size() + 1]; allocates memory, but does not fill it with any data.
std::string(currentDirectoryLPTSTR) creates a new string object and tries to copy data from currentDirectoryLPTSTR into the string. Which is undefined behavior since currentDirectoryLPTSTR is not a proper null-terminated string. This code does not cause the string object to point at the memory allocated for currentDirectoryLPTSTR, like you clearly think it does.
You are then append()ing File to that string object, not to the contents of currentDirectoryLPTSTR.
And then you throw away the string object you just created, and pass the unfilled currentDirectoryLPTSTR as-is to URLDownloadToFile(). Which is why your output file has a messed-up filename (you are lucky it even shows up in the correct folder at all).
Try this instead:
std::string GetCurrentDirectory()
{
char buffer[MAX_PATH] = {};
DWORD size = GetModuleFileNameA(NULL, buffer, MAX_PATH);
if (size == 0 || size == MAX_PATH) return "";
std::string fileName(buffer, size);
std::string::size_type pos = fileName.find_last_of("\\/");
return fileName.substr(0, pos + 1);
}
void StartDownload()
{
HRESULT downloadUpdate;
LPCSTR downloadUrl = "https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png", File = "download.png";
string localFile = GetCurrentDirectory() + File;
downloadUpdate = URLDownloadToFileA(0, downloadUrl, localFile.c_str(), 0, 0);
switch (downloadUpdate)
{
case S_OK:
updateSuccess();
break;
case E_OUTOFMEMORY:
updateOOMError();
break;
case INET_E_DOWNLOAD_FAILURE:
updateError();
break;
default:
updateErrorUnknown();
break;
}
}
On a side note, you really shouldn't be downloading files to the same folder that your program is running from. If your program is installed in a folder like C:\Program Files or C:\Program Files (x86), somewhere only admins can write to, then the download will be likely to fail. You should be downloading to a folder that the user has write access to, such as to a subfolder you create under %APPDATA% for your program to use.

How to handle pasted text correctly via GetConsoleInput()?

In Windows console, we can use GetConsoleInput() to get raw keyboard (and more) input. I want to use it to implement a custom function that read keystrokes with possible CTRL, SHIFT, ALT status. A simplified version of the function is
// for demo only, no error checking ...
struct ret {
wchar_t ch; // 2-byte UTF-16 in Windows
DWORD control_keys;
};
ret getch() {
HANDLE in = GetStdHandle(STD_INPUT_HANDLE);
INPUT_RECORD buf;
DWORD cnt;
for (;;) {
ReadConsoleInput(in, &buf, 1, &cnt);
if (buf.EventType != KEY_EVENT)
continue;
const KEY_EVENT_RECORD& rec = buf.Event.KeyEvent;
if (!rec.bKeyDown)
continue;
if (!rec.uChar.UnicodeChar)
continue;
return { rec.uChar.UnicodeChar,rec.dwControlKeyState };
}
}
It works fine, except that when I try to paste character not representable in 2 bytes in UTF-16, the UnicodeChar field is 0 when bKeyDown==true, and the UnicodeChar field is the pasted content when bKeyDown==false. Can anyone tell why it is the case and suggest possible workarounds?
Here is some demo code and result.

std::wcout, why is the printed character not the same as the input? [duplicate]

I tried to printf with some accented characters such as á é í ó ú:
printf("my name is Seán\n");
The text editor in the DEVC++ IDE displays them fine - i.e the source code looks fine.
I guess I need some library other than stdio.h and maybe some variant of the normal printf.
I'm using IDE Bloodshed DEVC running on Windows XP.
Perhaps the best is to use Unicode.
Here's how...
First, manually set your console font to "Consolas" or "Lucida Console" or whichever True-Type Unicode font you can choose ("Raster fonts" may not work, those aren't Unicode fonts, although they may include characters you're interested in).
Next, set the console code page to 65001 (UTF-8) with SetConsoleOutputCP(CP_UTF8).
Then convert your text to UTF-8 (if it's not yet in UTF-8) using WideCharToMultiByte(CP_UTF8, ...).
Finally, call WriteConsoleA() to output the UTF-8 text.
Here's a little function that does all these things for you, it's an "improved" variant of wprintf():
int _wprintf(const wchar_t* format, ...)
{
int r;
static int utf8ModeSet = 0;
static wchar_t* bufWchar = NULL;
static size_t bufWcharCount = 256;
static char* bufMchar = NULL;
static size_t bufMcharCount = 256;
va_list vl;
int mcharCount = 0;
if (utf8ModeSet == 0)
{
if (!SetConsoleOutputCP(CP_UTF8))
{
DWORD err = GetLastError();
fprintf(stderr, "SetConsoleOutputCP(CP_UTF8) failed with error 0x%X\n", err);
utf8ModeSet = -1;
}
else
{
utf8ModeSet = 1;
}
}
if (utf8ModeSet != 1)
{
va_start(vl, format);
r = vwprintf(format, vl);
va_end(vl);
return r;
}
if (bufWchar == NULL)
{
if ((bufWchar = malloc(bufWcharCount * sizeof(wchar_t))) == NULL)
{
return -1;
}
}
for (;;)
{
va_start(vl, format);
r = vswprintf(bufWchar, bufWcharCount, format, vl);
va_end(vl);
if (r < 0)
{
break;
}
if (r + 2 <= bufWcharCount)
{
break;
}
free(bufWchar);
if ((bufWchar = malloc(bufWcharCount * sizeof(wchar_t) * 2)) == NULL)
{
return -1;
}
bufWcharCount *= 2;
}
if (r > 0)
{
if (bufMchar == NULL)
{
if ((bufMchar = malloc(bufMcharCount)) == NULL)
{
return -1;
}
}
for (;;)
{
mcharCount = WideCharToMultiByte(CP_UTF8,
0,
bufWchar,
-1,
bufMchar,
bufMcharCount,
NULL,
NULL);
if (mcharCount > 0)
{
break;
}
if (GetLastError() != ERROR_INSUFFICIENT_BUFFER)
{
return -1;
}
free(bufMchar);
if ((bufMchar = malloc(bufMcharCount * 2)) == NULL)
{
return -1;
}
bufMcharCount *= 2;
}
}
if (mcharCount > 1)
{
DWORD numberOfCharsWritten, consoleMode;
if (GetConsoleMode(GetStdHandle(STD_OUTPUT_HANDLE), &consoleMode))
{
fflush(stdout);
if (!WriteConsoleA(GetStdHandle(STD_OUTPUT_HANDLE),
bufMchar,
mcharCount - 1,
&numberOfCharsWritten,
NULL))
{
return -1;
}
}
else
{
if (fputs(bufMchar, stdout) == EOF)
{
return -1;
}
}
}
return r;
}
Following tests this function:
_wprintf(L"\xA0\xA1\xA2\xA3\xA4\xA5\xA6\xA7"
L"\xA8\xA9\xAA\xAB\xAC\xAD\xAE\xAF"
L"\xB0\xB1\xB2\xB3\xB4\xB5\xB6\xB7"
L"\xB8\xB9\xBA\xBB\xBC\xBD\xBE\xBF"
L"\n"
L"\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7"
L"\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF"
L"\xD0\xD1\xD2\xD3\xD4\xD5\xD6\xD7"
L"\xD8\xD9\xDA\xDB\xDC\xDD\xDE\xDF"
L"\n"
L"\xE0\xE1\xE2\xE3\xE4\xE5\xE6\xE7"
L"\xE8\xE9\xEA\xEB\xEC\xED\xEE\xEF"
L"\xF0\xF1\xF2\xF3\xF4\xF5\xF6\xF7"
L"\xF8\xF9\xFA\xFB\xFC\xFD\xFE\xFF"
L"\n");
_wprintf(L"\x391\x392\x393\x394\x395\x396\x397"
L"\x398\x399\x39A\x39B\x39C\x39D\x39E\x39F"
L"\x3A0\x3A1\x3A2\x3A3\x3A4\x3A5\x3A6\x3A7"
L"\x3A8\x3A9\x3AA\x3AB\x3AC\x3AD\x3AE\x3AF\x3B0"
L"\n"
L"\x3B1\x3B2\x3B3\x3B4\x3B5\x3B6\x3B7"
L"\x3B8\x3B9\x3BA\x3BB\x3BC\x3BD\x3BE\x3BF"
L"\x3C0\x3C1\x3C2\x3C3\x3C4\x3C5\x3C6\x3C7"
L"\x3C8\x3C9\x3CA\x3CB\x3CC\x3CD\x3CE"
L"\n");
_wprintf(L"\x410\x411\x412\x413\x414\x415\x401\x416\x417"
L"\x418\x419\x41A\x41B\x41C\x41D\x41E\x41F"
L"\x420\x421\x422\x423\x424\x425\x426\x427"
L"\x428\x429\x42A\x42B\x42C\x42D\x42E\x42F"
L"\n"
L"\x430\x431\x432\x433\x434\x435\x451\x436\x437"
L"\x438\x439\x43A\x43B\x43C\x43D\x43E\x43F"
L"\x440\x441\x442\x443\x444\x445\x446\x447"
L"\x448\x449\x44A\x44B\x44C\x44D\x44E\x44F"
L"\n");
And should result in the following text in the console:
 ¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿
ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞß
àáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ
ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡ΢ΣΤΥΦΧΨΩΪΫάέήίΰ
αβγδεζηθικλμνξοπρςστυφχψωϊϋόύώ
АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
абвгдеёжзийклмнопрстуфхцчшщъыьэюя
I do not know the encoding in which your IDE stores non-ASCII characters in .c/.cpp files and I do not know what your compiler does when encounters non-ASCII characters. This part you should figure out yourself.
As long as you supply to _wprintf() properly encoded UTF-16 text or call WriteConsoleA() with properly encoded UTF-8 text, things should work.
P.S. Some gory details about console fonts can be found here.
Windows console is generally considered badly broken regarding to character encodings. You can read about this problem here, for example.
The problem is that Windows generally uses the ANSI codepage (which is assuming you are in Western Europe or America Windows-1252), but the console uses the OEM codepage (CP850 under the same assumption).
You have several options:
Convert the text to CP850 before writing it (see CharToOem()). The drawback is that if the user redirects the output to a file (> file.txt) and opens the file with e.g. Notepad, he will see it wrong.
Change the codepage of the console: You need to select a TTF console font (Lucida Console, for example) and use the command chcp 1252.
Use UNICODE text and wprintf(): You need the TTF console font anyway.
The Windows-1252 (also known as "ANSI") character set used by Windows console mode is not the same as that used by GUI applications. Hence the IDE representation differs from the runtime representation.
A quick-and-dirty solution for your example is:
printf("my name is Se\xe9n\n");
Most solutions to this problem are flawed one way or another and the simplest solution for Windows applications that need extensive multi-language localisation is to write them as GUI apps using Unicode.

Call HPDF_SaveToFile() with japanese filename

Im trying to save one pdf in path that contains japanese username. In this case, HPDF_SaveToFile is doing crash my app on windows. Any options to compile or other thing? Any idea to support Unicode filenames with libhaur? I not want to create pdf with japanese encode, I want to write pdf with japanese filename.
A solution in Qt. If you use C++, you can use fstream/ofstream(::write). If you use C, you can use fwrite.
QFile file(path);
if (file.open(QIODevice::WriteOnly))
{
HPDF_SaveToStream(m_pdf);
/* get the data from the stream and write it to file. */
for (;;)
{
HPDF_BYTE buf[4096];
HPDF_UINT32 siz = 4096;
HPDF_STATUS ret = HPDF_ReadFromStream(m_pdf, buf, &siz);
if (siz == 0)
{
break;
}
if (-1 == file.write(reinterpret_cast<const char *>(buf), siz))
{
qDebug() << "Write PDF error";
break;
}
}
}
HPDF_Free(m_pdf);
Refrence: Libharu Usage examples

Handle the content of a txt file using windows API

I use the fragment of code as follows to get the contents of a text file. However the buffer buff at the end has only the number 8 at one place and nothing else. The file being opened has the word "Project" as the only content. How can I handle (i.e. print) the content or the result I should receive? What is wrong with the following code:
TCHAR buff[20];
DWORD dwNumRead;
CString finalPath = path + L"\\" + fileName.c_str();
HANDLE hfile=CreateFile(finalPath ,GENERIC_READ|GENERIC_WRITE,0,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,NULL);
if(ReadFile(hfile,buff,20,&dwNumRead,NULL))
{
CString temp;
temp.Format(L"%s",&buff[0]);
ATLTRACE(L"Success %s", temp);
}
CloseHandle(hfile);
The trouble is that you are trying to print the MFC CString which is composed of wide character with %s macro. You need the %S macro to print the wide character.
This works :
char buff[20] = "";
DWORD dwNumRead;
CString finalPath = path + L"\\" + fileName.c_str();;
HANDLE hfile=CreateFile(finalPath ,GENERIC_READ|GENERIC_WRITE,0,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,NULL);
if(ReadFile(hfile,buff,20,&dwNumRead,NULL))
{
CString temp = buff;
ATLTRACE("Success %S", temp);
}
CloseHandle(hfile);
Otherwise, compile your program in unicode with the following extra C++ defs.
UNICODE,_UNICODE