Strange behavior using CString in swscanf directly - c++

I have one problem with CString and STL's set.
It looks a bit strange to use CString and STL together, but I tried to be curious.
My code is below:
#include "stdafx.h"
#include <iostream>
#include <set>
#include <atlstr.h>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
wchar_t line[1024] = {0};
FILE * pFile = _wfopen(L"F:\\test.txt", L"rt");
set<CString> cstr_set;
while (fgetws(line, 1024, pFile))
{
CString cstr;
swscanf(line, L"%s\n", cstr);
cstr_set.insert(cstr);
}
fclose(pFile);
cout << "count" << cstr_set.size();
return 0;
}
The contents of the test.txt is:
13245
123
2344
45
After the loop ends, cstr_set contains only one value.
It works as if cstr is static or const variable.
What is the problem?

A CString is a Microsoft implementation wrapping a character array into a C++ object to allow simpler processing.
But, swscanf is a good old C function that knows nothing about what a CString is: it just expects its arguments to be large enough to accept the decoded values. It should never be directly passed a CString.
The correct way would be:
...
#include <cstring>
...
while (fgetws(line, 1024, pFile))
{
line[wcscspn(line, L"\n")] = 0; // remove an optional end of line
CString cstr(line);
cstr_set.insert(cstr);
}
...

Related

How to concrete a string with GetWindowsDirectoryA returned result?

I'm using this GetWindowsDirectoryA Windows API function to get the location for Windows folder.
#include <iostream>
#include <string>
#include <vector>
#ifdef __WIN32
#include <fcntl.h>
#include <io.h>
#include <Windows.h>
#include <sysinfoapi.h>
#endif
std::string GetOSFolder() {
std::vector<char> buffer(MAX_PATH + 1);
GetWindowsDirectoryA(buffer.data(), MAX_PATH);
std::string windowsRoot{ buffer.data(), buffer.size() };
return windowsRoot + "/SomeFolder";
}
int main() {
std::cout << GetOSFolder() << "\n";
}
I want to concrete a folder name with the returned Windows folder string result.
windowsRoot + "/SomeFolder"
Above attempt results the following string,
C:\Windows
/SomeFolder
This seems happening because the buffer size is set to MAX_PATH which is larger than the actual string.
Is there a way to construct the string from buffer with actual string size?
You could initialize the string with the path, you wouldn't even need to know the size (which is returned by GetWindowsDirectoryA as stated in the comment section), but it's advisable to use it given that it does optimize std::string initialization.
Example:
std::string GetOSFolder() {
char buffer[MAX_PATH + 1];
const auto size = GetWindowsDirectoryA(buffer, MAX_PATH);
if (size) {
return std::string(buffer, size).append("/SomeFolder");
}
return ""; // or however you want to handle the error
}
int main() {
std::cout << GetOSFolder() << "\n";
}
Or you could avoid the second variable for buffer but you then have to resize the original buffer:
std::string GetOSFolder() {
std::string buffer(MAX_PATH + 1, 0);
auto result = GetWindowsDirectoryA(buffer.data(), MAX_PATH); // or &buffer[0] prior to C++17
if (result) {
buffer.resize(result);
return buffer.append("/SomeFolder");
}
return buffer; // will be an empty string if API call fails
}
Now, if you want to avoid resize you could still use some trickery:
std::string GetOSFolder() {
const auto size = GetWindowsDirectoryA(nullptr , 0);
if (size) {
std::string buffer(size, 0);
const auto result = GetWindowsDirectoryA(buffer.data(), MAX_PATH); // or &buffer[0] prior to C++17
if (result) {
return buffer.append("\\OtherPath");
}
}
return "";
}
Here you call GetWindowsDirectoryA twice, the first one to know the size you need for your buffer, the second to actually retrieve the path.
Note that in this last option, the first call will return the length of the string with the null terminator whereas the second call will only return the length of the string without including the null byte as is typical in WIN32 API for this type of call, it's a well-known pattern.

How to convert a command line argument to an int?

Im trying to convert the command line argument(*argv[]) to an integer using the atoi function
int main(int argc, char *argv[]) {
This is my attempt
#include <iostream>
#include <sstream>
#include <string>
#include <cstdlib>
#include <conio.h>
using namespace std;
int main(int argc, char *argv[]) {
int x = 0;
for ( x=0; x < argc; x++ )
{
int x = atoi(argv[1]);
cout << x;
}
return 0;
}
However this returns 0 and im unsure why. Thankyou
It's hard to say having the arguments you pass to your program, but there are few problems here.
Your loop goes from 0 to argc, but your inside your loop you always use argv[1], if you didn't pass any arguments you're going out of bounds, because argv[0] is always the path to your executable.
atoi is a function from C, and when it fails to parse it's argument as an int, it returns 0, replace it with std::stoi, and you will get and execption if the conversion failed. You can catch this exception with try/catch, and then check the string that you tried to convert to int.
Well, this
#include <iostream>
#include <sstream>
#include <string>
#include <cstdlib>
#include <conio.h>
using namespace std;
int main(int argc, char* argv[]) {
int x = 0;
for (x = 0; x < argc; x++)
{
cout << argv[x];
}
return 0;
}
just prints the path to the .exe, the path is a string, it has no numbers. And as I understood from my "research" about command line arguments, you need to use your program through a command line, a terminal, to initialise the argv argument.
Link : https://www.tutorialspoint.com/cprogramming/c_command_line_arguments.htm
Also, as I understood at least, the argv[0] is always the path of the .exe
I hope I will be of some help, if I am mistaken at something, pls tell me where and I will correct my self by editing the answer

How to convert unicode code points to utf-8 in c++?

I have an array consisting of unicode code points
unsigned short array[3]={0x20ac,0x20ab,0x20ac};
I just want this to be converted as utf-8 to write into file byte by byte using C++.
Example:
0x20ac should be converted to e2 82 ac.
or is there any other method that can directly write unicode characters in file.
Finally! With C++11!
#include <string>
#include <locale>
#include <codecvt>
#include <cassert>
int main()
{
std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> converter;
std::string u8str = converter.to_bytes(0x20ac);
assert(u8str == "\xe2\x82\xac");
}
The term Unicode refers to a standard for encoding and handling of text. This incorporates encodings like UTF-8, UTF-16, UTF-32, UCS-2, ...
I guess you are programming in a Windows environment, where Unicode typically refers to UTF-16.
When working with Unicode in C++, I would recommend the ICU library.
If you are programming on Windows, don't want to use an external library, and have no constraints regarding platform dependencies, you can use WideCharToMultiByte.
Example for ICU:
#include <iostream>
#include <unicode\ustream.h>
using icu::UnicodeString;
int main(int, char**) {
//
// Convert from UTF-16 to UTF-8
//
std::wstring utf16 = L"foobar";
UnicodeString str(utf16.c_str());
std::string utf8;
str.toUTF8String(utf8);
std::cout << utf8 << std::endl;
}
To do exactly what you want:
// Assuming you have ICU\include in your include path
// and ICU\lib(64) in your library path.
#include <iostream>
#include <fstream>
#include <unicode\ustream.h>
#pragma comment(lib, "icuio.lib")
#pragma comment(lib, "icuuc.lib")
void writeUtf16ToUtf8File(char const* fileName, wchar_t const* arr, size_t arrSize) {
UnicodeString str(arr, arrSize);
std::string utf8;
str.toUTF8String(utf8);
std::ofstream out(fileName, std::ofstream::binary);
out << utf8;
out.close();
}
Following code may help you,
#include <atlconv.h>
#include <atlstr.h>
#define ASSERT ATLASSERT
int main()
{
const CStringW unicode1 = L"\x0391 and \x03A9"; // 'Alpha' and 'Omega'
const CStringA utf8 = CW2A(unicode1, CP_UTF8);
ASSERT(utf8.GetLength() > unicode1.GetLength());
const CStringW unicode2 = CA2W(utf8, CP_UTF8);
ASSERT(unicode1 == unicode2);
}
This code uses WideCharToMultiByte (I assume that you are using Windows):
unsigned short wide_str[3] = {0x20ac, 0x20ab, 0x20ac};
int utf8_size = WideCharToMultiByte(CP_UTF8, 0, wide_str, 3, NULL, 0, NULL, NULL) + 1;
char* utf8_str = calloc(utf8_size);
WideCharToMultiByte(CP_UTF8, 0, wide_str, 3, utf8_str, utf8_size, NULL, NULL);
You need to call it twice: first time to get number of output bytes, and second time to actually convert it. If you know output buffer size, you may skip first call. Or, you can simply allocate buffer 2x larger than original + 1 byte (for your case it means 12+1 bytes) - it should be always enough.
With std c++
#include <iostream>
#include <locale>
#include <vector>
int main()
{
typedef std::codecvt<wchar_t, char, mbstate_t> Convert;
std::wstring w = L"\u20ac\u20ab\u20ac";
std::locale locale("en_GB.utf8");
const Convert& convert = std::use_facet<Convert>(locale);
std::mbstate_t state;
const wchar_t* from_ptr;
char* to_ptr;
std::vector<char> result(3 * w.size() + 1, 0);
Convert::result convert_result = convert.out(state,
w.c_str(), w.c_str() + w.size(), from_ptr,
result.data(), result.data() + result.size(), to_ptr);
if (convert_result == Convert::ok)
std::cout << result.data() << std::endl;
else std::cout << "Failure: " << convert_result << std::endl;
}
Iconv is a popular library used on many platforms.
I had a similar but slightly different problem. I had strings with the Unicode code point in it as a string representation. Ex: "F\u00f3\u00f3 B\u00e1r". I needed to convert the string code points to their Unicode character.
Here is my C# solution
using System.Globalization;
using System.Text.RegularExpressions;
static void Main(string[] args)
{
Regex CodePoint = new Regex(#"\\u(?<UTF32>....)");
Match Letter;
string s = "F\u00f3\u00f3 B\u00e1r";
string utf32;
Letter = CodePoint.Match(s);
while (Letter.Success)
{
utf32 = Letter.Groups[1].Value;
if (Int32.TryParse(utf32, NumberStyles.HexNumber, CultureInfo.GetCultureInfoByIetfLanguageTag("en-US"), out int HexNum))
s = s.Replace("\\u" + utf32, Char.ConvertFromUtf32(HexNum));
Letter = Letter.NextMatch();
}
Console.WriteLine(s);
}
Output: Fóó Bár

How to convert accented chars from command line to wstring?

I'm trying to implement an application where I would like users to enter accented chars on the command line. What I'm trying to do is to convert the char array into a vector of wstring.
I'm on Linux.
Here is what I got so far:
#include <vector>
#include <string>
#include <cstring>
#include <iostream>
std::vector<std::wstring> parse_args(int argc, const char* argv[]){
std::vector<std::wstring> args;
for(int i = 0; i < argc - 1; ++i){
auto raw = argv[i+1];
wchar_t* buf = new wchar_t[1025];
auto size = mbstowcs(buf, raw, 1024);
args.push_back(std::wstring(buf, size));
delete[] buf;
}
return std::move(args);
}
int main(int argc, const char* argv[]){
auto args = parse_args(argc, argv);
for(auto& arg : args){
std::wcout << arg << std::endl;
}
}
It works as expected with normal characters, but does not with accented chars. For instance, if I do:
./a.out Ménage
it crashes:
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_S_create
[1] 30564 abort ./a.out Ménage
The exception comes from the constructor of wstring because size = 18446744073709551615 (size_t - 1 I think) which seems to indicate that there is an unexpected character.
I don't see what it goes wrong ?
What I'm doing wrong ?
EDIT: It's going better
If I add
setlocale(LC_ALL, "");
At the beginning of the program, it doesn't crash, but does output a weird char:
M�nage
could it be a problem with my console now ?
The mbstowcs function uses the character encoding from the current locale. You are not setting the locale, so the default "C" locale gets used; the default locale supports ASCII characters only. Also, you should check the return value of mbstowcs, so it won't fail without you knowing it.
To fix this problem, set the locale in your program:
#include <clocale>
...
int main(int argc, const char* argv[]){
setlocale(LC_ALL,""); // Use locale from environment
....
}

c++ print buffer (integer to string)

Code:
#include "stdafx.h"
#include <windows.h>
#include <iostream>
#include <iomanip>
#include <locale>
#include <sstream>
#include <string>
int main()
{
HWND handle = FindWindow(0 ,TEXT("window name"));
if(handle == 0)
{
MessageBox(0,TEXT("Failed to find window"),TEXT("Return"),MB_OK);
}
else
{
DWORD ID;
GetWindowThreadProcessId(handle,&ID);
HANDLE hProcess = OpenProcess(PROCESS_VM_WRITE|PROCESS_VM_OPERATION , FALSE, ID);
hProcess = OpenProcess(PROCESS_VM_READ , FALSE, ID);
if(!hProcess)
{
Beep(1000,1000);
}else {
int buffer;
if (ReadProcessMemory(hProcess,(void *)0x00963FC4,&buffer,4,NULL))
{
printf(buffer);
}
else {
MessageBox(0,TEXT("Could not Read"),TEXT("Return"),MB_OK);
}
}CloseHandle(hProcess);
}
}
I tried to make this program that reads memory address,
but I got this error:
IntelliSense: argument of type "int" is incompatible with parameter of type "const char *
I tried printf(buffer);
I tried to make string and also doesn't work.
string test;
First, try using the correct printf() call with format string:
printf("%d", buffer);
C is a statically typed language and you cannot do python-like stuff with printf() to output anything you want. The printf() function always prints only the first "const char *" argument allowing to substitute some values in this string according to the rules.
Second, I see the TEXT() macros in your code, so you might be using the Unicode strings in your project setup. If so (you should get link errors 2019/2005 in VC++), you have to use the wprintf() function:
wprintf(L"%d", buffer);
To print the std::string object you must also convert it to the "const char*". This is done by the string::c_str() call:
std::string MyString("Test");
printf("Your string is = %s", MyString.c_str());