How to convert accented chars from command line to wstring? - c++

I'm trying to implement an application where I would like users to enter accented chars on the command line. What I'm trying to do is to convert the char array into a vector of wstring.
I'm on Linux.
Here is what I got so far:
#include <vector>
#include <string>
#include <cstring>
#include <iostream>
std::vector<std::wstring> parse_args(int argc, const char* argv[]){
std::vector<std::wstring> args;
for(int i = 0; i < argc - 1; ++i){
auto raw = argv[i+1];
wchar_t* buf = new wchar_t[1025];
auto size = mbstowcs(buf, raw, 1024);
args.push_back(std::wstring(buf, size));
delete[] buf;
}
return std::move(args);
}
int main(int argc, const char* argv[]){
auto args = parse_args(argc, argv);
for(auto& arg : args){
std::wcout << arg << std::endl;
}
}
It works as expected with normal characters, but does not with accented chars. For instance, if I do:
./a.out Ménage
it crashes:
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_S_create
[1] 30564 abort ./a.out Ménage
The exception comes from the constructor of wstring because size = 18446744073709551615 (size_t - 1 I think) which seems to indicate that there is an unexpected character.
I don't see what it goes wrong ?
What I'm doing wrong ?
EDIT: It's going better
If I add
setlocale(LC_ALL, "");
At the beginning of the program, it doesn't crash, but does output a weird char:
M�nage
could it be a problem with my console now ?

The mbstowcs function uses the character encoding from the current locale. You are not setting the locale, so the default "C" locale gets used; the default locale supports ASCII characters only. Also, you should check the return value of mbstowcs, so it won't fail without you knowing it.
To fix this problem, set the locale in your program:
#include <clocale>
...
int main(int argc, const char* argv[]){
setlocale(LC_ALL,""); // Use locale from environment
....
}

Related

Strange behavior using CString in swscanf directly

I have one problem with CString and STL's set.
It looks a bit strange to use CString and STL together, but I tried to be curious.
My code is below:
#include "stdafx.h"
#include <iostream>
#include <set>
#include <atlstr.h>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
wchar_t line[1024] = {0};
FILE * pFile = _wfopen(L"F:\\test.txt", L"rt");
set<CString> cstr_set;
while (fgetws(line, 1024, pFile))
{
CString cstr;
swscanf(line, L"%s\n", cstr);
cstr_set.insert(cstr);
}
fclose(pFile);
cout << "count" << cstr_set.size();
return 0;
}
The contents of the test.txt is:
13245
123
2344
45
After the loop ends, cstr_set contains only one value.
It works as if cstr is static or const variable.
What is the problem?
A CString is a Microsoft implementation wrapping a character array into a C++ object to allow simpler processing.
But, swscanf is a good old C function that knows nothing about what a CString is: it just expects its arguments to be large enough to accept the decoded values. It should never be directly passed a CString.
The correct way would be:
...
#include <cstring>
...
while (fgetws(line, 1024, pFile))
{
line[wcscspn(line, L"\n")] = 0; // remove an optional end of line
CString cstr(line);
cstr_set.insert(cstr);
}
...

How to convert a command line argument to an int?

Im trying to convert the command line argument(*argv[]) to an integer using the atoi function
int main(int argc, char *argv[]) {
This is my attempt
#include <iostream>
#include <sstream>
#include <string>
#include <cstdlib>
#include <conio.h>
using namespace std;
int main(int argc, char *argv[]) {
int x = 0;
for ( x=0; x < argc; x++ )
{
int x = atoi(argv[1]);
cout << x;
}
return 0;
}
However this returns 0 and im unsure why. Thankyou
It's hard to say having the arguments you pass to your program, but there are few problems here.
Your loop goes from 0 to argc, but your inside your loop you always use argv[1], if you didn't pass any arguments you're going out of bounds, because argv[0] is always the path to your executable.
atoi is a function from C, and when it fails to parse it's argument as an int, it returns 0, replace it with std::stoi, and you will get and execption if the conversion failed. You can catch this exception with try/catch, and then check the string that you tried to convert to int.
Well, this
#include <iostream>
#include <sstream>
#include <string>
#include <cstdlib>
#include <conio.h>
using namespace std;
int main(int argc, char* argv[]) {
int x = 0;
for (x = 0; x < argc; x++)
{
cout << argv[x];
}
return 0;
}
just prints the path to the .exe, the path is a string, it has no numbers. And as I understood from my "research" about command line arguments, you need to use your program through a command line, a terminal, to initialise the argv argument.
Link : https://www.tutorialspoint.com/cprogramming/c_command_line_arguments.htm
Also, as I understood at least, the argv[0] is always the path of the .exe
I hope I will be of some help, if I am mistaken at something, pls tell me where and I will correct my self by editing the answer

std::ifstream::imbue takes no effect

I'm trying to read a text file written in utf-16 as utf-8.
#include <fstream>
#include <codecvt>
#include <cassert>
int main(int argc, char** argv)
{
std::ios_base::sync_with_stdio(false);
const wchar_t utf16_raw_string[] = L"Привет!";
const char expected_string [] = u8"Привет!";
std::ofstream("file.txt").write((char*)utf16_raw_string, sizeof(utf16_raw_string));
std::ifstream ifs("file.txt");
ifs.imbue(std::locale(std::locale::empty(), new std::codecvt<wchar_t, char, std::mbstate_t>() ));
std::string got_string;
ifs >> got_string;
assert(got_string == expected_string);
return 0;
}
It seems that imbue takes no effect. No matter what codecvt is, got_string is always "\x1f\x4#\x48\x42\x45\x4B\x4" and I get the assertion. Any ideas?
I'm using Visual Studio 2015 with update 3.

Linux File Read and Write - C++

I supposed to create a program that reads source.txt's first 100 characters, write them in destination1.txt, and replace all "2" to "S" and write them to destination2.txt. Below is my code
#include <sys/types.h>
#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <cstdio>
#include <iostream>
using namespace std;
int main(int argc, const char* argv[]){
argv[0] = "source.txt";
argv[1] = "destination1.txt";
argv[2] = "destination2.txt";
int count=100;
char buff[125];
int fid1 = open(argv[0],O_RDWR);
read(fid1,buff,count);
close(fid1);
int fid2 = open(argv[1],O_RDWR);
write(fid2,buff,count);
close(fid2);
//How to change the characters?
return 0;
}
Thanks guys I am able to do the copying. But how to perform the character replacement? If it's fstream I know how to do it with a for loop. But I'm supposed to use Linux system calls.
Define an array out_buf and copy buff into out_buf character by character, replacing 2's to S.
...
read(fid1,buff,count);
close(fid1);
char out_buf [125];
int i;
for (i = 0; i < sizeof (buf); i++) {
if (buff [i] == '2')
out_buf [i] = 'S'
else
out_buf [i] = buff [i]
}
int fid2 = open(argv[1],O_RDWR);
write(fid2, out_buf,count);
close(fid2);
return 0;
You should replace the filename assignments to something like this:
const std::string source_filename = "source.txt";
const std::string dest1_filename = "destination1.txt";
const std::string dest2_filename = "destination2.txt";
There is no guarantee that the OS will allocate 3 variables to your program.

C++ CMD input along with exe

Sorry for such a noobie question, but how can I get a program to read the data i input with my program, like how cmd does it with the options
shutdown.exe -f
how do i read the example -f into my program?
This should print out each of the whitespace delimited parameters which were passed to your program.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
for(int i = 0; i < argc; i++)
{
printf("%s\n", argv[i]);
}
return 0;
}
If you're using plain old C++, your main function will need to look something like this:
int main(int argc, char *argv[])
where argc is the number of whitespace separated items, and argv is an array of pointers to each one
int main(int argc, char *argv[])
{
}
arguments are passed by argv.