This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
I can't see the russian alpabet in Visual Studio 2008
I'm trying input symbol from console in Russian alphabet. This is code
#include <iostream>
#include <windows.h>
#include <locale.h>
using namespace std;
void main(){
char c;
setlocale(LC_ALL,"rus");
cout << "Я хочу видеть это по-русски!" << endl;
cin >> c;
cout << c;
}
I entered 'ф', but it prints 'д'. I tried to use
char buf[2];
char str[2];
str[0] = c;
str[1] = '\0';
OemToAnsi(buf, str);
But I have
+ str 0x0015fef4 "¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ф¦¦¦¦d §" char [2]
+ buf 0x0015ff00 "¦¦¦ф¦¦¦¦d §" char [2]
And then I have an error Run-Time Check Failure #2 - Stack around the variable 'str' was corrupted.
I assume the set-up you're using is to have the source saved using cp1251 (Cyrillic Windows) and to have the console using cp866 (Cyrillic DOS). (This will be the default set up on Russian versions of Windows.) The problem you're running into seems to be that setting the locale as you do causes output to be converted from cp1251 to cp866, but does not cause the inverse conversion for input. So when you read a character in, the program gets the cp866 representation. This cp866 representation, when output, is incorrectly treated as a cp1251 representation and converted to cp866, resulting in the ф to д transformation.
I think the conversions is just done by the CRT based on the C locale, but I don't know how to enable a similar conversion for input. There are different options for getting your program to work.
Manually convert input data from cp866 to cp1251 before echoing it.
Replace setlocale(LC_ALL,"rus") which changes how the CRT deals with output with calls to SetConsoleCP(1251); SetConsoleOutputCP(1251); which will instead changes the console's behavior (and the changes will persist for the lifetime of the console rather than the lifetime of your program).
Replace uses of cin and cout with Windows APIs using UTF-16. Microsoft's implementation of the standard library forces the use of legacy encodings and causes all sorts of similar problems on Windows. So just avoid it altogether.
Here's an example of the second option:
#include <iostream>
#include <clocale>
#include <Windows.h>
void main(){
char c;
SetConsoleCP(1251);
SetConsoleOutputCP(1251);
std::cout << "Я хочу видеть это по-русски!\n";
std::cin >> c;
std::cout << c;
}
Assuming the source is cp1251 encoded then the output will appear correctly and an input ф will not be transformed into a д.
The locale might be wrong. Try
setlocale(LC_ALL, "");
This sets the locale to "the default, which is the user-default ANSI code page obtained from the operating system".
const int N = 34;
const char DosABC[N] = "абвгдеёжзийклмнопрстуфхцчшщъыьэюя";
const char WinABC[N] = " ЎўЈ¤Ґс¦§Ё©Є«¬®Їабвгдежзийклмноп";
std::string ToDosStr(std::string input)
{
std::string output = "";
bool Ok;
for (unsigned i = 0; i < input.length(); i++)
{
Ok = false;
for (int j = 0; j < N; j++)
if (input[i] == WinABC[j])
{
output += DosABC[j];
Ok = true;
}
if (!Ok)
output += input[i];
}
return output;
}
I did it, and it works, but everybody welcome to find easier answer
Related
// See the code below and help me why i did not getting the right result. Or suggest any other C++ function to convert a C-string like "$567,789,675.89" into long double
long double mstold( char s[] )
{
int len = strlen(s);
long double cash;
int n=0;
char amount[ 100 ];
for( int i=0; i<len; i++) // for copying the passed C-String into another C-string
{
amount[n] = s[i];
n++;
if( s[i] == '$' || s[i] == ',') // Because the C-String has been passed in format: "$567,789,564.987"
n--;
}
cash = _atold( amount ); // This does not gives the right result
return cash;
}
Use strtold() function, since _atold() is a non standard function. I am posting the code which works in compiler explorer. You were not terminating amount array with '\0'. Perhaps that's the reason _atold not worked.
#include <cstdlib>
#include <iostream>
#include <cstring>
using namespace std;
long double mstold(const char* s)
{
int len = strlen(s);
long double cash;
int n = 0;
char* amount = new char[len+1];
for (int i = 0; i<len; i++) // for copying the passed C-String into another C-string
{
amount[n] = s[i];
n++;
if (s[i] == '$' || s[i] == ',') // Because the C-String has been passed in format: "$567,789,564.987"
n--;
}
amount[n] = '\0';
cash = strtold(amount, NULL); // This does not gives the right result
delete[] amount;
return cash;
}
int main()
{
long double cash = mstold("$567,789,675.89");
std::cout << cash << std::endl;
}
First note. Please do not use C-Style strings. In C++ we use std::string. Anyway, also C-style strings will do and can be converted automatically.
Then, for newbies it is the best to transform the input monetary-string to a number-string with just one decimal digit and then use function stold for conversion. You may read here about it.
But in the real C++ world, you would do 2 things:
use dedicated C++ facilities
use localization
Unfortunately this is a rather complex topic and you need a while to understand.
You need to read about the localization library. Here you will learn about 2 major concepts:
locales
facets
In general textual representation of dates, monetary values or number formats are governed by regional or cultural conventions. All this properties are contained in a std::locale object.
But the std::locale does not offer much functionality. The real localization facilities are offered in the form of facets. And a std::locale encapsulates several facets. And one of them is about the monetary formatting.
You really can do many things with that and in the end get fully customized behaviour. But, as said, not that easy to understand.
I will use the std::money_get class in my below example.
Please note. This will convert your number into units, without a fraction. In financial calculations we basically should not use fractions, because double or even long double cannot store all "reaal" values". Please read about this as well.
Anyway. I will show you an example how such a monetary value would be converted in C++. You maybe shocked by the complexity, but flexibility has its price . . .
Please see:
#include <iostream>
#include <iomanip>
#include <limits>
#include <string>
#include <locale>
#include <sstream>
int main() {
// Input String
char valueAsCString[] = "$567,789,675.89";
// Put it in an istringstream for formatted extraction
std::istringstream iss{ valueAsCString };
// Assume US currency format (On Posix machines, please use "en-US") and set it for the stream
std::locale myLocale("en_US");
iss.imbue(myLocale);
// Assume that everthing is good
std::ios_base::iostate ioError{ std::ios_base::goodbit };
// Here we will get the value in UNITS, so without a fraction!!!
long double value{};
// Parse the money string and get the result
std::use_facet<std::money_get<char>>(myLocale).get(std::istreambuf_iterator<char>(iss), {}, false, iss, ioError, value);
// Check Error state
iss.setstate(ioError);
if (iss)
// Show result
std::cout << std::setprecision(std::numeric_limits<long double>::digits10 + 1) << std::setw(25) << value / 100 << '\n';
else
std::cerr << "\nError during conversion\n";
}
Error screenshot.
I am testing a program to print emoji in C++. My coding environment is Dev C++. But when execute it, it shows a question mark instead of the desired emoji.
Any helpful suggestion to fix my problem would be appreciated.
Here's my code:
#include <iostream>
#include <conio.h>
#include <stdlib.h>
using namespace std;
int main(void)
{
system("cls");
int sml = 1, i, limit;
char ch = sml;
cout << "How many smiley face you want to print ? ";
cin >> limit;
for (i = 0; i < limit; i++)
{
cout << ch << ' ';
}
return 0;
}
I've got this working for a console app on Windows 11 with the new Terminal and Visual Studio using UTF8. See https://utf8everywhere.org.
You want to make sure the language and the console are set for UTF8:
auto UTF8 = std::locale("en_US.UTF-8");
std::locale::global(UTF8);
std::cout.imbue(UTF8);
setlocale(LC_ALL, "en_us.utf8");
SetConsoleOutputCP(CP_UTF8);
I use std::u8string and entered emojis with Win+.
static std::u8string sDead(u8"🖤");
static std::u8string sLive(u8"😀");
static std::u8string sBorn(u8"💕");
static std::u8string sOld(u8"🤡");
static std::u8string sDying(u8"🤢");
static std::u8string sUnknown(u8"⁉️");
And you need to make sure your files are saved as UTF8 in Visual Studio. Click File, Save As... click the down arrow next to save, and click Save with Encoding...
You also need to make sure your Console and Font support emojis.
Full code is here https://github.com/mgradwohl/TerminalLife
Your program doesn't have any problem, But I can't see the output on vscode. You can also use Unicode and print this: ☺
#include<iostream>
#include<conio.h>
#include<stdlib.h>
using namespace std;
int main()
{
system("cls");
int i, limit;
cout<<"How many smiley face you want to print ? ";
cin>>limit;
for(i=0; i<limit; i++)
{
// Print ☺ in UTF-8
cout << "\342\230\272" << "\t";
}
return 0;
}
The program you've written is correct. It is the console which is at fault. It is unable to print the corresponding character for the ASCII value you've passed. I suggest you change the console (On tips for how to do that, you will have to search on Google).
Take a look at my console output:
I have some weird issues I cannot figure out. When I run the code below which takes a file.txt reads it line by line into a vector<string> and then compares each index to string "--" it does not make it to the comparison stage.
Further more, in the convert_file() under the for loop string m, has some weird behavior: string m = "1"; m+= "--"; ('--' inside vector) m+= "2"; will print to console 2--; which makes me think something is bugging out the vector. The 2 is replacing the 1, the first character. This makes it look like the vector is bugged.
#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
vector<string> get_file(const char* file){
int SIZE=256, ln=0;
char str[SIZE];
vector<string> strs;
ifstream in(file, ios::in);
if(!in){
return strs;
} else {
while(in.getline(str,SIZE)){
strs.push_back(string(str));
ln++;
}
}
in.close();
return strs;
}
void convert_file(const char* file){
vector<string> s = get_file(file);
vector<string> d;
int a, b;
bool t = false;
string comp = "--";
for(int i=0; i<s.size(); i++){
string m = "1";
m+= string(s.at(i));
m+= "2";
cout << m << endl;
if(s.at(i) == comp){
cout << "s[i] == '--'" << endl;
}
}
}
int main(){
convert_file("test.txt");
return 0;
}
now when I run a test file to check a similar program:
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main(){
vector<string> s;
s.push_back("--");
s.push_back("a");
for(int i=0; i<s.size(); i++){
cout << "1" << s.at(i) << "2" << endl;
if(s.at(i) == "--"){
cout << i << "= --" << endl;
}
}
return 0;
}
prints off 1--2, 0= --, 1a2. it works, it prints properly, and does the comparison. This leads me to think something is happening when I pull the line into a string.
Windows 7, cygwin64
g++ version 4.9.3
compile: D:\projects\test>g++ -o a -std=c++11 test.cpp
Based on the behavior and the discussion the lines in the file are terminated using a "\r\n" sequence. The easiest approach for dealing with the remaining '\r' is to remove it after reading a line. For example:
for (std::string line; std::getline(file, line); ) {
if (!line.empty() && line.back() == '\r') {
line.resize(line.size() - 1u);
}
strs.push_back(line);
}
If you insist in reading into char arrays you can use file.gcount() to determine the number of characters read to find the end of the string quickly. Note, however, that the number includes the bewline character, i.e., you'd want to check str[file.gcount() - 2] and potentially set it to '\0' (if the count is bigger or equal to 2, of course).
As answered by Dietmar Kühl already, the problem is with the \r\n line endings.
However, you should not need to modify your source code. The default behaviour in C++ is supposed to be to open files in text mode. Text mode means that whenever a line ending is found, where "line ending" depends on the platform you're using, it gets translated so that your program just sees a single \n. You're supposed to explicitly request "binary mode" from your program to disable this line ending translation. This has been long-standing practise on Windows systems, is the behaviour well supported by the C++ standard, and is the expected behaviour with native Windows compilers, but for compatibility with POSIX and existing Unix programs that do not bother setting the file mode properly, Cygwin ignores this and defaults to opening files in binary mode unless a custom Cygwin-specific text mode is explicitly requested.
This is covered in the Cygwin FAQ. The first solutions provided there (using O_TEXT or "t", depending on how you open your file) are non-standard so break your code with other environments, and they are not as easy to use with C++ <fstream> file access.
However, the next solutions provided there do work even for C++ programs:
You can also avoid to change the source code at all by linking an additional object file to your executable. Cygwin provides various object files in the /usr/lib directory which, when linked to an executable, changes the default open modes of any file opened within the executed process itself. The files are
binmode.o - Open all files in binary mode.
textmode.o - Open all files in text mode.
textreadmode.o - Open all files opened for reading in text mode.
automode.o - Open all files opened for reading in text mode,
all files opened for writing in binary mode.
And indeed, changing your compiler and linker invocation from g++ -o a -std=c++11 test.cpp to g++ -o a -std=c++11 test.cpp /usr/lib/textmode.o, your program works without changes to your source code. Linking with textmode.o basically means that your I/O will work the way it already should work by default.
On linux with g++, if I set a utf8 global locale, then wcin correctly transcodes UTF-8 to the internal wchar_t encoding.
However, if I use the classic locale and imbue an UTF8 locale into wcin, this doesn't happen. Input either fails altogether, or each individual byte gets converted to wchar_t independently.
With clang++ and libc++, neither setting the global locale nor imbuing the locale in wcin work.
#include <iostream>
#include <locale>
#include <string>
using namespace std;
int main() {
if(true)
// this works with g++, but not with clang++/libc++
locale::global(locale("C.UTF-8"));
else
// this doesn't work with either implementation
wcin.imbue(locale("C.UTF-8"));
wstring s;
wcin >> s;
cout << s.length() << " " << (s == L"áéú");
return 0;
}
The input stream contains only áéú characters. (They are in UTF-8, not any single-byte encoding).
Live demo: one two (I can't reproduce the other behaviour with online compilers).
Is this standard-conforming? Shouldn't I be able to leave the global locale alone and use imbue instead?
Should either of the described behaviours be classified as an implementation bug?
First of all you should use wcout with wcin.
Now you have two possible solutions to that:
1) Deactivate synchronization of iostream and cstdio streams by using
ios_base::sync_with_stdio(false);
Note, that this should be the first call, otherwise the behavior depends on implementation.
int main() {
ios_base::sync_with_stdio(false);
wcin.imbue(locale("C.UTF-8"));
wstring s;
wcin >> s;
wcout << s.length() << " " << (s == L"áéú");
return 0;
}
2) Localize both locale and wcout:
int main() {
std::setlocale(LC_ALL, "C.UTF-8");
wcout.imbue(locale("C.UTF-8"));
wstring s;
wcin >> s;
wcout << s.length() << " " << (s == L"áéú");
return 0;
}
Tested both of them using ideone, works fine. I don't have clang++/libc++ with me, so wasn't able to test this behavior, sorry.
I have this code to serialize/deserialize class objects to file, and it seems to work.
However, I have two questions.
What if instead two wstring's (as I have now) I want to have one wstring and one string member
variable in my class? (I think in such case my code won't work?).
Finally, below, in main, when I initialize s2.product_name_= L"megatex"; if instead of megatex I write something in Russian say (e.g., s2.product_name_= L"логин"), the code doesn't work anymore as intended.
What can be wrong? Thanks.
Here is code:
// ConsoleApplication3.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <string>
#include <fstream> // std::ifstream
using namespace std;
// product
struct Product
{
double price_;
double product_index_;
wstring product_name_;
wstring other_data_;
friend std::wostream& operator<<(std::wostream& os, const Product& p)
{
return os << p.price_ << endl
<< p.product_index_ << endl
<< p.product_name_ << endl
<< p.other_data_ << endl;
}
friend wistream& operator>>(std::wistream& is, Product& p)
{
is >> p.price_ >> p.product_index_;
is.ignore(std::numeric_limits<streamsize>::max(), '\n');
getline(is,p.product_name_);
getline(is,p.other_data_);
return is;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
Product s1,s2;
s1.price_ = 100;
s1.product_index_ = 0;
s1.product_name_= L"flex";
s1.other_data_ = L"dat001";
s2.price_ = 300;
s2.product_index_ = 2;
s2.product_name_= L"megatex";
s2.other_data_ = L"dat003";
// write
wofstream binary_file("c:\\test.dat",ios::out|ios::binary|ios::app);
binary_file << s1 << s2;
binary_file.close();
// read
wifstream binary_file2("c:\\test.dat");
Product p;
while (binary_file2 >> p)
{
if(2 == p.product_index_){
cout<<p.price_<<endl;
cout<<p.product_index_<<endl;
wcout<<p.product_name_<<endl;
wcout<<p.other_data_<<endl;
}
}
if (!binary_file2.eof())
std::cerr << "error during parsing of input file\n";
else
std::cerr << "Ok \n";
return 0;
}
What if instead two wstring's (as I have now) I want to have one
wstring and one string member variable in my class? (I think in such
case my code won't work?).
There are an inserter defined for char * for any basic_ostream (ostream and wostream), so you can use the result of c_str() member function call for the string member. For example, if the string member is other_data_:
return os << p.price_ << endl
<< p.product_index_ << endl
<< p.product_name_ << endl
<< p.other_data_.c_str() << endl;
The extractor case is more complex, since you'll have to read as wstring and the convert to string. The most simple way to do this is just reading as wstring and then narrowing each character:
wstring temp;
getline(is, temp);
p.other_data_ = string(temp.begin(), temp.end());
I'm not using locales in this sample, just converting a sequence of bytes (8 bits) to a sequence of words (16 bits) for output and the opposite (truncating values) for input. That is OK if you are using ASCII chars, or using single-byte chars and you don't require an specific format (as Unicode) for output.
Otherwise, you will need handle with locales. locale gives cultural contextual information to interpret the string (remember that is just a sequence of bytes, not characters in the sense of letters or symbols; the map between the bytes and what symbol represents is defined by the locale). locale is not an very easy to use concept (human culture isn't too). As you suggest yourself, it would be better make first some investigation about how it works.
Anyway, the idea is:
Identify the charset used in string and the charset used in file (Unicode or utf-16).
Convert the strings from original charset to Unicode using locale for output.
Convert the wstrings read from file (in Unicode) to strings using locale.
Finally, below, in main, when I initialize s2.product_name_=
L"megatex"; if instead of megatex I write something in Russian say
(e.g., s2.product_name_= L"логин"), the code doesn't work anymore as
intended.
When you define an array of wchar_t using L"", you'are not really specifying the string is Unicode, just that the array is of chars, not wchar_t. I suppose the intended working is s2.product_name_ store the name in Unicode format, but the compiler will take every char in that string (as without L) and convert to wchar_t just padding with zeros the most significant byte. Unicode is not good supported in the C++ standard until C++11 (and is still not really too supported). It works just for ASCII characters because they have the same codification in Unicode (or UTF-8).
For using the Unicode characters in a static string, you can use escape characters: \uXXXX. Doing that for every not-English character is not very comfortable, I know. You can found a list of Unicode characters in multiple sites in the web. For example, in the Wikipedia: http://en.wikipedia.org/wiki/List_of_Unicode_characters.