Writing class object to file using streams - c++

I have this code to serialize/deserialize class objects to file, and it seems to work.
However, I have two questions.
What if instead two wstring's (as I have now) I want to have one wstring and one string member
variable in my class? (I think in such case my code won't work?).
Finally, below, in main, when I initialize s2.product_name_= L"megatex"; if instead of megatex I write something in Russian say (e.g., s2.product_name_= L"логин"), the code doesn't work anymore as intended.
What can be wrong? Thanks.
Here is code:
// ConsoleApplication3.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <string>
#include <fstream> // std::ifstream
using namespace std;
// product
struct Product
{
double price_;
double product_index_;
wstring product_name_;
wstring other_data_;
friend std::wostream& operator<<(std::wostream& os, const Product& p)
{
return os << p.price_ << endl
<< p.product_index_ << endl
<< p.product_name_ << endl
<< p.other_data_ << endl;
}
friend wistream& operator>>(std::wistream& is, Product& p)
{
is >> p.price_ >> p.product_index_;
is.ignore(std::numeric_limits<streamsize>::max(), '\n');
getline(is,p.product_name_);
getline(is,p.other_data_);
return is;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
Product s1,s2;
s1.price_ = 100;
s1.product_index_ = 0;
s1.product_name_= L"flex";
s1.other_data_ = L"dat001";
s2.price_ = 300;
s2.product_index_ = 2;
s2.product_name_= L"megatex";
s2.other_data_ = L"dat003";
// write
wofstream binary_file("c:\\test.dat",ios::out|ios::binary|ios::app);
binary_file << s1 << s2;
binary_file.close();
// read
wifstream binary_file2("c:\\test.dat");
Product p;
while (binary_file2 >> p)
{
if(2 == p.product_index_){
cout<<p.price_<<endl;
cout<<p.product_index_<<endl;
wcout<<p.product_name_<<endl;
wcout<<p.other_data_<<endl;
}
}
if (!binary_file2.eof())
std::cerr << "error during parsing of input file\n";
else
std::cerr << "Ok \n";
return 0;
}

What if instead two wstring's (as I have now) I want to have one
wstring and one string member variable in my class? (I think in such
case my code won't work?).
There are an inserter defined for char * for any basic_ostream (ostream and wostream), so you can use the result of c_str() member function call for the string member. For example, if the string member is other_data_:
return os << p.price_ << endl
<< p.product_index_ << endl
<< p.product_name_ << endl
<< p.other_data_.c_str() << endl;
The extractor case is more complex, since you'll have to read as wstring and the convert to string. The most simple way to do this is just reading as wstring and then narrowing each character:
wstring temp;
getline(is, temp);
p.other_data_ = string(temp.begin(), temp.end());
I'm not using locales in this sample, just converting a sequence of bytes (8 bits) to a sequence of words (16 bits) for output and the opposite (truncating values) for input. That is OK if you are using ASCII chars, or using single-byte chars and you don't require an specific format (as Unicode) for output.
Otherwise, you will need handle with locales. locale gives cultural contextual information to interpret the string (remember that is just a sequence of bytes, not characters in the sense of letters or symbols; the map between the bytes and what symbol represents is defined by the locale). locale is not an very easy to use concept (human culture isn't too). As you suggest yourself, it would be better make first some investigation about how it works.
Anyway, the idea is:
Identify the charset used in string and the charset used in file (Unicode or utf-16).
Convert the strings from original charset to Unicode using locale for output.
Convert the wstrings read from file (in Unicode) to strings using locale.
Finally, below, in main, when I initialize s2.product_name_=
L"megatex"; if instead of megatex I write something in Russian say
(e.g., s2.product_name_= L"логин"), the code doesn't work anymore as
intended.
When you define an array of wchar_t using L"", you'are not really specifying the string is Unicode, just that the array is of chars, not wchar_t. I suppose the intended working is s2.product_name_ store the name in Unicode format, but the compiler will take every char in that string (as without L) and convert to wchar_t just padding with zeros the most significant byte. Unicode is not good supported in the C++ standard until C++11 (and is still not really too supported). It works just for ASCII characters because they have the same codification in Unicode (or UTF-8).
For using the Unicode characters in a static string, you can use escape characters: \uXXXX. Doing that for every not-English character is not very comfortable, I know. You can found a list of Unicode characters in multiple sites in the web. For example, in the Wikipedia: http://en.wikipedia.org/wiki/List_of_Unicode_characters.

Related

How to use operator L on array? (C++, Visual Studio 2019)

Part 2 on encoding characters in C++ (by User123).
<- Go to the previous post.
I was yesterday making some code, and Paul Sanders in this question told me useful solution: He told me not to use std::cout << "something"; but to use std::wcout << L"something";.
But I have another problem. Now I want to do something like this (some special characters, but in array):
#include <iostream>
using namespace std;
string myArray[2] = { "łŁšđřžőšě", "×÷¤ßł§ř~ú" };
int main()
{
cout << myArray[0] << endl << myArray[1];
return 0;
}
But now I get something really unusual:
│úÜ­°×§Üý
θĄ▀│ž°~˙
If I add L in front of the array, I get (Visual Studio 2019):
C++ initialization with '{...}' expected for aggregate object
How can I represent these special characters but in the array?
#include <iostream>
using namespace std;
wstring myArray[2] = { L"łŁšđřžőšě", L"×÷¤ßł§ř~ú" };
int main()
{
wcout << myArray[0] << endl << myArray[1];
return 0;
}
L can only be applied directly to string literals. The result is a string literal of type wchar_t[] (wide character) rather then the usual char_t[] (narrow character), so you cannot save it in a string. You need to save it in a wstring. And to output a wstring you need to pass it to wcout, not cout.

Changing type of char using wchar_t used not so like L

In my code I tried to create massive of 4 bite chars, where every char contain a Cyrillic letter.
wchar_t OUT_STRING[4] = { L'т',L'л',L'о',L'р' };
All in normal with this and I have expected output. It's only test, in real I need to convert element from string to the same type like in OUT_STRING; I tried to use something like this:
wchar_t OUT_STRING[4] = { (wchar_t)'т',L'л',L'о',L'р' };
But it didn't work and in output I have a rectangle.
I think you want to pass in a string using std::string in UTF-8 encoding and process it one character at a time, each time converting the single character to a wide character string of length 1 so that you can pass it to TTF_SizeUNICODE, and TTF_RenderUNICODE_Blended.
I will demonstrate the relevant string conversion code.
Here is a test function that expects a null-terminated wide character string with just one character in it. The body of main shows how to convert a UTF-8 string to UTF-16 (using codecvt_utf8_utf16) and how to convert a single character to a string (using std::wstring(1, ch))
#include <string>
#include <codecvt>
#include <iostream>
void test(const wchar_t* str) {
std::cout << "Number of characters in string: " << wcslen(str) << std::endl;
for (const wchar_t* ch = str; *ch; ++ch) {
std::cout << ((int)*ch) << std::endl;
}
}
int main() {
std::string input = u8"тлор";
for (wchar_t ch : std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>, wchar_t>().from_bytes(input)) {
std::wstring string_with_just_one_character(1, ch);
test(string_with_just_one_character.c_str());
}
return 0;
}

C++ c_str() doesn't return complete string

I'm doing a C++ assignment that requires taking user input of an expression (eg: 2 * (6-1) + 2 ) and outputting the result. Everything works correctly unless a space is encountered in the user input.
It is a requirement to pass the user input to the following method;
double Calculate(char* expr);
I'm aware the issue is caused by c_str() where the space characters act as a terminating null byte, though I'm not sure how to overcome this problem.
Ideally I'd like to preserve the space characters but I'd settle for simply removing them, as a space serves no purpose in the expression. I get the same result when using string::data instead of c_str.
int main(int argc, char **argv)
{
string inputExpr;
Calc myCalc;
while(true) {
cin >> inputExpr;
if(inputExpr == "q") break;
cout << "You wrote:" << (char*)inputExpr.c_str() << endl; // debug
printf("Result: %.3f \n\n", myCalc.Calculate( (char*)temp.c_str() ) );
}
return 0;
}
c_str works just fine. Your problem is cin >> inputExpr. The >> operator only reads until the next space, so you do not read your equation fully.
What you want to use is std::getline:
std::getline (std::cin,inputExpression);
which will read until it reaches a newline character. See the function description if you need a specific delimiter.
Problem is not with inputExpr.c_str() and c_str as such, c_str() returns pointer to a character array that contains a null-terminated sequence. While reading through cin, you get space or tab etc separating as multiple strings. Check with the content of the string that way to solve the intended operation
First, I think your Calculate() method should take as input a const char* string, since expr should be an input (read-only) parameter:
double Calculate(const char* expr);
Note that if you use const char*, you can simply call std::string::c_str() without any ugly cast to remove const-ness.
And, since this is C++ and not C, using std::string would be nice:
double Calculate(const std::string& expr);
On the particular issue of reading also whitespaces, this is not a problem of terminating NUL byte: a space is not a NUL.
You should just change the way you read the string, using std::getline() instead of simple std::cin >> overload:
#include <iostream>
#include <string>
using namespace std;
int main()
{
string line;
getline(cin, line);
cout << "'" << line << "'" << endl;
}
If you compile and run this code, and enter something like Hello World, you get the whole string as output (including the space separating the two words).

Stringstream don't copy new lines

Special characters disappear when I pass a string into a stringstream.
I tried this code which can directly be tested:
#include <iostream>
#include <sstream>
using namespace std;
int main(int argc, char* argv[]) {
string txt("hehehaha\n\t hehe\n\n<New>\n\ttest:\t130\n\ttest_end:\n<New_end>\n");
cout << txt << endl; // No problem with new lines and tabs
stringstream stream;
stream << txt;
string s;
while(stream >> s) {
cout << s; // Here special characters like '\n' and '\t' don't exist anymore.
}
cout << "\n\n";
return 0;
}
What can I do to overcome this?
Edit: I tried this:
stream << txt.c_str();
and it worked. But I don't know why...
basically, you are just printing it wrong, it should be:
cout << stream.str() << endl;
Some details. You are calling operator<<(string) which
overloads operator<< to behave as described in ostream::operator<<
for c-strings
The referred to behaviour is explained here:
(2) character sequence Inserts the C-string s into os. The terminating
null character is not inserted into os. The length of the c-string is
determined beforehand (as if calling strlen).
Strlen documentation says that the result is affected by nothing but
the terminating null-character
Indeed, strlen(tmp) in your examples outputs 55.
The stream, hence, gets "assigned" everything which comes up to the 55th character in your input string.
cout << stream.str() << endl;
will show you that this is indeed what happens.
A parenthesis: you can modify the behaviour of the stream << txt line by means of setting/unsetting flags, as in
stream.unsetf ( std::ios::skipws );
which you should try out.
The statement
while(stream >> s)
Is the problem, it gives you one token on each call, using white spaces for splitting and therefor ignoring them.

C++ iostream >> operator behaves differently than get() unsigned char

I was working on a piece of code to do some compression, and I wrote a bitstream class.
My bitstream class kept track of the current bit we are reading and the current byte (unsigned char).
I noticed that reading the next unsigned character from the file was done differently if I used the >> operator vs get() method in the istream class.
I was just curious why I was getting different results?
ex:
this->m_inputFileStream.open(inputFile, std::ifstream::binary);
unsigned char currentByte;
this->m_inputFileStream >> currentByte;
vs.
this->m_inputFileStream.open(inputFile, std::ifstream::binary);
unsigned char currentByte;
this->m_inputFileStream.get((char&)currentByte);
Additional Info:
To be specific the byte I was reading was 0x0A however when using >> it would read it as 0x6F
I'm not sure how they're even related ? (they're not the 2s complement of each other?)
The >> operator is also defined to work for unsigned char as well however (see c++ istream class reference
operator>> is for formatted input. It'll read "23" as an integer if you stream it into an int, and it'll eat whitespace between tokens. get() on the other hand is for unformatted, byte-wise input.
If you aren't parsing text, don't use operator>> or operator<<. You'll get weird bugs that are hard to track down. They are also resilient to unit tests, unless you know what to look for. Reading a uint8 for instance will fail on 9 for instance.
edit:
#include <iostream>
#include <sstream>
#include <cstdint>
void test(char r) {
std::cout << "testing " << r << std::endl;
char t = '!';
std::ostringstream os(std::ios::binary);
os << r;
if (!os.good()) std::cout << "os not good" << std::endl;
std::istringstream is(os.str(), std::ios::binary);
is >> t;
if (!is.good()) std::cout << "is not good" << std::endl;
std::cout << std::hex << (uint16_t)r
<< " vs " << std::hex << (uint16_t)t << std::endl;
}
int main(int argc, char ** argv) {
test('z');
test('\n');
return 0;
}
produces:
testing z
7a vs 7a
testing
is not good
a vs 21
I suppose that would never have been evident a priori.
C++'s formatted input (operator >>) treats char and unsigned char as a character, rather than an integer. This is a little annoying, but understandable.
You have to use get, which returns the next byte, instead.
However, if you open a file with the binary flag, you should not be using formatted I/O. You should be using read, write and related functions. Formatted I/O won't behave correctly, as it's intended to operate on text formats, not binary formats.