cannot read character value '26' from file(substitute character) in c++ - c++

hi,
I've just done something like below in c++: - ON - WINDOWS 10
//1. Serialize a number into string then write it to file
std::string filename = "D:\\Hello.txt";
size_t OriNumber = 26;
std::string str;
str.resize(sizeof(size_t));
memcpy(&str[0], reinterpret_cast<const char*>(&OriNumber), str.size());
std::ofstream ofs(filename);
ofs << str << std::endl;
ofs.close();
//2. Now read back the string from file and deserialize it
std::ifstream ifs(filename);
std::string str1{std::istreambuf_iterator<char>(ifs), std::istreambuf_iterator<char>()};
// 3. Expect that the string str1 will not be empty here.
size_t DeserializedNumber = *(reinterpret_cast<const size_t*>(str1.c_str()));
std::cout << DeserializedNumber << std::endl;
At step 3, I could not read the string from file, even if I opened the file with notepad++, it showed several characters. At last line we still have the value of DeserializedNumber got printed, but it is due to str1.c_str() is now a valid pointer with some garbage value.
After debugged the program, I found that std:ifstream will get the value -1(EOF) at beginning of the file, and as explanation of wikipedia, 26 is value of Substitue Character and sometime is considered as EOF.
My question is:
if I can't read value 26 from file as above, then how can serialization library serialize this value to bytes?
and Do we have some way to read/write/transfer this value properly if still serialize the value 26 as my way above?
thanks,

Related

how to make a new line in binary file c++?

Can anybody help me with this simple thing in file handling?
This is my code:
#include<iostream>
#include<fstream>
#include<string>
using namespace std;
int main()
{
fstream f;
f.open("input05.bin", ios_base::binary | ios_base::out);
string str = "";
cout << "Input text:"<<endl;
while (1)
{
getline(cin, str);
if (str == "end")
break;
else {
f.write((char*)&str, sizeof(str));
}
}
f.close();
f.open("input05.bin", ios_base::in | ios_base::binary);
while (!f.eof())
{
string st;
f.read((char*)&st, sizeof(st));
cout << st << endl;
}
f.close();
}
It is running successfully now. I want to format the output of the text file according to my way.
I have:
hi this is first program i writer this is an experiment
How can I make my output file look like the following:
hi this is first program
I writer this is an experiment
What should I do to format the output in that way?
First of all,
string str;
....
f.write((char*)&str, sizeof(str));
is absolutely wrong as you cast a pointer to an object of type std::string to a pointer to a character, i.e. char*. Note that an std::string is an object having data members like the length of the string and a pointer to the memory where the string content is kept, but it is not a c-string of type char *. Further, sizeof(str) gives you the size of the "wrapper object" with the length member and the pointer, but it does not give you the length of the string.
So it should be something like this:
f.write(str.c_str(), str.length());
Another thing is the os-dependant handling of new line character. Depending on the operating system, a new line is represented either by 0x0d 0x0a or just by 0x0d. In memory, c++ treats a new line always as a single character '\n'(i.e. 0x0d). When writing to a file in text mode, c++ will expand an '\n' to 0x0d 0x0a or just keep it as 0x0d (depending on the platform). If you write to a file in binary mode, however, this replacement will not occur. So if you create a file in binary mode and insert only a 0x0d, then - depending on the platform - printing the file in the console will not result in a new line.
Try to write ...
f.write(str.c_str(), str.length());
f.put('\r');
such that it will work on your platform (and will not work on other platforms then).
That's why you should write in text mode if you want to write text.

C++ reading a file in binary mode. Problems with END OF FILE

I am learning C++and I have to read a file in binary mode. Here's how I do it (following the C++ reference):
unsigned values[255];
unsigned total;
ifstream in ("test.txt", ifstream::binary);
while(in.good()){
unsigned val = in.get();
if(in.good()){
values[val]++;
total++;
cout << val <<endl;
}
}
in.close();
So, I am reading the file byte per byte till in.good() is true. I put some cout at the end of the while in order to understand what's happening, and here is the output:
marco#iceland:~/workspace/huffman$ ./main
97
97
97
97
10
98
98
10
99
99
99
99
10
100
100
10
101
101
10
221497852
marco#iceland:~/workspace/huffman$
Now, the input file "test.txt" is just:
aaaa
bb
cccc
dd
ee
So everything works perfectly till the end, where there's that 221497852. I guess it's something about the end of file, but I can't figure the problem out.
I am using gedit & g++ on a debian machine(64bit).
Any help help will be appreciated.
Many thanks,
Marco
fstream::get returns an int-value. This is one of the problems.
Secondly, you are reading in binary, so you shouldn't use formatted streams. You should use fstream::read:
// read a file into memory
#include <iostream> // std::cout
#include <fstream> // std::ifstream
int main () {
std::ifstream is ("test.txt", std::ifstream::binary);
if (is) {
// get length of file:
is.seekg (0, is.end);
int length = is.tellg();
is.seekg (0, is.beg);
char * buffer = new char [length];
std::cout << "Reading " << length << " characters... ";
// read data as a block:
is.read (buffer,length);
if (is)
std::cout << "all characters read successfully.";
else
std::cout << "error: only " << is.gcount() << " could be read";
is.close();
// ...buffer contains the entire file...
delete[] buffer;
}
return 0;
}
This isn't the way istream::get() was designed to be used.
The classical idiom for using this function would be:
for ( int val = in.get(); val != EOF; val = in.get() ) {
// ...
}
or even more idiomatic:
char ch;
while ( in.get( ch ) ) {
// ...
}
The first loop is really inherited from C, where in.get() is
the equivalent of fgetc().
Still, as far as I can tell, the code you give should work.
It's not idiomatic, and it's not
The C++ standard is unclear what it should return if the
character value read is negative. fgetc() requires a value in
the range [0...UCHAR_MAX], and I think it safe to assume that
this is the intent here. It is, at least, what every
implementation I've used does. But this doesn't affect your
input. Depending on how the implementation interprets the
standard, the return value of in.get() must be in the range
[0...UCHAR_MAX] or [CHAR_MIN...CHAR_MAX], or it must be EOF
(typically -1). (The reason I'm fairly sure that the intent is
to require [0...UCHAR_MAX] is because otherwise, you may not
be able to distinguish end of file from a valid character.)
And if the return value is EOF (almost always
-1), failbit should be set, so in.good() would return
false. There is no case where in.get() would be allowed
to return 221497852. The only explication I can possibly think
of for your results is that your file has some character with
bit 7 set at the end of the file, that the implementation is
returning a negative number for this (but not end of file,
because it is a character), which results in an out of bounds
index in values[val], and that this out of bounds index
somehow ends up modifying val. Or that your implementation is
broken, and is not setting failbit when it returns end of
file.
To be certain, I'd be interested in knowing what you get from
the following:
std::ifstream in( "text.txt", std::ios_base::binary );
int ch = in.get();
while ( ch != std::istream::traits_type::eof() ) {
std::cout << ch << std::endl;
ch = in.get();
}
This avoids any issues of a possibly invalid index, and any type
conversions (although the conversion int to unsigned is well
defined). Also, out of curiosity (since I can only access VC++
here), you might try replacing in as follows:
std::istringstream in( "\n\xE5" );
I would expect to get:
10
233
(Assuming 8 bit bytes and an ASCII based code set. Both of
which are almost, but not quite universal today.)
I've eventually figured this out.
Apparently it seems the problem wasn't due to any code. The problem was gedit. It always appends a newline character at the end of file. This also happen with other editors, such as vim. For some editor this can be configured to not append anything, but in gedit this is apparently not possible. https://askubuntu.com/questions/13317/how-to-stop-gedit-gvim-vim-nano-from-adding-end-of-file-newline-char
Cheers to everyone who asked me,
Marco

ifstream get line change output from char to string

C++ ifstream get line change getline output from char to string
I got a text file.. so i read it and i do something like
char data[50];
readFile.open(filename.c_str());
while(readFile.good())
{
readFile.getline(data,50,',');
cout << data << endl;
}
My question is instead of creating a char with size 50 by the variable name data, can i get the getline to a string instead something like
string myData;
readFile.getline(myData,',');
My text file is something like this
Line2D, [3,2]
Line3D, [7,2,3]
I tried and the compiler say..
no matching function for getline(std::string&,char)
so is it possible to still break by delimiter, assign value to a string instead of a char.
Updates:
Using
while (std::getline(readFile, line))
{
std::cout << line << std::endl;
}
IT read line by line, but i wanna break the string into several delimiter, originally if using char i will specify the delimiter as the 3rd element which is
readFile.getline(data,50,',');
how do i do with string if i break /explode with delimiter comma , the one above. in line by line
Use std::getline():
std::string line;
while (std::getline(readFile, line, ','))
{
std::cout << line << std::endl;
}
Always check the result of read operations immediately otherwise the code will attempt to process the result of a failed read, as is the case with the posted code.
Though it is possible to specify a different delimiter in getline() it could mistakenly process two invalid lines as a single valid line. Recommend retrieving each line in full and then split the line. A useful utility for splitting lines is boost::split().

Reading a string from a file in C++

I'm trying to store strings directly into a file to be read later in C++ (basically for the full scope I'm trying to store an object array with string variables in a file, and those string variables will be read through something like object[0].string). However, everytime I try to read the string variables the system gives me a jumbled up error. The following codes are a basic part of what I'm trying.
#include <iostream>
#include <fstream>
using namespace std;
/*
//this is run first to create the file and store the string
int main(){
string reed;
reed = "sees";
ofstream ofs("filrsee.txt", ios::out|ios::binary);
ofs.write(reinterpret_cast<char*>(&reed), sizeof(reed));
ofs.close();
}*/
//this is run after that to open the file and read the string
int main(){
string ghhh;
ifstream ifs("filrsee.txt", ios::in|ios::binary);
ifs.read(reinterpret_cast<char*>(&ghhh), sizeof(ghhh));
cout<<ghhh;
ifs.close();
return 0;
}
The second part is where things go haywire when I try to read it.
Sorry if it's been asked before, I've taken a look around for similar questions but most of them are a bit different from what I'm trying to do or I don't really understand what they're trying to do (still quite new to this).
What am I doing wrong?
You are reading from a file and trying to put the data in the string structure itself, overwriting it, which is plain wrong.
As it can be verified at http://www.cplusplus.com/reference/iostream/istream/read/ , the types you used were wrong, and you know it because you had to force the std::string into a char * using a reinterpret_cast.
C++ Hint: using a reinterpret_cast in C++ is (almost) always a sign you did something wrong.
Why is it so complicated to read a file?
A long time ago, reading a file was easy. In some Basic-like language, you used the function LOAD, and voilĂ !, you had your file.
So why can't we do it now?
Because you don't know what's in a file.
It could be a string.
It could be a serialized array of structs with raw data dumped from memory.
It could even be a live stream, that is, a file which is appended continuously (a log file, the stdin, whatever).
You could want to read the data word by word
... or line by line...
Or the file is so large it doesn't fit in a string, so you want to read it by parts.
etc..
The more generic solution is to read the file (thus, in C++, a fstream), byte per byte using the function get (see http://www.cplusplus.com/reference/iostream/istream/get/), and do yourself the operation to transform it into the type you expect, and stopping at EOF.
The std::isteam interface have all the functions you need to read the file in different ways (see http://www.cplusplus.com/reference/iostream/istream/), and even then, there is an additional non-member function for the std::string to read a file until a delimiter is found (usually "\n", but it could be anything, see http://www.cplusplus.com/reference/string/getline/)
But I want a "load" function for a std::string!!!
Ok, I get it.
We assume that what you put in the file is the content of a std::string, but keeping it compatible with a C-style string, that is, the \0 character marks the end of the string (if not, we would need to load the file until reaching the EOF).
And we assume you want the whole file content fully loaded once the function loadFile returns.
So, here's the loadFile function:
#include <iostream>
#include <fstream>
#include <string>
bool loadFile(const std::string & p_name, std::string & p_content)
{
// We create the file object, saying I want to read it
std::fstream file(p_name.c_str(), std::fstream::in) ;
// We verify if the file was successfully opened
if(file.is_open())
{
// We use the standard getline function to read the file into
// a std::string, stoping only at "\0"
std::getline(file, p_content, '\0') ;
// We return the success of the operation
return ! file.bad() ;
}
// The file was not successfully opened, so returning false
return false ;
}
If you are using a C++11 enabled compiler, you can add this overloaded function, which will cost you nothing (while in C++03, baring optimizations, it could have cost you a temporary object):
std::string loadFile(const std::string & p_name)
{
std::string content ;
loadFile(p_name, content) ;
return content ;
}
Now, for completeness' sake, I wrote the corresponding saveFile function:
bool saveFile(const std::string & p_name, const std::string & p_content)
{
std::fstream file(p_name.c_str(), std::fstream::out) ;
if(file.is_open())
{
file.write(p_content.c_str(), p_content.length()) ;
return ! file.bad() ;
}
return false ;
}
And here, the "main" I used to test those functions:
int main()
{
const std::string name(".//myFile.txt") ;
const std::string content("AAA BBB CCC\nDDD EEE FFF\n\n") ;
{
const bool success = saveFile(name, content) ;
std::cout << "saveFile(\"" << name << "\", \"" << content << "\")\n\n"
<< "result is: " << success << "\n" ;
}
{
std::string myContent ;
const bool success = loadFile(name, myContent) ;
std::cout << "loadFile(\"" << name << "\", \"" << content << "\")\n\n"
<< "result is: " << success << "\n"
<< "content is: [" << myContent << "]\n"
<< "content ok is: " << (myContent == content)<< "\n" ;
}
}
More?
If you want to do more than that, then you will need to explore the C++ IOStreams library API, at http://www.cplusplus.com/reference/iostream/
You can't use std::istream::read() to read into a std::string object. What you could do is to determine the size of the file, create a string of suitable size, and read the data into the string's character array:
std::string str;
std::ifstream file("whatever");
std::string::size_type size = determine_size_of(file);
str.resize(size);
file.read(&str[0], size);
The tricky bit is determining the size the string should have. Given that the character sequence may get translated while reading, e.g., because line end sequences are transformed, this pretty much amounts to reading the string in the general case. Thus, I would recommend against doing it this way. Instead, I would read the string using something like this:
std::string str;
std::ifstream file("whatever");
if (std::getline(file, str, '\0')) {
...
}
This works OK for text strings and is about as fast as it gets on most systems. If the file can contain null characters, e.g., because it contains binary data, this doesn't quite work. If this is the case, I'd use an intermediate std::ostringstream:
std::ostringstream out;
std::ifstream file("whatever");
out << file.rdbuf();
std::string str = out.str();
A string object is not a mere char array, the line
ifs.read(reinterpret_cast<char*>(&ghhh), sizeof(ghhh));
is probably the root of your problems.
try applying the following changes:
char[BUFF_LEN] ghhh;
....
ifs.read(ghhh, BUFF_LEN);

How can ofstream write NULL to a file in binary mode?

I am maintaining a C++ method which one of my clients is hitting an issue with. The method is supposed to write out a series of identifiers to a file delimited by a new line. However on their machine somehow the method is writing a series of NULL's out to the file. Opening the file in a binary editor shows that it contains all zeros.
I can't understand why this is happening. I've tried assigning empty strings and strings with the first character set to 0. There is no problem creating the file, just writing the identifiers to it.
Here is the method:
void writeIdentifiers(std::vector<std::string> IDs, std::string filename)
{
std::ofstream out (filename.c_str(), std::ofstream::binary);
if (out.is_open())
{
for (std::vector<std::string>::iterator it = IDs.begin();
it != IDs.end();
it++)
{
out << *it << "\n";
}
}
out.close();
}
My questions: is there any possible input you can provide that method which will create a file which has NULL values in it?
Yeah, the following code quite clearly writes a series of NULL bytes:
std::vector<std::string> ids;
std::string nullstring;
nullstring.assign("\0\0\0\0\0\0\0\0\0\0", 10);
ids.push_back(nullstring);
writeIdentifiers(ids, "test.dat");
Because the std::string container stores the string length, it can't necessarily be used in the same way as an ordinary C (null-terminated) string. Here, I assign a string containing 10 NULL bytes. Those are then output because the string length is 10.