How does read and write function work in C++ file handling? - c++

I'm learning file handling in c++ from internet alone. I came across the read and write function. But the parameters they take confused me.
So, I found the syntax as
fstream fout;
fout.write( (char *) &obj, sizeof(obj) );
and
fstream fin;
fin.read( (char *) &obj, sizeof(obj) );
In both of these, what is the function of char*?
And how does it read and write the file?

The function fstream::read has the following function signature:
istream& read (char* s, streamsize n);
You need to cast your arguments to the correct type. (char*) tells the compiler to pretend &obj is the correct type. Usually, this is a really bad idea.
Instead, you should do it this way:
// C++ program to demonstrate getline() function
#include <iostream>
#include <string>
using namespace std;
int main()
{
string str;
fstream fin;
getline(fin, str); // use cin instead to read from stdin
return 0;
}
Source: https://www.geeksforgeeks.org/getline-string-c/

The usage of the char * cast with read and write is to treat the obj variable as generic, continuous, characters (ignoring any structure).
The read function will read from the stream directly into the obj variable, without any byte translation or mapping to data members (fields). Note, pointers in classes or structures will be replaced with whatever value comes from the stream (which means the pointer will probably point to an invalid or improper location). Beware of padding issues.
The write function will the entire area of memory, occupied by obj, to the stream. Any padding between structure or class members will also be written. Values of pointers will be written to the stream, not the item that the pointer points to.
Note: these functions work "as-is". There are no conversions or translations of the data. For example, no conversion between Big Endain and Little Endian; no processing of the "end of line" or "end of file" characters. Basically mirror image data transfers.

Related

scanf function for strings

The problem is simple, the code below does not work. it says Process finished with exit code -1073740940 (0xC0000374). Removing ampersand does not change anything.
int main(){
string x;
scanf("%s",&x);
cout << x;
}
scanf() with the %s format specifier reads bytes into a preallocated character array (char[]), to which you pass a pointer.
Your s is not a character array. It is a std::string, a complex object.
A std::string* is not in any way the same as a char*. Your code overwrites the memory of parts of a complex object in unpredictable ways, so you end up with a crash.
Your compiler should have warned about this, since it knows that a char* is not a std::string*, and because compilers are clever and can detect mistakes like this despite the type-unsafe nature of C library functions.
Even if this were valid via some magic compatibility layer, the string is empty.
Use I/O streams instead.
You cannot pass complex objects through the ... operator of printf/scanf. Many compilers print a warning for that.
scanf requires a pointer of type char* pointing to sufficient storage for an argument of %s. std::string is something completely different.
In C++ the iostream operators are intended for text input and output.
cin >> x;
will do the job.
You should not use scanf in C++. There are many pitfalls, you found one of them.
Another pitfall: %s at scanf is almost always undefined behavior unless you you really ensure that the source stream can only contain strings of limited size. In this case a buffer of char buffer[size]; is the right target.
In any other case you should at least restrict the size of the string to scan. E.g. use %20s and of course a matching char buffer, char buffer[21];in this case. Note the size +1.
You should use cin. But if you want to use scanf() for whatever reason and still manipulate your strings with std::string, then you can read the C-string and use it to initialize your C++ string.
#include <iostream>
#include <cstdio>
#include <string>
using std::cout;
using std::string;
int main()
{
char c_str[80];
scanf("%s", c_str);
string str(c_str);
cout << str << "\n";
}
If you want to use strings, use cin (or getline).
string s;
cin>>s; //s is now read
If you want to use scanf, you want to have a char array (and don't use &):
char text[30];
scanf("%s", text); //text is now read
You can use char[] instead of string
include <iostream>
using namespace std;
int main()
{
char tmp[101];
scanf("%100s", tmp);
cout << tmp;
}

File loader problems

i have a text file which contains authors and books lists, i need to load it to my program, here is the code of the method which should load it:
void Loader::loadFile(const char* path)
{
FILE* file = fopen(path, "r");
char* bufferString;
while (feof(file) != 1) {
fgets(bufferString, 1000, file);
printf("%s", bufferString);
}
}
I use it in my main file:
int main(int argc, char** argv) {
Loader* loader = new Loader();
loader->loadFile("/home/terayon/prog/parser/data.txt");
return 0;
}
And I get data.txt file is not completely printed.
What I should do to get data completed?
fgets reads into the memory pointed to by the pointer passed as first parameter, bufferString on your case.
But your bufferString is an uninitialised pointer (leading to undefined behaviour):
char * bufferString;
// not initialised,
// and definitely not pointing to valid memory
So you need to provide some memory to read into, e.g by making it an array:
char bufferString[1000];
// that's a bit large to store on the stack
As a side note: Your code is not idiomatic C++. You're using the IO functions provided by the C standard library, which is possible, but using the facilities of the C++ STL would be more appropriate.
You have undefined behavior, you have a pointer bufferString but you never actually make int point anywhere. Since it's not initialized its value will be indeterminate and will seem to be random, meaning you will write to unallocated memory in the fgets call.
It's easy to solve though, declare it as an array, and use the array size when calling fgets:
char bufferString[500];
...
fgets(bufferString, sizeof(bufferString), file);
Besides the problem detailed above, you should not do while(!feof(file)), it will not work as you expect it to. The reason is that the EOF flag is not set until you try to read from beyond the file, leading the loop to iterate once to many.
You should instead do e.g. while (fgets(...) != NULL)
The code you have is not very C++-ish, instead it's using the old C functions for file handling. Instead I suggest you read more about the C++ standard I/O library and std::string which is a auto-expanding string class that won't have the limits of C arrays, and won't suffer from potential buffer overflows in the same way.
The code could then look something like this
std::ifstream input_file(path);
std::string input_buffer;
while (std::getline(input_file, input_buffer))
std::cout << input_buffer << '\n';

Parsing binary data from file

and thank you in advance for your help!
I am in the process of learning C++. My first project is to write a parser for a binary-file format we use at my lab. I was able to get a parser working fairly easily in Matlab using "fread", and it looks like that may work for what I am trying to do in C++. But from what I've read, it seems that using an ifstream is the recommended way.
My question is two-fold. First, what, exactly, are the advantages of using ifstream over fread?
Second, how can I use ifstream to solve my problem? Here's what I'm trying to do. I have a binary file containing a structured set of ints, floats, and 64-bit ints. There are 8 data fields all told, and I'd like to read each into its own array.
The structure of the data is as follows, in repeated 288-byte blocks:
Bytes 0-3: int
Bytes 4-7: int
Bytes 8-11: float
Bytes 12-15: float
Bytes 16-19: float
Bytes 20-23: float
Bytes 24-31: int64
Bytes 32-287: 64x float
I am able to read the file into memory as a char * array, with the fstream read command:
char * buffer;
ifstream datafile (filename,ios::in|ios::binary|ios::ate);
datafile.read (buffer, filesize); // Filesize in bytes
So, from what I understand, I now have a pointer to an array called "buffer". If I were to call buffer[0], I should get a 1-byte memory address, right? (Instead, I'm getting a seg fault.)
What I now need to do really ought to be very simple. After executing the above ifstream code, I should have a fairly long buffer populated with a number of 1's and 0's. I just want to be able to read this stuff from memory, 32-bits at a time, casting as integers or floats depending on which 4-byte block I'm currently working on.
For example, if the binary file contained N 288-byte blocks of data, each array I extract should have N members each. (With the exception of the last array, which will have 64N members.)
Since I have the binary data in memory, I basically just want to read from buffer, one 32-bit number at a time, and place the resulting value in the appropriate array.
Lastly - can I access multiple array positions at a time, a la Matlab? (e.g. array(3:5) -> [1,2,1] for array = [3,4,1,2,1])
Firstly, the advantage of using iostreams, and in particular file streams, relates to resource management. Automatic file stream variables will be closed and cleaned up when they go out of scope, rather than having to manually clean them up with fclose. This is important if other code in the same scope can throw exceptions.
Secondly, one possible way to address this type of problem is to simply define the stream insertion and extraction operators in an appropriate manner. In this case, because you have a composite type, you need to help the compiler by telling it not to add padding bytes inside the type. The following code should work on gcc and microsoft compilers.
#pragma pack(1)
struct MyData
{
int i0;
int i1;
float f0;
float f1;
float f2;
float f3;
uint64_t ui0;
float f4[64];
};
#pragma pop(1)
std::istream& operator>>( std::istream& is, MyData& data ) {
is.read( reinterpret_cast<char*>(&data), sizeof(data) );
return is;
}
std::ostream& operator<<( std::ostream& os, const MyData& data ) {
os.write( reinterpret_cast<const char*>(&data), sizeof(data) );
return os;
}
char * buffer;
ifstream datafile (filename,ios::in|ios::binary|ios::ate);
datafile.read (buffer, filesize); // Filesize in bytes
you need to allocate a buffer first before you read into it:
buffer = new filesize[filesize];
datafile.read (buffer, filesize);
as to the advantages of ifstream, well it is a matter of abstraction. You can abstract the contents of your file in a more convenient way. You then do not have to work with buffers but instead can create the structure using classes and then hide the details about how it is stored in the file by overloading the << operator for instance.
You might perhaps look for serialization libraries for C++. Perhaps s11n might be useful.
This question shows how you can convert data from a buffer to a certain type. In general, you should prefer using a std::vector<char> as your buffer. This would then look like this:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
int main() {
std::ifstream input("your_file.dat");
std::vector<char> buffer;
std::copy(std::istreambuf_iterator<char>(input),
std::istreambuf_iterator<char>(),
std::back_inserter(buffer));
}
This code will read the entire file into your buffer. The next thing you'd want to do is to write your data into valarrays (for the selection you want). valarray is constant in size, so you have to be able to calculate the required size of your array up-front. This should do it for your format:
std::valarray array1(buffer.size()/288); // each entry takes up 288 bytes
Then you'd use a normal for-loop to insert the elements into your arrays:
for(int i = 0; i < buffer.size()/288; i++) {
array1[i] = *(reinterpret_cast<int *>(buffer[i*288])); // first position
array2[i] = *(reinterpret_cast<int *>(buffer[i*288]+4)); // second position
}
Note that on a 64-bit system this is unlikely to work as you expect, because an integer would take up 8 bytes there. This question explains a bit about C++ and sizes of types.
The selection you describe there can be achieved using valarray.

difference between cout and write in c++

I am still confused about the difference between ostream& write ( const char* s , streamsize n ) in c++ and cout in c++
The first function writes the block of data pointed by s, with a size of n characters, into the output buffer. The characters are written sequentially until n have been written.
whereas cout is an object of class ostream that represents the standard output stream. It corresponds to the cstdio stream stdout.
Can anyone clearly bring out the differences between the two functions.
ostream& write ( const char* s , streamsize n );
Is an Unformatted output function and what is written is not necessarily a c-string, therefore any null-character found in the array s is copied to the destination and does not end the writing process.
cout is an object of class ostream that represents the standard output stream.
It can write characters either as formatted data using for example the insertion operator ostream::operator<< or as Unformatted data using the write member function.
You are asking what is the difference between a class member function and an instance of the class? cout is an ostream and has a write() method.
As to the difference between cout << "Some string" and cout.write("Some string", 11): It does the same, << might be a tiny bit slower since write() can be optimized as it knows the length of the string in advance. On the other hand, << looks nice and can be used with many types, such as numbers. You can write cout << 5;, but not cout.write(5).
cout is not a function. Like you said, it is an object of class ostream. And as an object of that class, it possesses the write function, which can be called like this:
cout.write(source,size);
"In binary files, to input and output data with the extraction and insertion operators (<< and >>) and functions like getline is not efficient, since we do not need to format any data, and data may not use the separation codes used by text files to separate elements (like space, newline, etc...).
File streams include two member functions specifically designed to input and output binary data sequentially: write and read. The first one (write) is a member function of ostream inherited by ofstream. And read is a member function of istream that is inherited by ifstream. Objects of class fstream have both members. Their prototypes are:
write ( memory_block, size );
read ( memory_block, size );
"
from: http://www.cplusplus.com/doc/tutorial/files/
There is no function ostream& write ( const char* s , streamsize n ). Perhaps you are referring to the member function ostream& ostream::write ( const char* s , streamsize n )?
The .write() function is called raw (or unformatted) output. It simply outputs a series of bytes into the stream.
The global variable cout is one instance of class ofstream and has the .write() method. However, cout is typically used for formatted output, such as:
string username = "Poulami";
cout << "Username: '" << username << "'." << endl;
Many different types have the ostream& operator<<(ostream& stream, const UserDefinedType& data), which can be overloaded to enrich ofstream's vocabulary.
Oh boy! A chance to smash up a question.
From your question I feel you are some Java or Python programmer and definitely not a begginner.
You dont understand that C++ is probably the only language that allows programmers to implement primitive built in operators as class members and as part of the general interface.
In Java you could never go
class Money
{
int operator + (int cash) { return this.cash + cash; }
void operator << () { System.out.println(cash); }
int cash;
}
public class Main_
{
public static void Main(String [] args)
{
Money cashOnHand;
System << cashOnHand;
}
}
But cpp allows this with great effect. class std::ostream implements the stream operators but also implements a regular write function which does raw binary operations.
I agreed with Alok Saveļ¼A litte before, I searched the problem and read the answer carefully.
Maybe in other word, cout is an object of ostream, but write is just a function provided. So cout have twe ways to used by coders: one is as a member function, another is used by operator(<<).

binary read/write runtime failure

I've looked at binary reading and writing objects in c++ but are having some problems. It "works" but in addition I get a huge output of errors/"info".
What I've done is
Person p2;
std::fstream file;
file.open( filename.c_str(), std::ios::in | std::ios::out | std::ios::binary );
file.seekg(0, std::ios::beg );
file.read ( (char*)&p2, sizeof(p2));
file.close();
std::cout << "Name: " << p2.name;
Person is a simple struct containing string name and int age. When I run the program it outputs "Name: Bob" since I have already made a program to write to a file (so the object is already in filename).
IN ADDITION to outputting the name it also outputs:
* glibc detected * program: double free og corruption (fastttop): ***
Backtrace:
...
Memory map:
...
Abort
Is the name string in the Person struct a character array or a STL string? You can't fill in an STL String by binary reading data over top of it, since the data format is not serializable (contains pointers)
It would be interesting to see how you write the information to file as well, as well as how the Person struct is built.
If you don't have any problem that the file is plain text, my suggestion would be to write to file using string::c_str() (which returns a const char*) as well as using itoa() or itoa_s() to get the integer as a char*.
You can also have one or several constructors in Person:
Person(const std::string& name, int age);
Person(const char* name, int age);
then, when you extract the data from the file you just call the constructor with that data.
Either p2.name is a char* and you are writing and reading the pointer value, not what is pointed by it. Or p2.name is a more complex type such as std::string which is using internaly pointers with the same problem.
Serializing classes often need more work than just dumping the memory representation.
You said you wrote the Person object to a file. Did you tried to use a dump tool to see if what you have inside the file is what you are expecting?
Also did you tried to instead of using string, use a ordinary char (as #bdk pointed out) ?
When you use binary IO, the size must be fixed. If you use STL string here, it would have a problem as the size of a STL string is arbitrary.