Splitting files in C++ - c++

I need to split a file into multiple files without compression. I found this on cpp reference
#include <fstream>
using namespace std;
int main () {
char * buffer;
long size;
ifstream infile ("test.txt",ifstream::binary);
ofstream outfile ("new.txt",ofstream::binary);
// get size of file
infile.seekg(0,ifstream::end);
size=infile.tellg();
infile.seekg(0);
// allocate memory for file content
buffer = new char [size];
// read content of infile
infile.read (buffer,size);
// write to outfile
outfile.write (buffer,size);
// release dynamically-allocated memory
delete[] buffer;
outfile.close();
infile.close();
return 0;
}
and I thought to do it like this. But the problem is ..I can create only the 1st file because I can read data only from the beginning of the file. Can it be done like this and if no..what is the best way to split these files.

The example code doesn't split a file into multiple files; it just
copies the file. To split a file into multiple files, just don't close
the input. In pseudo-code:
open input
decide size of each block
read first block
while block is not empty (read succeeded):
open new output file
write block
close output file
read another block
The important part is not closing the input file, so that each read
picks up exactly where the preceding read ended.

You can seek the stream to the desired position and then read stream. Check this piece of code.
// get size of file
infile.seekg(0,ifstream::end);
size=infile.tellg();
infile.seekg(0);
All you need to do is to remember the position where you stopped reading infile, close outfile, open new outfile, reallocate buffers and read infile to buffer and write to second outfile.

You can read data from anywhere in the file - you already moved to the end and back to the start successfully.
You don't need to though: just write a loop to sequentially read each outputSize and write it to a new file, for some outputSize < size.

Why reinvent the wheel - Try split
Even has the source code for you to get ideas if you want to implement it in C++

I think I have got a solution to your problem...
You read all the first file in a char array.
Then you write the first half of your array in a file, and then second half of you array in other file...
For example :
#include <fstream>
using namespace std;
int main () {
char * buffer;
long size;
ifstream infile ("test.txt",ifstream::binary);
ofstream outfile ("new.txt",ofstream::binary);
ofstream outfile2 ("new2.txt",ofstream::binary);
// get size of file
infile.seekg(0,ifstream::end);
size=infile.tellg();
infile.seekg(0);
// allocate memory for file content
buffer = new char [size];
// read content of infile
infile.read (buffer,size);
// write to outfile
outfile.write (buffer,size/2);
outfile2.write (buffer+size/2,size);
// release dynamically-allocated memory
delete[] buffer;
outfile.close();
infile.close();
outfile2.close();
return 0;
}
You can also read the first half, write it, then read the second half and write it... Just have a look to that :
int main () {
char * buffer;
long size;
long halfSize;
ifstream infile ("test.txt",ifstream::binary);
ofstream outfile ("new.txt",ofstream::binary);
ofstream outfile2 ("new2.txt",ofstream::binary);
// get size of file
infile.seekg(0,ifstream::end);
size=infile.tellg();
infile.seekg(0);
halfSize = static_cast<int>(floor(size/2));
// allocate memory for file content
buffer1 = new char[halfSize];
buffer2 = new char[size-halfSize];
// read content of infile
infile.read (buffer1,halfSize);
infile.read (buffer2,size-halfSize);
// write to outfile
outfile.write (buffer1,halfSize);
outfile2.write (buffer2,size-halfSize);
// release dynamically-allocated memory
delete[] buffer;
delete[] buffer2;
outfile.close();
infile.close();
outfile2.close();
return 0;
}

Related

Read big files in C++ but also small files as well in C++?

I want to make a C++ program to read huge files (like 50Gb each) while you have only 4 or 8Gb of RAM.
I want this algorithm to be faster and work with small files as well.
This is the code I have until now:
#include<iostream>
#include<fstream>
#include<string>
using namespace std;
//Making a buffer to store the chuncks of the file read:
// Buffer size 1 Megabyte (or any number you like)
size_t buffer_size = 1<<20;
char *buffer = new char[buffer_size];
int main(){
string filename="stats.txt";
//compute file size
size_t iFileSize = 0;
std::ifstream ifstr(filename.c_str(), std::ios::binary); // create the file stream - this is scoped for destruction
if(!ifstr.good()){
cout<<"File is not valid!"<<endl;
exit(EXIT_FAILURE);
}
//get the file size
iFileSize = ifstr.tellg();
ifstr.seekg( 0, std::ios::end ); // open file at the end to get the size
iFileSize = (int) ifstr.tellg() - iFileSize;
cout<<"File size is: "<<iFileSize<<endl;
//close the file and reopen it for reading:
ifstr.close();
cout<<"Buffer size before check is:"<<buffer_size<<endl;
if(buffer_size>iFileSize){
buffer_size=iFileSize;
}
cout<<"Buffer size after check is:"<<buffer_size<<endl;
ifstream myFile;
myFile.open(filename);
if(myFile.fail()){
cerr<<"Error opening file!"<<endl;
exit(EXIT_FAILURE);
}
if(!myFile.good()){
cout<<"File is not valid!"<<endl;
exit(EXIT_FAILURE);
}
if(!myFile.is_open()){
cout<<"File is NOT opened anymore!"<<endl;
return 1;
}
while(myFile.is_open()&&myFile){
// Try to read next chunk of data
myFile.read(buffer, buffer_size);
// Get the number of bytes actually read
size_t count = myFile.gcount();
// If nothing has been read, break
if (!count){
break;
}
// Do whatever you need with first count bytes in the buffer:
string line;
while(getline(myFile, line)){
if(!line.empty()){
cout <<"Line: '" << line << "'" <<endl;
}
}
}
delete[] buffer;
buffer = NULL;
myFile.close();
return 0;
}
My files could have blank lines between the text line, also even the first lines could be blank lines.
So, I tested the program on a small file size (128kb in size) named to see how it works. But it doesn't work. It doesn't display any line on the screen even the file is so small.
What is wrong? Also, if I change the buffer size to a very small number, it reads just first one or two lines but why it doesn't loop to the end of the file to read and display all of the lines from that small file? Any help, please?
Thank you in advance!
This is the test file: (It starts with a few blank lines also.)
Population UK: 97876876723
Population France: 898989
This is the test end of the file: Yay!
This is the result:
And no line from file is displayed.

How to create an array by reading a file in c++ if every line contains an integer?

In C++ I should read a file in which every line contains an integer and transfer every integer into an array. I tried doing count the lines with getline() function. And created array however when I count the lines, it consumes and if I use getline() function again, it won't work. What should I do? Thank you.
ifstream inFile( fileName );
if ( inFile.is_open() ) {
int size = 0;
string line;
while( getline(inFile, line))
size++;
int* array = new int [ size ];
while ( getline( inFile, line )) {
....
}
}
The code does not enter the second while.
Reset fstream before second while loop
inFile.clear();
inFile.seekg(0, std::ios::beg);
while ( getline( inFile, line )) {
....
When this loop completes:
while( getline(inFile, line))
// ...
the stream inFile has been exhausted, and there's no more data to read from it.
One option is to open the file again, and then read from it. However, this is wasteful, since you can keep track of the numbers you read from the file the first time:
std::vector<int> v;
int i;
while (inFile >> i)
v.push_back(i);
Now you have all the numbers in a container, and you don't need to read from the file again.
Note that I'm using std::vector instead of an array, since it's much easier to work with.
If you absolutely must use an array, then you can read the file once to figure out how many integers are in the file:
int i, count = 0;
while (inFile >> i);
and then allocate memory for an array:
int *array = new int[count];
and then open the file again, and read into the array:
int i = 0;
while (inFile >> array[i++]);
you can read the file into a string and then work with the string instead of with the file
std::ifstream t("file.txt");
std::stringstream buffer;
buffer << t.rdbuf();
std::string stringOfNumbers = buffer.str();
and from there split the string by the newline characters parse the split strings as ints

Using fstream and fstream.eof. Working with files

I'm trying to make a programm, which will read the file, change specified word to symbols '#' and write back to same file. But I have problems with that.
1st question.
It seems like I need to store file in buffer before writing it to the file. How should I do it?
2nd question:
I cant understand why loop in this code never ends? It's about 200 words in that file, but I always get memory exception and i gets 10075.
int main(int argc, char* argv[]){
char** temp = new char*[10000];
int i = 0;
fstream fTemp("D:\doc.txt", ios_base::in);
while (!fTemp.eof()){
temp[i] = new char[50];
fTemp >> temp[i];
temp[i][1] = '#';
cout << temp[i] << endl;
i++;
}
fTemp.open("D:\doc.txt", ios_base::trunc);
for (int i = 0; i < sizeof(*temp); i++){
fTemp << temp[i];
}
_getch();
}
First, you should use getline as your usage of eof is incorrect (eof bit is set only after failed read).
Next, store the strings in a std::vector<string>. This will allow you to not care about memory management (current one is leaking) and provide a more flexible solution.
std::string buffer;
std::vector<string> data;
while(std::getline(fTemp,buffer)) {
data.push_back(buffer);
}
The problem you probably have, is the incorrect eof() call, buy you should check you cout output to determine the problem with this code.
to store the data of file in a buffer, you can get the size of file and use the function read to get all file data. see this code:
// Load file in a buffer
ifstream fl(FileName);
fl.seekg(0, ios::end);
size_t len = fl.tellg();
char* fdata = new char[len];
fl.seekg(0, ios::beg);
fl.read(fdata, len);
fl.close();
in your case the same fstream that you used to open are being used to write without close the file before reopen.
Your loop never ends because it is a pointer, and it size isn't managed, the better way is get the size of file while it is open, in this case the size of file is the "size_t len".
to rewrite your code you can create another stream, see this code:
// Write File
ofstream flOut(FileName, ios_base::trunc);
flOut.write(fdata, len);
flOut.close();
between these two codes above, you can change the data of fdata, but what exactly you wanna make? is replace some word to symbol '#'? which word?

C++ writing to text using buffer returns non ascii character at end of file.

I'm fairly new with C++ and am trying to read and write binary file. I have used the read and write functions to read text from one file and output it to a new file. However the following characters always appear at the end of the created text file "ÌÌ". Is a particular character indicating the end of file being saved in the character buffer?
int main(){
ifstream myfile("example.txt", ios::ate);
ofstream outfile("new.txt");
ifstream::pos_type size;
char buf [1024];
if(myfile.is_open()){
size=myfile.tellg();
cout<<"The file's size is "<<(int) size<<endl;
myfile.seekg(0,ios::beg);
while(!myfile.eof()){
myfile.read(buf, sizeof(buf));
}
outfile.write(buf,size);
}
else
cout<<"Error"<<endl;
myfile.close();
outfile.close();
cin.get();
return 0;
}
Not the only problem with your code (try it on a file bigger than 1024 bytes) but since you are doing binary I/O you need
ifstream myfile("example.txt", ios::ate|ios::binary);
ofstream outfile("new.txt", ios::binary);

Open a new file using std::ifstream

I open a file using ,
std::ifstream ifs(filename);
I want to open a new file using same ifs variable , how do i do that ?
ifs.close();
ifs.open(newfilename);
Please take into consideration that std::ifstream.close() does not clear its flags,
that may contain values from the last session. Always clear the flags with the clear() function before using the stream with another file.
Example:
ifstream mystream;
mystream.open("myfile");
while(mystream.good())
{
// read the file content until EOF
}
mystream.clear(); // if you do not do it the EOF flag remains switched on!
mystream.close();
mystream.open("my_another_file");
while(mystream.good()) // if not cleared, this loop will not start!
{
// read the file
}
mystream.close();
ifs.close(); //close the previous file that was open
ifs.open("NewFile.txt", std::ios::in); //opens the new file in read-only mode
if(!ifs) //checks to see if the file was successfully opened
{
std::cout<<"Unable to read file...\n";
return;
}
char* word = new char[SIZE]; //allocate whatever size you want to
while(ifs>>word)
{
//do whatever
}
ifs.close(); //close the new file
delete[] word; //free the allocated memory