how to reverse order of lines in a file [duplicate] - c++

This question already has answers here:
How can I reverse the order of lines in a file?
(24 answers)
Closed 8 years ago.
How can we revere order of lines in a file not the lines themselves.
File can get huge.
No assumption should be made about the length of a line.
Input:
this is line1
this is line2
this is line3
Example Output:
this is line3
this is line2
this is line1
I though of making use of another file as buffer, like a stack data structures, but could not really go anywhere with it.
Any thoughts on this ?

Read in large blocks of the file starting at both ends. Inside those blocks, swap the first line for the last line and then move both pointers to keep track of where you are. Write out each block as you fill it. When the two pointers meet in the middle, you are done.
Don't try to modify the blocks in place, that will make things more complicated. Use four blocks, the first read block, the first write block, the last read block, and the last write block. As each write block is complete, write it out. As each read block is exhausted, read in another one. Be careful not to overwrite anything you've not yet read!
It should be fairly straightforward, just tedious. If you don't need it to be optimal, you can just read blocks backwards and write out a new file and then move it on top of the existing file.

If the file won't fit in memory, then it's a two-pass process. The first pass, you read chunks of the file (as many lines as will fit into memory), and then write them to a temporary file in reverse order. So you have:
while not end of input
read chunk of file into array of lines
write lines from array to temporary file, in reverse order
end while
When you're done with the first pass, you'll have a bunch of temporary files: temp1.txt, temp2.txt, temp3.txt ... tempN.txt.
Now open the last file (tempN.txt) for append, and start appending the files in reverse order. So you have:
open fileN for append
fileno = N-1
while fileno > 0
append file_fileno to fileN
fileno--
end while
Then rename tempN.txt and delete the other temporary files.
By the way, you can use the operating system supplied concatenation utility for step 2. On Windows, for example, you could replace step 2 with:
copy /A file4.txt+file3.txt+file2.txt+file1.txt mynewfile.txt
There are similiar utilities on other platforms.
You might run into command line length limitations, though.

it can be done in 2 simple steps:
step 1: reverse all the file
step 2: reverse each line
step:0 1 2
---------------------
abc zyx xyz
1234 => 4321 => 1234
xyz cba abc
EDIT: here is a complete solution:
#include <iostream>
#include <fstream>
#include <algorithm>
#define BUFFSIZE 4098 /*make sure this is larger then the longest line...*/
using namespace std;
bool reverse_file(const char* input, const char* output)
{
streamsize count=0;
streamoff size=0,pos;
char buff[BUFFSIZE];
ifstream fin(input);
ofstream fout(output);
if(fin.fail() || fout.fail()){
return false;
}
fin.seekg(0, ios::end);
size = fin.tellg();
fin.seekg(0);
while(!fin.eof()){
fin.read(buff, BUFFSIZE);
count = fin.gcount();
reverse(buff,buff+count);
pos = fin.tellg();
if(pos<0) {
pos = size;
}
fout.seekp(size - pos);
fout.write(buff,count);
}
return true;
}
bool reverse_file_lines(const char* input, const char* output)
{
streamsize count=0;
char buff[BUFFSIZE];
ifstream fin(input);
ofstream fout(output);
if(fin.fail() || fout.fail()){
return false;
}
while(!fin.eof()){
fin.getline(buff, BUFFSIZE);
/*if BUFFSIZE is smallest then line size gcount will return 0,
but I didn't handle it...*/
count = fin.gcount();
if(buff[count-1]==0)count--;
reverse(buff,buff+count);
fout.write(buff,count);
if(!fin.eof()){
fout<<endl;
}
}
return true;
}
int main()
{
reverse_file("test.in", "test.tmp");
reverse_file_lines("test.tmp","test.out");
return 0;
}

Related

deleting a line in a .txt file without using another file or array in C++

the professor gave told us to delete a line in a txt file without the help of another file or an array,
i tried to replace the line with backspace but it print the BS character instead
void rem()
{
fstream f("test.txt");
f.seekp(3, ios_base::beg);
f.write("\b",sizeof(char));
f.close();
}
1
2
3
4
5
i want to remove 2
1
3
4
5
after searching for few hours i found that everyone use another file or a vector or the try to replay the line with BS like me.
EDIT:
this is the correct code:
void skip_line(std::fstream& f)
{
char c;
f.get(c);
f.ignore(50, '\n');
}
void getpos(int& readpos, int& writepos)
{
std::fstream f("test.txt", std::ios::in | std::ios::binary);
skip_line(f);
writepos = f.tellg();
skip_line(f);
readpos = f.tellg();
f.close();
}
void rem()
{
int writepos, readpos, length, newSize;
std::fstream f;
getpos(readpos, writepos);
f.open("test.txt");
length = readpos - writepos;
f.seekg(readpos);
for (char c; f.get(c);)
{
f.seekg(writepos);
if (c != '\n') f.put(c);
readpos++;
writepos++;
f.seekg(readpos);
}
f.close();
//fs::path p = "test.txt"
newSize = fs::file_size("test.txt") - length;
fs::resize_file("test.txt", newSize);
}
the rusult:
befor
111111
222222
333333
444444
555555
after
111111
333333
444444
555555
Backspace will not do what you hoped for. A backspace character takes up one char just like any other character. When printed on devices capable of moving the cursor backwards, that's what'll happen. It's just a visual thing and it does not work with files.
Since you are not allowed to use another file or arrays, I'm going to assume that std::vectors and std::strings are also forbidden so I suggest shifting everything down in the file, one char at a time, to overwrite the line to be removed.
You will need a function like std::getline which is capable of reading a line from a stream into a std::string - but you do not need to store any data so we can call it skip_line. It could look like this:
std::istream& skip_line(std::istream& is) {
// read until reading fails or a newline is read:
for(char ch; is.get(ch) && ch != '\n';);
return is;
}
When you've opened the file, call skip_line until you've reached the line you want to remove. If you want to remove line 2, call skip_line 1 time. If you instead want to remove line 3, call skip_line 2 times.
The get (f.tellg()) position in the stream is now where you should start writing when you move everyting in the file back to overwrite the line to be removed. Store this position in a variable called writepos.
Call skip_line one time. The get position is now where you should start reading when moving the contents of the file. Store this position in a variable called readpos.
Calculate and store the length of the line to be removed: lenght_of_line_to_be_removed = readpos - writepos.
Now, you need to read one char at a time from the readpos position and write that char to the writepos position. It could look like this:
f.seekg(readpos); // set the _get_ position where we should read from
for(char ch; f.get(ch);) { // loop for as long as you can read a char
f.seekp(writepos); // set the _put_ position where you should write to
f.put(ch); // ...and write the char
writepos += 1; // step both positions forward
readpos += 1; // -"-
f.seekg(readpos); // set the new _get_ position
}
When the above is done, everything is "shifted down" in the file - but the size of the file will still be the same as it was before:
original: 1 2 3 4 5
after : 1 3 4 5 5
If you use C++17 or newer, you can use the standard functions std::filesystem::file_size and std::filesystem::resize_file to fix this. Remember that you stored lenght_of_line_to_be_removed above. If you use a version of C++ that does not have std::filesystem, you need to use some platform specific function. Posix systems have the truncate function that can be used for this.

Reading lines of txt file into array prints only the last element

First of all, I didn't code in C++ for more then 8 years, but there is a hobby project I would like to work on where I ran into this issue.
I checked a similar question: Only printing last line of txt file when reading into struct array in C
but in my case I don't have a semicolon at the end of the while cycle.
Anyway, so I have a nicknames.txt file where I store nicknames, one in each line.
Then I want to read these nicknames into an array and select one random element of it.
Example nicknames.txt:
alpha
beta
random nickname
...
Pirate Scrub
Then I read the TXT file:
int nicknameCount = 0;
char *nicknames[2000];
std::string line;
std::ifstream file("nicknames.txt");
FILE *fileID = fopen("asd.txt", "w");
while (std::getline(file, line))
{
nicknames[nicknameCount++] = line.data();
// (1)
fprintf(fileID, "%i: %s\n", nicknameCount - 1, nicknames[nicknameCount - 1]);
}
int randomNickIndex = rand() % nicknameCount;
// (2)
for (int i = 0; i < nicknameCount; i++)
fprintf(fileID, "%i: %s\n", i, nicknames[i]);
fprintf(fileID, "Result: %s\n", nicknames[randomNickIndex]);
fprintf(fileID, "Result: %i\n", randomNickIndex);
fclose(fileID);
exit(0);
What then I see at point (1) is what I expect; the nicknames. Then later at point (2) every single member of the array is "Pirate Scrub", which is the last element of the nicknames.txt.
I think it must be something obvious, but I just can't figure it out. Any ideas?
line.data() returns a pointer to the sequence of characters. It is always the same pointer. Every time you read a new line, the contents of line are overwritten. To fix this, you will need to copy the contents of line.
Change:
char *nicknames[2000];
to
char nicknames[2000][256];
and
nicknames[nicknameCount++] = line.data();
to
strcpy(nicknames[nicknameCount++], line.data());
However, using a vector to store the lines is probably better, since this is C++
Your nicknames array does not contain copies of the strings, all the nicknames are pointers to the same data owned by line.
Instead of char* nicknames[2000] i would recommend you use
std::vector<std::string> nicknames;
and then inside the loop:
nicknames.push_back(line);
This:
char *nicknames[2000];
is an array of 2000 pointers to char. Nowhere in your code you are actually storing the strings from the file. This
nicknames[nicknameCount++] = line.data();
merely stores pointers to the lines internal buffer in the array. In the next iteration this buffer is overwritten with contents of the next line.
Forget about all the C i/o. Mixing C and C++ is advanced and you don't need it here. If you want to store a dynamically sized array of strings in C++, that is a std::vector<std::string>:
std::vector<std::string> lines;
std::string line;
while (std::getline(file, line))
{
lines.push_back(line);
}
Also for writing to the output file you should use an std::ofstream.

Go to a specific line in file and read it

Problem description
I have a file containing a set of lines. A
File 1:
"Hello How are you"
"The cat ate the mouse"
Based on the the beginning and ending of the lines given by the user as input. I want to go to each line in the file and Extract it.
For examples if user types 1 , 17 then I have to go to line 1 that has a size of 17 characters. He can give any line number in the file.
I read the following Answer Read from a specific spot in a file C++. But I didn't really understand. Why do the lines have to be the same size? If i have the information concerning the beginning and ending of every line in the file. Why can't I access it directly?
Source code
I tried to use the following code which was inspired by Read Data From Specified Position in File Using Seekg But I couldn't extract the lines why?
#include <fstream>
#include <iostream>
using namespace std::
void getline(int, int, const ifstream & );
int main()
{
//open file1 containing the sentences
ifstream file1("file1.txt");
int beg = 1;
int end = 17;
getline(beg,end, file1);
beg = 2;
end = 20;
getline(beg,end, file1);
return 0;
}
void getline(int beg, int end, const ifstream & file)
{
file.seekg(beg, ios::beg);
int length = end;
char * buffer = new char [length];
file.read (buffer,length);
buffer [length - 1] = '\0';
cout.write (buffer,length);
delete[] buffer;
}
This code appears to be using line numbers as byte offsets. If you seek to offset "1" the file seeks forward 1 byte, not 1 line. If you seek to offset 2, the file seeks forward 2 bytes, not 2 lines.
To seek to a specific line you need to read the file and count the number of line breaks until you get to the line you want. There is code that already does this, for example std::getline(). If you don't already know the exact byte offset of the line you want, you can call std::getline() the number of times equal to the line number you want.
Also remember that the first byte of a file is at offset 0 not offset 1, and that different platforms use different bytes to indicate the end of a line (E.g. on Windows it's "\r\n", on Unix it's "\n"). If you're using a library function to read lines, the line ending should be taken care of for you.

Merging two text files gives wierd results

I need to merge two text files by putting them in a vector array and then writing them in a new text file.
After merging them.The new file has extra characters.
FE:
f1.txt ("text1")
f2.txt ("text2.")
f12.txt ("text1˙text2.˙W64")
Content of the buffer: "text1 text2. W64"
Here is the code:
int main(){
enum errorcode{FNF,FNC};
vector<char> buffer;
char ime[255];
cin>>ime;//first file
ifstream ud1(ime,ios::in);
if(ud1.is_open()){
while(!ud1.eof())buffer.push_back(ud1.get());
ud1.close();
}
else {cout<<"File not found.";return FNF;}
cin>>ime;//second file
ifstream ud2(ime,ios::in);
if(ud2.is_open()){
while(!ud2.eof())buffer.push_back(ud2.get());
ud2.close();
}
else {cout<<"File not found.";return FNF;}
cin>>ime;//new file
ofstream id(ime,ios::out);
if(id.is_open()){
for(int i=0;i<buffer.capacity();i++)id.put(buffer[i]);
id.close();
}
else {cout<<"File not created.";return FNC;}
return 0;
}
I guess this is because of notepad or files themselves.
Can you please tell me reason for this.
you are using Vector capacity: Returns the size of the storage space currently allocated for the vector, expressed in terms of elements.
You must use vector size: Returns the number of elements in the vector. This is the number of actual objects held in the vector, which is not necessarily equal to its storage capacity.
About the ˙
please look at istream::get return value:
Return Value
The first signature returns the character read, or the end-of-file value (EOF) if no characters are available in the stream (note that in this case, the failbit flag is also set).
So, you could change the loop to this:
while(!ud1.eof()){
int tmpChar = ud1.get();
if( !ud1.eof() )
buffer.push_back(tmpChar);
}

Need to write specific lines of a text into a new text

I have numerical text data lines ranging between 1mb - 150 mb in size, i need to write lines of numbers related to heights, for example: heights=4 , new text must include lines: 1,5,9,13,17,21.... consequentially.
i have been trying to find a way to do this for a while now, tried using a list instead of vector which ended up with compilation errors.
I have cleaned up the code as advised. It now writes all lines sample2 text, all done here. Thank you all
I am open to method change as long as it delivers what i need, Thank you for you time and help.
following is what i have so far:
#include <iostream>
#include <fstream>
#include <string>
#include <list>
#include <vector>
using namespace std;
int h,n,m;
int c=1;
int main () {
cout<< "Enter Number Of Heights: ";
cin>>h;
ifstream myfile_in ("C:\\sample.txt");
ofstream myfile_out ("C:\\sample2.txt");
string line;
std::string str;
vector <string> v;
if (myfile_in.is_open()) {
myfile_in >> noskipws;
int i=0;
int j=0;
while (std::getline(myfile_in, line)) {
v.push_back( line );
++n;
if (n-1==i) {
myfile_out<<v[i]<<endl;
i=i+h;
++j;
}
}
cout<<"Number of lines in text file: "<<n<<endl;
}
else cout << "Unable to open file(s) ";
cout<< "Reaching here, Writing one line"<<endl;
system("PAUSE");
return 0;
}
You need to use seekg to set the position at the beginning of the file, once you have read it (you have read it once, to count the lines (which I don't think you actually need, as this size is never used, at least in this piece of code)
And what is the point if the inner while? On each loop, you have
int i=1;
myfile_out<<v[i]; //Not writing to text
i=i+h;
So on each loop, i gets 1, so you output the element with index 1 all the time. Which is not the first element, as indices start from 0. So, once you put seekg or remove the first while, your program will start to crash.
So, make i start from 0. And get it out of the two while loops, right at the beginning of the if-statement.
Ah, the second while is also unnecessary. Leave just the first one.
EDIT:
Add
myfile_in.clear();
before seekg to clear the flags.
Also, your algorithm is wrong. You'll get seg fault, if h > 1, because you'll get out of range (of the vector). I'd advise to do it like this: read the file in the while, that counts the lines. And store each line in the vector. This way you'll be able to remove the second reading, seekg, clear, etc. Also, as you already store the content of the file into a vector, you'll NOT lose anything. Then just use for loop with step h.
Again edit, regarding your edit: no, it has nothing to do with any flags. The if, where you compare i==j is outside the while. Add it inside. Also, increment j outside the if. Or just remove j and use n-1 instead. Like
if ( n-1 == i )
Several things.
First you read the file completely, just to count the number of lines,
then you read it a second time to process it, building up an in memory
image in v. Why not just read it in the first time, and do everything
else on the in memory image? (v.size() will then give you the number
of lines, so you don't have to count them.)
And you never actually use the count anyway.
Second, once you've reached the end of file the first time, the
failbit is set; all further operations are no-ops, until it is reset.
If you have to read the file twice (say because you do away with v
completely), then you have to do myfile_in.clear() after the first
loop, but before seeking to the beginning.
You only test for is_open after having read the file once. This test
should be immediately after the open.
You also set noskipws, although you don't do any formatted input
which would be affected by it.
The final while is highly suspect. Because you haven't done the
clear, you probably never enter the loop, but if you did, you'd very
quickly start accessing out of bounds: after reading n lines, the size
of v will be n, but you read it with index i, which will be n * h.
Finally, you should explicitly close the output file and check for
errors after the close, just in case.
It's not clear to me what you're trying to do. If all you want to do is
insert h empty lines between each existing line, something like:
std::string separ( h + 1, '\n' );
std::string line;
while ( std::getline( myfile_in, line ) ) {
myfile_out << line << separ;
}
should do the trick. No need to store the complete input in memory.
(For that matter, you don't even have to write a program for this.
Something as simple a sed 's:$:\n\n\n\n:' < infile > outfile would do
the trick.)
EDIT:
Reading other responses, I gather that I may have misunderstood the
problem, and that he only wants to output every h-th line. If this is
the case:
std::string line;
while ( std::getline( myfile_in, line ) ) {
myfile_out << line << '\n';
for ( int count = h - 1; h > 0; -- h ) {
std::getline( myfile_in, line );
// or myfile_in.ignore( INT_MAX, '\n' );
}
}
But again, other tools seem more appropriate. (I'd follow thiton's
suggestion and use AWK.) Why write a program in a language you don't
know well when tools are already available to do the job.
If there is no absolutely compelling reason to do this in C++, you are using the wrong programming language for this. In awk, your whole program is:
{ if ( FNR % 4 == 1 ) print; }
Or, giving the whole command line e.g. in sh to filter lines 1,5,9,13,...:
awk '{ if ( FNR % 4 == 1 ) print; }' a.txt > b.txt