Reading a stream from the right in c++ - c++

I'm a bit of a noob to c++. I understand that when one reads from a stream, you read from the left. e.g If you had a file with the line:
I'm playing around with streams
When you read the file, the first string you'll retrieve from the file is I'm
Is it possible to make the first string you retrieve to be streams
Reading a stream from the right, basically.
Note: This is assuming that you don't read entire lines per time

Streams are not read from left to right, they're read from first to last. They are supposed to model things where this is the logical way to read them and in some cases (e.g. keyboard input) the only sensible way to read them. For a stream that is entirely known at the start (e.g. a file) you could, if you really really wanted, painstakingly seek to the last element and then read them in one-by-one seeking as you go. This would be slow and ugly.
Instead, I recommend you read from first to last in the usual fashion and then manipulate the data once you've got it.

What you want to do is to read the words from right to left, not the whole stream. Reading the stream from right to left would result in smaerts not in streams and I am sure you can not do that out of the box. What I suggest is that you read the words in a vector and then reverse it. Alternatively, reverse the whole string before reading from it and then reverse each word after reading it.

No, there's no support for this in the streams library. A file is generally arranged with the document data going from left to right, top to bottom. Given variable length lines, you can't generally know where the line terminates unless you scan over all the data in the line.
For this requirement, you're best off reading an entire line into a string with getline, then you have many options such as:
writing your own string scanner to find each word in turn (simple enough, fast)
creating a istringstream from the reversed line text, then looping to stream each word in turn - reversing it back before processing (convenient for stream features - whitespace skipping, conversions, status etc.)
tokenising the line into an array or similar, and iterating that in reverse order (simple conceptually, but memory hungry)

Related

How do i write to a specific line of a text file?

myfile<<hashdugumu[key].numara;
I have this piece of code.For example,i would like to write to eighth line.How do i do that in c++ ?Thanks in advance.
If the line you want to write is exactly the same length (in bytes, not in characters, remember some encodings (like e.g. UTF-8) is variable length) then it's very easy: Just skip over the first seven lines and then write the line.
There is a caveat with this though: input streams and output streams have different stream positions. So if you read from a combined input/output file stream then only the read position will change, so if you just try to write directly then you will not write at the same position. To solve this you need to get the read position, and set the write position to the same value.
As an alternative, or if the data you want to write is not the same size as the existing data, then you have to use a temporary "buffer", be it another file or an actual in-memory buffer.
If the file is not big you can use an in-memory buffer, for example using a std::vector for the lines. Read each line into the vector, and then modify the lines (elements in the vector) that you want to modify. Finally reopen the file for writing, truncating it, and then just write each "line" to the file.
There is a slight problem with the above though when it comes to the rewriting of the data, and that is if the file is truncated and then there's an error when you write to the file, you can lose data. This can be dsolved by using a temporary file.
Using a temporary file it's easier to not bother with the in-memory buffer, and instead read from the original file and write directly to the temporary file. Knowing when you should write something else is done by keeping track of the current line numbers, which is easy if you read one line at a time. In your example you read the first seven lines from the original file and write them to the temporary file, after the seventh line you write your special eight line while skipping the original eight line from the original file, and then just continue reading/writing the remaining lines. When done close the files and then rename the temporary file as the original file.

Go back one line on a text file C++

my program reads a line from a text file using :
std::ifstream myReadFile("route.txt");
getline(myReadFile, line)
And if it finds something that i'm looking for (tag) it stores that line in a temp String. I wan't to continue this until i find some other tag, if i find an other tag i want to be able to return to the previous line in order for the program to read if again as some that other tag and do something else.
I have been looking at putback() and unget() i'm confuse on how to use them and if they might be the correct answer.
Best would be to consider a one pass algorithm, that stores in memory what it could need at the first tag without going back.
If this is not possible, you can "bookmark" the stream position and retreive it later with tellg() and seekg():
streampos oldpos = myReadFile.tellg(); // stores the position
....
myReadFile.seekg (oldpos); // get back to the position
If you read recursively embedded tags (html for example), you could even use a stack<streampos> to push and pop the positions while reading. However, be aware that performance is slowed down by such forward/backward accesses.
You mention putback() and unget(), but these are limited to one char, and seem not suited to your getline() approach.
The easiest thing by far, if you only ever want to roll back by one line, is always to keep track of the line you're on and the line before.
Maintain a cur variable that stores the current line, and prev that stores the previous one. When you move to the next line, you copy cur into prev, and read the new line into cur.
That way, you always have the previous line available.

Access last 6 lines in a text file c++

I want to access the last 6 lines in a text file using c++. Can anyone provide me with a code that reaches there in a constant time? Thanks in advance. :)
fstream myfile("test.txt");
myfile.seekg(-6,ios_base::end);
string line;
while(getline(myfile,line))
{
if(vect.size() != VSIZE)
{
vect.push_back(line);
}
else
{
vect.erase(v.begin());
vect.push_back(line);
}
}
It seems not to be working... and VSIZE is 6... please provide me with help and working code.
This line:
myfile.seekg(-6,ios_base::end);
seeks to the 6th byte before the end of the file, not 6 lines. You need to count the newline backwards or start from the beginning. So your code should work if you remove the line above.
This is quite a hard thing to do, and there are several edge cases to consider.
Broadly the strategy is:
Open the file in binary mode so you see every byte.
Seek to (end - N), where N is the size of an arbitrary buffer. About 1K should do it.
Read N bytes into a buffer. Scan backwards looking for LF characters ('\n). Skip the one at the end, if there is one.
Each line starts just after an LF, so count the lines backwards until you get to 6.
If you don't find 6 then seek backwards another N bytes, read another buffer and continue the scan.
If you reach the beginning of the file, stop.
I leave the code as an exercise.
This answer explains why what you do won't work. Below I explain what will work.
Open the file in the binary mode.
Read forward from the start storing positions of '\n' in a circular buffer of length 6. (boost::circular_buffer can help)
Dump the contents of the file starting from the smallest position in the ring buffer.
Step 2 can be improved by seeking to end-X where X is derived by some sort of bisection around the end of file.
Probably the easiest approach is to just mmap() the file. This puts its contents into your virtual address space, so you can easily scan it from the end for the first six line endings.
Since mmapping a file gives you the illusion of the entire file being in memory in a single large buffer without actually loading the parts that you don't need, it both avoids unnecessary I/O and alleviates you from managing a growing buffer as you search backwards for the line endings.

Retrieving file from .dat via getline() w/ c++

I posted this over at Code Review Beta but noticed that there is much less activity there.
I have the following code and it works just fine. It's function is to grab the input from a file and display it out (to confirm that it's been grabbed). My task is to write a program that counts how many times a certain word (string) "abc" is found in the input file.
Is it better to store the input as a string or in arrays/vectors and have each line be stored separately? a[1], a[2] ect? Perhaps someone could also point me to a resource that I can use to learn how to filter through the input data.
Thanks.
input_file.open ("in.dat");
while(!input_file.eof()) // Inputs all the lines until the end of file (eof).
{
getline(input_file,STRING); // Saves the input_file in STRING.
cout<<STRING; // Prints our STRING.
}
input_file.close();
Reading as much of the file into memory is always more efficient than reading one letter or text line at a time. Disk drives take a lot of time to spin up and relocate to a sector. However, your program will run faster if you can minimize the number of reads from the file.
Memory is fast to search.
My recommendation is to read the entire file, or as much as you can into memory, then search the memory for a "word". Remember, that in English, words can have hyphens,'-', and single quotes, "don't". Word recognition may become more difficult if it is split across a line or you include abbreviations (with periods).
Good luck.

getline() text with UNIX formatting characters

I am writing a C++ program which reads lines of text from a .txt file. Unfortunately the text file is generated by a twenty-something year old UNIX program and it contains a lot of bizarre formatting characters.
The first few lines of the file are plain, English text and these are read with no problems. However, whenever a line contains one or more of these strange characters mixed in with the text, that entire line is read as characters and the data is lost.
The really confusing part is that if I manually delete the first couple of lines so that the very first character in the file is one of these unusual characters, then everything in the file is read perfectly. The unusual characters obviously just display as little ascii squiggles -arrows, smiley faces etc, which is fine. It seems as though a decision is being made automatically, without my knowledge or consent, based on the first line read.
Based on some googling, I suspected that the issue might be with the locale, but according to the visual studio debugger, the locale property of the ifstream object is "C" in both scenarios.
The code which reads the data is as follows:
//Function to open file at location specified by inFilePath, load and process data
int OpenFile(const char* inFilePath)
{
string line;
ifstream codeFile;
//open text file
codeFile.open(inFilePath,ios::in);
//read file line by line
while ( codeFile.good() )
{
getline(codeFile,line);
//check non-zero length
if (line != "")
ProcessLine(&line[0]);
}
//close line
codeFile.close();
return 1;
}
If anyone has any suggestions as to what might be going on or how to fix it, they would be very welcome.
From reading about your issues it sounds like you are reading in binary data, which will cause getline() to throw out content or simply skip over the line.
You have a couple of choices:
If you simply need lines from the data file you can first sanitise them by removing all non-printable characters (that is the "official" name for those weird ascii characters). On UNIX a tool such as strings would help you with that process.
You can off course also do this programmatically in your code by simply reading in X amount of data, storing it in a string, and then removing those characters that fall outside of the standard ASCII character range. This will most likely cause you to lose any unicode that may be stored in the file.
You change your program to understand the format and basically write a parser that allows you to parse the document in a more sane way.
If you can, I would suggest trying solution number 1, simply to see if the results are sane and can still be used. You mention that this is medical data, do you per-chance know what file format this is? If you are trying to find out and have access to a unix/linux machine you can use the utility file and maybe it can give you a clue (worst case it will tell you it is simply data).
If possible try getting a "clean" file that you can post the hex dump of so that we can try to provide better help than that what we are currently providing. With clean I mean that there is no personally identifying information in the file.
For number 2, open the file in binary mode. You mentioned using Windows, binary and non-binary files in std::fstream objects are handled differently, whereas on UNIX systems this is not the case (on most systems, I'm sure I'll get a comment regarding the one system that doesn't match this description).
codeFile.open(inFilePath,ios::in);
would become
codeFile.open(inFilePath, ios::in | ios::binary);
Instead of getline() you will want to become intimately familiar with .read() which will allow unformatted operations on the ifstream.
Reading will be like this:
// This code has not been tested!
char input[1024];
codeFile.read(input, 1024);
int actual_read = codeFile.gcount();
// Here you can process input, up to a maximum of actual_read characters.
//ProcessLine() // We didn't necessarily read a line!
ProcessData(input, actual_read);
The other thing as mentioned is that you can change the locale for the current stream and change the separator it considers a new line, maybe this will fix your issue without requiring to use the unformatted operators:
imbue the stream with a new locale that only knows about the newline. This method may or may not let your getline() function without issues.