I am trying to read only 10 or 100 line from my file. Is there any way that I can read certain line like this?
To read a single line from a file, use:
std::string text_from_file;
std::getline(text_file_stream, text_from_file);
In C++, to perform an action many times, we use a loop. So to read 10 lines from a file, we would use a for loop:
for (unsigned int i = 0U; i < 10U; ++i)
{
std::getline(text_file_stream, text_from_file);
}
Another method:
unsigned int lines_read = 0U;
while ((lines_read < 10) && (std::getline(text_file_stream, text_from_file)))
{
++lines_read;
}
To read 100 lines, you would change the constant from 10 to 100.
Skipping Lines
The fundamental issue with skipping lines or seeking to a given line, is that the text file has variable length records. You will have to read each line to figure out where the next one starts.
So the technique for skipping lines is to read a line into a text variable and ignore it, much like the examples above.
There are methods to speed this up, but they involve reading large blocks of data into memory or treating the file as memory (a.k.a. memory mapping). One issue with this technique is handling the case where the text line you want crosses the end of the buffer (it is not fully in the buffer). These techniques can be found in other posts on StackOverflow or on the Internet.
Reading until a delimiter
A delimiter is something that indicates the end of text. The standard delimiter for text files is a newline. You can read text until a comma, period tab or other delimiter, by using the 3rd parameter of std::getline.
const char delimiter = '.';
std::string text_from_file;
std::getline(text_data_stream, text_from_file, delimiter);
All this is available in good text books or a good online reference.
Related
I'm currently having troubles with my program. I wanna modify only 6 lines in a text files located from line 76 to 81 and I don't know how to do it.
I want to add something at the very end of these lines(or replace'em if it's easier) and not modify any of the other lines(maybe check if the modification hasn't already occurred too but that's bonus).
I found myself lost looking for an answer on google, may you help me ?
If your replacement text is the same, exact length as the original text, you can open the file as read/write and overwrite the text.
The traditional method (since reel-to-reel tapes), has been to copy the text that is not modified to a new file, then the modified text, then the remainder of the original text.
I recommend using std::getline and std::string.
If you really need performance, you may want consider double buffering.
Edit 1: example
for (unsigned int i = 0; i < 76; ++i)
{
std::string text;
std::getline(original_file, text);
new_file << text << "\n";
}
// Write new text to new file
// Read old text and ignore it.
// Copy remaining text to new file.
Background
Although files can be treated as random access (meaning you can seek to a random position), the text is not of fixed length. In general, text files can be considered as containing variable length records of text. The only way to count a line is to read until a newline is found. So in order to seek to line 76, one has to count line endings until 76 is found.
I'm using ifstream to parse a file in a c++ code. I'm not able using seekg() and tellg() to jump in a particular line of the code.
In particular I would like to read a line, with method getLine, from a particular position of the file. Position saved in the previously iteration.
You just have to skip required number of lines.
The best way to do it is ignoring strings with std::istream::ignore
for (int currLineNumber = 0; currLineNumber < startLineNumber; ++currLineNumber){
if (addressesFile.ignore(numeric_limits<streamsize>::max(), addressesFile.widen('\n'))){
//just skipping the line
} else {
// todo: handle the error
}
}
The first argument is maximum number of characters to extract. If this is exactly numeric_limits::max(), there is no limit.
You should use is instead of std::getline due to better performance.
It seems there are no specific C++ functions, like "seekline", for your needs, and I see two ways to solve this task:
Preliminary you can expand every line in textfile with spaces to reach a
constant length L. Than, to seek required line N, just use
seekg with L * N offset.
This method is more complicated. You can create an auxiliary binary
file, every byte of it will keep length of every line of source
file. This auxiliary file is a kind of database. Next, you have to
load this binary file into array in your program within initialization phase. The offset of a
required line in textfile should be calculated as sum of first N array's
elements. Of course, it's necessary to update an auxiliary file and source file simultaneously.
The first case is more efficient if a textfile is loyal for it's size requirements. The second case brings best perfomance for long textfile and rare edit operations.
I am writing a program to read a text file line by line, store the line values in a vector, do some processing then write back to a new text file. This is what the text file typically looks like:
As you can see, there are two columns: one for the frame number and another for the time. What I want is only the second column (aka the time). There can be hundreds, if not thousands of lines in the text file. Previously I have been manually deleting the frame number column which i'd rather not do. So my question is: is there an easy way to edit my current code so that when I read the file with getline() it skips the first word and only gets the second? Here is the code that I use to read the text file. Thanks
ifstream sysfile(sys_time_dir);
//Store lines in a vector
vector<string> sys_times;
string textline;
while (getline(sysfile, textline))
{
sys_times.push_back(textline);
}
Since you have two numbers in each line, you can read two numbers and ignore the first number.
vector<double> sys_times;
int first;
double second;
while ( sysfile >> first >> second )
{
sys_times.push_back(second);
}
std::string ignore_me;
while (sysfile >> ignore_me, getline(sysfile, textline)) {
...
This utilizes the comma operator, reading in the first word (here defining "word" as a continuous sequence of non-space characters) of the line, but ignoring the result, then using getline to read the rest of the line.
Note that for the specific data format you describe, I would rather choose what RSahu showed in their answer. My answer is more general to the problem of "skipping the first word and reading the rest of the line".
I want to access the last 6 lines in a text file using c++. Can anyone provide me with a code that reaches there in a constant time? Thanks in advance. :)
fstream myfile("test.txt");
myfile.seekg(-6,ios_base::end);
string line;
while(getline(myfile,line))
{
if(vect.size() != VSIZE)
{
vect.push_back(line);
}
else
{
vect.erase(v.begin());
vect.push_back(line);
}
}
It seems not to be working... and VSIZE is 6... please provide me with help and working code.
This line:
myfile.seekg(-6,ios_base::end);
seeks to the 6th byte before the end of the file, not 6 lines. You need to count the newline backwards or start from the beginning. So your code should work if you remove the line above.
This is quite a hard thing to do, and there are several edge cases to consider.
Broadly the strategy is:
Open the file in binary mode so you see every byte.
Seek to (end - N), where N is the size of an arbitrary buffer. About 1K should do it.
Read N bytes into a buffer. Scan backwards looking for LF characters ('\n). Skip the one at the end, if there is one.
Each line starts just after an LF, so count the lines backwards until you get to 6.
If you don't find 6 then seek backwards another N bytes, read another buffer and continue the scan.
If you reach the beginning of the file, stop.
I leave the code as an exercise.
This answer explains why what you do won't work. Below I explain what will work.
Open the file in the binary mode.
Read forward from the start storing positions of '\n' in a circular buffer of length 6. (boost::circular_buffer can help)
Dump the contents of the file starting from the smallest position in the ring buffer.
Step 2 can be improved by seeking to end-X where X is derived by some sort of bisection around the end of file.
Probably the easiest approach is to just mmap() the file. This puts its contents into your virtual address space, so you can easily scan it from the end for the first six line endings.
Since mmapping a file gives you the illusion of the entire file being in memory in a single large buffer without actually loading the parts that you don't need, it both avoids unnecessary I/O and alleviates you from managing a growing buffer as you search backwards for the line endings.
I am running C++ code where I need to import data from txt file.
The text file contains 10,000 lines. Each line contains n columns of binary data.
The code has to loop 100,000 times, each time it has to randomly select a line out of the txt file and assign the binary values in the columns to some variables.
What is the most efficient way to write this code? should I load the file first into the memory or should I randomly open a random line number?
How can I implement this in C++?
To randomly access a line in a text file, all lines need to have the same byte-length. If you don't have that, you need to loop until you get at the correct line. Since this will be pretty slow for so much access, better just load it into a std::vector of std::strings, each entry being one line (this is easily done with std::getline). Or since you want to assign values from the different columns, you can use a std::vector with your own struct like
struct MyValues{
double d;
int i;
// whatever you have / need
};
std::vector<MyValues> vec;
Which might be better instead of parsing the line all the time.
With the std::vector, you get your random access and only have to loop once through the whole file.
10K lines is a pretty small file.
If you have, say, 100 chars per line, it will use the HUGE amount of 1MB of your RAM.
Load it to a vector and access it the way you want.
maybe not THE most efficient, but you could try this:
int main() {
//use ifstream to read
ifstream in("yourfile.txt");
//string to store the line
string line = "";
//random number generator
srand(time(NULL));
for(int i = 0; i < 100000; i++) {
in.seekg(rand() % 10000);
in>>line;
//do what you want with the line here...
}
}
Im too lazy right now, but you need to make sure that you check your ifstream for errors like end-of-file, index-out-of-bounds, etc...
Since you're taking 100,000 samples from just 10,000 lines, the majority of lines will be sampled. Read the entire file into an array data structure, and then randomly sample the array. This avoids file seeking entirely.
The more common case is to sample only a small subset of the file's data. To do that, assuming the lines are different length, seek to random points in the file, skip to the next newline (for example cin.ignore( numeric_limits< streamsize >::max(), '\n' ), and then parse the subsequent text.