Ignoring remaining newlines and white space when reading input file (C++) - c++

I have a function that reads a text file as input and stores the data in a vector.
It works, as long as the text file doesn't contain any extra new lines or white space.
Here is the code I currently have:
std::ifstream dataStream;
dataStream.open(inputFileName, std::ios_base::in);
std::string pushThis;
while(dataStream >> pushThis){
dataVector.push_back(pushThis);
}
For example:
safe mace
bait mate
The above works as an input text file.
This does not work:
safe mace
bait mate
Is there any way to stop the stream once you reach the final character in the file, while still maintaining separation via white space between words in order to add them to something like a vector, stack, whatever?
i.e. a vector would contain ['safe', 'mace', 'bait', 'mate']

Answer:
The problem came from having two streams, one using !dataStream.eof() and the other using dataStream >> pushThis.
Fixed so that both use dataStream >> pushThis.
For future reference for myself and others who may find this:
Don't use eof() unless you want to grab the ending bit(s) of a file (whitespace inclusive).

Related

How should I go about reading in a text file with each line containing various data types in C++?

I have a textfile with lots of data. Each row has two integers which specify coordinates, followed by the name of the coordinate, and additional attributes.
I'm trying to read in all the attributes into a vector and then later access the different attributes and each row.
I'm using getline to read in each line, but how would I be able to access the various attributes instead of the whole line?
Here is my code:
while (getline(location_file, line)) {
vector<string> file;
file.push_back(line);
}
Would structs be a better option?
Instead of using getline to read the whole line, since the input seems to be standard, you can use a std::istream and the >> operator to read formatted input that ignores whitespaces. You basically have to read 2 integers and then use getline to read the rest of the line.

C++ Easy way to ignore first word of input

I am writing a program to read a text file line by line, store the line values in a vector, do some processing then write back to a new text file. This is what the text file typically looks like:
As you can see, there are two columns: one for the frame number and another for the time. What I want is only the second column (aka the time). There can be hundreds, if not thousands of lines in the text file. Previously I have been manually deleting the frame number column which i'd rather not do. So my question is: is there an easy way to edit my current code so that when I read the file with getline() it skips the first word and only gets the second? Here is the code that I use to read the text file. Thanks
ifstream sysfile(sys_time_dir);
//Store lines in a vector
vector<string> sys_times;
string textline;
while (getline(sysfile, textline))
{
sys_times.push_back(textline);
}
Since you have two numbers in each line, you can read two numbers and ignore the first number.
vector<double> sys_times;
int first;
double second;
while ( sysfile >> first >> second )
{
sys_times.push_back(second);
}
std::string ignore_me;
while (sysfile >> ignore_me, getline(sysfile, textline)) {
...
This utilizes the comma operator, reading in the first word (here defining "word" as a continuous sequence of non-space characters) of the line, but ignoring the result, then using getline to read the rest of the line.
Note that for the specific data format you describe, I would rather choose what RSahu showed in their answer. My answer is more general to the problem of "skipping the first word and reading the rest of the line".

c++ overwriting file data?

I am trying to run a program to replace certain data within a file. The relevant parts of the file attempting to be replaced look like the following:
1 Information 15e+10
2 Information 2e+16
3 Information 6e+2
And so on.
The files in question can be very large in the multiple gigabyte range and to my understanding because of this using a buffer of the whole file and rewriting the whole file is impossible/unreasonable. Well that is all fine I just want to replace the values (ex. the 15e+10).
This all works fine with simple ios::in|ios::out and tellp() if I am replacing the value with a similar sized value (15e+10->12e+12) or even if its a smaller size as I can simply add an extra space which can be ignored down the line (ex. 15e+10->4e+10 ). But I am running into the problem if I need to replace the value with a value whose length is longer than already in the file (ex. 6e+2->16e+10) it will write over the new line character or start writing over the information in the next line.
I have searched on the forums and everyone says you can either overwrite in the file, you can append to the end of the file, or you can buffer and recreate the whole file. Is there anyway I can achieve my goal of overwriting the value correctly without having to recreate the file?
If not then how can I have 2 files open (1 input 1 output) to do this if multiple files in question are too large for the memory?
Note: I would also like to avoid using boost:: as I need to be able to run this on a system without the boost library.
Open a stream to read from the input (IN) file and a second stream (OUT) to write to a new output (tmp) file.
Read from IN and write to OUT. When you get a value from IN that you want to replace write the replacement to OUT instead of the value you got from IN.
When parsing is complete replace the first file with the second (tmp) file.
Would this work for you?
Use lseek()/fseek() for "jump" to a given position in a file.
You can use seekp to go to the location and rewrite it with <<
Example:
example.txt ( |?| = 1 byte of data )
|A|B|C|\n|1|2|3|D|E|F|\n|4|5|6|
//Somewhere in the code
fstream file;
open("example.txt");
//Somehow find the character distance and store it into "distance"
seekp(distance);//If distance = 0, it will go to "A" like rewind() but easier for me
If the distance is 4, the next character will be overwritten is 1
file << "987";
And the file will be
|A|B|C|\n|9|8|7|D|E|F|\n|4|5|6|
BUT the only problem here is when you need to increase/decrease the size:
Increase:
You will overwrite the other character so you need to create a temp string to store it the rest of data or separate it into smaller chunk if the data is too large like
|A|B|C|\n|9|8|7|D|E|F|\n|4|5|6|
string tempstring;
seekp(distance);
file >> tempstring;
seekp(distance);
file << content << tempstring; //content is the data
Decrease:
The easiest solution is to write NULL character \0 to the excess space like
|A|B|C|\n|1|\0|\0|D|E|F|\n|4|5|6|
The only side-effect is the file size is the same as before

Stop carriage return from appearing in stringstream

I'm have some text parsing that I'd like to behave identically whether read from a file or from a stringstream. As such, I'm trying to use an std::istream to perform all the work. In the string version, I'm trying to get it to read from a static memory byte array I've created (which was originally from a text file). Let's say the original file looked like this:
4
The corresponding byte array is this:
const char byte_array[] = { 52, 13, 10 };
Where 52 is ASCII for the character 4, then the carriage return, then the linefeed.
When I read directly from the file, the parsing works fine.
When I try to read it in "string mode" like this:
std::istringstream iss(byte_array);
std::istream& is = iss;
I end up getting the carriage returns stuck on the end of the strings I retrieve from the stringstream with this method:
std::string line;
std::getline(is, line);
This screws up my parsing because the string.empty() method no longer gets triggered on "blank" lines -- every line contains at least a 13 for the carriage return even if it's empty in the original file that generated the binary data.
Why is the ifstream behaving differently from the istringstream in this respect? How can I have the istringstream version discard the carriage return just like the ifstream version does?
std::ifstream operates in text mode by default, which means it will convert non-LF line endings to a single LF. In this case, std::ifstream is removing the CR character before std::getline() ever sees it.
std::istringstream does not do any interpretation of the source string, and passes through all bytes as they are in the string.
It's important to note that std::string represents a sequence of bytes, not characters. Typically one uses std::string to store ASCII-encoded text, but they can also be used to store arbitrary binary data. The assumption is that if you have read text from a file into memory, you have already done any text transformations such as standardization of line endings.
The correct course of action here would be to convert line endings when the file is being read. In this case, it looks like you are generating code from a file. The program that reads the file and converts it to code should be eliminating the CR characters.
An alternative approach would be to write a stream wrapper that takes an std::istream and delegates read operations to it, converting line endings on the fly. This approach is viable, though can be tricky to get right. (Efficiently handling seeking, in particular, will be difficult.)

How do you read a word in from a file in C++?

So I was feeling bored and decided I wanted to make a hangman game. I did an assignment like this back in high school when I first took C++. But this was before I even too geometry, so unfortunately I didn't do well in any way shape or form in it, and after the semester I trashed everything in a fit of rage.
I'm looking to make a txt document and just throw in a whole bunch of words
(ie:
test
love
hungery
flummuxed
discombobulated
pie
awkward
you
get
the
idea
)
So here's my question:
How do I get C++ to read a random word from the document?
I have a feeling #include<ctime> will be needed, as well as srand(time(0)); to get some kind of pseudorandom choice...but I haven't the foggiest on how to have a random word taken from a file...any suggestions?
Thanks ahead of time!
Here's a rough sketch, assuming that the words are separated by whitespaces (space, tab, newline, etc):
vector<string> words;
ifstream in("words.txt");
while(in) {
string word;
in >> word;
words.push_back(word);
}
string r=words[rand()%words.size()];
The operator >> used on a string will read 1 (white) space separated word from a stream.
So the question is do you want to read the file each time you pick a word or do you want to load the file into memory and then pick up the word from a memory structure. Without more information I can only guess.
Pick a Word from a file:
// Note a an ifstream is also an istream.
std::string pickWordFromAStream(std::istream& s,std::size_t pos)
{
std::istream_iterator<std::string> iter(s);
for(;pos;--pos)
{ ++iter;
}
// This code assumes that pos is smaller or equal to
// the number of words in the file
return *iter;
}
Load a file into memory:
void loadStreamIntoVector(std::istream& s,std::vector<std::string> words)
{
std::copy(std::istream_iterator<std::string>(s),
std::istream_iterator<std::string>(),
std::back_inserter(words)
);
}
Generating a random number should be easy enough. Assuming you only want psudo-random.
I would recommend creating a plain text file (.txt) in Notepad and using the standard C file APIs (fopen(), and fread()) to read from it. You can use fgets() to read each line one at a time.
Once you have your plain text file, just read each line into an array and then randomly choose an entry in the array using the method you've suggested above.