The first line of my file must be exactly a two digit number which I read into firstline[2]. I use sscanf to read the data from that buffer and store it into an int to represent the number of lines (not including the first one) in the file. If there is a third character I must exit with an error code.
I've tried introducing a new char buffer thirdchar[1] and comparing it to a new line (10 or '\n'). If thirdchar does not equal newline then it should exit with an error code. Later in my program I use sscanf to read firstline and store that number into an int called numberoflines. When I intriduce thirdchar, it appends an extra two zeros to numberoflines to the end of what was in firstline.
//If the first line was "20"
int numberoflines;
char firstline[2];
file.get(firstline[0]);//should be '2'
file.get(firstline[1]);//should be '0'
char thridchar[1];
file.get(thirdchar[0]);//should be '\n'
if (thirdchar !=10){exit();}//10 is the value gdb spits out to represent '\n'
sscanf(firstline, "%d", &numberoflines);//numberoflines should be 20
I debugged this and firstline and thirdchar are the expected values, but numberoflines becomes 2000! I've removed the code related to thirdchar and it works fine, but doesnt meet the requirement of it being a 2 digit number. Am I misunderstanding what sscanf does? Is there a better way to implement this? Thanks.
---------------UPDATE------------------
So I've updated my code to use std::string and std::getline:
std::string firstline;
std::getline(file, firstline);
And I get the following error when trying to print the value of firstline
$1 = Python Exception <class 'gdb.error'> There is no member named _M_dataplus.:
sscanf requires the input string to be null-terminated. You are not passing it a null-terminated string so it's not behaving as expected.
As suggested, you would be better placed reading in the string using std::getline and converting the std::string into an integer.
Further reading here if using C++11 onwards, or here otherwise.
Related
I wrote the following C++ program to read a text file line by line and print out the content of the file line by line. I entered the name of the text file as the only command line argument into the command line.
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[])
{
char buf[255] = {};
if (argc != 2)
{
cout << "Invalid number of files." << endl;
return 1;
}
ifstream f(argv[1], ios::in | ios::binary);
if (!f)
{
cout << "Error: Cannot open file." << endl;
return 1;
}
while (!f.eof())
{
f.get(buf,255);
cout << buf << endl;
}
f.close();
return 0;
}
However, when I ran this code in Visual Studio, the Debug Console was completely blank. What's wrong with my code?
Apart from the errors mentioned in the comments, the program has a logical error because istream& istream::get(char* s, streamsize n) does not do what you (or I, until I debugged it) thought it does. Yes, it reads to the next newline; but it leaves the newline in the input!
The next time you call get(), it will see the newline immediately and return with an empty line in the buffer, for ever and ever.
The best way to fix this is to use the appropriate function, namely istream::getline() which extracts, but does not store the newline.
The EOF issue
is worth mentioning. The canonical way to read lines (if you want to write to a character buffer) is
while (f.getline(buf, bufSz))
{
cout << buf << "\n";
}
getline() returns a reference to the stream which in turn has a conversion function to bool, which makes it usable in a boolean expression like this. The conversion is true if input could be obtained. Interestingly, it may have encountered the end of file, and f.eof() would be true; but that alone does not make the stream convert to false. As long as it could extract at least one character it will convert to true, indicating that the last input operation made input available, and the loop will work as expected.
The next read after encountering EOF would then fail because no data could be extracted: After all, the read position is still at EOF. That is considered a read failure. The condition is wrong and the loop is exited, which was exactly the intent.
The buffer size issue
is worth mentioning, as well. The standard draft says in 30.7.4.3:
Characters are extracted and stored until one of the following occurs:
end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit));
traits::eq(c, delim) for the next available input character c
(in which case the input character
is extracted but not stored);
n is less than one or n - 1 characters are stored
(in which case the function calls setstate(
failbit)).
The conditions are tested in that order, which means that if n-1 characters have been stored and the next character is a newline (the default delimiter), the input was successful (and the newline is extracted as well).
This means that if your file contains a single line 123 you can read that successfully with f.getline(buf, 4), but not a line 1234 (both may or may not be followed by a newline).
The line ending issue
Another complication here is that on Windows a file created with a typical editor will have a hidden carriage return before the newline, i.e. a line actually looks like "123\r\n" ("\r" and "\n" each being a single character with the values 13 and 10, respectively). Because you opened the file with the binary flag the program will see the carriage return; all lines will contain that "invisible" character, and the number of visible characters fitting in the buffer will be one shorter than one would assume.
The console issue ;-)
Oh, and your Console was not entirely empty; it's just that modern computers are too fast and the first line which was probably printed (it was in my case) scrolled away faster than anybody could switch windows. When I looked closely there was a cursor in the bottom left corner where the program was busy printing line after line of nothing ;-).
The conclusion
Debug your programs. It's very easy with VS.
Use getline(istream, string).
Use the return value of input functions (typically the stream)
as a boolean in a while loop: "As long as you can extract any input, use that input."
Beware of line ending issues.
Consider C I/O (printf, scanf) for anything non-trivial (I didn't discuss this in my answer but I think that's what many people do).
I have a small program, that is meant to copy a small phrase from a file, but it appears that I am either misinformed as to how seekg() works, or there is a problem in my code preventing the function from working as expected.
The text file contains:
//Intro
previouslyNoted=false
The code is meant to copy the word "false" into a string
std::fstream stats("text.txt", std::ios::out | std::ios::in);
//String that will hold the contents of the file
std::string statsStr = "";
//Integer to hold the index of the phrase we want to extract
int index = 0;
//COPY CONTENTS OF FILE TO STRING
while (!stats.eof())
{
static std::string tempString;
stats >> tempString;
statsStr += tempString + " ";
}
//FIND AND COPY PHRASE
index = statsStr.find("previouslyNoted="); //index is equal to 8
//Place the get pointer where "false" is expected to be
stats.seekg(index + strlen("previouslyNoted=")); //get pointer is placed at 24th index
//Copy phrase
stats >> previouslyNotedStr;
//Output phrase
std::cout << previouslyNotedStr << std::endl;
But for whatever reason, the program outputs:
=false
What I expected to happen:
I believe that I placed the get pointer at the 24th index of the file, which is where the phrase "false" begins. Then the program would've inputted from that index onward until a space character would have been met, or the end of the file would have been met.
What actually happened:
For whatever reason, the get pointer started an index before expected. And I'm not sure as to why. An explanation as to what is going wrong/what I'm doing wrong would be much appreciated.
Also, I do understand that I could simply make previouslyNotedStr a substring of statsStr, starting from where I wish, and I've already tried that with success. I'm really just experimenting here.
The VisualC++ tag means you are on windows. On Windows the end of line takes two characters (\r\n). When you read the file in a string at a time, this end-of-line sequence is treated as a delimiter and you replace it with a single space character.
Therefore after you read the file you statsStr does not match the contents of the file. Every where there is a new line in the file you have replaced two characters with one. Hence when you use seekg to position yourself in the file based on numbers you got from the statsStr string, you end up in the wrong place.
Even if you get the new line handling correct, you will still encounter problems if the file contains two or more consecutive white space characters, because these will be collapsed into a single space character by your read loop.
You are reading the file word by word. There are better methods:
while (getline(stats, statsSTr)
{
// An entire line is read into statsStr.
std::string::size_type posn = statsStr.find("previouslyNoted=");
// ...
}
By reading entire text lines into a string, there is no need to reposition the file.
Also, there is a white-space issue when reading by word. This will affect where you think the text is in the file. For example, white space is skipped, and there is no telling how many spaces, newlines or tabs were skipped.
By the way, don't even think about replacing the text in the same file. Replacement of text only works if the replacement text has the same length as the original text in the file. Write to a new file instead.
Edit 1:
A better method is to declare your key strings as array. This helps with positioning pointers within a string:
static const char key_text[] = "previouslyNoted=";
while (getline(stats, statsStr))
{
std::string::size_type key_position = statsStr.find(key_text);
std::string::size_type value_position = key_position + sizeof(key_text) - 1; // for the nul terminator.
// value_position points to the character after the '='.
// ...
}
You may want to save programming type by making your data file conform to an existing format, such as INI or XML, and using appropriate libraries to parse them.
string numbers;
string fileName = "text.txt";
ifstream inputFile;
inputFile.open(fileName.c_str(),ios_base::in);
inputFile >> numbers;
inputFile.close();
cout << numbers;
And my text.txt file is:
1 2 3 4 5
basically a set of integers separated by tabs.
The problem is the program only reads the first integer in the text.txt file and ignores the rest for some reason. If I remove the tabs between the integers it works fine, but with tabs between them, it won't work. What causes this? As far as I know it should ignore any white space characters or am I mistaken? If so is there a better way to get each of these numbers from the text file?
When reading formatted strings the input operator starts with ignoring leading whitespace. Then it reads non-whitespace characters up to the first space and stops. The non-whitespace characters get stored in the std::string. If there are only whitespace characters before the stream reaches end of file (or some error for that matter), reading fails. Thus, your program reads one "word" (in this case a number) and stops reading.
Unfortunately, you only said what you are doing and what the problems are with your approach (where you problem description failed to cover the case where reading the input fails in the first place). Here are a few things you might want to try:
If you want to read multiple words, you can do so, e.g., by reading all words:
std::vector<std::string> words;
std::copy(std::istream_iterator<std::string>(inputFile),
std::istream_iterator<std::string>(),
std::back_inserter(words));
This will read all words from inputFile and store them as a sequence of std::strings in the vector words. Since you file contains numbers you might want to replace std::string by int to read numbers in a readily accessible form.
If you want to read a line rather than a word you can use std::getline() instead:
if (std::getline(inputFile, line)) { ... }
If you want to read multiple lines, you'd put this operation into a loop: There is, unfortunately, no read-made approach to read a sequence of lines as there is for words.
If you want to read the entire file, not just the first line, into a file, you can also use std::getline() but you'd need to know about one character value which doesn't occur in your file, e.g., the null value:
if (std::getline(inputFile, text, char()) { ... }
This approach considers a "line" a sequence of characters up to a null character. You can use any other character value as well. If you can't be sure about the character values, you can read an entire file using std::string's constructor taking iterators:
std::string text((std::istreambuf_iterator<char>(inputFile)),
std::istreambuf_iterator<char>());
Note, that the extra pair of parenthesis around the first parameter is, unfortunately, necessary (if you are using C++ 2011 you can avoid them by using braces, instead of parenthesis).
Use getline to do the reading.
string numbers;
if (inputFile.is_open())//checking if open
{
getline (inputFile,numbers); //fetches entire line into string numbers
inputFile.close();
}
Your program does behave exactly as in your description : inputFile >> numbers; just extract the first integer in the input file, so if you suppress the tab, inputFile>> will extract the number 12345, not 5 five numbers [1,2,3,4,5].
a better method :
vector< int > numbers;
string fileName = "text.txt";
ifstream inputFile;
inputFile.open(fileName.c_str(),ios_base::in);
char c;
while (inputFile.good()) // loop while extraction from file is possible
{
c = inputFile.get(); // get character from file
if ( inputFile.good() and c!= '\t' and c!=' ' ) // not sure of tab and space encoding in C++
{
numbers.push_back( (int) c);
}
}
inputFile.close();
I have this code....
file_parser::file_parser(string file){
txtfile.open(file.c_str(), ios::in);
if (!txtfile.good()){
string error=err(0," "+file+" not found. Exit code ",1);
throw file_parse_exception(error);
}
while (!txtfile.eof()){
char str[200];
txtfile.getline(str, 200);
string str2=str;
vfile.push_back(str2);
}
txtfile.close();
}
and the problem is that if I have a line in the input file greater than 200 characters it hangs then crashes. I checked out the value of str at crash and it is preceded by a null char then it tries to push back a null(non-initialized) string onto the vector which causes the hang/crash. does anyone know a way to get around this? I thought by using getline it would truncate the char array at 199(+null) characters but apparently this isn't happening. I'm stumped. The thing is that I want each pushback to have a max of 200 characters. I really don't want the WHOLE line which is what 'string str' would do. and if a line is over 200 characters it should read the first 200 and then move on to the next line.
Replace your input loop with this:
std::string str;
while (std::getline(txtfile, str)){
vfile.push_back(str);
}
Using ios::eof() as a loop condition almost always creates a buggy program, as it did here. In this case, using eof() has two problems. First, eof() is only set after the read fails, not before, but you are checking it before the read. Second, eof() doesn't check the range of other errors. When an input line has more than 200 characters, istream::getline sets failbit, but not eofbit.
EDIT: With the added requirement of limiting the input lines to 200 characters, this should work:
// untested
std::string str;
while(std::getline(txtfile, str)) {
if(str.size() > 200)
str.erase(200, std::string::npos);
vfile.push_back(str);
}
once again I ask for help. I haven't coded anything for sometime!
Now I have a text file filled with random gibberish. I already have a basic idea on how I will count the number of occurrences per word.
What really stumps me is how I will determine what line the word is in. Gut instinct tells me to look for the newline character at the end of each line. However I have to do this while going through the text file the first time right? Since if I do it afterwords it will do no good.
I already am getting the words via the following code:
vector<string> words;
string currentWord;
while(!inputFile.eof())
{
inputFile >> currentWord;
words.push_back(currentWord);
}
This is for a text file with no set structure. Using the above code gives me a nice little(big) vector of words, but it doesn't give me the line they occur in.
Would I have to get the entire line, then process it into words to make this possible?
Use a std::map<std::string, int> to count the word occurrences -- the int is the number of times it exists.
If you need like by line input, use std::getline(std::istream&, std::string&), like this:
std::vector<std::string> lines;
std::ifstream file(...) //Fill in accordingly.
std::string currentLine;
while(std::getline(file, currentLine))
lines.push_back(currentLine);
You can split a line apart by putting it into an std::istringstream first and then using operator>>. (Alternately, you could cobble up some sort of splitter using std::find and other algorithmic primitaves)
EDIT: This is the same thing as in #dash-tom-bang's answer, but modified to be correct with respect to error handing:
vector<string> words;
int currentLine = 1; // or 0, however you wish to count...
string line;
while (getline(inputFile, line))
{
istringstream inputString(line);
string word;
while (inputString >> word)
words.push_back(pair(word, currentLine));
}
Short and sweet.
vector< map< string, size_t > > line_word_counts;
string line, word;
while ( getline( cin, line ) ) {
line_word_counts.push_back();
map< string, size_t > &word_counts = line_word_counts.back();
istringstream line_is( line );
while ( is >> word ) ++ word_counts[ word ];
}
cout << "'Hello' appears on line 5 " << line_word_counts[5-1]["Hello"]
<< " times\n";
You're going to have to abandon reading into strings, because operator >>(istream&, string&) discards white space and the contents of the white space (== '\n' or != '\n', that is the question...) is what will give you line numbers.
This is where OOP can save the day. You need to write a class to act as a "front end" for reading from the file. Its job will be to buffer data from the file, and return words one at a time to the caller.
Internally, the class needs to read data from the file a block (say, 4096 bytes) at a time. Then a string GetWord() (yes, returning by value here is good) method will:
First, read any white space characters, taking care to increment the object's lineNumber member every time it hits a \n.
Then read non-whitespace characters, putting them into the string object you'll be returning.
If it runs out of stuff to read, read the next block and continue.
If the you hit the end of file, the string you have is the whole word (which may be empty) and should be returned.
If the function returns an empty string, that tells the caller that the end of file has been reached. (Files usually end with whitespace characters, so reading whitespace characters cannot imply that there will be a word later on.)
Then you can call this method at the same place in your code as your cin >> line and the rest of the code doesn't need to know the details of your block buffering.
An alternative approach is to read things a line at a time, but all the read functions that would work for you require you to create a fixed-size buffer to read into beforehand, and if the line is longer than that buffer, you have to deal with it somehow. It could get more complicated than the class I described.