c++ getline reads entire file - c++

I'm using std::getline() to read from a text file, line by line. However, the first call to getline is reading in the entire file! I've also tried specifying the delimeter as '\n' explicitly. Any ideas why this might be happening?
My code:
std::ifstream serialIn;
...
serialIn.open(argv[3]);
...
std::string tmpStr;
std::getline(serialIn, tmpStr, '\n');
// All 570 lines in the file is in tmpStr!
...
std::string serialLine;
std::getline(serialIn, serialLine);
// serialLine == "" here
I am using Visual Studio 2008. The text file has 570 lines (I'm viewing it in Notepad++ fwiw).
Edit: I worked around this problem by using Notepad++ to convert the line endings in my input text file to "Windows" line endings. The file was written with '\n' at the end of each line, using c++ code. Why would getline() require the Windows line endings (\r\n)?? Does this have to do with character width, or Microsoft implementation?

Just guessing, but could your file have Unix line-endings and you're running on Windows?

You're confusing the newline you see in code ('\n') with the actual line-ending representation for the platform (some combination of carriage-return (CR) and linefeed (LF) bytes).
The standard I/O library functions automatically convert line-endings for your platform to and from conceptual newlines for text-mode streams (the default). See What's the difference between text and binary I/O? from the comp.lang.c FAQ. (Although that's from the C FAQ, the concepts apply to C++ as well.) Since you're on Windows, the standard I/O functions by default write newlines as CR-LF and expect CR-LF for newlines when reading.
If you don't want these conversions done and would prefer to see the raw, unadulterated data, then you should set your streams to binary mode. In binary mode, \n corresponds to just LF, and \r corresponds to just CR.
In C, you can specify binary mode by passing "b" as one of the flags to fopen:
FILE* file = fopen(filename, "rb"); // Open a file for reading in binary mode.
In C++:
std::ifstream in;
in.open(filename, std::ios::binary);
or:
std::ifstream in(filename, std::ios::binary);

Related

How can i convert linux text file to windows text file by using qt?

When I copy text files to USB flash memory with Qt on raspberry pi 3 , and when I open these text files on Windows , text file '\n' characters not seem to work on Windows.
I searched this topic and I saw that text file formats are different on Linux and Windows.So I have to copy Linux based text files to Flash Memory with Qt and open these files on Windows.
There are a few characters which can indicate a new line. The usual ones are these two:
'\n' or '0x0A' (10 in decimal) -> This character is called "Line Feed" (LF).
'\r' or '0x0D' (13 in decimal) -> This one is called "Carriage return" (CR).
Different Operating Systems handle newlines in a different way. Here is a short list of the most common ones:
DOS and Windows :
They expect a newline to be the combination of two characters, namely '\r\n' (or 13 followed by 10).
Unix (and hence Linux as well) :
Unix uses a single '\n' to indicate a new line.
Mac :
Macs use a single '\r'.
EDIT : As MSalters mentioned Mac OSX is Unix and uses \n. The single \r is ancient Mac OS9
I guess you are just transporting the file, not doing anything with it, but I can't think of another option than opening it and rewrite the line endings.
If you open the .txt file on Windows and read from it (with c++ or c++/Qt) and then write the lines as you get them to a new file, the line endings should then fit the Windows sepcifics.
You can read the file like this:
std::ifstream file;
file.open(filePath);
std::ofstream file2;
file2.open(filePath2);
while(std::getline(file, line))
{
file2<<line;
}
std::getline
At least the documentation states that getline searches for '\n', it should work on windows and Unix. If it doesn't, you can still set the delimeter to '\n'.
If you want to write the file 'Windowslike' on your raspberry, you can try to replace the '\n' characters with '\r\n'
It should look somehow like this:
std::string myFileAsString;
std::string toReplace = "\n";
std::string replaceWith = "\r\n";
myFileAsString.replace(myFileAsString.find(toReplace), toReplace.length(), replaceWith);
where find searches for '\n' and then replaces it with '\r\n'
replace
find

C++ ofstream, printing without CRLF

I have a C++ code I am running in Linux with wine. I think this is actually part of the problem.
Usually, when I do something like this in a native Linux C++ program:
ofstream fout;
fout.open("myfile.txt")
fout<<"blah blah"<<endl;
fout<<"blah blah 2"<<endl;
fout.close;
The file is standard ASCII text. However, in the code I an running under wine, myfile.txt is now ASCII text with CRLF line terminators.
This is a problem because if I want to read the file using a native Linux C++ code running on the same machine, the CRLF line terminators really mess up a lot of the file handling and parsing.
Is there a way to get the code running under wine to output files without CRLF line terminators and in a fashion that I can read it using the native Linux C++ code on the same machine?
You could open the file in ios::binary mode. This doesn't, strictly speaking, mean that it's a binary file [any more than any other file is "text", since all files are binary]. Binary in this context just means "don't muck about with the stuff inside the file by interpreting characters as special, add or remove any characters, etc.
Or when you copy the file to Linux, use dos2unix myfile.txt to convert it from "dos" (and Windows) format to "unix" style text file.

Find the system's line terminator

Is there a header file somewhere that stores the line termination character/s for the system (so that without any #ifdefs it will work on all platforms, MAC, Windows, Linux, etc)?
You should open the file in "text mode" (that is "not use binary"), and newline is always '\n', whatever the native file is. The C library will translate whatever native character(s) indicate newlines into '\n' whenever appropriate [that is, reading/writing text files]. Note that this also means you can't rely on "counting the number of characters read and using that to "seek back to this location".
If the file is binary, then newlines aren't newlines anyways.
And unless you plan on running on really ancient systems, and you REALLY want to do this, I would do:
#ifdef __WINDOWS__ // Or something like that
#define END_LINE "\r\n"
#else
#define END_LINE "\n"
#endif
This won't work for MacOS before MacOS X, but surely nobody is using pre-MacOS X hardware any longer?
No, because it's \n everywhere. That expands to the correct newline character(s) when you write it to a text file.
Posix requires it to be \n. So if _POSIX_VERSION is defined, it's \n. Otherwise, special-case the only non-POSIX OS, windows, and you're done.
It doesn't look like there's anything in the standard library to obtain the current platform's line terminator.
The closest looking API is
char_type std::basic_ios::widen(char c);
It "converts a character c to its equivalent in the current locale" (cppreference). I was pointed at it by the documentation for std::endl which "inserts a endline character into the output sequence os and flushes it as if by calling os.put(os.widen('\n')) followed by os.flush()" (cppreference).
On Posix,
widen('\n') returns '\n' (as a char, for char-based streams);
endl inserts a '\n' and flushes the buffer.
On Windows, they do exactly the same. In fact
#include <iostream>
#include <fstream>
using namespace std;
int main() {
ofstream f;
f.open("aaa.txt", ios_base::out | ios_base::binary);
f << "aaa" << endl << "bbb";
f.close();
return 0;
}
will result in a file with just '\n' as a line terminator.
As others have suggested, when the file is open in text mode (the default) the '\n' will be automatically converted to '\r' '\n' on Windows.
(I've rewritten this answer because I had incorrectly assumed that std::endl translated to "\r\n" on Windows)
The answer to your question can be extended a little further by being able to use the same code to read both Windows-based text files and Unix-based text files in Windows, MacOS and Linux/Unix systems (excluding the ancient Macintosh system that use \r as line delimiter).
As already pointed out by others, \n can be used as line delimiter in all above systems because underlying C library can convert it to native delimiter used by each system. Therefore, one can use the following codes to read in text files that use either \n or \r\n as line delimiters while discarding all delimiter characters:
// Open a file in text mode
std::ifstream file_stream(file_name, ios_base::in);
// Use widened '\n' as line delimiter
for(std::string text_line; std::getline(file_stream, text_line, input.widen('\n'));)
{
if(!text_line.empty())
{
// Discard '\r' when read Windows-based file in Unix-like systems
if(text_line.back() == '\r') text_line.pop_back();
// Do more with text_line
}
}
In above codes, read-in lines containing \r will only be encountered when reading Windows-based text files in Unix-like systems because a single \n is used as delimiter while Windows-based text files use \r\n. On the other hand, when reading text files in Windows-based systems, text files with either \r\n or \n can be removed by std::getline function that uses the widened \n as delimiter. Note that this code snippet doesn't remove any \r not adjacent to \n because then those text files are not correctly formed in Windows, Mac and Linux/Unix systems.

0x0A after 0x0D when reading file

I read a file and find that there are 0x0D after any 0x0A.
I only know that it is the windows that do the convertion.
But I have used the binary mode, it cannot prevent it?
ifstream input(inputname, ios::binary);input.get(ch);
How do I avoid it. I only want to get the \n.
How about write file?
Thx in advance.
If you're on a system that does use \r\n line endings then opening a file in text mode will cause the system to automatically convert these to the standard \n without \r. Opening a file in binary mode prevents this conversion.
If you're on a system that does not use this convention then there's no mode that will convert the line endings. You will have to convert them manually yourself or preprocess the file using an external tool.
If you want to detect whether a file uses \r\n you'll have to do it manually. Scan through the text file and see if every \n is preceded by a \r.
As an alternative, instead of trying to preemptively detect what kind of line endings a file uses, you could simply add logic in your processing code to specially handle \r followed by \n. Something like:
for (int i=0; i<n; ++i) {
if ('\r' == text[i] && (i+1<n) && '\n' == text[i+1])
++i; // skip carriage return, just handle newline
if ('\n' == text[i])
handle newline...
else
handle other characters
}
Hmm. If you use binary mode, ios::binary tells the library that you want to read the file as it is in binary (uncooked, raw).Using msdos (some people nowadays call it windows-nt) lines in text-files are terminated by 0d0a. So if you dont want to see this two chars, you have to open the file in text-mode (just omit the ios::binary). Or you have to convert these files to unix-style by some utilities like dos2unix, but then, if you are on a windows system, e.g. notepad may not be able to display this files as expected...

Mismatch between characters put and read

I'm trying to write a Huffman encoder but I'm getting some compression errors. I identified the problem as mismatches between characters that were put() to the ofstream and the characters read() from the same file.
One specific instance of this problem :
The put() writes ASCII character 10 (Line feed)
The read() reads ASCII character 13 (Carriage return)
I thought read and put read and write raw data ( no character translations ) I'm not sure why this is happening. Can someone help me out?
Here is the ofstream instance for writing the compressed file:
std::ofstream compressedFileStream(getCompressedFileName(),std::ios::binary||std::ios::ate);
and the ifstream instance for reading the same
std::ifstream fileInput(getFileName()+".huf",std::ios::binary);
The code is running on Windows 7 and all streams in the program are opened in binary mode.
Not opening in binary mode due to a typo:
std::ofstream compressedFileStream(getCompressedFileName(),std::ios::binary||std::ios::ate)
should be:
std::ofstream compressedFileStream(getCompressedFileName(),std::ios::binary|std::ios::ate)
// ^
|, not ||.
The symptoms show that you are creating the ofsteam with text mode or you are creating it using a filedesc that is opened in text mode.
You will want to pass ios::binary to it at construction time or it may run in text mode on Windows.
After you added the code, the reason proves to be a typo;
std::ios::binary||std::ios::ate
should be
std::ios::binary|std::ios::ate
On Windows, if you are writing binary data, you need to open the file with the appropriate attributes.
Similarly, if you are reading binary data, you need to open the file with the appropriate attributes.