Using ifstream's getLIne C++ - c++

Hello World,
I am fairly new to C++ and I am trying to read a text file Line by Line. I did some research online and stumbled across ifstream.
What is troubling me is the getLine Method.
The parameters are istream& getline (char* s, streamsize n );
I understand that the variable s is where the line being read is saved. (Correct me if I am wrong)
What I do not understand is what the streamsize n is used for.
The documentation states that:
Maximum number of characters to write to s (including the terminating null character).
However if I do not know how long a given line is what do I set the streamsize n to be ?
Also,
What is the difference between ifstream and istream ?
Would istream be more suitable to read lines ? Is there a difference in performance ?
Thanks for your time

You almost never want to use this getline function. It's a leftover from back before std::string had been defined. It's for reading into a fixed-size buffer, so you'd do something like this:
static const int N = 1024;
char mybuffer[N];
myfile.getline(mybuffer, N);
...and the N was there to prevent getline from writing into memory past the end of the space you'd allocated.
For new code you usually want to use an std::string, and let it expand to accommodate the data being read into it:
std::string input;
std::getline(myfile, input);
In this case, you don't need to specify the maximum size, because the string can/will expand as needed for the size of the line in the input. Warning: in a few cases, this can be a problem--if (for example) you're reading data being fed into a web site, it could be a way for an attacker to stage a DoS attack by feeding an immense string, and bringing your system to its knees trying to allocate excessive memory.
Between istream and ifstream: an istream is mostly a base class that defines an interface that can be used to work with various derived classes (including ifstream objects). When/if you want to open a file from disk (or something similar) you want to use an ifstream object.

Related

Input Redirection reading integers and char C++

Thanks for taking your time to read this!
I'm having trouble parsing through a file with input redirection and I am having trouble reading through integers and characters.
Without using getline(), how do you read in the file including integers, characters, and any amount of whitespaces? (I know the >> operator can skip whitespace but fails when it hits a character)
Thanks!
The first thing you need to realise is that, fundamentally, there are no things like "integers" in your file. Your file does not contain typed data: it contains bytes.
Now, since C++ doesn't support any text encodings, for our purposes here we can consider bytes equivalent to "characters". (In reality, you'll probably layer something like a UTF-8 support library on top of your code, at which point "characters" takes on a whole new meaning. But we'll save that discussion for another day.)
At the most basic, then, we can just extract a bunch of bytes. Let's say 50 at a time:
std::ifstream ifs("filename.dat");
static constexpr const size_t CHUNK_SIZE = 50;
char buf[CHUNK_SIZE];
while (ifs.read(buf, CHUNK_SIZE)) {
const size_t num_extracted = ifs.gcount();
parseData(buf, num_extracted);
}
The function parseData would then examine those bytes in whatever manner you see fit.
For many text files this is unnecessarily arduous. So, as you've discovered, the IOStreams part of the C++ Standard Library provides us with some shortcuts. For example, std::getline will read bytes up to a delimiter, rather than reading a certain quantity of bytes.
Using this, we can read in things "line by line" — assuming a "line" is a sequence of bytes terminated by a \n (or \r\n if your platform performs line-ending translation, and you haven't put the stream into binary mode):
std::ifstream ifs("filename.dat");
static constexpr const size_t CHUNK_SIZE = 50;
std::string line;
while (std::getline(ifs, line)) {
parseLine(line);
}
Instead of \n you can provide, as a third argument to std::getline, some other delimiter.
The other facility it offers is operator<<, which will pick out tokens (sequences of bytes delimited by whitespace) and attempt to "lexically cast" them; that is, it'll try to interpret friendly human ASCII text into C++ data. So if your input is "123 abc", you can pull out the "123" into an int with value 123, and the "abc" into another string.
If you need more complex parsing, though, you're back to the initial offering, and to the conclusion of my answer: read everything and parse it byte-by-byte as you see fit. To help with this, there's sscanf inherited from the C standard library, or spooky incantations from Boost; or you could just write your own algorithms.
The above is true of any compatible input stream, be it a std::ifstream, a std::istringstream, or the good old ready-provided std::istream instance named std::cin (which I guess is how you're accepting the data, given your mention of input redirection: shell scripting?).

Any way to get rid of the null character at the end of an istream get?

I'm currently trying to write a bit of code to read a file and extract bits of it and save them as variables.
Here's the relevant code:
char address[10];
ifstream tracefile;
tracefile.open ("trace.txt");
tracefile.seekg(2, ios::beg);
tracefile.get(address, 10, ' ');
cout << address;
The contents of the file: (just the first line)
R 0x00000000
The issue I'm having is that address misses the final '0' because it puts a /0 character there, and I'm not sure how to get around that? So it outputs:
0x0000000
I'm also having issues with
tracefile.seekg(2, ios::cur);
It doesn't seem to work, hence why I've changed it to ios::beg just to try and get something work, although obviously that won't be useable once I try to read multiple lines after one another.
Any help would be appreciated.
ifstream::get() will attempt to produce a null-terminated C string, which you haven't provided enough space for.
You can either:
Allocate char address[11]; (or bigger) to hold a null-terminated string longer than 9 characters.
Use ifstream::read() instead to read the 10 bytes without a null-terminator.
Edit:
If you want a buffer that can dynamically account for the length of the line, use std::getline with a std::string.
std::string buffer;
tracefile.seekg(2, ios::beg);
std::getline( tracefile, buffer );
Edit 2
If you only want to read to the next whitespace, use:
std::string buffer;
tracefile.seekg(2, ios::beg);
tracefile >> buffer;
Make the buffer bigger, so that you can read the entire input text into it, including the terminating '\0'. Or use std::string, which doesn't have a pre-determined size.
There are several issues with your code. The first is that
seekg( 2, ios::beg ) is undefined behavior unless the stream
is opened in binary mode (which yours isn't). It will work
under Unix, and depending on the contents of the file, it
might work under Windows (but it could also send you to the
wrong place). On some other systems, it might systematically
fail, or do just about anything else. You cannot reliably seek
to arbitrary positions in a text stream.
The second is that if you want to read exactly 10 characters,
the function you need is istream::read, and not
istream::get. On the other hand, if you want to read up to
the next white space, using >> into a string will work best.
If you want to limit the number of characters extracted to a
maximum, set the width before calling >>:
std::string address;
// ...
tracefile >> std::setw( 10 ) >> address;
This avoids all issues of '\0', etc.
Finally, of course, you need error checking. You should
probably check whether the open succeeded before doing anything
else, and you should definitely check whether the read succeeded
before using the results. (As you've written the code, if the
open fails for any reason, you have undefined behavior.)
If you're reading multiple lines, of course, the best solution
is usually to use std::getline to read each line into a
string, and then parse that string (possibly using
std::istringstream). This prevents the main stream from
entering error state if there is a format error in the line, and
it provides automatic resynchronization in such cases.

fgets - maximum size (int num)

In my program, i'm calling the function fgets:
char * fgets ( char * str, int num, FILE * stream );
in a loop several times, and then deal with the new coming input (in case there is one).
in the fgets specifications, it says that:
num:
Maximum number of characters to be read (including the final
null-character). Usually, the length
of the array passed as str is used.
The problem is that i want to rean NO MORE than the specified num - and IGNORE the rest of it, if it's in the same line.
What i've found out, is that the fgets reads the next part of the line in the next call to the function.
How can i avoid this behavior?
You'll need to do it manually - consume the characters with fgets and copy the results to a result array until a newline character is found, dump the contents, and continue with fgets.
The size parameter is intended to be used to prevent reading more data than your buffer can hold. It won't work for skipping over data.
You'll have to write code to throw away the parts of the string you don't want after it's read.
fgets() is a old C function. The idea is that the language will provide minimal complexity functions that you can combine to do what you like. They don't include any extra capability on purpose. This keeps everyone from paying for things they don't use. Think LEGO.

parse an unknown size string

I am trying to read an unknown size string from a text file and I used this code :
ifstream inp_file;
char line[1000] ;
inp_file.getline(line, 1000);
but I don't like it because it has a limit (even I know it's very hard to exceed this limit)but I want to implement a better code which reallocates according to the size of the coming string .
The following are some of the available options:
istream& getline ( istream& is, string& str, char delim );
istream& getline ( istream& is, string& str );
One of the usual idioms for reading unknown-size inputs is to read a chunk of known size inside a loop, check for the presence of more input (i.e. verify that you are not at the end of the line/file/region of interest), and extend the size of your buffer. While the getline primitives may be appropriate for you, this is a very general pattern for many tasks in languages where allocation of storage is left up to the programmer.
Maybe you could look at using re2c which is a flexible scanner for parsing the input stream? In that way you can pull in any sized input line without having to know in advance... for example using a regex notation
^.+$
once captured by re2c you can then determine how much memory to allocate...
Have a look on memory-mapped files in boost::iostreams.
Maybe it's too late to answer now, but just for documentation purposes, another way to read an unknown sized line would be to use a wrapper function. In this function, you use fgets() using a local buffer.
Set last character in the buffer to '\0'
Call fgets()
Check the last character and see if it's still '\0'
If it's not '\0' and it's not '\n', implies not finished reading a line yet. Allocate a new buffer and copy the data into this new buffer and go back to step (1) above.
If there is already an allocated buffer, call realloc() to make it bigger. Otherwise, you are done. Return the data in an allocated buffer.
This was a tip given in my algorithms lecture.

Overloading operator>> to a char buffer in C++ - can I tell the stream length?

I'm on a custom C++ crash course. I've known the basics for many years, but I'm currently trying to refresh my memory and learn more. To that end, as my second task (after writing a stack class based on linked lists), I'm writing my own string class.
It's gone pretty smoothly until now; I want to overload operator>> that I can do stuff like cin >> my_string;.
The problem is that I don't know how to read the istream properly (or perhaps the problem is that I don't know streams...). I tried a while (!stream.eof()) loop that .read()s 128 bytes at a time, but as one might expect, it stops only on EOF. I want it to read to a newline, like you get with cin >> to a std::string.
My string class has an alloc(size_t new_size) function that (re)allocates memory, and an append(const char *) function that does that part, but I obviously need to know the amount of memory to allocate before I can write to the buffer.
Any advice on how to implement this? I tried getting the istream length with seekg() and tellg(), to no avail (it returns -1), and as I said looping until EOF (doesn't stop reading at a newline) reading one chunk at a time.
To read characters from the stream until the end of line use a loop.
char c;
while(istr.get(c) && c != '\n')
{
// Apped 'c' to the end of your string.
}
// If you want to put the '\n' back onto the stream
// use istr.unget(c) here
// But I think its safe to say that dropping the '\n' is fine.
If you run out of room reallocate your buffer with a bigger size.
Copy the data across and continue. No need to be fancy for a learning project.
you can use cin::getline( buffer*, buffer_size);
then you will need to check for bad, eof and fail flags:
std::cin.bad(), std::cin.eof(), std::cin.fail()
unless bad or eof were set, fail flag being set usually indicates buffer overflow, so you should reallocate your buffer and continue reading into the new buffer after calling std::cin.clear()
A side note: In the STL the operator>> of an istream is overloaded to provide this kind of functionality or (as for *char ) are global functions. Maybe it would be more wise to provide a custom overload instead of overloading the operator in your class.
Check Jerry Coffin's answer to this question.
The first method he used is very simple (just a helper class) and allow you to write your input in a std::vector<std::string> where each element of the vector represents a line of the original input.
That really makes things easy when it comes to processing afterwards!