Trying to implement an iterator without an explicit container - c++

After my recent question, I am trying to implement my own contrived example.
I have a basic structure in place but even after reading this, which is probably the best tutorial I've seen, I'm still very confused. I think I should probably convert the Chapter._text into a stream and for the increment operator do something like
string p = "";
string line;
while ( getline(stream, line) ) {
p += line;
}
return *p;
but I'm not sure which of the "boilerplate" typedefs to use and how to put all these things together. Thanks much for your help
#include <stdio.h>
#include <stdlib.h>
#include <vector>
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
class Paragraph {
public:
string _text;
Paragraph (string text) {
_text = text;
}
};
class Chapter {
public:
string _text;
/* // I guess here I should do something like:
class Iterator : public iterator<input_iterator_tag, Paragraph> {
}
// OR should I be somehow delegating from istream_iterator ? */
Chapter (string txt_file) {
string line;
ifstream infile(txt_file.c_str());
if (!infile.is_open()) {
cout << "Error opening file " << txt_file << endl;
exit(0);
}
while ( getline(infile, line) ) {
_text += (line + "\n");
}
}
};
int main(int argc, char** argv) {
Chapter c(argv[1]);
// want to do something like:
// for (<Paragraph>::iterator pIt = c.begin(); pIt != c.end(); pIt++) {
// Paragraph p(*pIt);
// // Do something interesting with p
// }
return 0;
}

As you weren't planning on a chapter loading at a time (merely a paragraph), and your paragraph is empty, I think this might be best done with a single paragraph_iterator class
class paragraph_iterator :
public std::iterator<std::input_iterator_tag, std::string>
{
std::shared_ptr<std::ifstream> _infile; //current file
std::string _text; //current paragraph
paragraph_iterator(const paragraph_iterator &b); //not defined, so no copy
paragraph_iterator &operator=(const paragraph_iterator &b); //not defined, so no copy
// don't allow copies, because streams aren't copiable.
// make sure to always pass by reference
// unfortunately, this means no stl algorithms either
public:
paragraph_iterator(string txt_file) :_infile(txt_file.c_str()) {
if (!infile.is_open())
throw std::exception("Error opening file ");
std::getline(_infile, _text);
}
paragraph_iterator() {}
paragraph_iterator &operator++() {
std::getline(_infile, _text);
return *this;
}
// normally you'd want operator++(int) as well, but that requires making a copy
// and std::ifstream isn't copiable.
bool operator==(const paragraph_iterator &b) const {
if (_infile.bad() == b._infile.bad())
return true; //all end() and uninitialized iterators equal
// so we can use paragraph_iterator() as end()
return false; //since they all are seperate strings, never equal
}
bool operator!=(const paragraph_iterator &b) const {return !operator==(b);}
const std::string &operator*() const { return _text;}
};
int main() {
paragraph_iterator iter("myfile.txt");
while(iter != paragraph_iterator()) {
// dostuff with *iter
}
}
the stream is encapsulated in the iterator, so that if we have two iterators to the same file, both will get every line. If you have a seperate Chapter class with two iterators, you may run into "threading" problems. This is pretty bare code, and completely untested. I'm sure there's a way to do it with copiable iterators, but far trickier.
In general, an iterator class implementation is closely tied with the data structure it iterates over. Otherwise, we'd just have a few generic iterator classes.

Related

C++ random access iterators for containers with elements loaded on demand

I'm currently working on a small project which requires loading messages from a file. The messages are stored sequentially in the file and files can become huge, so loading the entire file content into memory is unrewarding.
Therefore we decided to implement a FileReader class that is capable of moving to specific elements in the file quickly and load them on request. Commonly used something along the following lines
SpecificMessage m;
FileReader fr;
fr.open("file.bin");
fr.moveTo(120); // Move to Message #120
fr.read(&m); // Try deserializing as SpecificMessage
The FileReader per se works great. Therefore we thought about adding STL compliant iterator support as well: A random access iterator that provides read-only references to specific messages. Used in the following way
for (auto iter = fr.begin<SpecificMessage>(); iter != fr.end<SpecificMessage>(); ++iter) {
// ...
}
Remark: the above assumes that the file only contains messages of type SpecificMessage. We've been using boost::iterator_facade to simplify the implementation.
Now my question boils down to: how to implement the iterator correctly? Since FileReader does not actually hold a sequence of messages internally, but loads them on request.
What we've tried so far:
Storing the message as an iterator member
This approach stores the message in the iterator instance. Which works great for simple use-cases but fails for more complex uses. E.g. std::reverse_iterator has a dereference operation that looks like this
reference operator*() const
{ // return designated value
_RanIt _Tmp = current;
return (*--_Tmp);
}
This breaks our approach as a reference to a message from a temporary iterator is returned.
Making the reference type equal the value type
#DDrmmr in the comments suggested making the reference type equal the value type, so that a copy of the internally stored object is returned. However, I think this is not valid for the reverse iterator which implements the -> operator as
pointer operator->() const {
return (&**this);
}
which derefs itself, calls the *operator which then returns a copy of a temporary and finally returns the address of this temporary.
Storing the message externally
Alternatively I though about storing the message externally:
SpecificMessage m;
auto iter = fr.begin<SpecificMessage>(&m);
// ...
which also seems to be flawed for
auto iter2 = iter + 2
which will have both iter2 and iter point to the same content.
As I hinted in my other answer, you could consider using memory mapped files. In the comment you asked:
As far as memory mapped files is concerned, this seems not what I want to have, as how would you provide an iterator over SpecificMessages for them?
Well, if your SpecificMessage is a POD type, you could just iterate over the raw memory directly. If not, you could have a deserialization helper (as you already have) and use Boost transform_iterator to do the deserialization on demand.
Note that we can make the memory mapped file managed, effectively meaning that you can just use it as a regular heap, and you can store all standard containers. This includes node-based containers (map<>, e.g.), dynamic-size containers (e.g. vector<>) in addition to the fixed-size containers (array<>) - and any combinations of those.
Here's a demo that takes a simple SpecificMessage that contains a string, and (de)derializes it directly into shared memory:
using blob_t = shm::vector<uint8_t>;
using shared_blobs = shm::vector<blob_t>;
The part that interests you would be the consuming part:
bip::managed_mapped_file mmf(bip::open_only, DBASE_FNAME);
shared_blobs* table = mmf.find_or_construct<shared_blobs>("blob_table")(mmf.get_segment_manager());
using It = boost::transform_iterator<LazyLoader<SpecificMessage>, shared_blobs::const_reverse_iterator>;
// for fun, let's reverse the blobs
for (It first(table->rbegin()), last(table->rend()); first < last; first+=13)
std::cout << "blob: '" << first->contents << "'\n";
// any kind of random access is okay, though:
auto random = rand() % table->size();
SpecificMessage msg;
load(table->at(random), msg);
std::cout << "Random blob #" << random << ": '" << msg.contents << "'\n";
So this prints each 13th message, in reverse order, followed by a random blob.
Full Demo
The sample online uses the lines of the sources as "messages".
Live On Coliru
#include <boost/interprocess/file_mapping.hpp>
#include <boost/interprocess/managed_mapped_file.hpp>
#include <boost/container/scoped_allocator.hpp>
#include <boost/interprocess/containers/vector.hpp>
#include <iostream>
#include <boost/iterator/transform_iterator.hpp>
#include <boost/range/iterator_range.hpp>
static char const* DBASE_FNAME = "database.map";
namespace bip = boost::interprocess;
namespace shm {
using segment_manager = bip::managed_mapped_file::segment_manager;
template <typename T> using allocator = boost::container::scoped_allocator_adaptor<bip::allocator<T, segment_manager> >;
template <typename T> using vector = bip::vector<T, allocator<T> >;
}
using blob_t = shm::vector<uint8_t>;
using shared_blobs = shm::vector<blob_t>;
struct SpecificMessage {
// for demonstration purposes, just a string; could be anything serialized
std::string contents;
// trivial save/load serialization code:
template <typename Blob>
friend bool save(Blob& blob, SpecificMessage const& msg) {
blob.assign(msg.contents.begin(), msg.contents.end());
return true;
}
template <typename Blob>
friend bool load(Blob const& blob, SpecificMessage& msg) {
msg.contents.assign(blob.begin(), blob.end());
return true;
}
};
template <typename Message> struct LazyLoader {
using type = Message;
Message operator()(blob_t const& blob) const {
Message result;
if (!load(blob, result)) throw std::bad_cast(); // TODO custom excepion
return result;
}
};
///////
// for demo, create some database contents
void create_database_file() {
bip::file_mapping::remove(DBASE_FNAME);
bip::managed_mapped_file mmf(bip::open_or_create, DBASE_FNAME, 1ul<<20); // Even sparse file size is limited on Coliru
shared_blobs* table = mmf.find_or_construct<shared_blobs>("blob_table")(mmf.get_segment_manager());
std::ifstream ifs("main.cpp");
std::string line;
while (std::getline(ifs, line)) {
table->emplace_back();
save(table->back(), SpecificMessage { line });
}
std::cout << "Created blob table consisting of " << table->size() << " blobs\n";
}
///////
void display_random_messages() {
bip::managed_mapped_file mmf(bip::open_only, DBASE_FNAME);
shared_blobs* table = mmf.find_or_construct<shared_blobs>("blob_table")(mmf.get_segment_manager());
using It = boost::transform_iterator<LazyLoader<SpecificMessage>, shared_blobs::const_reverse_iterator>;
// for fun, let's reverse the blobs
for (It first(table->rbegin()), last(table->rend()); first < last; first+=13)
std::cout << "blob: '" << first->contents << "'\n";
// any kind of random access is okay, though:
auto random = rand() % table->size();
SpecificMessage msg;
load(table->at(random), msg);
std::cout << "Random blob #" << random << ": '" << msg.contents << "'\n";
}
int main()
{
#ifndef CONSUMER_ONLY
create_database_file();
#endif
srand(time(NULL));
display_random_messages();
}
You are having issues because your iterator does not conform to the forward iterator requirements. Specifically:
*i must be an lvalue reference to value_type or const value_type ([forward.iterators]/1.3)
*i cannot be a reference to an object stored in the iterator itself, due to the requirement that two iterators are equal if and only if they are bound to the same object ([forward.iterators]/6)
Yes, these requirements are a huge pain in the butt, and yes, that means that things like std::vector<bool>::iterator are not random access iterators even though some standard library implementations incorrectly claim that they are.
EDIT: The following suggested solution is horribly broken, in that dereferencing a temporary iterator returns a reference to an object that may not live until the reference is used. For example, after auto& foo = *(i + 1); the object referenced by foo may have been released. The implementation of reverse_iterator referenced in the OP will cause the same problem.
I'd suggest that you split your design into two classes: FileCache that holds the file resources and a cache of loaded messages, and FileCache::iterator that holds a message number and lazily retrieves it from the FileCache when dereferenced. The implementation could be something as simple as storing a container of weak_ptr<Message> in FileCache and a shared_ptr<Message> in the iterator: Simple demo
I have to admit I may not fully understand the trouble you have with holding the current MESSAGE as a member of Iter. I would associate each iterator with the FileReader it should read from and implement it as a lightweight encapsulation of a read index for FileReader::(read|moveTo). The most important method to overwtite is boost::iterator_facade<...>::advance(...) which modifies the current index and tries to pull a new MESSAGE from the FileReader If this fails it flags the the iterator as invalid and dereferencing will fail.
template<class MESSAGE,int STEP>
class message_iterator;
template<class MESSAGE>
class FileReader {
public:
typedef message_iterator<MESSAGE, 1> const_iterator;
typedef message_iterator<MESSAGE,-1> const_reverse_iterator;
FileReader();
bool open(const std::string & rName);
bool moveTo(int n);
bool read(MESSAGE &m);
// get the total count of messages in the file
// helps us to find end() and rbegin()
int getMessageCount();
const_iterator begin() {
return const_iterator(this,0);
}
const_iterator end() {
return const_iterator(this,getMessageCount());
}
const_reverse_iterator rbegin() {
return const_reverse_iterator(this,getMessageCount()-1);
}
const_reverse_iterator rend() {
return const_reverse_iterator(this,-1);
}
};
// declaration of message_iterator moving over MESSAGE
// STEP is used to specify STEP size and direction (e.g -1 == reverse)
template<class MESSAGE,int STEP=1>
class message_iterator
: public boost::iterator_facade<
message_iterator<MESSAGE>
, const MESSAGE
, boost::random_access_traversal_tag
>
{
typedef boost::iterator_facade<
message_iterator<MESSAGE>
, const MESSAGE
, boost::random_access_traversal_tag
> super;
public:
// constructor associates an iterator with its FileReader and a given position
explicit message_iterator(FileReader<MESSAGE> * p=NULL,int n=0): _filereader(p),_idx(n),_valid(false) {
advance(0);
}
bool equal(const message_iterator & i) const {
return i._filereader == _filereader && i._idx == _idx;
}
void increment() {
advance(+1);
}
void decrement() {
advance(-1);
}
// overwrite with central functionality. Move to a given relative
// postion and check wether the position can be read. If move/read
// fails we flag the iterator as incalid.
void advance(int n) {
_idx += n*STEP;
if(_filereader!=NULL) {
if( _filereader->moveTo( _idx ) && _filereader->read(_m)) {
_valid = true;
return;
}
}
_valid = false;
}
// Return a ref to the currently cached MESSAGE. Throw
// an acception if positioning at this location in advance(...) failes.
typename super::reference dereference() const {
if(!_valid) {
throw std::runtime_error("access to invalid pos");
}
return _m;
}
private:
FileReader<MESSAGE> * _filereader;
int _idx;
bool _valid;
MESSAGE _m;
};
Boost PropertyMap
You could avoid writing the bulk of the code using Boost PropertyMap:
Live On Coliru
#include <boost/property_map/property_map.hpp>
#include <boost/property_map/function_property_map.hpp>
using namespace boost;
struct SpecificMessage {
// add some data
int index; // just for demo
};
template <typename Message>
struct MyLazyReader {
typedef Message type;
std::string fname;
MyLazyReader(std::string fname) : fname(fname) {}
Message operator()(size_t index) const {
Message m;
// FileReader fr;
// fr.open(fname);
// fr.moveTo(index); // Move to Message
// fr.read(&m); // Try deserializing as SpecificMessage
m.index = index; // just for demo
return m;
}
};
#include <iostream>
int main() {
auto lazy_access = make_function_property_map<size_t>(MyLazyReader<SpecificMessage>("file.bin"));
for (int i=0; i<10; ++i)
std::cout << lazy_access[rand()%256].index << "\n";
}
Sample output is
103
198
105
115
81
255
74
236
41
205
Using Memory Mapped Files
You could store a map of index -> BLOB objects in a shared vector<array<byte, N>>, flat_map<size_t, std::vector<uint8_t> > or similar.
So, now you only have to deserialize from myshared_map[index].data() (begin() and end() in case the BLOB size varies)

Trouble implementing a line-by-line file parser in C++

I have trouble implementing a simple file parser in C++11 which reads a file line by line and tokenizes the line. It should properly manage its resources. Usage of the parser should be like:
Parser parser;
parser.open("/path/to/file");
std::pair<int> header = parser.getHeader();
while (parser.hasNext()) {
std::vector<int> tokens = parser.getNext();
}
parser.close();
So the Parser class needs one member std::ifstream file (or std::ifstream* file?)
1) How should the constructor initialize this->file?
2) How should the open method set this->file to the input file?
3) How should the next line from the file get loaded into a string?
(Is this what you would use: std::getline(this->file, line)) ?
Can you give some advice? Ideally, could you sketch out the class as a code example.
Since the Parser is probably in a pretty useless state once you've constructed it and before you've opened the file, I would suggest having your use case look something like this:
Parser parser("/path/to/file");
std::pair<int> header = parser.getHeader();
while (parser.hasNext()) {
std::vector<int> tokens = parser.getNext();
}
parser.close();
In which case, you should use the constructor's member initialization list to initialise the file member (which, yes, should be of type std::ifstream):
Parser::Parser(std::string file_name)
: file(file_name)
{
// ...
}
If you kept the constructor and open member function separate, you could just leave the constructor as default because the file member will be default constructed giving you a file stream that is not associated with any file. You would then get Parser::open to forward the file name to std::ifstream::open, like so:
void Parser::open(std::string file_name)
{
file.open(file_name);
}
Then, yes, to read lines from the file, you want to use something similar to this:
std::string line;
while (std::getline(file, line)) {
// Do something with line
}
Good job for not falling into the trap of doing while (!file.eof()).
It can be designed in many ways.
You may ask the user to provide you a stream instead of specifying a filename.
That will be more generic and will work in all streams.
That way you should have a std::ifstream& member variable though you can have a pointer type as well but you need to do *_stream << to invoke any operator.
If you take a file, you mat construct a stream in your constructor and close it if open in destructor
Actually, there is an alternative to feeding the name of the file to Parser: you could feed it a std::istream. What's interesting in this is that this way any derived class of std::istream can be used, and thus you could feed it, for example, a std::istringstream, which makes it easier to write unit-tests.
class Parser {
public:
explicit Parser(std::istream& is);
/**/
private:
std::istream& _stream;
/**/
};
Next, comes iteration. It is not idiomatic in C++ to have a has followed by a get. std::istream supports iteration (with an input iterator), you could perfectly design your parser so it does too. This way you will have the benefit of compatibility with many STL algorithms.
class ParserIterator:
public std::iterator< std::input_iterator_tag, std::vector<int> >
{
public:
ParserIterator(): _stream(nullptr) {} // end
ParserIterator(std::istream& is): _stream(&is) { this->advance(); }
// Accessors
std::vector<int> const& operator*() const { return _vec; }
std::vector<int> const* operator->() const { return &_vec; }
bool equals(ParserIterator const& other) const {
if (_stream != other._stream) { return false; }
if (_stream == nullptr) { return true; }
return false;
}
// Modifiers
ParserIterator& operator++() { this->advance(); return *this; }
ParserIterator operator++(int) {
ParserIterator tmp(*this);
this->advance();
return tmp;
}
private:
void advance() {
assert(_stream && "cannot advance an end iterator");
_vec.clear();
std::string buffer;
if (not getline(*_stream, buffer)) {
_stream = 0; // end of story
}
// parse here
}
std::istream* _stream;
std::vector<int> _vec;
}; // class ParserIterator
inline bool operator==(ParserIterator const& left, ParserIterator const& right) {
return left.equals(right);
}
inline bool operator!= (parserIterator const& left, ParserIterator const& right) {
return not left.equals(right);
}
And with that we can augment our parser:
ParserIterator Parser::begin() const {
return ParserIterator(_stream);
}
ParserIterator Parser::end() const {
return ParserIterator();
}
I'll leave the getHeader method and the actual parsing content to you ;)

returning c++ iterators

I have a function that returns an iterator if an object is found.
Now i have a problem. How do i fix the problem of informing the object that called this function that the object was not found?
vector<obj>::iterator Find(int id, int test)
{
vector<obj>::iterator it;
aClass class;
for(it = class.vecCont.begin(); it != class.vecCont.end(); ++it)
{
if(found object) //currently in psuedo code
return it;
}
return ???? // <<< if not found what to insert here?
}
Do i need to change my data structure in this instead?
Thanks in advance! :)
Return vector::end(), throw an exception, or return something other than a plain iterator
Better yet, don't implement your own Find function. That is what the <algorithm> library is for. Based on your psudocode, you can probably use std::find or std::find_if. find_if is particularly useful in cases where equality doesn't necessarily mean operator==. In those cases, you can use a [C++11] lambda or if C++11 isn't available to you, a functor class.
Since the functor is the lowest common denominator, I'll start with that:
#include <cstdlib>
#include <string>
#include <algorithm>
#include <vector>
#include <functional>
using namespace std;
class Person
{
public:
Person(const string& name, unsigned age) : name_(name), age_(age) {};
string name_;
unsigned age_;
};
class match_name : public unary_function <bool, string>
{
public:
match_name(const string& rhs) : name_(rhs) {};
bool operator()(const Person& rhs) const
{
return rhs.name_ == name_;
}
private:
string name_;
};
#include <iostream>
int main()
{
vector<Person> people;
people.push_back(Person("Hellen Keller", 99));
people.push_back(Person("John Doe", 42));
/** C++03 **/
vector<Person>::const_iterator found_person = std::find_if( people.begin(), people.end(), match_name("John Doe"));
if( found_person == people.end() )
cout << "Not FOund";
else
cout << found_person->name_ << " is " << found_person->age_;
}
found_person now points to the person whose name is "John Doe", or else points to people_.end() if that person wasn't found.
A C++11 lambda is new language syntax that makes this process of declaring/defining a functor and using is somewhat simpler for many cases. It's done like this:
string target = "John Doe";
vector<Person>::const_iterator found_person = std::find_if(people.begin(), people.end(), [&target](const Person& test) { return it->name_ == target; });
You can return an iterator to the end, i.e. return class.vecCont.end() to indicate that.
How about just returning the end iterator?
Your code becomes:-
vector<obj>::iterator Find(int id, int test)
{
vector<obj>::iterator it;
aClass class;
for(it = class.vecCont.begin(); it != class.vecCont.end(); ++it)
{
if(found object) //currently in psuedo code
break;
}
return it;
}
or just use std::find.
You should return class.vecCont.end() if the object was not found. But #chris is right - this is exactly what std::find is for.
Something like this
std::vector<obj>::iterator pos;
pos = find(coll.begin(),coll.end(), val);
And don't forget to these check for presence of your element or not in the container
if (pos != coll.end())
Don't return an iterator to a hidden container. Return simply what it is that you want, namely a means to access an object if it exists. In this example, I store the objects in the container via pointer. If your objects only exist temporarily, then new one up and copy the object over!
class AClass;
//...some time later
std::vector<AClass*> vecCont; //notice, store pointers in this example!
//..some time later
AClass * findAClass(int id, int test)
{
vector<AClass*>::iterator it;
for(it = class.vecCont.begin(); it != class.vecCont.end(); ++it)
{
if(found object) //currently in psuedo code
return it;
}
return NULL;
}
//later still..
AClass *foundVal = findAClass(1, 0);
if(foundVal)
{
//we found it!
}
else
{
//we didn't find it
}
edit: the intelligent thing to do is to write a comparator for your class and use the std algorithms sort and to find them for you. However, do what you want.
Never emulate std::algorithm functions inside a class. They are free functions for a reason. It usually is enough to expose begin and end member function that return the right iterators (and possibly a boost::iterator_range). If you need to do a fancy find with a functor, expose the functor as well.

map inheritance

I have a parent class which holds a map and n the child class i have used to inherit that class with for some reason can't access the map which i can't under stand why, i want to access the values inside the map.
my code is as follows
//HEADER FILE
#include <iostream>
#include <map>
using namespace std;
//////PARENT CLASS
struct TTYElementBase
{
//some code here
};
class element
{
public:
std::map<char,std::string> transMask;
std::map<char,std::string>::iterator it;
void populate();
};
//////CHILD CLASS .HPP
class elementV : public element
{
public :
std::string s1;
std::string s2;
elementV();
friend ostream &operator<< (ostream &, const elementV &);
void transLateMask();
};
//CPP FILE
#include "example.h"
#include <iostream>
elementV::elementV()
{
}
void elementV::transLateMask()
{
for ( it=transMask.begin() ; it != transMask.end(); it++ )
cout << (*it).first << endl;
}
int main()
{
elementV v;
v.transLateMask();
}
// ' OUTPUT IS NOTHING I DONT KNOW WHY?'
output is nothing but i need to acces the map fron the parent class, what am i doing wrong?
any help i will be very gratefull
Thanks
Does the map contain an entry for 'D' when you call transLateMask()? You'll get undefined behaviour (perhaps a runtime error) if it doesn't, since you don't check the result of find(). Something like this would be more robust:
auto found = transMask.find('D');
if (found == transMask.end()) {
// handle the error in some way, perhaps
throw std::runtime_error("No D in transMask");
}
std::string str = found->second;
(If you're not using C++11, then replace auto with the full type name, std::map<char,std::string>::const_iterator).
Alternatively, C++11 adds an at() method which throws std::out_of_range if the key is not found:
std::string str = transMask.at('D')->second;
The find() method of the std::map can return an iterator that is "one beyond the end" of the map, i.e. equals to result of end(). This means there's no such entry in the map. You have to check for that:
typedef std::map<char,std::string> mymap;
mymap::const_iterator i = transMask.find('D');
if ( i == transMask.end()) {
std::cerr << "'D' not found" << std::endl;
} else {
...
}

How do I iterate over cin line by line in C++?

I want to iterate over std::cin, line by line, addressing each line as a std::string. Which is better:
string line;
while (getline(cin, line))
{
// process line
}
or
for (string line; getline(cin, line); )
{
// process line
}
? What is the normal way to do this?
Since UncleBen brought up his LineInputIterator, I thought I'd add a couple more alternative methods. First up, a really simple class that acts as a string proxy:
class line {
std::string data;
public:
friend std::istream &operator>>(std::istream &is, line &l) {
std::getline(is, l.data);
return is;
}
operator std::string() const { return data; }
};
With this, you'd still read using a normal istream_iterator. For example, to read all the lines in a file into a vector of strings, you could use something like:
std::vector<std::string> lines;
std::copy(std::istream_iterator<line>(std::cin),
std::istream_iterator<line>(),
std::back_inserter(lines));
The crucial point is that when you're reading something, you specify a line -- but otherwise, you just have strings.
Another possibility uses a part of the standard library most people barely even know exists, not to mention being of much real use. When you read a string using operator>>, the stream returns a string of characters up to whatever that stream's locale says is a white space character. Especially if you're doing a lot of work that's all line-oriented, it can be convenient to create a locale with a ctype facet that only classifies new-line as white-space:
struct line_reader: std::ctype<char> {
line_reader(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask>
rc(table_size, std::ctype_base::mask());
rc['\n'] = std::ctype_base::space;
return &rc[0];
}
};
To use this, you imbue the stream you're going to read from with a locale using that facet, then just read strings normally, and operator>> for a string always reads a whole line. For example, if we wanted to read in lines, and write out unique lines in sorted order, we could use code like this:
int main() {
std::set<std::string> lines;
// Tell the stream to use our facet, so only '\n' is treated as a space.
std::cin.imbue(std::locale(std::locale(), new line_reader()));
std::copy(std::istream_iterator<std::string>(std::cin),
std::istream_iterator<std::string>(),
std::inserter(lines, lines.end()));
std::copy(lines.begin(), lines.end(),
std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
Keep in mind that this affects all input from the stream. Using this pretty much rules out mixing line-oriented input with other input (e.g. reading a number from the stream using stream>>my_integer would normally fail).
What I have (written as an exercise, but perhaps turns out useful one day), is LineInputIterator:
#ifndef UB_LINEINPUT_ITERATOR_H
#define UB_LINEINPUT_ITERATOR_H
#include <iterator>
#include <istream>
#include <string>
#include <cassert>
namespace ub {
template <class StringT = std::string>
class LineInputIterator :
public std::iterator<std::input_iterator_tag, StringT, std::ptrdiff_t, const StringT*, const StringT&>
{
public:
typedef typename StringT::value_type char_type;
typedef typename StringT::traits_type traits_type;
typedef std::basic_istream<char_type, traits_type> istream_type;
LineInputIterator(): is(0) {}
LineInputIterator(istream_type& is): is(&is) {}
const StringT& operator*() const { return value; }
const StringT* operator->() const { return &value; }
LineInputIterator<StringT>& operator++()
{
assert(is != NULL);
if (is && !getline(*is, value)) {
is = NULL;
}
return *this;
}
LineInputIterator<StringT> operator++(int)
{
LineInputIterator<StringT> prev(*this);
++*this;
return prev;
}
bool operator!=(const LineInputIterator<StringT>& other) const
{
return is != other.is;
}
bool operator==(const LineInputIterator<StringT>& other) const
{
return !(*this != other);
}
private:
istream_type* is;
StringT value;
};
} // end ub
#endif
So your loop could be replaced with an algorithm (another recommended practice in C++):
for_each(LineInputIterator<>(cin), LineInputIterator<>(), do_stuff);
Perhaps a common task is to store every line in a container:
vector<string> lines((LineInputIterator<>(stream)), LineInputIterator<>());
The first one.
Both do the same, but the first one is much more readable, plus you get to keep the string variable after the loop is done (in the 2nd option, its enclosed in the for loop scope)
Go with the while statement.
See Chapter 16.2 (specifically pages 374 and 375) of Code Complete 2 by Steve McConell.
To quote:
Don't use a for loop when a while loop is more appropriate. A common abuse of the flexible for loop structure in C++, C# and Java is haphazardly cramming the contents of a while loop into a for loop header.
.
C++ Example of a while loop abusively Crammed into a for Loop Header
for (inputFile.MoveToStart(), recordCount = 0; !inputFile.EndOfFile(); recordCount++) {
inputFile.GetRecord();
}
C++ Example of appropriate use of a while loop
inputFile.MoveToStart();
recordCount = 0;
while (!InputFile.EndOfFile()) {
inputFile.getRecord();
recordCount++;
}
I've omitted some parts in the middle but hopefully that gives you a good idea.
This is based on Jerry Coffin's answer. I wanted to show c++20's std::ranges::istream_view. I also added a line number to the class. I did this on godbolt, so I could see what happened. This version of the line class still works with std::input_iterator.
https://en.cppreference.com/w/cpp/ranges/basic_istream_view
https://www.godbolt.org/z/94Khjz
class line {
std::string data{};
std::intmax_t line_number{-1};
public:
friend std::istream &operator>>(std::istream &is, line &l) {
std::getline(is, l.data);
++l.line_number;
return is;
}
explicit operator std::string() const { return data; }
explicit operator std::string_view() const noexcept { return data; }
constexpr explicit operator std::intmax_t() const noexcept { return line_number; }
};
int main()
{
std::string l("a\nb\nc\nd\ne\nf\ng");
std::stringstream ss(l);
for(const auto & x : std::ranges::istream_view<line>(ss))
{
std::cout << std::intmax_t(x) << " " << std::string_view(x) << std::endl;
}
}
prints out:
0 a
1 b
2 c
3 d
4 e
5 f
6 g