nlohmann json access nested value by single string - c++

I have a json like :
{
"answer": {
"everything": 42
}
}
Using nlohmann json library in C++, I can access at this nested value like this :
std::cout << my_json["answer"]["everything"];
I'm looking for a way to access it with something like :
std::cout << my_json["answer.everything"];

It won't be possible to implement this with the syntax j["name1.name2"] you used. Many overloads for operators, including the bracket operators () and [] can only be declared inside a class. Furthermore nlohmann::json has already defined the [] operator to work like j["name1"]["name2"]! This means you would have to modify the library in order to achieve this precise syntax. But it seems you are only trying to have an arbitrarily nested Json and be still able to browse through it with a simple input string without necessarily sticking to the square bracket operator.
A simple solution is writing a function get(nlohmann::json, std::string) that takes an nlohmann::json as well as the corresponding dot-separeted names of the form name1.name2.name3 etc. and returns the corresponding value. This can be achieved by splitting the string according to the delimiter and then calling the nlohmann::json[] operator to browse the Json string:
std::vector<std::string> split(std::string const& str, char const delim) noexcept {
std::vector<std::string> res = {};
std::size_t start {0};
std::size_t end {0};
while ((start = str.find_first_not_of(delim, end)) != std::string::npos) {
end = str.find(delim, start);
res.push_back(str.substr(start, end - start));
}
return res;
}
nlohmann::json get(nlohmann::json const& root, std::string const& dot_separated_names) {
std::vector<std::string> const names = split(dot_separated_names, '.');
nlohmann::json const* leaf = &root;
for (auto const& name : names) {
if (leaf->contains(name)) {
leaf = &leaf->at(name);
} else {
// Error handling (e.g. throw error)
std::cerr << "Name '" << name << "' not found!" << std::endl;
}
}
return *leaf;
}
With the functions above you can achieve the desired output with get(j, "name1.name2");.
If you want to be able to set values as well you can similarly write a function set
nlohmann::json& set(nlohmann::json& root, std::string const& dot_separated_names) {
std::vector<std::string> const names = split(dot_separated_names, '.');
nlohmann::json* leaf = &root;
for (auto const& name : names) {
if (leaf->contains(name)) {
leaf = &leaf->at(name);
} else {
// Error handling (e.g. throw error)
std::cerr << "Name '" << name << "' not found!" << std::endl;
}
}
return *leaf;
}
that can be called as set(j, "name1.name2") = "value";. It might be unintuitive to set a value like this so you might want to modify it to something like template <typename T> set(nlohmann::json& root, std::string const& dot_separated_names, T&& value).
Probably it is even better to write this function for std::vector<std::string> instead of std::string and move the splitting of the string to a different routine. This way you can switch delimiter easily and can also browse through a Json just given a vector of names.
If you really wanted to use it as j["name1.name2"] you would have to write your own wrapper class for a nlohmann::json that defines the bracket operator in this way. But in this case you lose a lot of functionality of the wrapped nlohmann::json that you would have to supply by yourself, e.g. the other operators such as << etc.
class YourJsonWrapper {
public:
YourJsonWrapper(nlohmann::json const& j) noexcept
: j{j} {
return;
}
// Your custom bracket operator
nlohmann::json const& operator [] (std::string const& dot_separated_names) const {
// Get implementation (like above)
// ...
}
nlohmann::json& operator [] (std::string const& dot_separated_names) {
// Set implementation (like above)
// ...
}
// Add the other operators you need and forward them to the underlying nlohmann::json
nlohmann::json j; ///< The wrapped nlohmann::json
};
Furthermore - as pointed out in the comments - using dot-separated strings could be problematic as a Json name itself might contain a dot as well. Therefore make sure your Json names can't contain dots by design.

Related

Function taking either reference to single element or a vector

I want a function that modifies passed in elements of known type, matching against private data we iterate over in an outer loop comparing each to the passed in elements. Quantities are small so no need to build a map or optimize away the n² nature of this. In pseudo-code:
function fill_in_thing_data([in-out] things)
for (item in private_items)
info = wrangle(item)
for (thing in things)
if (matching(thing, info))
thing.data = info.data
Say private_items is expensive to iterator over, or to setup for iteration, and so I definitely want that in the outer loop.
Easy, right? Except that I want to two C++ overloaded functions that share some underlying code, one that takes a non-const reference to thing, one that takes a reference to a vector of things:
void fill_in_thing_data(Thing& single_thing);
void fill_in_thing_data(std::vector<Thing>& some_things);
What's the best way to share code between these 2 functions? I was thinking a helper function that takes iterators or some similar sequence type, and the first function passing in an iterator or sequence made out of the 1 element, the other one made out of the vector. I want to use C++ idioms so was looking at doing it with ForwardIterators.
Problem:
C++ iterators in general aren't easy. Ok using them is ok, but writing a function that takes any kind of ForwardIterator to simply a known type seems unexpectedly tricky.
Is a special case iterator over a single element reference something that's available? Or something that's easy to define? It seems the answers are no and no.
Instead of iterators, this is my ugly solution using a visitor lambda passed in, and a visit lambda passed back, allowing the iteration logic to be abstracted out of the shared function:
using VisitThing = std::function<bool(Thing& thing)>;
using ThingVisitor = std::function<bool(VisitThing visit)>;
void match_things(ThingVisitor visitor);
void fill_in_thing_data(Thing& single_thing) {
match_things([&](VisitThing visit) {
visit(single_thing);
});
}
void fill_in_thing_data(std::vector<Thing>& some_things) {
match_things([&](VisitThing visit) {
for (auto& thing : some_things) { visit(thing); }
});
}
void match_things(ThingVisitor visitor) {
auto stuff = fetch_private_stuff()
while (item = stuff.get_next_item()) { // can't change this home-brew iteration
auto item_info = wrangle(item) // but more complex, logic about skipping items etc
visitor([&](Thing& thing) {
if (item_info.token == thing.token)) {
thing.data = item_info.data;
}
}
}
}
Am I wrong and can do this with iterators without too much complexity? Or can I do this better in some other way, like maybe a good data structure class that can either be built with either the reference or the vector and then pass that in? Or like something else obvious I'm just not seeing? Thanks!
I hope I got OPs issue right. To me, it boiled down
to have a function which can be applied
to a single reference as well as
to a std::vector of instances.
There is actually a very simple solution which I even learnt (decades ago) in C but would work in C++ as well:
The function takes a pointer and a count:
void fill(Thing *pThing, size_t len);
To use it with a single instance:
Thing thing;
fill(&thing, 1);
To use it with a std::vector<Thing>:
std::vector<Thing> things;
fill(&things[0], things.size());
IMHO, this is somehow C-ish, beside of the fact, that OP mentioned iterators.
So, here we go:
template <typename ITER>
void fill(ITER first, ITER last)
{
for (const Item &item : items) {
for (ITER iter = first; iter != last; ++iter) {
if (matching(*iter, item)) iter->data = item;
}
}
}
// a wrapper for a single thing
void fill(Thing &thing) { fill(&thing, &thing + 1); }
// a wrapper for a vector of things
void fill(std::vector<Thing> &things) { fill(things.begin(), things.end()); }
The principle is still the same like above but using iterators.
A complete demo:
#include <iostream>
#include <vector>
// an item
struct Item {
int id = 0;
};
// the vector of OPs private items
std::vector<Item> items = {
{ 1 }, { 2 }, { 3 }
};
// a thing
struct Thing {
int id;
Item data;
Thing(int id): id(id) { }
};
// a hypothetical function matching a thing with an item
bool matching(const Thing &thing, const Item &item)
{
return thing.id == item.id;
}
// a generic fill (with matches) using iterators
template <typename ITER>
void fill(ITER first, ITER last)
{
for (const Item &item : items) {
for (ITER iter = first; iter != last; ++iter) {
if (matching(*iter, item)) iter->data = item;
}
}
}
// a wrapper for a single thing
void fill(Thing &thing) { fill(&thing, &thing + 1); }
// a wrapper for a vector of things
void fill(std::vector<Thing> &things) { fill(things.begin(), things.end()); }
// overloaded output operator for thing (demo sugar)
std::ostream& operator<<(std::ostream &out, const Thing &thing)
{
return out << "Thing { id: " << thing.id
<< ", data: Item { id: " << thing.data.id << "} }";
}
// demo sugar
#define DEBUG(...) std::cout << #__VA_ARGS__ << ";\n"; __VA_ARGS__
// demonstrate
int main()
{
// call to fill a single instance of Thing
DEBUG(Thing thing(2));
DEBUG(std::cout << thing << '\n');
DEBUG(fill(thing));
DEBUG(std::cout << thing << '\n');
std::cout << '\n';
// call to fill a vector of Thing
DEBUG(std::vector<Thing> things = { Thing(2), Thing(3) });
DEBUG(for (const Thing &thing : things) std::cout << thing << '\n');
DEBUG(fill(things));
DEBUG(for (const Thing &thing : things) std::cout << thing << '\n');
}
Output:
Thing thing(2);
std::cout << thing << '\n';
Thing { id: 2, data: Item { id: 0} }
fill(thing);
std::cout << thing << '\n';
Thing { id: 2, data: Item { id: 2} }
std::vector<Thing> things = { Thing(2), Thing(3) };
for (const Thing &thing : things) std::cout << thing << '\n';
Thing { id: 2, data: Item { id: 0} }
Thing { id: 3, data: Item { id: 0} }
fill(things);
for (const Thing &thing : things) std::cout << thing << '\n';
Thing { id: 2, data: Item { id: 2} }
Thing { id: 3, data: Item { id: 3} }
Live demo on coliru
A note about the template function with iterators:
Just recently I became painfully aware that just naming something ITER is not enough to grant that it accepts iterators only. In my case, I had a variety of overloads, and the one accepting an iterator range was only one of them. So, I had to manage to eliminate ambiguities. For this, I found a quite simple solution (here in Stack Overflow) using SFINAE:
template <typename ITER,
typename = decltype(
*std::declval<ITER&>(), void(), // has dereference
++std::declval<ITER&>(), void())> // has prefix inc.
void fill(ITER first, ITER last);
An iterator is something which (among other things) has to provide a de-reference and an increment operator. The 2nd template type argument checks precisely this in its default initialization. (Template type arguments for SFINAE should never be used explicitly in template instances, of course.)
Live demo on coliru

Creating a generic conversion function

I have a ResourceManager which takes in classes of type Resource. Resource is a parent class of other classes such as ShaderProgram, Texture, Mesh and even Camera who are completely unrelated to one another.
Suffice it to say, the ResourceManager works. But there is one thing that is very tedious and annoying, and that's when I retrieve the objects from the ResourceManager. Here is the problem:
In order to get an object from ResourceManager you call either of these functions:
static Resource* get(int id);
static Resource* get(const std::string &name);
The first function checks one std::unordered_map by an integer id; whereas the second function checks another std::unordered_map by the name that is manually given by the client. I have two versions of these functions for flexibility sakes because there are times where we don't care what the object contained within ResourceManager is (like Mesh) and there are times where we do care about what it is (like Camera or ShaderProgram) because we may want to retrieve the said objects by name rather than id.
Either way, both functions return a pointer to a Resource. When you call the function, it's as easy as something like:
rm::get("skyboxShader");
Where rm is just a typedef of ResourceManager since the class is static (all members/functions are static). The problem though is that the rm::get(..) function returns a Resource*, and not the child class that was added to the ResourceManager to begin with. So, in order to solve this problem I have to do a manual type conversion so that I can get ShaderProgram* instead of Resource*. I do it like this:
auto s = static_cast<ShaderProgram*>(rm::get(name));
So, everytime I want to access a Resource I have to insert the type I want to actually get into the static_cast. This is problematic insofar that everytime someone needs to access a Resource they have to type convert it. So, naturally I created a function, and being that ShaderProgram is the subject here, thus:
ShaderProgram* Renderer::program(const std::string &name)
{
auto s = static_cast<ShaderProgram*>(rm::get(name));
return s;
}
This function is static, and ResourceManager is a static class so the two go well hand-in-hand. This is a nice helper function and it works effectively and my program renders the result just fine. The problem is what I have to do when I'm dealing with other Resources; that means for every Resource that exists, there has to be a type-conversion function to accommodate it. Now THAT is annoying. Isn't there a way I can write a generic type-conversion function something like this?
auto Renderer::getResource(classTypeYouWant T, const std::string &name)
{
auto s = static_cast<T*>(rm::get(name));
return s;
}
Here, the auto keyword causes the function to derive which type it's supposed to be dealing with and return the result accordingly. My first guess is that I might have to use templates; but the problem with templates is that I can't limit which types get inserted into the function, and I really REALLY don't want floating-point id numbers, char ids, let alone custom-defined ids. It's either string (might change to const char* tbh) or ints or else.
How can I create a generic conversion function like the one described above?
Have you looked at using dynamic_cast? If the conversion fails with dynamic_cast the the pointer will be set to nullptr. So you could either write overloads for each type or you could write a template function where you pass the the type you want to convert to as well as the string or id and if the conversion succeeds or fails return true or false.
template<typename T>
bool Renderer::getResource(T*& type, const std::string &name)
{
type = dynamic_cast<decltype(std::remove_reference<decltype(T)>::type)>(rm::get(name));
if (type == nullptr)
return false;
return true;
}
OK, I did not like the idea of a typeless storage, but maybe you find that basic program as a start point. There are a lot of things which must be beautified, but some work must remain :-)
Again: It is a design failure to solve something in that way!
In addition to your example code this solution provides a minimum of safety while checking for the stored type while recall the element. But this solution needs rtti an this is not available on all platforms.
#include <map>
#include <iostream>
#include <typeinfo>
class ResourcePointerStorage
{
private:
std::map< const std::string, std::pair<void*, const std::type_info*>> storage;
public:
bool Get(const std::string& id, std::pair<void*, const std::type_info*>& ptr )
{
auto it= storage.find( id );
if ( it==storage.end() ) return false;
ptr= it->second;
return true;
}
bool Put( const std::string& id, void* ptr, const std::type_info* ti)
{
storage[id]=make_pair(ptr, ti);
}
};
template < typename T>
bool Get(ResourcePointerStorage& rm, const std::string& id, T** ptr)
{
std::pair<void*, const std::type_info*> p;
if ( rm.Get( id,p ))
{
if ( *p.second != typeid(T)) { return false; }
*ptr= static_cast<T*>(p.first);
return true;
}
else
{
return 0;
}
}
template < typename T>
void Put( ResourcePointerStorage& rm, const std::string& id, T *ptr)
{
rm.Put( id, ptr, &typeid(T) );
}
class Car
{
private:
int i;
public:
Car(int _i):i(_i){}
void Print() { std::cout << "A car " << i << std::endl; }
};
class Animal
{
private:
double d;
public:
Animal( double _d):d(_d) {}
void Show() { std::cout << "An animal " << d << std::endl; }
};
int main()
{
ResourcePointerStorage store;
Put( store, "A1", new Animal(1.1) );
Put( store, "A2", new Animal(2.2) );
Put( store, "C1", new Car(3) );
Animal *an;
Car *car;
if ( Get(store, "A1", &an)) { an->Show(); } else { std::cout << "Error" << std::endl; }
if ( Get(store, "A2", &an)) { an->Show(); } else { std::cout << "Error" << std::endl; }
if ( Get(store, "C1", &car)) { car->Print(); } else { std::cout << "Error" << std::endl; }
// not stored object
if ( Get(store, "XX", &an)) { } else { std::cout << "Expected false condition" << std::endl; }
// false type
if ( Get(store, "A1", &car)) { } else { std::cout << "Expected false condition" << std::endl; }
};
I've found the solution to my question. I created a macro:
#define convert(type, func) dynamic_cast<type>(func)
Extremely generic and code-neutral which allows types to be dynamic_casted from the return type of the function. It also allows for doing checks:
if (!convert(ShaderProgram*, rm::get("skyboxShader")))
cerr << "Conversion unsuccessful!" << endl;
else cout << "Conversion successful!" << endl;
I hope my solution will help people who search for questions similar of this kind. Thanks all!

Parameter validation C++

I've been thinking of a solution to validate the set of parameters a function/method receives using an object oriented approach. For example, in the following snippet the parameters are checked "manually" before being used.
InstallData::InstallData(std::string appPath, std::string appName,
std::string errMsg) {
if(appPath.empty()) {
#ifndef NDEBUG
std::cout << "Path not specified" << std::endl;
#endif
}
if(appName.empty()) {
#ifndef NDEBUG
std::cout << "Application name not specified" << std::endl;
std::cout << "Defaulting to AppName" << std::endl;
this->appName = "AppName";
#endif
}
if(errMsg.empty()) {
#ifndef NDEBUG
std::cout << "Error message not specified" << std::endl;
std::cout << "Defaulting to Error" << std::endl;
this->errMsg = "Error";
#endif
}
// ... further initialization beyond this point ...
}
As the number of parameters increases so does the size of the validation code. I've thought of a basic approach of checking parameters(strings and pointers) as whether they are either empty or null(the aim is to make the code providing functionality more readable).
class Validator {
public:
bool validateStrs(std::vector<std::string> strings, std::vector<std::string> messages, bool quiet);
bool validateStr(std::string str, std::string message, bool quiet);
bool validatePtrs(std::vector<void*> ptrs, std::vector<std::string> messages, bool quiet);
bool validatePtr(void* ptr, std::string message, bool quiet);
};
The validation methods validateStrs and validatePtrs check whether each element of the first array is empty or null and display a message from the second array(there is a one to one relationship between the elements of the first array and the second) if the quiet flag is not set.
In my implementation this looks like:
InstallData::InstallData(std::string appPath, std::string appName,
std::string errMsg, std::string errTitle) {
// Initialize string container
std::vector<std::string> strings;
strings.push_back(appPath);
strings.push_back(appName);
strings.push_back(errMsg);
strings.push_back(errTitle);
// Initialize name container
std::vector<std::string> names;
names.push_back("ApplicationPath");
names.push_back("ApplicationName");
names.push_back("ErrorMessage");
names.push_back("ErrorTitle");
boost::shared_ptr<Validator> valid(new Validator());
bool result = true;
#ifndef NDEBUG
result = valid->validateStrs(strings, names, false);
#else
result = valid->validateStrs(strings, names, true);
#endif
if(result){
this->appPath = appPath;
this->appName = appName;
this->errMsg = errMsg;
this->errTitle = errTitle;
} else {
std::exit(0);
}
}
The messages can also be placed in a separate file thus making the method body cleaner.
Numeric value range checkers can also be implemented similarly. This approach, however, doesn't consider dependencies between parameters.
Is there a more elegant solution of implementing a parameter validation mechanism, possibly using templates?
A more elegant way is not to use standard types for parameters but to define specific classes that check parameters on construction. Something like
class InvalidAppPath {};
class AppPath {
public:
AppPath(const std::string & appPath) : path(appPath) {
if ( appPath.empty() ) throw InvalidAppPath();
}
operator std::string() { return path; }
private:
std::string path;
};
This would also make it easier to ensure that an AppPath is checked for validity only on construction and possibly on modification.
These slides from a presentation by Ric Parkin at the 2007 ACCU Conference explore the idea in greater detail.
Perhaps you would find it easier to leverage function name overloading and variadic templates. You can group the parameter information you want to validate along with the corrective action together in a std::tuple. I implemented a small demo of this idea on IDEONE.
bool validate (std::string s) { return !s.empty(); }
bool validate (const void *p) { return p; }
template <typename Tuple>
bool validate (Tuple param) {
if (validate(std::get<0>(param))) return true;
#ifndef NDEBUG
std::cout << "Invalid: " << std::get<1>(param) << std::endl;
std::get<2>(param)();
#endif
return false;
}
bool validate () { return true; }
template <typename T, typename... Params>
bool validate (T param, Params... params) {
return validate(param) & validate(params...);
}
Then, you could use it like:
bool result
= validate(
std::make_tuple(appPath, "ApplicationPath",
[&](){ appPath = "defaultPath"; }),
std::make_tuple(appName, "ApplicationName",
[&](){ appName = "defaultName"; })
//...
);

Trouble implementing a line-by-line file parser in C++

I have trouble implementing a simple file parser in C++11 which reads a file line by line and tokenizes the line. It should properly manage its resources. Usage of the parser should be like:
Parser parser;
parser.open("/path/to/file");
std::pair<int> header = parser.getHeader();
while (parser.hasNext()) {
std::vector<int> tokens = parser.getNext();
}
parser.close();
So the Parser class needs one member std::ifstream file (or std::ifstream* file?)
1) How should the constructor initialize this->file?
2) How should the open method set this->file to the input file?
3) How should the next line from the file get loaded into a string?
(Is this what you would use: std::getline(this->file, line)) ?
Can you give some advice? Ideally, could you sketch out the class as a code example.
Since the Parser is probably in a pretty useless state once you've constructed it and before you've opened the file, I would suggest having your use case look something like this:
Parser parser("/path/to/file");
std::pair<int> header = parser.getHeader();
while (parser.hasNext()) {
std::vector<int> tokens = parser.getNext();
}
parser.close();
In which case, you should use the constructor's member initialization list to initialise the file member (which, yes, should be of type std::ifstream):
Parser::Parser(std::string file_name)
: file(file_name)
{
// ...
}
If you kept the constructor and open member function separate, you could just leave the constructor as default because the file member will be default constructed giving you a file stream that is not associated with any file. You would then get Parser::open to forward the file name to std::ifstream::open, like so:
void Parser::open(std::string file_name)
{
file.open(file_name);
}
Then, yes, to read lines from the file, you want to use something similar to this:
std::string line;
while (std::getline(file, line)) {
// Do something with line
}
Good job for not falling into the trap of doing while (!file.eof()).
It can be designed in many ways.
You may ask the user to provide you a stream instead of specifying a filename.
That will be more generic and will work in all streams.
That way you should have a std::ifstream& member variable though you can have a pointer type as well but you need to do *_stream << to invoke any operator.
If you take a file, you mat construct a stream in your constructor and close it if open in destructor
Actually, there is an alternative to feeding the name of the file to Parser: you could feed it a std::istream. What's interesting in this is that this way any derived class of std::istream can be used, and thus you could feed it, for example, a std::istringstream, which makes it easier to write unit-tests.
class Parser {
public:
explicit Parser(std::istream& is);
/**/
private:
std::istream& _stream;
/**/
};
Next, comes iteration. It is not idiomatic in C++ to have a has followed by a get. std::istream supports iteration (with an input iterator), you could perfectly design your parser so it does too. This way you will have the benefit of compatibility with many STL algorithms.
class ParserIterator:
public std::iterator< std::input_iterator_tag, std::vector<int> >
{
public:
ParserIterator(): _stream(nullptr) {} // end
ParserIterator(std::istream& is): _stream(&is) { this->advance(); }
// Accessors
std::vector<int> const& operator*() const { return _vec; }
std::vector<int> const* operator->() const { return &_vec; }
bool equals(ParserIterator const& other) const {
if (_stream != other._stream) { return false; }
if (_stream == nullptr) { return true; }
return false;
}
// Modifiers
ParserIterator& operator++() { this->advance(); return *this; }
ParserIterator operator++(int) {
ParserIterator tmp(*this);
this->advance();
return tmp;
}
private:
void advance() {
assert(_stream && "cannot advance an end iterator");
_vec.clear();
std::string buffer;
if (not getline(*_stream, buffer)) {
_stream = 0; // end of story
}
// parse here
}
std::istream* _stream;
std::vector<int> _vec;
}; // class ParserIterator
inline bool operator==(ParserIterator const& left, ParserIterator const& right) {
return left.equals(right);
}
inline bool operator!= (parserIterator const& left, ParserIterator const& right) {
return not left.equals(right);
}
And with that we can augment our parser:
ParserIterator Parser::begin() const {
return ParserIterator(_stream);
}
ParserIterator Parser::end() const {
return ParserIterator();
}
I'll leave the getHeader method and the actual parsing content to you ;)

How do I iterate over cin line by line in C++?

I want to iterate over std::cin, line by line, addressing each line as a std::string. Which is better:
string line;
while (getline(cin, line))
{
// process line
}
or
for (string line; getline(cin, line); )
{
// process line
}
? What is the normal way to do this?
Since UncleBen brought up his LineInputIterator, I thought I'd add a couple more alternative methods. First up, a really simple class that acts as a string proxy:
class line {
std::string data;
public:
friend std::istream &operator>>(std::istream &is, line &l) {
std::getline(is, l.data);
return is;
}
operator std::string() const { return data; }
};
With this, you'd still read using a normal istream_iterator. For example, to read all the lines in a file into a vector of strings, you could use something like:
std::vector<std::string> lines;
std::copy(std::istream_iterator<line>(std::cin),
std::istream_iterator<line>(),
std::back_inserter(lines));
The crucial point is that when you're reading something, you specify a line -- but otherwise, you just have strings.
Another possibility uses a part of the standard library most people barely even know exists, not to mention being of much real use. When you read a string using operator>>, the stream returns a string of characters up to whatever that stream's locale says is a white space character. Especially if you're doing a lot of work that's all line-oriented, it can be convenient to create a locale with a ctype facet that only classifies new-line as white-space:
struct line_reader: std::ctype<char> {
line_reader(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask>
rc(table_size, std::ctype_base::mask());
rc['\n'] = std::ctype_base::space;
return &rc[0];
}
};
To use this, you imbue the stream you're going to read from with a locale using that facet, then just read strings normally, and operator>> for a string always reads a whole line. For example, if we wanted to read in lines, and write out unique lines in sorted order, we could use code like this:
int main() {
std::set<std::string> lines;
// Tell the stream to use our facet, so only '\n' is treated as a space.
std::cin.imbue(std::locale(std::locale(), new line_reader()));
std::copy(std::istream_iterator<std::string>(std::cin),
std::istream_iterator<std::string>(),
std::inserter(lines, lines.end()));
std::copy(lines.begin(), lines.end(),
std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
Keep in mind that this affects all input from the stream. Using this pretty much rules out mixing line-oriented input with other input (e.g. reading a number from the stream using stream>>my_integer would normally fail).
What I have (written as an exercise, but perhaps turns out useful one day), is LineInputIterator:
#ifndef UB_LINEINPUT_ITERATOR_H
#define UB_LINEINPUT_ITERATOR_H
#include <iterator>
#include <istream>
#include <string>
#include <cassert>
namespace ub {
template <class StringT = std::string>
class LineInputIterator :
public std::iterator<std::input_iterator_tag, StringT, std::ptrdiff_t, const StringT*, const StringT&>
{
public:
typedef typename StringT::value_type char_type;
typedef typename StringT::traits_type traits_type;
typedef std::basic_istream<char_type, traits_type> istream_type;
LineInputIterator(): is(0) {}
LineInputIterator(istream_type& is): is(&is) {}
const StringT& operator*() const { return value; }
const StringT* operator->() const { return &value; }
LineInputIterator<StringT>& operator++()
{
assert(is != NULL);
if (is && !getline(*is, value)) {
is = NULL;
}
return *this;
}
LineInputIterator<StringT> operator++(int)
{
LineInputIterator<StringT> prev(*this);
++*this;
return prev;
}
bool operator!=(const LineInputIterator<StringT>& other) const
{
return is != other.is;
}
bool operator==(const LineInputIterator<StringT>& other) const
{
return !(*this != other);
}
private:
istream_type* is;
StringT value;
};
} // end ub
#endif
So your loop could be replaced with an algorithm (another recommended practice in C++):
for_each(LineInputIterator<>(cin), LineInputIterator<>(), do_stuff);
Perhaps a common task is to store every line in a container:
vector<string> lines((LineInputIterator<>(stream)), LineInputIterator<>());
The first one.
Both do the same, but the first one is much more readable, plus you get to keep the string variable after the loop is done (in the 2nd option, its enclosed in the for loop scope)
Go with the while statement.
See Chapter 16.2 (specifically pages 374 and 375) of Code Complete 2 by Steve McConell.
To quote:
Don't use a for loop when a while loop is more appropriate. A common abuse of the flexible for loop structure in C++, C# and Java is haphazardly cramming the contents of a while loop into a for loop header.
.
C++ Example of a while loop abusively Crammed into a for Loop Header
for (inputFile.MoveToStart(), recordCount = 0; !inputFile.EndOfFile(); recordCount++) {
inputFile.GetRecord();
}
C++ Example of appropriate use of a while loop
inputFile.MoveToStart();
recordCount = 0;
while (!InputFile.EndOfFile()) {
inputFile.getRecord();
recordCount++;
}
I've omitted some parts in the middle but hopefully that gives you a good idea.
This is based on Jerry Coffin's answer. I wanted to show c++20's std::ranges::istream_view. I also added a line number to the class. I did this on godbolt, so I could see what happened. This version of the line class still works with std::input_iterator.
https://en.cppreference.com/w/cpp/ranges/basic_istream_view
https://www.godbolt.org/z/94Khjz
class line {
std::string data{};
std::intmax_t line_number{-1};
public:
friend std::istream &operator>>(std::istream &is, line &l) {
std::getline(is, l.data);
++l.line_number;
return is;
}
explicit operator std::string() const { return data; }
explicit operator std::string_view() const noexcept { return data; }
constexpr explicit operator std::intmax_t() const noexcept { return line_number; }
};
int main()
{
std::string l("a\nb\nc\nd\ne\nf\ng");
std::stringstream ss(l);
for(const auto & x : std::ranges::istream_view<line>(ss))
{
std::cout << std::intmax_t(x) << " " << std::string_view(x) << std::endl;
}
}
prints out:
0 a
1 b
2 c
3 d
4 e
5 f
6 g