I'm implementing a very simple serialization library, to save/restore user_info object, which contains an int and a std::string, here's my code:
#include <iostream>
#include <sstream>
using namespace std;
struct user_info {
int id;
string name;
};
// for any POD type like int, read sizeof(int) from the stream
template <class stream_t, class T>
void de_serialize(stream_t& stream, T& x) {
stream.read((char*)&x, sizeof(T));
}
// for std::string, read length(int) first, then read string content when length>0
template <class stream_t>
void de_serialize(stream_t& stream, std::string& str) {
int len;
de_serialize(stream, len);
str.resize(len);
char x;
for(int i=0; i<len; i++) {
de_serialize(stream, x);
str[i] = x;
}
}
// read from stream, fill id and name
template <typename stream_t>
void de_serialize(stream_t& ss, user_info& ui) {
de_serialize(ss, ui.id);
de_serialize(ss, ui.name);
}
int main() {
// read from file, but here I use a 8-bytes-content represents the file content
stringstream ifs;
// two int, \x1\x1\x1\x1 for id, \x0\x0\x0\x0 for name
ifs.write("\x1\x1\x1\x1\x0\x0\x0\x0", 8);
while(!ifs.eof()) {
user_info ui;
de_serialize(ifs, ui);
// when first time goes here, the stream should've read 8 bytes and reaches eof,
// then break the while loop, but it doesn't
// so the second time it calls de_serialize, the program crashes
}
}
the code illustrates the part of de-serializing. The while loop is supposed to run once, and the stream reaches eof, but why it doesn't stop looping? The second loop causes a crash.
Please refer to the comments, thanks.
The eof() flag will never be set if the streem develops an error condition. It is usually wrong to loop on eof(). What I would do here is change the return type from your de_serialize() function to return the stream and then rewrite your loop around the de_serialize() function
Like this:
#include <iostream>
#include <sstream>
using namespace std;
struct user_info {
int id;
string name;
};
// for any POD type like int, read sizeof(int) from the stream
template <class stream_t, class T>
stream_t& de_serialize(stream_t& stream, T& x) {
stream.read((char*)&x, sizeof(T));
return stream; // return the stream
}
// for std::string, read length(int) first, then read string content when length>0
template <class stream_t>
stream_t& de_serialize(stream_t& stream, std::string& str) {
int len;
de_serialize(stream, len);
str.resize(len);
char x;
for(int i=0; i<len; i++) {
de_serialize(stream, x);
str[i] = x;
}
return stream; // return the stream
}
// read from stream, fill id and name
template <typename stream_t>
stream_t& de_serialize(stream_t& ss, user_info& ui) {
de_serialize(ss, ui.id);
de_serialize(ss, ui.name);
return ss; // return the stream
}
int main() {
// read from file, but here I use a 8-bytes-content represents the file content
stringstream ifs;
// two int, \x1\x1\x1\x1 for id, \x0\x0\x0\x0 for name
ifs.write("\x1\x1\x1\x1\x0\x0\x0\x0", 8);
user_info ui;
while(de_serialize(ifs, ui)) { // loop on de_serialize()
// Now you know ui was correctly read from the stream
// when first time goes here, the stream should've read 8 bytes and reaches eof,
// then break the while loop, but it doesn't
// so the second time it calls de_serialize, the program crashes
}
}
Related
I want to take input from an istream from either a std::ifstream or std::cin depending on some condition.
As far as I can get it working, I had to use a raw std::istream* pointer:
int main(int argc, char const* argv[]) {
std::istream *in_stream;
std::ifstream file;
if (argc > 1) {
std::string filename = argv[1];
file.open(filename);
if (not file.is_open()) {
return -1;
}
in_stream = &file;
} else {
in_stream = &std::cin;
}
// do stuff with the input, regardless of the source...
}
Is there a way to rewrite the above using smart pointers?
This may be a question about ownership: How does one wrap and pass around two different std::istream instances, one of which is owned and needs disposal (which calls close() upon destruction) and ownership transfers, whereas the other one is not owned by user code (e.g. std::cin) and cannot be deleted. The solution is polymorphism: Create an interface like
struct StreamWrapper {
virtual std::istream &stream() = 0;
virtual ~StreamWrapper() = default;
};
and two implementations, e.g. OwnedStreamWrapper and UnownedStreamWrapper. The former can contain a std::istream directly (which is moveable, but not copyable) and the latter can contain (e.g.) a std::istream&, i.e. have no ownership over the referenced stream instance.
struct OwnedStreamWrapper : public StreamWrapper {
template <typename... A>
OwnedStreamWrapper(A &&...a) : stream_{std::forward<A>(a)...} {}
std::istream &stream() override { return stream_; }
private:
std::ifstream stream_;
};
struct UnownedStreamWrapper : public StreamWrapper {
UnownedStreamWrapper(std::istream &stream) : stream_{stream} {}
std::istream &stream() override { return stream_; }
private:
std::istream &stream_;
};
Apart from the stream ownership, wrapper ownership matters as well and can be used either to handle the wrappers’ (and streams’) lifespan automatically on the stack, or to allow a wrapper’s lifespan to exceed that of the current stack frame. In the example below, the first ReadWrapper() takes ownership of the wrapper and deletes it automatically whereas the second ReadWrapper() does not take ownership. In both cases polymorphism (i.e. the implementation of StreamWrapper) determines how the underlying stream is handled, i.e. whether it outlives its wrapper (like std::cin should) or dies together with it (like a manually instantiated stream should).
void ReadWrite(std::istream &in, std::ostream &out) {
std::array<char, 1024> buffer;
while (in) {
in.read(buffer.data(), buffer.size());
out.write(buffer.data(), in.gcount());
}
}
void ReadWrapper(std::unique_ptr<StreamWrapper> stream) {
ReadWrite(stream->stream(), std::cout);
}
void ReadWrapper(StreamWrapper &stream) {
ReadWrite(stream.stream(), std::cout);
}
Below are all four combinations of { owned | unowned } { streams | stream wrappers } in a runnable example:
#include <array>
#include <fstream>
#include <iostream>
#include <memory>
#include <utility>
namespace {
/********** The three snippets above go here! **********/
} // namespace
int main() {
OwnedStreamWrapper stream1{"/proc/cpuinfo"};
std::unique_ptr<StreamWrapper> stream2{
std::make_unique<OwnedStreamWrapper>("/proc/cpuinfo")};
std::ifstream stream3_file{"/proc/cpuinfo"}; // Let’s pretend it is unowned.
UnownedStreamWrapper stream3{stream3_file};
std::ifstream stream4_file{"/proc/cpuinfo"}; // Let’s pretend it is unowned.
std::unique_ptr<StreamWrapper> stream4{
std::make_unique<UnownedStreamWrapper>(stream4_file)};
ReadWrapper(stream1); // owned stream, wrapper kept
ReadWrapper(std::move(stream2)); // owned stream, wrapper transferred
ReadWrapper(stream3); // unowned stream, wrapper kept
ReadWrapper(std::move(stream4)); // unowned stream, wrapper transferred
}
A smart pointer is kind of an awkward fit for this situation, because it requires a dynamic allocation (which is probably not necessary in this case)
1. Do your processing in another function
As #BoP pointed out in the comments a nice way to handle this could be to use another function that gets passed the appropriate input stream:
godbolt
int do_thing(std::istream& in_stream) {
// do stuff with the input, regardless of the source...
int foobar;
in_stream >> foobar;
return 0;
}
int main(int argc, char const* argv[]) {
std::ofstream{"/tmp/foobar"} << 1234;
if(argc > 1) {
std::ifstream file{argv[1]};
if(!file) return -1;
return do_thing(file);
} else {
return do_thing(std::cin);
}
}
2. Switch the stream buffer
If you won't be using std::cin at all if a filename is passed you could change the streambuffer of std::cin to the one of the file with rdbuf() - that way you can use std::cin in both cases.
godbolt
std::ifstream file;
std::streambuf* oldbuf = std::cin.rdbuf();
if (argc > 1) {
file.open(argv[1]);
if (!file) return -1;
// switch buffer of std::cin
std::cin.rdbuf(file.rdbuf());
}
// do stuff with the input, regardless of the source...
int foobar;
std::cin >> foobar; // this will reader either from std::cin or the file, depending on the current streambuffer assigned to std::cin
// restore old stream buffer
std::cin.rdbuf(oldbuf);
3. Using std::unique_ptr / std::shared_ptr with a custom deleter
The easiest solution if you still want to use smart pointers would be to use std::unique_ptr / std::shared_ptr with a custom deleter that handles the std::cin case:
godbolt
// deletes the stream only if it is not std::cin
struct istream_ptr_deleter {
void operator()(std::istream* stream) {
if(&std::cin == stream) return;
delete stream;
}
};
using istream_ptr = std::unique_ptr<std::istream, istream_ptr_deleter>;
// Example Usage:
istream_ptr in_stream;
if (argc > 1) {
in_stream.reset(new std::ifstream(argv[1]));
if (!*in_stream) return -1;
} else {
in_stream.reset(&std::cin);
}
int foobar;
*in_stream >> foobar;
I have currently a function
int myfun(const int a) {
...
return rval;
}
that performs several actions.
I mean to adapt it to write debug information on its behaviour or not according to some parameter that I can pass.
In the cases I want to write that info, I also want to pass the ofstream to use.
And I want applications that were using myfun to still work with no modifications.
So I would ideally change to
int myfun(const int a, ofstream & ofs) {
...
if (!(ofs == NULL)) {
ofs << ...
}
...
if (!(ofs == NULL)) {
ofs << ...
}
return rval;
}
with a default value similar to &ofs=NULL. I know NULL is not appropriate.
What is an appropriate way of handling this?
Note 1:
I could pass an optional parameter with the output file name, but that is less flexible.
Note 2:
I could also change to
int myfun(const int a, const bool debug_info, ofstream & ofs) {
...
if (debug_info) {
ofs << ...
}
with a default value debug_info=false.
I guess this still requires a default value for ofs as above.
Note 3:
The accepted answer in Default ofstream class argument in function proposes an overload without the ofs parameter.
In my case that may not work, as I mean to not write anything when "ofs=NULL".
Note 4:
This other answer apparently works, but it seems somewhat contrived to me, and I am not sure it provides all the same functionality as with passing an ofs.
Related:
Is there a null std::ostream implementation in C++ or libraries?
I want applications that were using myfun to still work with no modifications.
If so, use an ofs with default nullptr
int myfun(const int a, ofstream *ofs = nullptr)
{
if (ofs != nullptr)
{
// (*ofs) << ... ;
}
// ...
}
You can't use a reference parameter ofstream& ofs for such function because a reference cannot be null.
Make an abstract Logger class. It has a method for logging a message. In derived classes you can add logging to file (ofstream) or simply do nothing. You can use any logger, the implementation of myfun() stays the same.
#include <fstream>
class Logger {
public:
virtual void log(const char*) = 0;
};
class NullLogger: public Logger {
public:
void log(const char*) override {};
};
class FileLogger: public Logger {
public:
FileLogger(std::ofstream& s): ofs(s){}
void log(const char* msg) override {
ofs << msg;
}
private:
std::ofstream& ofs;
};
static NullLogger defaultLogger;
int myfun(const int a, Logger& logger=defaultLogger)
{
logger.log("hello");
// ...
logger.log("asdf");
}
int main(){
std::ofstream ofs;
FileLogger fileLogger(ofs);
NullLogger nullLogger;
myfun(10,fileLogger); // logs to file
myfun(10,nullLogger); // logs nothing
myfun(10); // also logs nothing
return 0;
}
In C++17 there is a solution involving std::optional but since it requires default constructible types, std::reference_wrapper has to be used too.
#include <fstream>
#include <optional>
#include <functional>
int myfun(const int a, std::optional<std::reference_wrapper<std::ofstream>> ofs)
{
if (ofs) {
ofs->get() << "...";
return 1;
}
else{
return 0;
}
}
#include <iostream>
int main(){
std::ofstream file;
//Calling is quite nice.
std::cout<<myfun(10,{file})<<'\n'; //Prints 1
std::cout<<myfun(10,{})<<'\n'; //Prints 0
}
The downside of this solution, although idiomatic, is being verbose and heavy on the syntax in some cases.
Class A contains a template function which reads an object into a string by first passing it to a stringstream and then converting the stream to a string.
For the life of me, I don't know why these objects aren't passed to the stream. What could be the problem? What could I try to resolve this?
class A
{
public:
template<class T>
void Input(const T&);
private:
std::string result;
};
template<class T>
void A::Input(const T& obj)
{
//Pass object to the stream
std::ostringstream ss;
ss << obj;
//After this line, result == "" with size 1 (??), no matter the input
result = ss.str();
int test = 1234;
ss << test;
//Even with a test int, result == "" with size 1 (???)
result = ss.str();
}
EDIT: I'm using Visual Studio Pro 2012.
This is how I usually read files with std::ifstream:
while (InFile.peek() != EOF)
{
char Character = InFile.get();
// Do stuff with Character...
}
This avoids the need of an if statement inside the loop. However, it seems that even peek() causes eofbit to be set, which makes calling clear() necessary if I plan on using that same stream later.
Is there a cleaner way to do this?
Typically, you would just use
char x;
while(file >> x) {
// do something with x
}
// now clear file if you want
If you forget to clear(), then use an RAII scope-based class.
Edit: Given a little more information, I'd just say
class FileReader {
std::stringstream str;
public:
FileReader(std::string filename) {
std::ifstream file(filename);
file >> str.rdbuf();
}
std::stringstream Contents() {
return str;
}
};
Now you can just get a copy and not have to clear() the stream every time. Or you could have a self-clearing reference.
template<typename T> class SelfClearingReference {
T* t;
public:
SelfClearingReference(T& tref)
: t(&tref) {}
~SelfClearingReference() {
tref->clear();
}
template<typename Operand> T& operator>>(Operand& op) {
return *t >> op;
}
};
I'm not sure I understand. Infile.peek() only sets eofbit
when it returns EOF. And if it returns EOF, and later read
is bound to fail; the fact that it sets eofbit is an
optimization, more than anything else.
I want to iterate over std::cin, line by line, addressing each line as a std::string. Which is better:
string line;
while (getline(cin, line))
{
// process line
}
or
for (string line; getline(cin, line); )
{
// process line
}
? What is the normal way to do this?
Since UncleBen brought up his LineInputIterator, I thought I'd add a couple more alternative methods. First up, a really simple class that acts as a string proxy:
class line {
std::string data;
public:
friend std::istream &operator>>(std::istream &is, line &l) {
std::getline(is, l.data);
return is;
}
operator std::string() const { return data; }
};
With this, you'd still read using a normal istream_iterator. For example, to read all the lines in a file into a vector of strings, you could use something like:
std::vector<std::string> lines;
std::copy(std::istream_iterator<line>(std::cin),
std::istream_iterator<line>(),
std::back_inserter(lines));
The crucial point is that when you're reading something, you specify a line -- but otherwise, you just have strings.
Another possibility uses a part of the standard library most people barely even know exists, not to mention being of much real use. When you read a string using operator>>, the stream returns a string of characters up to whatever that stream's locale says is a white space character. Especially if you're doing a lot of work that's all line-oriented, it can be convenient to create a locale with a ctype facet that only classifies new-line as white-space:
struct line_reader: std::ctype<char> {
line_reader(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask>
rc(table_size, std::ctype_base::mask());
rc['\n'] = std::ctype_base::space;
return &rc[0];
}
};
To use this, you imbue the stream you're going to read from with a locale using that facet, then just read strings normally, and operator>> for a string always reads a whole line. For example, if we wanted to read in lines, and write out unique lines in sorted order, we could use code like this:
int main() {
std::set<std::string> lines;
// Tell the stream to use our facet, so only '\n' is treated as a space.
std::cin.imbue(std::locale(std::locale(), new line_reader()));
std::copy(std::istream_iterator<std::string>(std::cin),
std::istream_iterator<std::string>(),
std::inserter(lines, lines.end()));
std::copy(lines.begin(), lines.end(),
std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
Keep in mind that this affects all input from the stream. Using this pretty much rules out mixing line-oriented input with other input (e.g. reading a number from the stream using stream>>my_integer would normally fail).
What I have (written as an exercise, but perhaps turns out useful one day), is LineInputIterator:
#ifndef UB_LINEINPUT_ITERATOR_H
#define UB_LINEINPUT_ITERATOR_H
#include <iterator>
#include <istream>
#include <string>
#include <cassert>
namespace ub {
template <class StringT = std::string>
class LineInputIterator :
public std::iterator<std::input_iterator_tag, StringT, std::ptrdiff_t, const StringT*, const StringT&>
{
public:
typedef typename StringT::value_type char_type;
typedef typename StringT::traits_type traits_type;
typedef std::basic_istream<char_type, traits_type> istream_type;
LineInputIterator(): is(0) {}
LineInputIterator(istream_type& is): is(&is) {}
const StringT& operator*() const { return value; }
const StringT* operator->() const { return &value; }
LineInputIterator<StringT>& operator++()
{
assert(is != NULL);
if (is && !getline(*is, value)) {
is = NULL;
}
return *this;
}
LineInputIterator<StringT> operator++(int)
{
LineInputIterator<StringT> prev(*this);
++*this;
return prev;
}
bool operator!=(const LineInputIterator<StringT>& other) const
{
return is != other.is;
}
bool operator==(const LineInputIterator<StringT>& other) const
{
return !(*this != other);
}
private:
istream_type* is;
StringT value;
};
} // end ub
#endif
So your loop could be replaced with an algorithm (another recommended practice in C++):
for_each(LineInputIterator<>(cin), LineInputIterator<>(), do_stuff);
Perhaps a common task is to store every line in a container:
vector<string> lines((LineInputIterator<>(stream)), LineInputIterator<>());
The first one.
Both do the same, but the first one is much more readable, plus you get to keep the string variable after the loop is done (in the 2nd option, its enclosed in the for loop scope)
Go with the while statement.
See Chapter 16.2 (specifically pages 374 and 375) of Code Complete 2 by Steve McConell.
To quote:
Don't use a for loop when a while loop is more appropriate. A common abuse of the flexible for loop structure in C++, C# and Java is haphazardly cramming the contents of a while loop into a for loop header.
.
C++ Example of a while loop abusively Crammed into a for Loop Header
for (inputFile.MoveToStart(), recordCount = 0; !inputFile.EndOfFile(); recordCount++) {
inputFile.GetRecord();
}
C++ Example of appropriate use of a while loop
inputFile.MoveToStart();
recordCount = 0;
while (!InputFile.EndOfFile()) {
inputFile.getRecord();
recordCount++;
}
I've omitted some parts in the middle but hopefully that gives you a good idea.
This is based on Jerry Coffin's answer. I wanted to show c++20's std::ranges::istream_view. I also added a line number to the class. I did this on godbolt, so I could see what happened. This version of the line class still works with std::input_iterator.
https://en.cppreference.com/w/cpp/ranges/basic_istream_view
https://www.godbolt.org/z/94Khjz
class line {
std::string data{};
std::intmax_t line_number{-1};
public:
friend std::istream &operator>>(std::istream &is, line &l) {
std::getline(is, l.data);
++l.line_number;
return is;
}
explicit operator std::string() const { return data; }
explicit operator std::string_view() const noexcept { return data; }
constexpr explicit operator std::intmax_t() const noexcept { return line_number; }
};
int main()
{
std::string l("a\nb\nc\nd\ne\nf\ng");
std::stringstream ss(l);
for(const auto & x : std::ranges::istream_view<line>(ss))
{
std::cout << std::intmax_t(x) << " " << std::string_view(x) << std::endl;
}
}
prints out:
0 a
1 b
2 c
3 d
4 e
5 f
6 g