How do I create my own ostream/streambuf? - c++

For educational purposes I want to create a ostream and stream buffer to do:
fix endians when doing << myVar;
store in a deque container instead of using std:cout or writing to a file
log extra data, such as how many times I did <<, how many times I did .write, the amount of bytes I written and how many times I flush(). But I do not need all the info.
I tried overloading but failed horribly. I tried overloading write by doing
ostream& write( const char* s, streamsize n )
in my basic_stringstream2 class (I copied paste basic_stringstream into my cpp file and modified it) but the code kept using basic_ostream. I looked through code and it looks like I need to overload xsputn (which isn't mention on this page http://www.cplusplus.com/reference/iostream/ostream ) but what else do I need to overload? and how do I construct my class (what does it need to inherit, etc)?

The canonical approach consists in defining your own streambuf.
You should have a look at:
Angelika LAnger's articles on IOStreams derivation
James Kanze's articles on filtering streambufs
boost.iostream for examples of application

For A+C) I think you should look at facets, they modify how objects are written as characters. You could store statistics here as well on how many times you streamed your objects.
Check out How to format my own objects when using STL streams? for an example.
For B) You need to create your own streambuf and connect your ostream to that buffer (constructor argument). See Luc's links + Deriving new streambuf classes.
In short you need to implement this for an ostream (minimum):
overflow (put a single char or flush buffer) (link)
xsputn (put a char array to buffer)(link)
sync (link)

I'm not sure that what you want to do is possible. The << operators are not virtual. So you could define yourstream &operator << (yourstream &strm, int i) to do what you want with the endian conversion and counting, and it will work when your code calls it directly. But if you pass a yourstream object into a function that expects an ostream, any time that function calls <<, it will go to the original ostream version instead of yours.
As I understand it, the streams facilities have been set up so that you can "easily" define a new stream type which uses a different sort of buffer (like, say, a deque of chars), and you can very easily add support for outputting your own classes via <<. I don't think you are intended to be able to redefine the middle layer between those.
And particularly, the entire point of the << interface is to provide nicely formatted text output, while it sounds like you actually want binary output. (Otherwise the reference to "endian" makes no sense.) Even assuming there is some way of doing this I don't know, it will produce awkward binary output at best. For instance, consider the end user overload to output a point in 3D space. The end user version of << will probably do something like << '(' << x << ", " << y << ", " << z << ')'. That will look nice in a text stream, but it's a lot of wasted and completely useless characters in a binary stream, which would ideally just use << x << y << z. (And how many calls to << should those count as?)

Related

C++ When should I std::ctype<char>::widen()?

Apparently, writing a single character of type char to a stream whose char type is char is guaranteed by the standard to not invoke ctype<char>.widen() on the associated locale.
On the other hand, according to my reading of the standard (C++17), when writing a string of chars (const char*) instead of a single char, ctype<char>.widen() must be invoked.
I am struggling to understand how to make sense of this.
On one hand, the fact, that widen() is required when writing strings, suggests that there are valid scenarios where widen() has an effect. But if that is the case, then how can it be alright to omit the widening operation when writing single characters?
It seems to me that there must be an intended difference in the roles (domains of applicability) of the two operations, output of single char (char) and output of string (const char*), but I do not see what it is.
To make things more concrete, let us say that I wanted to implement an output operator for a range object, and have the output be on the form 0->2. My first inkling would be something like this:
std::ostream& operator<<(std::ostream& out, const Range& range)
{
// ...
out << "->"; // Invokes widen()
// ...
}
But, is this how I am supposed to do it? Or would out << '-' << '>' (no widening) have been better / more correct?
Curiously, the formulation of the standard suggests to me that the two forms do not always produce the same result. Also, as far as I can tell, the latter form (with separate chars), could be much faster on some platforms.
What is the upshot? What are the rules that should guide me in choosing between the two types of output operations?
For reference, here is an earlier attempt of mine at posing the same question (3 years ago): C++ What is the role of std::ctype<char>::widen()?
Since the old question never got much traction, I'd prefer to mark that one as a duplicate of this one, rather than vice versa.
EDIT: I recognize that a good output operator might not want to use formatted output operations internally, but that is not what I am interested in here. I'm interested in the reasoning behind the difference in behavior of the two types of output operations.
EDIT: Here is one explanation that would make sense to me: << on single char is to be understood as a special case of << on std::string, and not as a special case of << on const char*. But, is this the right explanation? If so, I believe it means that I should use << "->" above. Not << '-' << '>'.
EDIT: Here is what makes me think that the explanation above (2nd EDIT) is not the right one: In the case of a wchar_t stream, both << on char and << on const char* invokes widen(), so from this point of view, they are in the same "family". So, from a consistency point of view, we should expect that when we switch stream type from wchar_t to char, either both of those operators should still invoke widen(), or both should not.
EDIT: Here is another kind of explanation, which I don't think is right, but I'll include it for exposition: For a char stream out, out << "->" has the same effect as out << '-' << '>', because even though the first form is required to invoke widen(), widen() is required to be a "no op" on a char stream in any locale (I don't believe this is the case). So, while there may be a significant difference in performance, the results are always the same. This would suggest that the difference in formulation of required behavior is a kind of unintended, but fairly benign accident. If this is the right explanation, then I should chose out << '-' << '>' due to the possibly much better performance.
EDIT: Ok, I found another 3 year old question from myself, where I am coming at it from a slightly different angle: C++ When are characters widened in output stream operator<<()?. The comments from Dietmar Kühl suggests that widen() is always a "no op" on a char stream, and the whole "issue" is due to imprecise wording in the standard. If so, it would render my second proposed explanation above correct (4th EDIT). Still, It would be nice to get this corroborated by somebody else.

Least onerous way to implement generic formatted stream output in CUDA?

I want to be able to write something close to:
std::cout << "Hello" << my_world_string << ", " << std::setprecision(5) << my_double << '\n';
in CUDA device-side code, for debugging templated functions - and for this kind of line of code to result in a single, unbroken, output line (i.e. the equivalent of a single CUDA printf() call - which typically doesn't get mangled with output from other threads).
Of course, that's not possible since there are no files or file descriptors in device-side code, nor is any of the std::ostream code usable in device-side code. Essentially what we have to work with is CUDA's hardware+software hack enabling printf()s. But it is obviously possible to get something like:
stream << "Hello" << my_world_string << ", " << foo::setprecision(5) << my_double << '\n';
stream.flush();
or:
stream << "Hello" << my_world_string << ", " << foo::setprecision(5) << my_double << '\n';
printf("%s", stream.str());
My question is: What should I implement which would allow me to write code as close to the above as possible, minimizing effort / amount of code to write?
Notes:
I used the identifier stream but it doesn't have to be a stream. Nor does the code need to look just like I laid it out. The point is for me to be able to have printing code in a templated device function.
All code will be written in C++11.
Code may assume compilation is performed either with C++11 or a later version of the standard.
I can use existing FOSS code, but only if its license is permissive, e.g. 3-BSD, CC-BY-SA, MIT - but not GPL.
Currently, the way I'm thinking of implementing this is:
Implement an std::ostringstream-like class which can take its initial storage from elsewhere (on construction).
With such an object, you can then printf("%s\n", my_gpu_sstream.str()) .
Allow the GPU-ostringstream to be constructed with a fixed-sized buffer.
Allow the GPU-ostringstream to allocate variable-size buffers using CUDA's device-side malloc().
and Bob's your uncle.
However, I would really rather avoid implementing a full-blown stringstream myself. Seems like a whole lot of redundant work and code.
Edit: Done! I now havea working implementation in my cuda-kat library. I've used robhz786's strf library, which is (header-only-if-you-like) string formatting library not based on standard streams. On its basis I've implemented an on-device stringstream, kat::stringstream, and on the basis of that, a "printf'ing ostream" class.
It's far from perfect: strf doesn't use standard library manipulators and has it's own idioms for filling, setting precision etc. Also, compilation time is quite high. But it is quite usable. Even has the option to prepend each printed line with a prefix (e.g. the block & thread indices) if you configure it to do so. Output uses CUDA's intrinsic printf() mechanism - when reaching the end of a line.

What is the point of using ostringstream instead of just using a regular string?

I am learning C++ sockets for the first time, and my example uses ostringstream a lot. What is the purpose and advantage of using stringstreams here instead of just using strings? It looks to me in this example that I could just as easily use a regular string. Isn't using this ostringstream more bulky?
std::string NetFunctions::GetHostDescription(cost sockaddr_in &sockAddr)
{
std::ostringstream stream;
stream << inet_ntoa(sockAddr.sin_addr) << ":" << ntohs(sockAddr.sin_port);
return stream.str();
}
Streams are buffers. They are not equal to char arrays, such as std::string, which is basically an object containing a pointer to an array of chars. Streams have their intrinsic functions, manipulators, states, and operators, already at hand. A string object in comparison will have some deficiencies, e.g., with outputting numbers, lack of handy functions like endl, troublesome concatenation, esp. with function effects (results, returned by functions), etc. String objects are simply cumbersome for that.
Now std::ostringstream is a comfortable and easy-to-set buffer for formatting and preparation of much data in textual form (including numbers) for the further combined output. Moreover, in comparison to simple ostream object cout, you may have a couple of such buffers and juggle them as you need.

Forcing output inside a loop C++ [duplicate]

Can someone please explain (preferably using plain english) how std::flush works?
What is it?
When would you flush a stream?
Why is it important?
Thank you.
Since it wasn't answered what std::flush happens to be, here is some detail on what it actually is. std::flush is a manipulator, i.e., a function with a specific signature. To start off simple, you can think of std::flush of having the signature
std::ostream& std::flush(std::ostream&);
The reality is a bit more complex, though (if you are interested, it is explained below as well).
The stream class overload output operators taking operators of this form, i.e., there is a member function taking a manipulator as argument. The output operator calls the manipulator with the object itself:
std::ostream& std::ostream::operator<< (std::ostream& (*manip)(std::ostream&)) {
(*manip)(*this);
return *this;
}
That is, when you "output" std::flush with to an std::ostream, it just calls the corresponding function, i.e., the following two statements are equivalent:
std::cout << std::flush;
std::flush(std::cout);
Now, std::flush() itself is fairly simple: All it does is to call std::ostream::flush(), i.e., you can envision its implementation to look something like this:
std::ostream& std::flush(std::ostream& out) {
out.flush();
return out;
}
The std::ostream::flush() function technically calls std::streambuf::pubsync() on the stream buffer (if any) which is associated with the stream: The stream buffer is responsible for buffering characters and sending characters to the external destination when the used buffer would overflow or when the internal representation should be synced with the external destination, i.e., when the data is to be flushed. On a sequential stream syncing with the external destination just means that any buffered characters are immediately sent. That is, using std::flush causes the stream buffer to flush its output buffer. For example, when data is written to a console flushing causes the characters to appear at this point on the console.
This may raise the question: Why aren't characters immediately written? The simple answer is that writing characters is generally fairly slow. However, the amount of time it takes to write a reasonable amount of characters is essentially identical to writing just one where. The amount of characters depends on many characteristics of the operating system, file systems, etc. but often up to something like 4k characters are written in about the same time as just one character. Thus, buffering characters up before sending them using a buffer depending on the details of the external destination can be a huge performance improvement.
The above should answer two of your three questions. The remaining question is: When would you flush a stream? The answer is: When the characters should be written to the external destination! This may be at the end of writing a file (closing a file implicitly flushes the buffer, though) or immediately before asking for user input (note that std::cout is automatically flushed when reading from std::cin as std::cout is std::istream::tie()'d to std::cin). Although there may be a few occasions where you explicitly want to flush a stream, I find them to be fairly rare.
Finally, I promised to give a full picture of what std::flush actually is: The streams are class templates capable of dealing with different character types (in practice they work with char and wchar_t; making them work with another characters is quite involved although doable if you are really determined). To be able to use std::flush with all instantiations of streams, it happens to be a function template with a signature like this:
template <typename cT, typename Traits>
std::basic_ostream<cT, Traits>& std::flush(std::basic_ostream<cT, Traits>&);
When using std::flush immediately with an instantiation of std::basic_ostream it doesn't really matter: The compiler deduces the template arguments automatically. However, in cases where this function isn't mentioned together with something facilitating the template argument deduction, the compiler will fail to deduce the template arguments.
By default, std::cout is buffered, and the actual output is only printed once the buffer is full or some other flushing situation occurs (e.g. a newline in the stream). Sometimes you want to make sure that the printing happens immediately, and you need to flush manually.
For example, suppose you want to report a progress report by printing a single dot:
for (;;)
{
perform_expensive_operation();
std::cout << '.';
std::flush(std::cout);
}
Without the flushing, you wouldn't see the output for a very long time.
Note that std::endl inserts a newline into a stream as well as causing it to flush. Since flushing is mildly expensive, std::endl shouldn't be used excessively if the flushing isn't expressly desired.
Here's a short program that you can write to observe what flush is doing
#include <iostream>
#include <unistd.h>
using namespace std;
int main() {
cout << "Line 1..." << flush;
usleep(500000);
cout << "\nLine 2" << endl;
cout << "Line 3" << endl ;
return 0;
}
Run this program: you'll notice that it prints line 1, pauses, then prints line 2 and 3. Now remove the flush call and run the program again- you'll notice that the program pauses and then prints all 3 lines at the same time. The first line is buffered before the program pauses, but because the buffer is never flushed, line 1 is not outputted until the endl call from line 2.
A stream is connected to something. In the case of standard output, it could be the console/screen or it could be redirected to a pipe or a file. There is a lot of code between your program and, for example, the hard disk where the file is stored. For example, the operating system is doing stuff with any file or the disk drive itself might be buffering data to be able to write it in fixed size blocks or just to be more efficient.
When you flush the stream, it tells the language libraries, the os and the hardware that you want to any characters that you have output so far to be forced all the way to storage. Theoretically, after a 'flush', you could kick the cord out of the wall and those characters would still be safely stored.
I should mention that the people writing the os drivers or the people designing the disk drive might are free to use 'flush' as a suggestion and they might not really write the characters out. Even when the output is closed, they might wait a while to save them. (Remember that the os does all sorts of things at once and it might be more efficient to wait a second or two to handle your bytes.)
So a flush is a sort of checkpoint.
One more example: If the output is going to the console display, a flush will make sure the characters actually get all the way out to where the user can see them. This is an important thing to do when you are expecting keyboard input. If you think you have written a question to the console and its still stuck in some internal buffer somewhere, the user doesn't know what to type in answer. So, this is a case where the flush is important.

How do I pass a specific number of characters from an istream to an ostream

I have an istream (ifstream, in this case), and I want to write a specific number of characters from it into an ostream (cout, to be specific).
I can see that this is possible using something like istream.get(), but that will create an intermediate buffer, which is something I'd prefer to avoid.
Something like:
size_t numCharacters = 8;
ostream.get(istream, numCharacters);
Can someone point me in the right direction?
Thanks for reading :)
Edit: added c++ tag
Edit: fixed title :/
New Edit:
Thanks very much for the answers guys. As a side note, can anyone explain this weird behaviour of copy_n? Basically it seems to not consume the last element copied from the input stream, even though that element appears in the output stream. The code below should illustrate:
string test = "testing the weird behaviour of copy_n";
stringstream testStream(test);
istreambuf_iterator<char> inIter( testStream );
ostream_iterator<char> outIter( cout );
std::copy_n(inIter, 5, outIter);
char c[10];
testStream.get(c,10);
cout << c;
The output I get is:
testiing the w
The output I would expect is:
testing the we
Nothing in the docs at cppreference.com mentions this kind of behaviour. Any further help would be much appreciated :)
My workaround for the moment is to seek 1 extra element after the copy - but that's obviously not ideal.
You won't be able to avoid copying the bytes if you want to limit the number of characters and you want to be efficient. For a slow version without a buffer you can use std::copy_n():
std::copy_n(std::istreambuf_iterator<char>(in),
std::istreambuf_iterator<char>(),
std::ostreambuf_iterator<char>(out),
n);
I'd be pretty sure that this is quite a bit slower than reading a buffer or a sequence thereof and writing the buffer (BTW make sure to call std::ios_base::sync_with_stdio(false) as it is otherwise likely to be really slow to do anything with std::cout).
You could also create a filtering stream buffer, i.e., a class derived from std::streambuf using another stream buffer as its source of data. The filter would indicate after n characters that ther are no more characters. It would internally still use buffers as individual processing of characters is slow. It could then be used as:
limitbuf buf(in, n);
out << &buf;