how to keep track of the current position in std::stringstream - c++

I am writing a custom logger where I buffer my log messages in a std::stringstream and flush it to a file (std::ofstream) whenever the std::stringstream is big enough(to save some IO latency) . sincestd::stringstream doesn't have a .size() method, I use seekg and tellg :
template <typename T>
MyClass & operator<< (const T& val)
{
boost::unique_lock<boost::mutex> lock(mutexOutput);
output << val; //std::stringstream output;
output.seekg(0, std::ios::end);
if(output.tellg() > 1048576/*1MB*/){
flushLog();
}
return *this;
}
Problem:
It seems to me that, whenever I invoke this method, it uses seekg to start counting the bytes from the beginning all the way to the end and get the size using tellg. I came up with this design to save some IO time in the first place, but: isn't this continuous counting impose a larger cost(if the number of calls to this method is high and log messages are small as in most of the cases)?
is there a better way to do this?
And a side question: is 1MB a good number for buffer size in a normal nowadays computers?
Thank you

You can just use ostringstream::tellp() to get the length of the string. Here's an example lifted from http://en.cppreference.com/w/cpp/io/basic_ostream/tellp.
#include <iostream>
#include <sstream>
int main()
{
std::ostringstream s;
std::cout << s.tellp() << '\n';
s << 'h';
std::cout << s.tellp() << '\n';
s << "ello, world ";
std::cout << s.tellp() << '\n';
s << 3.14 << '\n';
std::cout << s.tellp() << '\n' << s.str();
}
Output:
0
1
13
18
hello, world 3.14

Related

Difference between modifying and non-modifying putback()

The question comes from https://en.cppreference.com/w/cpp/io/basic_istream/putback, the example code.
#include <sstream>
#include <iostream>
int main()
{
std::istringstream s2("Hello, world"); // input-only stream
s2.get();
if (s2.putback('Y')) // cannot modify input-only buffer
std::cout << s2.rdbuf() << '\n';
else
std::cout << "putback failed\n";
s2.clear();
if (s2.putback('H')) // non-modifying putback
std::cout << s2.rdbuf() << '\n';
else
std::cout << "putback failed\n";
}
Why s2.putback('Y') fails but s2.putback('H') succeed? Isn't the latter also an operation to modify the input-only stream buffer?
Also, I find something confusing while doing some experiments.
I add 1 line code compared to the sample above and the second results fails.. Why is it?
#include <sstream>
#include <iostream>
int main()
{
std::istringstream s2("Hello, world"); // input-only stream
s2.get();
if (s2.putback('Y')) // cannot modify input-only buffer
std::cout << s2.rdbuf() << '\n';
else
std::cout << "putback failed\n";
std::cout << s2.rdbuf() << '\n'; //1 line code added
s2.clear();
if (s2.putback('H')) // non-modifying putback
std::cout << s2.rdbuf() << '\n';
else
std::cout << "putback failed\n";
}
Why s2.putback('Y') fails but s2.putback('H') succeed? Isn't the latter also an operation to modify the input-only stream buffer?
The invocation s2.putback('H') potentially modifies the buffer, but in this case, it does not, because the data already start with an 'H'.
You can exemplify the behavior like this:
s2.clear();
assert(s2.putback('H')); // Ok, replacing 'H' with 'H' doesn't change anything
assert(!s2.putback('Z')); // Can't modify.
You can read further the behavior of sputbackc.
If a putback position is available in the get area (gptr() > eback()), and the character c is equal to the character one position to the left of gptr() (as determined by Traits::eq(c, gptr()[-1]), then simply decrements the next pointer (gptr()).
So in the case of s2.putback('H'), only the next pointer is decremented. The buffer is not changed.
Answer to your edit: basic_ostream& operator<<( std::basic_streambuf<CharT, Traits>* sb); extracts the characters maintained by sb, so after std::cout << s2.rdbuf() << '\n'; the next pointer points to the end of the buffer, which causes s2.putback('H') to fail.

Postpone standard output in the end

Many warning messages (via std::cout) might be printed out during the process. Is there a way to postpone the printing of the warning messaged in the end of the program? There are huge amount of the processing information will be printed. I'm planing to have all the warnings together in the end rather than scattered around.
More background:
code is already there.
there are about 50 warning messages within the code (in case if there is some sort of delay( ) function, I don't want to add 50 times, would be nice if there is an globally delaye/postpone function for stand output)
Thanks
One way to do it is to send everything to a stringstream, and then print at the end.
For example:
#include <iostream>
#include <sstream>
int main(){
int i = 5, j = 4;
std::stringstream ss;
std::cout << i * j << std::endl;
ss << "success" << std::endl;
std::cout << j + i * i + j << std::endl;
ss << "failure" << std::endl;
std::cout << ss.str() << std::endl;
return 0;
}
Output:
20
33
success
failure
If you're just trying to delay all printing of std::cout what you can do is redirect standard out to a string stream that acts as a buffer. It's pretty simple and avoids all of the dup, dup2, and piping stuff that one might be inclined to try.
#include <sstream>
// Make a buffer for all of your output
std::stringstream buffer;
// Copy std::cout since we're going to replace it temporarily
std::streambuf normal_cout = std::cout.rdbuf();
// Replace std::cout with your bufffer
std::cout.rdbuf(buffer.rdbuf());
// Now your program runs and does its thing writing to std::cout
std::cout << "Additional errors or details" << std::endl;
// Now restore std::cout
std::cout.rdbuf(normal_cout);
// Print the stuff you buffered
std::cout << buffer.str() << std::endl;
Also in the future, you should really use a buffer for errors from the start OR at a minimum write errors and logging to std::cerr so that your normal runtime print outs aren't cluttered with errors.

What is the best way (performance driven) to convert and write variables to a file in c++?

I want to write something like this to a file:
0x050 addik r13, r0, 4496
0x054 addik r2, r0, 2224
0x058 addik r1, r0, 7536
0x05c brlid r15, 200
...
And so on... Its a program instruction trace which will have thousands of lines.
I am reading from an 'elf' decoding the instruction, creating an object, setting its address, instruction, name and registers parameters and then writing it in the above format to a file.
What it is the best way, measuring in speed/performance, to do this?
Now I have this (still just hexadecimals) and I don't know if it is the best way to continue writing my code:
Converting function:
static std::string toHex(const T &i) {
std::stringstream stream;
stream << "0x"
<< std::setfill ('0') << std::setw(sizeof(T)*2)
<< std::hex << i;
return stream.str();
};
And the writing:
while((newInstruction = manager->newInstruction())){
stream << Utils::toHex(newInstruction->getAddress())
<< " "
<< Utils::toHex(newInstruction->getInstruction())
<< endl;
trace->writeFile(stream.str());
stream.str(std::string());
}
EDIT:
So I have reached a faster solution based on the answers.
For one I implemented the solution given by Escualo to stop creating objects each time I read a new instruction.
And then I read the answer given by Thomas Matthews and gave me the idea to not write to my file at every instruction read, so the stringstream now works like a buffer with size 1024, and when it surpasses that value then writes the stream to the file:
while((newInstruction = manager->newInstruction())){
stream << myHex<unsigned int> << newInstruction->getAddress() << ' '
<< myHex<uint32_t> << newInstruction->getInstruction();
if(stream.tellp() > 1024){
trace->writeFile(stream.str());
stream.str(std::string());
}
}
Since file I/O is slower than the time for formatting, I suggest formatting into a buffer, the block writing the buffer to the file.
char text_buffer[1024];
unsigned int bytes_formatted = 0;
for (unsigned int i = 0; i < 10; ++i)
{
int chars_formatted = snprintf(&text_buffer[bytes_formatted],
1024-bytes_formatted,
"hello %d", i);
if (chars_formatted > 0)
{
bytes_formatted += chars_formatted;
}
}
my_file.write(text_buffer, bytes_formatted);
Most file I/O operations have a constant overhead regardless of the operation. So the more data that can be written during one operation, the better.
For one, I would avoid creating and destroying an std::stringstream for every call to your formatting function.
Recall that the I/O manipulators are nothing but functions which return the stream itself. For example, a manipulator doing exactly what you indicated above, but without resorting to a temporary std::stringstream could look like:
#include <iostream>
#include <iomanip>
template<typename T,
typename CharT,
typename Traits = std::char_traits<CharT> >
inline std::basic_ostream<CharT, Traits>&
myhex(std::basic_ostream<CharT, Traits>& os) {
return os << "0x"
<< std::setfill('0')
<< std::setw(2 * sizeof(T))
<< std::hex;
}
int main() {
int x;
std::cout << myhex<int> << &x << std::endl;
}
to print (for example):
0x0x7fff5926cf9c
To clarify: I do not know why you choose the fill, width, prefix, and format; I am just showing you how to create an I/O manipulator that does not entail creating and destroying temporary objects.
Notice that the manipulator will work with any std::basic_ostream<CharT, Traits> such as std::cout, std::cerr, std::ofstream, and std::stringstream.

How to print a bunch of integers with the same formatting?

I would like to print a bunch of integers on 2 fields with '0' as fill character. I can do it but it leads to code duplication. How should I change the code so that the code duplication can be factored out?
#include <ctime>
#include <sstream>
#include <iomanip>
#include <iostream>
using namespace std;
string timestamp() {
time_t now = time(0);
tm t = *localtime(&now);
ostringstream ss;
t.tm_mday = 9; // cheat a little to test it
t.tm_hour = 8;
ss << (t.tm_year+1900)
<< setw(2) << setfill('0') << (t.tm_mon+1) // Code duplication
<< setw(2) << setfill('0') << t.tm_mday
<< setw(2) << setfill('0') << t.tm_hour
<< setw(2) << setfill('0') << t.tm_min
<< setw(2) << setfill('0') << t.tm_sec;
return ss.str();
}
int main() {
cout << timestamp() << endl;
return 0;
}
I have tried
std::ostream& operator<<(std::ostream& s, int i) {
return s << std::setw(2) << std::setfill('0') << i;
}
but it did not work, the operator<< calls are ambigous.
EDIT I got 4 awesome answers and I picked the one that is perhaps the simplest and the most generic one (that is, doesn't assume that we are dealing with timestamps). For the actual problem, I will probably use std::put_time or strftime though.
In C++20 you'll be able to do this with std::format in a less verbose way:
ss << std::format("{}{:02}{:02}{:02}{:02}{:02}",
t.tm_year + 1900, t.tm_mon + 1, t.tm_mday,
t.tm_hour, t.tm_min, t.tm_sec);
and it's even easier with the {fmt} library that supports tm formatting directly:
auto s = fmt::format("{:%Y%m%d%H%M%S}", t);
You need a proxy for your string stream like this:
struct stream{
std::ostringstream ss;
stream& operator<<(int i){
ss << std::setw(2) << std::setfill('0') << i;
return *this; // See Note below
}
};
Then your formatting code will just be this:
stream ss;
ss << (t.tm_year+1900)
<< (t.tm_mon+1)
<< t.tm_mday
<< t.tm_hour
<< t.tm_min
<< t.tm_sec;
return ss.ss.str();
ps. Note the general format of my stream::operator<<() which does its work first, then returns something.
The "obvious" solution is to use a manipulator to install a custom std::num_put<char> facet which just formats ints as desired.
The above statement may be a bit cryptic although it entirely describes the solution. Below is the code to actually implement the logic. The first ingredient is a special std::num_put<char> facet which is just a class derived from std::num_put<char> and overriding one of its virtual functions. The used facet is a filtering facet which looks at a flag stored with the stream (using iword()) to determine whether it should change the behavior or not. Here is the code:
class num_put
: public std::num_put<char>
{
std::locale loc_;
static int index() {
static int rc(std::ios_base::xalloc());
return rc;
}
friend std::ostream& twodigits(std::ostream&);
friend std::ostream& notwodigits(std::ostream&);
public:
num_put(std::locale loc): loc_(loc) {}
iter_type do_put(iter_type to, std::ios_base& fmt,
char fill, long value) const {
if (fmt.iword(index())) {
fmt.width(2);
return std::use_facet<std::num_put<char> >(this->loc_)
.put(to, fmt, '0', value);
}
else {
return std::use_facet<std::num_put<char> >(this->loc_)
.put(to, fmt, fill, value);
}
}
};
The main part is the do_put() member function which decides how the value needs to be formatted: If the flag in fmt.iword(index()) is non-zero, it sets the width to 2 and calls the formatting function with a fill character of 0. The width is going to be reset anyway and the fill character doesn't get stored with the stream, i.e., there is no need for any clean-up.
Normally, the code would probably live in a separate translation unit and it wouldn't be declared in a header. The only functions really declared in a header would be twodigits() and notwodigits() which are made friends in this case to provide access to the index() member function. The index() member function just allocates an index usable with std::ios_base::iword() when called the time and it then just returns this index. The manipulators twodigits() and notwodigits() primarily set this index. If the num_put facet isn't installed for the stream twodigits() also installs the facet:
std::ostream& twodigits(std::ostream& out)
{
if (!dynamic_cast<num_put const*>(
&std::use_facet<std::num_put<char> >(out.getloc()))) {
out.imbue(std::locale(out.getloc(), new num_put(out.getloc())));
}
out.iword(num_put::index()) = true;
return out;
}
std::ostream& notwodigits(std::ostream& out)
{
out.iword(num_put::index()) = false;
return out;
}
The twodigits() manipulator allocates the num_put facet using new num_put(out.getloc()). It doesn't require any clean-up because installing a facet in a std::locale object does the necessary clean-up. The original std::locale of the stream is accessed using out.getloc(). It is changed by the facet. In theory the notwodigits could restore the original std::locale instead of using a flag. However, imbue() can be a relatively expensive operation and using a flag should be a lot cheaper. Of course, if there are lots of similar formatting flags, things may become different...
To demonstrate the use of the manipulators there is a simple test program below. It sets up the formatting flag twodigits twice to verify that facet is only created once (it would be a bit silly to create a chain of std::locales to pass through the formatting:
int main()
{
std::cout << "some-int='" << 1 << "' "
<< twodigits << '\n'
<< "two-digits1='" << 1 << "' "
<< "two-digits2='" << 2 << "' "
<< "two-digits3='" << 3 << "' "
<< notwodigits << '\n'
<< "some-int='" << 1 << "' "
<< twodigits << '\n'
<< "two-digits4='" << 4 << "' "
<< '\n';
}
Besides formatting integers with std::setw / std::setfill or ios_base::width / basic_ios::fill, if you want to format a date/time object you may want to consider using std::put_time / std::gettime
For convenient output formatting you may use boost::format() with sprintf-like formatting options:
#include <boost/format.hpp>
#include <iostream>
int main() {
int i1 = 1, i2 = 10, i3 = 100;
std::cout << boost::format("%03i %03i %03i\n") % i1 % i2 % i3;
// output is: 001 010 100
}
Little code duplication, additional implementation effort is marginal.
If all you want to do is output formatting of your timestamp, you should obviously use strftime(). That's what it's made for:
#include <ctime>
#include <iostream>
std::string timestamp() {
char buf[20];
const char fmt[] = "%Y%m%d%H%M%S";
time_t now = time(0);
strftime(buf, sizeof(buf), fmt, localtime(&now));
return buf;
}
int main() {
std::cout << timestamp() << std::endl;
}
operator<<(std::ostream& s, int i) is "ambiguous" because such a function already exists.
All you need to do is give that function a signature that doesn't conflict.

"Roll-Back" or Undo Any Manipulators Applied To A Stream Without Knowing What The Manipulators Were [duplicate]

This question already has answers here:
Restore the state of std::cout after manipulating it
(9 answers)
Closed 4 years ago.
If I apply an arbitrary number of manipulators to a stream, is there a way to undo the application of those manipulators in a generic way?
For example, consider the following:
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
cout << "Hello" << hex << 42 << "\n";
// now i want to "roll-back" cout to whatever state it was in
// before the code above, *without* having to know
// what modifiers I added to it
// ... MAGIC HAPPENS! ...
cout << "This should not be in hex: " << 42 << "\n";
}
Suppose I want to add code at MAGIC HAPPENS that will revert the state of the stream manipulators to whatever it was before I did cout << hex. But I don't know what manipulators I added. How can I accomplish this?
In other words, I'd like to be able to write something like this (psudocode/fantasy code):
std::something old_state = cout.current_manip_state();
cout << hex;
cout.restore_manip_state(old_state);
Is this possible?
EDIT:
In case you're curious, I'm interested in doing this in a custom operator<<() I'm writing for a complex type. The type is a kind of discriminated union, and different value types will have different manips applied to the stream.
EDIT2:
Restriction: I cannot use Boost or any other 3rd party libraries. Solution must be in standard C++.
Yes.
You can save the state and restore it:
#include <iostream>
#include <iomanip>
using namespace std;
int main()
{
std::ios state(NULL);
state.copyfmt(std::cout);
cout << "Hello" << hex << 42 << "\n";
// now i want to "roll-back" cout to whatever state it was in
// before the code above, *without* having to know what modifiers I added to it
// ... MAGIC HAPPENS! ...
std::cout.copyfmt(state);
cout << "This should not be in hex: " << 42 << "\n";
}
If you want to get back to the default state you don't even need to save the state you can extract it from a temporary object.
std::cout.copyfmt(std::ios(NULL));
The standard manipulators all manipulate a stream's format flags, precision and width settings. The width setting is reset by most formatted output operations anyway. These can all be retrieved like this:
std::ios_base::fmtflags saveflags = std::cout.flags();
std::streamsize prec = std::cout.precision();
std::streamsize width = std::cout.width();
and restored:
std::cout.flags( saveflags );
std::cout.precision( prec );
std::cout.width( width );
Turning this into an RAII class is an exercise for the reader...
Saving and restoring state is not exception-safe. I would propose to shuffle everything into a stringstream, and finally you put that on the real stream (which has never changed its flags at all).
#include <iostream>
#include <iomanip>
#include <sstream>
int main()
{
std::ostringstream out;
out << "Hello" << std::hex << 42 << "\n";
std::cout << out.str();
// no magic necessary!
std::cout << "This should not be in hex: " << 42 << "\n";
}
Of course this is a little less performant. The perfect solutions depends on your specific needs.
Boost IO State saver might be of help.
http://www.boost.org/doc/libs/1_40_0/libs/io/doc/ios_state.html
I know that is an old question, but for future generations:
You can also write a simple state saver yourself (it will certainly help you avoid leaving the state changed). Just use the solution suggested by #loki and run it from the constructor/destructor of an object (in short: RAII) along these lines:
class stateSaver
{
public:
stateSaver(ostream& os): stream_(os), state_(nullptr) { state_.copyfmt(os); }
~stateSaver() { stream_.copyfmt(state_); }
private:
std::ios state_;
ostream& stream_;
};
Then, you will use it like this:
void myFunc() {
stateSaver state(cout);
cout << hex << 42 << endl; // will be in hex
}
int main() {
cout << 42 << endl; // will be in dec
myFunc();
cout << 42 << endl; // will also be in dec
}