How reliable are basic_stringbuf::in_avail() and basic_stringbuf::str()? - c++

I need to get all the contents from a stream, without actually extracting them (just like stringstream::str()). I've tried basic_stringbuf::str(), but it behaves incorrectly when the stream is empty. To avoid that case, I had a go at basic_stringbuf::in_avail(), but that hasn't worked out very well either.
In the following test case, in_avail() doesn't return the number of available elements on the stream, and str() returns more elements than what is currently there:
#include <iostream>
#include <iterator>
#include <vector>
#include <sstream>
// extracts everything from the stream
std::vector<unsigned char> stream2vector(std::basic_istream<unsigned char>& stream)
{
std::vector<unsigned char> retreivedData;
std::istreambuf_iterator<unsigned char> it(stream);
const std::istreambuf_iterator<unsigned char> endOfStream;
retreivedData.insert(retreivedData.begin(), it, endOfStream);
return retreivedData;
}
int main() {
std::basic_stringbuf<unsigned char> buf;
std::basic_iostream<unsigned char> stream(&buf);
unsigned char array[5] = { 1, 2, 3, 4, 5 };
stream.write(array, 5);
std::cout << "rdbuf()->in_avail(): " << buf.in_avail() << "\n";
std::vector<unsigned char> d1 = stream2vector(stream);
std::cout << "d1.size(): " << d1.size() << "\n";
std::cout << "\n";
// d2 should be empty
std::vector<unsigned char> d2 = stream2vector(stream);
std::cout << "d2.size(): " << d2.size() << "\n";
std::basic_string<unsigned char> s = buf.str();
std::cout << "buf.str().size(): " << buf.str().size() << "\n";
}
Compiling on g++ 4.4, the output is:
rdbuf()->in_avail(): 1 // expected: 5
d1.size(): 5 // as expected
d2.size(): 0 // as expected
buf.str().size(): 5 // expected: 0
What am I doing wrong? What's the best way to do what I'm trying?
Thanks a lot.

in_avail is the number of characters ready to be read from the buffer, not the size of the buffer itself. It's really allowed to return any nonzero value here.
However, I can't answer what the best way is of what you're doing, because I don't know what you're doing. If you already have things as an unsigned char array, then you're going to want to do:
std::vector<unsigned char> data(array, array + sizeof(array)/sizeof(unsigned char));
If you're just trying to read a whole stream into a vector, then I would do exactly what you're doing; I'd just replace you're stream2vector function with this, equivalent, simpler one:
// extracts everything from the stream
std::vector<unsigned char> stream2vector(std::basic_istream<unsigned char>& stream)
{
std::istreambuf_iterator<unsigned char> it(stream);
const std::istreambuf_iterator<unsigned char> endOfStream;
return std::vector<unsigned char>(it, endOfStream);
}
I'm not entirely sure why you're specializing every operation here for unsigned char -- I would just use the default char versions, because unsigned char is allowed to be the same size as a short, which is probably not what you want (but I am not aware of any implementation that does this).

Related

Inputting data to stringstream with hexadecimal representation

I am attempting to extract a hash-digest in hexadecimal via a stringstream, but I cannot get it to work when iterating over data.
Using std::hex I can do this easily with normal integer literals, like this:
#include <sstream>
#include <iostream>
std::stringstream my_stream;
my_stream << std::hex;
my_stream << 100;
std::cout << my_stream.str() << std::endl; // Prints "64"
However when I try to push in data from a digest it just interprets the data as characters and pushes them into the stringstream. Here is the function:
#include <sstream>
#include <sha.h> // Crypto++ library required
std::string hash_string(const std::string& message) {
using namespace CryptoPP;
std::stringstream buffer;
byte digest[SHA256::DIGESTSIZE]; // 32 bytes or 256 bits
static SHA256 local_hash;
local_hash.CalculateDigest(digest, reinterpret_cast<byte*>(
const_cast<char*>(message.data())),
message.length());
// PROBLEMATIC PART
buffer << std::hex;
for (size_t i = 0; i < SHA256::DIGESTSIZE; i++) {
buffer << *(digest+i);
}
return buffer.str();
}
The type byte is just a typedef of unsigned char so I do not see why this would not input correctly. Printing the return value using std::cout gives the ASCI mess of normal character interpretation. Why does it work in the first case, and not in the second case?
Example:
std::string my_hash = hash_string("hello");
std::cout << hash << std::endl; // Prints: ",≥M║_░ú♫&Φ;*┼╣Γ₧←▬▲\▼ºB^s♦3bôïÿ$"
First, the std::hex format modifier applies to integers, not to characters. Since you are trying to print unsigned char, the format modifier is not applied. You can fix this by casting to int instead. In your first example, it works because the literal 100 is interpreted as an integer. If you replace 100 with e.g. static_cast<unsigned char>(100), you would no longer get the hexadecimal representation.
Second, std::hex is not enough, since you likely want to pad each character to a 2-digit hex value (i.e. F should be printed as 0F). You can fix this by also applying the format modifiers std::setfill('0') and std::setw(2) (reference, reference).
Applying these modifications, your code would then look like this:
#include <iomanip>
...
buffer << std::hex << std::setfill('0') << std::setw(2);
for (size_t i = 0; i < SHA256::DIGESTSIZE; i++) {
buffer << static_cast<int>(*(digest+i));
}

How do I perform string formatting to a static buffer in C++?

I am working in a section of code with very high performance requirements. I need to perform some formatted string operations, but I am trying to avoid memory allocations, even internal library ones.
In the past, I would have done something similar to the following (assuming C++11):
constexpr int BUFFER_SIZE = 200;
char buffer[BUFFER_SIZE];
int index = 0;
index += snprintf(&buffer[index], BUFFER_SIZE-index, "Part A: %d\n", intA);
index += snprintf(&buffer[index], BUFFER_SIZE-index, "Part B: %d\n", intB);
// etc.
I would prefer to use all C++ methods, such as ostringstream, to do this instead of the old C functions.
I realize I could use std::string::reserve and std::ostringstream to procure space ahead of time, but that will still perform at least one allocation.
Does anyone have any suggestions?
Thanks ahead of time.
Does anyone have any suggestions?
Yes, use std::ostrstream. I know it is deprecated. But I find it useful for output to static buffers. No possibility of memory leaks if an exception occurs.
No allocation of memory at all.
#include <strstream> // for std::ostrstream
#include <ostream> // for std::ends
// :
constexpr int BUFFER_SIZE = 200;
char buffer[BUFFER_SIZE];
std::ostrstream osout(buffer, sizeof(buffer));
osout << "Part A: " << intA << "Part B: " << intB << std::ends;
My thanks to all that posted suggestions (even in the comments).
I appreciate the suggestion by SJHowe, being the briefest solution to the problem, but one of the things I am looking to do with this attempt is to start coding for the C++ of the future, and not use anything deprecated.
The solution I decided to go with stems from the comment by Remy Lebeau:
#include <iostream> // For std::ostream and std::streambuf
#include <cstring> // For std::memset
template <int bufferSize>
class FixedBuffer : public std::streambuf
{
public:
FixedBuffer()
: std::streambuf()
{
std::memset(buffer, 0, sizeof(buffer));
setp(buffer, &buffer[bufferSize-1]); // Remember the -1 to preserve the terminator.
setg(buffer, buffer, &buffer[bufferSize-1]); // Technically not necessary for an std::ostream.
}
std::string get() const
{
return buffer;
}
private:
char buffer[bufferSize];
};
//...
constexpr int BUFFER_SIZE = 200;
FixedBuffer<BUFFER_SIZE> buffer;
std::ostream ostr(&buffer);
ostr << "PartA: " << intA << std::endl << "PartB: " << intB << std::endl << std::ends;

How to store the value of a buffer in a map and not its reference?

I have a UDP server which gets messages in a buffer, that I would like to store like a mailbox. For this, I would like to create either a vector or a map that could hold these incoming messages, but the value of my map or vector keeps pointing to the current value of the buffer.
How do I get the values properly stored in a map or vector?
To demonstrate my issue, I've written a simple static example which represents what happens in my script:
#include <map>
#include <iostream>
int main(int argc, char const *argv[])
{
char buffer[65535];
std::map<int, char *> messages;
buffer = {'h','e','l','l','o'};
messages[0] = buffer;
buffer = {'h','o','w'};
messages[1] = buffer;
buffer = {'a','r','e'};
messages[2] = buffer;
buffer = {'y','o','u'};
messages[3] = buffer;
std::cout << messages[0] << std::endl;
std::cout << messages[1] << std::endl;
std::cout << messages[2] << std::endl;
std::cout << messages[3] << std::endl;
return 0;
}
The outcome of this is:
you
you
you
you
But I would like to get:
hello
how
are
you
How do I achieve this?
If you declare your map with std::map<int, char *> the second member is just a pointer to a char.
In your code, this pointer points to the first char of your buffer and you add it several times in your map hence you get the same result at the end.
If you want to keep your map with the char* you have to allocate some memory to each entry you will add, and at the end free it.
I advise you to replace your char * by std::string and the manipulations will be way much simpler, as an example :
#include <map>
#include <string>
#include <iostream>
std::map<int, std::string> messages;
messages[0] = "hello";
messages[1] = "how";
messages[2] = "are";
messages[3] = "you";
std::cout << messages[0] << std::endl;
std::cout << messages[1] << std::endl;
std::cout << messages[2] << std::endl;
std::cout << messages[3] << std::endl;
See on Coliru
What's happening there is that you change the value of the memory your variable buffer points to. If you want to use char* then you have to allocate new memory for every message. But as some people in the comment mentioned, you can just use the std type std::string to do what you want to do. Then the memory management is done behind the scenes
edit: changed the phrase at the beginning, because it was ambiguous!

Casting from `int` to `unsigned char`

I am running the following C++ code on Coliru:
#include <iostream>
#include <string>
int main()
{
int num1 = 208;
unsigned char uc_num1 = (unsigned char) num1;
std::cout << "test1: " << uc_num1 << "\n";
int num2 = 255;
unsigned char uc_num2 = (unsigned char) num2;
std::cout << "test2: " << uc_num2 << "\n";
}
I am getting the output:
test1: �
test2: �
This is a simplified example of my code.
Why does this not print out:
test1: 208
test2: 255
Am I misusing std::cout, or am I not doing the casting correctly?
More background
I want to convert from int to unsigned char (rather than unsigned char*). I know that all my integers will be between 0 and 255 because I am using them in the RGBA color model.
I want to use LodePNG to encode images. The library in example_encode.cpp uses unsigned chars in std::vector<unsigned char>& image:
//Example 1
//Encode from raw pixels to disk with a single function call
//The image argument has width * height RGBA pixels or width * height * 4 bytes
void encodeOneStep(const char* filename, std::vector<unsigned char>& image, unsigned width, unsigned height)
{
//Encode the image
unsigned error = lodepng::encode(filename, image, width, height);
//if there's an error, display it
if(error) std::cout << "encoder error " << error << ": "<< lodepng_error_text(error) << std::endl;
}
std::cout is correct =)
Press ALT then 2 0 8
This is the char that you are printing with test1. The console might not know how to print that properly so it outputs the question mark. Same thing with 255. After reading the png and putting it in the std::vector, there is no use of writing it to the screen. This file contains binary data which is not writable.
If you want to see "208" and "255", you should not convert them to unsigned char first, or specify that you want to print numbers such as int for example, like this
std::cout << num1 << std::endl;
std::cout << (int) uc_num1 << std::endl;
You are looking at a special case of std::cout which is not easy to understand at first.
When std::cout is called, it checks the type of the right hand side operand. In your case, std::cout << uc_num1 tells cout that the operand is an unsigned char, so it does not perform a conversion because unsigned char are usually printable. Try this :
unsigned char uc_num3 = 65;
std::cout << uc_num3 << std::endl;
If you write std::cout << num1, then cout will realize that you are printing an int. It will then transform the int into a string and print that string for you.
You might want to check about c++ operator overloading to understand how it works, but it is not super crucial at the moment, you just need to realize that std::cout can behave differently for different data type you try to print.

std::cout << stringstream.str()->c_str() prints nothing

in a function, that gets unsigned char && unsigned char length,
void pcap_callback(u_char *args, const struct pcap_pkthdr* pkthdr, const u_char* packet)
{
std::vector<unsigned char> vec(packet, packet+pkthdr->len); // optimized from foo.
std::stringstream scp;
for (int i=0;i<pkthdr->len;i++) {
scp<<vec[i];
}
std::string mystr = std::string(scp.rdbuf()->str());
std::cout << "WAS: " << packet << std::endl;
std::cout << "GOOD: " << scp.str() << std::endl;
std::cout << "BAD: " << scp.str().c_str() << std::endl;
std::cout << "TEST: " << mystr.size() << std::endl;
assert(mystr.size() == pkthdr->len);
}
Results:
WAS: prints nothing (guess there is a pointer to const.. case)
GOOD: prints data
BAD: prints nothing
TEST, assert: prints that mystr.size() is equal to passed unsigned char size.
I tried:
string.assign(scp.rdbuf());
memcpy(char, scp.str(), 10);
different methods of creating/allocating temporary chars, strings
No help.. it is wanted to get a std::cout'able std::string that contains data, (which was picked from foo, which was unsigned char, which was packet data).
Guessing either the original foo may not be null-terminated, or the problem is something like this - simple, but can't get in.. what are the things to look for here?
(this code is another attempt to use libpcap, just to print packets in C++ way, without using known C++ magic wrappers like libpcapp).
For a quick test, throw in a check for scp.str().size() == strlen(scp.str().c_str()) to see if there are embedded '\0' characters in the string, which is what I suspect is happening.
I think you're going about this the wrong way. It looks like you're dealing with binary data here, in which case you can't expect to meaningfully output it to the screen as text. What you really need is a hex dump.
const unsigned char* ucopy = packet;
std::ios_base::fmtflags old_flags = std::cout.flags();
std::cout.setf(std::ios::hex, std::ios::basefield);
for (const unsigned char* p = ucopy, *e = p + pkthdr->len; p != e; ++p) {
std::cout << std::setw(2) << std::setfill('0') << static_cast<unsigned>(*p) << " ";
}
std::cout.flags(old_flags);
This will output the data byte-by-byte, and let you examine the individual hex values of the binary data. A null byte will simply be output as 00.
Check std::cout.good() after the failed output attempt. My guess is that there's some failure on output (i.e. trying to write a nonprintable character to the console), which is setting failbit on cout.
Also check to ensure the string does not start with a NULL, which would cause empty output to be the expected behavior :)
(Side note, please use reinterpret_cast for unsigned char *ucopy = (unsigned char*)packet; if you're in C++ ;) )