C++ Better Practice to Use of Streams and Buffers - c++

I know a way to read from a stream and use it like below:
strstream s; // It can be another standard stream type
// ...
while (!s.eof())
{
char buf[MAX];
s.read(buf, sizeof (buf));
int count = s.gcount();
THIRD_PARTY_FUNCTION(buf, count);
// ...
}
but this code has an abusing point, It first copies data from the stream to buf and then passes buf to THIRD_PARTY_FUNCTION.
Is there any way to reform the code to something like below(I mean below code avoids an extra copy) ?
strstream s; // It can be another standard stream type
// ...
while (!s.eof())
{
char *buf = A_POINTER_TO_DATA_OF_STREAM(s);
int count = AVAIABLE_DATA_SIZE_OF_STREAM(s);
// Maybe it needs s.seekg(...) here
THIRD_PARTY_FUNCTION(buf, count);
// ...
}

Something like this might work for you.
char buffer[2000];
std::istream& s = getStreamReference();
s.rdbuf()->pubsetbuf(buffer, 2000);
while(s)
{
THIRD_PARTY_FUNCTION(buffer, s.rdbuf()->in_avail());
s.ignore(s.rdbuf()->in_avail());
// Not sure this may go into an infinite loop.
// Its late here so I have not tested it.
}
Note sure I care about the cost of copying a 2K buffer.
The profiling would have to show that this is a real hotspot that is causing a significant degrade in performance before I would look at making this kind of optimization. Readability is going to be my most important factor here 99% of the time.

You can convert a std::stringstream to a c-style string by first calling its member method str to get an std::string and then call the member function c_str of that to convert it to a c-style null-terminated char[].

Related

Reading contents of file into dynamically allocated char* array- can I read into std::string instead?

I have found myself writing code which looks like this
// Treat the following as pseudocode - just an example
iofile.seekg(0, std::ios::end); // iofile is a file opened for read/write
uint64_t f_len = iofile.tellg();
if(f_len >= some_min_length)
{
// Focus on the following code here
char *buf = new char[7];
char buf2[]{"MYFILET"}; // just some random string
// if we see this it's a good indication
// the rest of the file will be in the
// expected format (unlikely to see this
// sequence in a "random file", but don't
// worry too much about this)
iofile.read(buf, 7);
if(memcmp(buf, buf2, 7) == 0) // I am confident this works
{
// carry on processing file ...
// ...
// ...
}
}
else
cout << "invalid file format" << endl;
This code is probably an okay sketch of what we might want to do when opening a file, which has some specified format (which I've dictated). We do some initial check to make sure the string "MYFILET" is at the start of the file - because I've decided all my files for the job I'm doing are going to start with this sequence of characters.
I think this code would be better if we didn't have to play around with "c-style" character arrays, but used strings everywhere instead. This would be advantageous because we could do things like if(buf == buf2) if buf and buf2 where std::strings.
A possible alternative could be,
// Focus on the following code here
std::string buf;
std::string buf2("MYFILET"); // very nice
buf.resize(7); // okay, but not great
iofile.read(buf.data(), 7); // pretty awful - error prone if wrong length argument given
// also we have to resize buf to 7 in the previous step
// lots of potential for mistakes here,
// and the length was used twice which is never good
if(buf == buf2) then do something
What are the problems with this?
We had to use the length variable 7 (or constant in this case) twice. Which is somewhere between "not ideal" and "potentially error prone".
We had to access the contents of buf using .data() which I shall assume here is implemented to return a raw pointer of some sort. I don't personally mind this too much, but others may prefer a more memory-safe solution, perhaps hinting we should use an iterator of some sort? I think in Visual Studio (for Windows users which I am not) then this may return an iterator anyway, which will give [?] warnings/errors [?] - not sure on this.
We had to have an additional resize statement for buf. It would be better if the size of buf could be automatically set somehow.
It is undefined behavior to write into the const char* returned by std::string::data(). However, you are free to use std::vector::data() in this way.
If you want to use std::string, and dislike setting the size yourself, you may consider whether you can use std::getline(). This is the free function, not std::istream::getline(). The std::string version will read up to a specified delimiter, so if you have a text format you can tell it to read until '\0' or some other character which will never occur, and it will automatically resize the given string to hold the contents.
If your file is binary in nature, rather than text, I think most people would find std::vector<char> to be a more natural fit than std::string anyway.
We had to use the length variable 7 (or constant in this case) twice.
Which is somewhere between "not ideal" and "potentially error prone".
The second time you can use buf.size()
iofile.read(buf.data(), buf.size());
We had to access the contents of buf using .data() which I shall
assume here is implemented to return a raw pointer of some sort.
And pointed by John Zwinck, .data() return a pointer to const.
I suppose you could define buf as std::vector<char>; for vector (if I'm not wrong) .data() return a pointer to char (in this case), not to const char.
size() and resize() are working in the same way.
We had to have an additional resize statement for buf. It would be
better if the size of buf could be automatically set somehow.
I don't think read() permit this.
p.s.: sorry for my bad English.
We can validate a signature without double buffering (rdbuf and a string) and allocating from the heap...
// terminating null not included
constexpr char sig[] = { 'M', 'Y', 'F', 'I', 'L', 'E', 'T' };
auto ok = all_of(begin(sig), end(sig), [&fs](char c) { return fs.get() == (int)c; });
if (ok) {}
template<class Src>
std::string read_string( Src& src, std::size_t count){
std::string buf;
buf.resize(count);
src.read(&buf.front(), 7); // in C++17 make it buf.data()
return buf;
}
Now auto read = read_string( iofile, 7 ); is clean at point of use.
buf2 is a bad plan. I'd do:
if(read=="MYFILET")
directly, or use a const char myfile_magic[] = "MYFILET";.
I liked many of the ideas from the examples above, however I wasn't completely satisfied that there was an answer which would produce undefined-behaviour-free code for C++11 and C++17. I currently write most of my code in C++11 - because I don't anticipate using it on a machine in the future which doesn't have a C++11 compiler.
If one doesn't, then I add a new compiler or change machines.
However it does seem to me to be a bad idea to write code which I know may not work under C++17... That's just my personal opinion. I don't anticipate using this code again, but I don't want to create a potential problem for myself in the future.
Therefore I have come up with the following code. I hope other users will give feedback to help improve this. (For example there is no error checking yet.)
std::string
fstream_read_string(std::fstream& src, std::size_t n)
{
char *const buffer = new char[n + 1];
src.read(buffer, n);
buffer[n] = '\0';
std::string ret(buffer);
delete [] buffer;
return ret;
}
This seems like a basic, probably fool-proof method... It's a shame there seems to be no way to get std::string to use the same memory as allocated by the call to new.
Note we had to add an extra trailing null character in the C-style string, which is sliced off in the C++-style std::string.

Using sprintf with std::string in C++

I am using sprintf function in C++ 11, in the following way:
std::string toString()
{
std::string output;
uint32_t strSize=512;
do
{
output.reserve(strSize);
int ret = sprintf(output.c_str(), "Type=%u Version=%u ContentType=%u contentFormatVersion=%u magic=%04x Seg=%u",
INDEX_RECORD_TYPE_SERIALIZATION_HEADER,
FORAMT_VERSION,
contentType,
contentFormatVersion,
magic,
segmentId);
strSize *= 2;
} while (ret < 0);
return output;
}
Is there a better way to do this, than to check every time if the reserved space was enough? For future possibility of adding more things.
Your construct -- writing into the buffer received from c_str() -- is undefined behaviour, even if you checked the string's capacity beforehand. (The return value is a pointer to const char, and the function itself marked const, for a reason.)
Don't mix C and C++, especially not for writing into internal object representation. (That is breaking very basic OOP.) Use C++, for type safety and not running into conversion specifier / parameter mismatches, if for nothing else.
std::ostringstream s;
s << "Type=" << INDEX_RECORD_TYPE_SERIALIZATION_HEADER
<< " Version=" << FORMAT_VERSION
// ...and so on...
;
std::string output = s.str();
Alternative:
std::string output = "Type=" + std::to_string( INDEX_RECORD_TYPE_SERIALIZATION_HEADER )
+ " Version=" + std::to_string( FORMAT_VERSION )
// ...and so on...
;
The C++ patterns shown in other answers are nicer, but for completeness, here is a correct way with sprintf:
auto format = "your %x format %d string %s";
auto size = std::snprintf(nullptr, 0, format /* Arguments go here*/);
std::string output(size + 1, '\0');
std::sprintf(&output[0], format, /* Arguments go here*/);
Pay attention to
You must resize your string. reserve does not change the size of the buffer. In my example, I construct correctly sized string directly.
c_str() returns a const char*. You may not pass it to sprintf.
std::string buffer was not guaranteed to be contiguous prior to C++11 and this relies on that guarantee. If you need to support exotic pre-C++11 conforming platforms that use rope implementation for std::string, then you're probably better off sprinting into std::vector<char> first and then copying the vector to the string.
This only works if the arguments are not modified between the size calculation and formatting; use either local copies of variables or thread synchronisation primitives for multi-threaded code.
We can mix code from here https://stackoverflow.com/a/36909699/2667451 and here https://stackoverflow.com/a/7257307 and result will be like that:
template <typename ...Args>
std::string stringWithFormat(const std::string& format, Args && ...args)
{
auto size = std::snprintf(nullptr, 0, format.c_str(), std::forward<Args>(args)...);
std::string output(size + 1, '\0');
std::sprintf(&output[0], format.c_str(), std::forward<Args>(args)...);
return output;
}
A better way is to use the {fmt} library. Ex:
std::string message = fmt::sprintf("The answer is %d", 42);
It exposes also a nicer interface than iostreams and printf. Ex:
std::string message = fmt::format("The answer is {}", 42);`
See:
https://github.com/fmtlib/fmt
http://fmtlib.net/latest/api.html#printf-formatting-functions
Your code is wrong. reserve allocates memory for the string, but does not change its size. Writing into the buffer returned by c_str does not change its size either. So the string still believes its size is 0, and you've just written something into the unused space in the string's buffer. (Probably. Technically, the code has Undefined Behaviour, because writing into c_str is undefined, so anything could happen).
What you really want to do is forget sprintf and similar C-style functions, and use the C++ way of string formatting—string streams:
std::ostringstream ss;
ss << "Type=" << INDEX_RECORD_TYPE_SERIALIZATION_HEADER
<< " Version=" << FORAMT_VERSION
<< /* ... the rest ... */;
return ss.str();
Yes, there is!
In C, the better way is to associate a file with the null device and make a dummy printf of the desired output to it, to learn how much space would it take if actually printed. Then allocate appropriate buffer and sprintf the same data to it.
In C++ you could associate the output stream with a null device, too, and test the number of charactes printed with std::ostream::tellp. However, using ostringstream is a way better solution – see the answers by DevSolar or Angew.
You can use an implementation of sprintf() into a std::string I wrote that uses vsnprintf() under the hood.
It splits the format string into sections of plain text which are just copied to the destination std::string and sections of format fields (such as %5.2lf) which are first vsnprintf()ed into a buffer and then appended to the destination.
https://gitlab.com/eltomito/bodacious-sprintf

Is it possible to use an std::string for read()?

Is it possible to use an std::string for read() ?
Example :
std::string data;
read(fd, data, 42);
Normaly, we have to use char* but is it possible to directly use a std::string ? (I prefer don't create a char* for store the result)
Thank's
Well, you'll need to create a char* somehow, since that's what the
function requires. (BTW: you are talking about the Posix function
read, aren't you, and not std::istream::read?) The problem isn't
the char*, it's what the char* points to (which I suspect is what
you actually meant).
The simplest and usual solution here would be to use a local array:
char buffer[43];
int len = read(fd, buffer, 42);
if ( len < 0 ) {
// read error...
} else if ( len == 0 ) {
// eof...
} else {
std::string data(buffer, len);
}
If you want to capture directly into an std::string, however, this is
possible (although not necessarily a good idea):
std::string data;
data.resize( 42 );
int len = read( fd, &data[0], data.size() );
// error handling as above...
data.resize( len ); // If no error...
This avoids the copy, but quite frankly... The copy is insignificant
compared to the time necessary for the actual read and for the
allocation of the memory in the string. This also has the (probably
negligible) disadvantage of the resulting string having an actual buffer
of 42 bytes (rounded up to whatever), rather than just the minimum
necessary for the characters actually read.
(And since people sometimes raise the issue, with regards to the
contiguity of the memory in std:;string: this was an issue ten or more
years ago. The original specifications for std::string were designed
expressedly to allow non-contiguous implementations, along the lines of
the then popular rope class. In practice, no implementor found this
to be useful, and people did start assuming contiguity. At which point,
the standards committee decided to align the standard with existing
practice, and require contiguity. So... no implementation has ever not
been contiguous, and no future implementation will forego contiguity,
given the requirements in C++11.)
No, you cannot and you should not. Usually, std::string implementations internally store other information such as the size of the allocated memory and the length of the actual string. C++ documentation explicitly states that modifying values returned by c_str() or data() results in undefined behaviour.
If the read function requires a char *, then no. You could use the address of the first element of a std::vector of char as long as it's been resized first. I don't think old (pre C++11) strings are guarenteed to have contiguous memory otherwise you could do something similar with the string.
No, but
std::string data;
cin >> data;
works just fine. If you really want the behaviour of read(2), then you need to allocate and manage your own buffer of chars.
Because read() is intended for raw data input, std::string is actually a bad choice, because std::string handles text. std::vector seems like the right choice to handle raw data.
Using std::getline from the strings library - see cplusplus.com - can read from an stream and write directly into a string object. Example (again ripped from cplusplus.com - 1st hit on google for getline):
int main () {
string str;
cout << "Please enter full name: ";
getline (cin,str);
cout << "Thank you, " << str << ".\n";
}
So will work when reading from stdin (cin) and from a file (ifstream).

How to put stringstream contents into char instead string type?

Every one know stringstream.str() need a string variable type to store the content of stringstream.str() into it .
I want to store the content of stringstream.str() into char variable or char array or pointer.
Is it possible to do that?
Please, write a simple example with your answer.
Why not just
std::string s = stringstream.str();
const char* p = s.c_str();
?
Edit: Note that you cannot freely give the p outside your function: its lifetime is bound to the lifetime of s, so you may want to copy it.
Edit 2: as #David suggests, copy above means copying of the content, not the pointer itself. There are several ways for that. You can either do it manually (legacy way "inherited" from C) -- this is done with the functions like std::strcpy. This way is quite complicated, since it involves manual resources management, which is usually discouraged, since it leads to a more complicated and error-prone code. Or you can use the smart pointers or containers: it can be either std::vector<char> or std::unique_ptr/std::shared_ptr.
I personally would go for the second way. See the discussion to this and #Oli's answer, it can be useful.
If you want to get the data into a char buffer, why not put it there immediately anyway? Here is a stream class which takes an array, determines its size, fills it with null characters (primarily to make sure the resulting string is null terminated), and then sets up an std::ostream to write to this buffer directly.
#include <iostream>
#include <algorithm>
struct membuf: public std::streambuf {
template <size_t Size> membuf(char (&array)[Size]) {
this->setp(array, array + Size - 1);
std::fill_n(array, Size, 0);
}
};
struct omemstream: virtual membuf, std::ostream {
template <size_t Size> omemstream(char (&array)[Size]):
membuf(array),
std::ostream(this)
{
}
};
int main() {
char array[20];
omemstream out(array);
out << "hello, world";
std::cout << "the buffer contains '" << array << "'\n";
}
Obviously, this stream buffer and stream would probably live in a suitable namespace and would be implemented in some header (there isn't much point in putting anything of it into a C++ file because all the function are templates needing to instantiated). You could also use the [deprecated] class std::ostrstream to do something similar but it is so easy to create a custom stream that it may not worth bothering.
You can do this if you want an actual copy of the string (vital if the stringstream object is going to go out of scope at some point):
const char *p = new char[ss.str().size()+1];
strcpy(p, ss.str().c_str());
...
delete [] p;
As discussed in comments below, you should be wary of doing it like this (manual memory management is error-prone, and very non-idiomatic C++). Why do you want a raw char array?
I figured it out. Using namespace std and replacing tstingstreamwith stringstream. Next step is:
stringstream strstream;
strstream.imbue(std::locale("C"));
string str = strstream.str();
const char *sql= str .c_str();
Now you can execute sql statement.
sqlite3_exec(db, sql, callback, (void*)data, &zErrMsg);
Maybe it helps to somebody.

C++: how to get fprintf results as a std::string w/o sprintf

I am working with an open-source UNIX tool that is implemented in C++, and I need to change some code to get it to do what I want. I would like to make the smallest possible change in hopes of getting my patch accepted upstream. Solutions that are implementable in standard C++ and do not create more external dependencies are preferred.
Here is my problem. I have a C++ class -- let's call it "A" -- that currently uses fprintf() to print its heavily formatted data structures to a file pointer. In its print function, it also recursively calls the identically defined print functions of several member classes ("B" is an example). There is another class C that has a member std::string "foo" that needs to be set to the print() results of an instance of A. Think of it as a to_str() member function for A.
In pseudocode:
class A {
public:
...
void print(FILE* f);
B b;
...
};
...
void A::print(FILE *f)
{
std::string s = "stuff";
fprintf(f, "some %s", s);
b.print(f);
}
class C {
...
std::string foo;
bool set_foo(std::str);
...
}
...
A a = new A();
C c = new C();
...
// wish i knew how to write A's to_str()
c.set_foo(a.to_str());
I should mention that C is fairly stable, but A and B (and the rest of A's dependents) are in a state of flux, so the less code changes necessary the better. The current print(FILE* F) interface also needs to be preserved. I have considered several approaches to implementing A::to_str(), each with advantages and disadvantages:
Change the calls to fprintf() to sprintf()
I wouldn't have to rewrite any format strings
print() could be reimplemented as: fprint(f, this.to_str());
But I would need to manually allocate char[]s, merge a lot of c strings , and finally convert the character array to a std::string
Try to catch the results of a.print() in a string stream
I would have to convert all of the format strings to << output format. There are hundreds of fprintf()s to convert :-{
print() would have to be rewritten because there is no standard way that I know of to create an output stream from a UNIX file handle (though this guy says it may be possible).
Use Boost's string format library
More external dependencies. Yuck.
Format's syntax is different enough from printf() to be annoying:
printf(format_str, args) -> cout << boost::format(format_str) % arg1 % arg2 % etc
Use Qt's QString::asprintf()
A different external dependency.
So, have I exhausted all possible options? If so, which do you think is my best bet? If not, what have I overlooked?
Thanks.
Here's the idiom I like for making functionality identical to 'sprintf', but returning a std::string, and immune to buffer overflow problems. This code is part of an open source project that I'm writing (BSD license), so everybody feel free to use this as you wish.
#include <string>
#include <cstdarg>
#include <vector>
#include <string>
std::string
format (const char *fmt, ...)
{
va_list ap;
va_start (ap, fmt);
std::string buf = vformat (fmt, ap);
va_end (ap);
return buf;
}
std::string
vformat (const char *fmt, va_list ap)
{
// Allocate a buffer on the stack that's big enough for us almost
// all the time.
size_t size = 1024;
char buf[size];
// Try to vsnprintf into our buffer.
va_list apcopy;
va_copy (apcopy, ap);
int needed = vsnprintf (&buf[0], size, fmt, ap);
// NB. On Windows, vsnprintf returns -1 if the string didn't fit the
// buffer. On Linux & OSX, it returns the length it would have needed.
if (needed <= size && needed >= 0) {
// It fit fine the first time, we're done.
return std::string (&buf[0]);
} else {
// vsnprintf reported that it wanted to write more characters
// than we allotted. So do a malloc of the right size and try again.
// This doesn't happen very often if we chose our initial size
// well.
std::vector <char> buf;
size = needed;
buf.resize (size);
needed = vsnprintf (&buf[0], size, fmt, apcopy);
return std::string (&buf[0]);
}
}
EDIT: when I wrote this code, I had no idea that this required C99 conformance and that Windows (as well as older glibc) had different vsnprintf behavior, in which it returns -1 for failure, rather than a definitive measure of how much space is needed. Here is my revised code, could everybody look it over and if you think it's ok, I will edit again to make that the only cost listed:
std::string
Strutil::vformat (const char *fmt, va_list ap)
{
// Allocate a buffer on the stack that's big enough for us almost
// all the time. Be prepared to allocate dynamically if it doesn't fit.
size_t size = 1024;
char stackbuf[1024];
std::vector<char> dynamicbuf;
char *buf = &stackbuf[0];
va_list ap_copy;
while (1) {
// Try to vsnprintf into our buffer.
va_copy(ap_copy, ap);
int needed = vsnprintf (buf, size, fmt, ap);
va_end(ap_copy);
// NB. C99 (which modern Linux and OS X follow) says vsnprintf
// failure returns the length it would have needed. But older
// glibc and current Windows return -1 for failure, i.e., not
// telling us how much was needed.
if (needed <= (int)size && needed >= 0) {
// It fit fine so we're done.
return std::string (buf, (size_t) needed);
}
// vsnprintf reported that it wanted to write more characters
// than we allotted. So try again using a dynamic buffer. This
// doesn't happen very often if we chose our initial size well.
size = (needed > 0) ? (needed+1) : (size*2);
dynamicbuf.resize (size);
buf = &dynamicbuf[0];
}
}
I am using #3: the boost string format library - but I have to admit that I've never had any problem with the differences in format specifications.
Works like a charm for me - and the external dependencies could be worse (a very stable library)
Edited: adding an example how to use boost::format instead of printf:
sprintf(buffer, "This is a string with some %s and %d numbers", "strings", 42);
would be something like this with the boost::format library:
string = boost::str(boost::format("This is a string with some %s and %d numbers") %"strings" %42);
Hope this helps clarify the usage of boost::format
I've used boost::format as a sprintf / printf replacement in 4 or 5 applications (writing formatted strings to files, or custom output to logfiles) and never had problems with format differences. There may be some (more or less obscure) format specifiers which are differently - but I never had a problem.
In contrast I had some format specifications I couldn't really do with streams (as much as I remember)
You can use std::string and iostreams with formatting, such as the setw() call and others in iomanip
The {fmt} library provides fmt::sprintf function that performs printf-compatible formatting (including positional arguments according to POSIX specification) and returns the result as std::string:
std::string s = fmt::sprintf("The answer is %d.", 42);
Disclaimer: I'm the author of this library.
The following might be an alternative solution:
void A::printto(ostream outputstream) {
char buffer[100];
string s = "stuff";
sprintf(buffer, "some %s", s);
outputstream << buffer << endl;
b.printto(outputstream);
}
(B::printto similar), and define
void A::print(FILE *f) {
printto(ofstream(f));
}
string A::to_str() {
ostringstream os;
printto(os);
return os.str();
}
Of course, you should really use snprintf instead of sprintf to avoid buffer overflows. You could also selectively change the more risky sprintfs to << format, to be safer and yet change as little as possible.
You should try the Loki library's SafeFormat header file (http://loki-lib.sourceforge.net/index.php?n=Idioms.Printf). It's similar to boost's string format library, but keeps the syntax of the printf(...) functions.
I hope this helps!
Is this about serialization? Or printing proper?
If the former, consider boost::serialization as well. It's all about "recursive" serialization of objects and sub-object.
Very very late to the party, but here's how I'd attack this problem.
1: Use pipe(2) to open a pipe.
2: Use fdopen(3) to convert the write fd from the pipe to a FILE *.
3: Hand that FILE * to A::print().
4: Use read(2) to pull bufferloads of data, e.g. 1K or more at a time from the read fd.
5: Append each bufferload of data to the target std::string
6: Repeat steps 4 and 5 as needed to complete the task.