I'm trying to print N characters pointed to by a pointer, there is no terminating character. Lets just say I have something like this (hopefully my ascii artwork is ok here.. I want to write the chars/string "bcd" to file/stdout )
char* ptr ----> 'a' , 'b' , 'c' , 'd' , 'e' , 'f'
^ ^
| |
begin end
Now I have no terminating character there. I have a pointer to beginning and end of the chars I want to write to stdout (or a logfile say). Performance is really important say and I want to avoid the overhead of copy constructing std::string (using the begin/end pointers).
Whats the fastest way to accomplish this, can anybody tell me? I've googled around but can't see anything. I could iterate over begin->end and print/write each char at a time but I'd like to get something faster/ready made. This is a theoretical question (for my own benefit) but I'd like to know how this is done in high performance applications (think FIX message strings in low latency applications).
Thanks much
Graham
If you would like to make custom buffering I could suggest on something like this
class buffered_stream_buf : public std::streambuf {
public:
buffered_stream_buf(std::ostream* stream)
: _size(), _stream(stream), _out(_stream->rdbuf()) {
_stream->rdbuf(this);
}
~buffered_stream_buf() {
_stream->flush();
_stream->rdbuf(_out);
}
int sync() override {
if (_size) {
_out->sputn(_buffer, _size);
_size = 0;
}
return _out->pubsync();
}
int overflow(int c) override {
if (c == std::streambuf::traits_type::eof()) {
return !std::streambuf::traits_type::eof();
}
_buffer[_size] = static_cast<char>(c);
++_size;
if (_size == sizeof(_buffer) && sync() != 0)
return std::streambuf::traits_type::eof();
return c;
}
private:
char _buffer[8 * 1024];
size_t _size;
std::ostream* _stream;
std::streambuf* _out;
};
int main() {
// Unbuffering `cout` might be a good idea:
// to avoid double copying
std::cout.setf(std::ios::unitbuf);
buffered_stream_buf mycoutbuf(&std::cout);
std::ofstream f("testmybuffer.txt", std::ios_base::out);
buffered_stream_buf myfbuf(&f);
std::cout << "Hello";
f << "Hello";
std::string my_long_string("long_long_long");
auto b = my_long_string.begin() + 3;
auto e = my_long_string.begin() + 5;
for (; b != e; ++b) {
std::cout << *b;
f << *b;
}
return 0;
}
The question will this improve the performance? I am not sure, it can even worsen your performance. Why? cout and fstream are usually already buffered with a probably good size according to your machine. That means before being send to OS objects (files, pipes, etc) C++ std implementaion might buffer it (though it is not required by the standard). Adding new layer of buffering might not be unnecessary and hit the performance (you first copy it to your buffer, then later you copy it again to std buffer). Second, OS objects files, pipes are already mapped to memory, i.e. buffered. However, with this language buffering you achieve less system calls that might be expensive. To be sure if your own buffering is helpful you should benchmark it. You do not have time for that I recommend leaving it out of your scope and rely on std and OS, those are usually quite good in this.
You can use std::basic_ostream::write:
std::cout.write(begin, end - begin);
Related
It might not be advisable according to what I have read at a couple of places (and that's probably the reason std::string doesn't do it already), but in a controlled environment and with careful usage, I think it might be ok to write a string class which can be implicitly converted to a proper writable char buffer when needed by third party library methods (which take only char* as an argument), and still behave like a modern string having methods like Find(), Split(), SubString() etc. While I can try to implement the usual other string manipulation methods later, I first wanted to ask about the efficient and safe way to do this main task. Currently, we have to allocate a char array of roughly the maximum size of the char* output that is expected from the third party method, pass it there, then convert the return char* to a std::string to be able to use the convenient methods it allows, then again pass its (const char*) result to another method using string.c_str(). This is both lengthy and makes the code look a little messy.
Here is my very initial implementation so far:
MyString.h
#pragma once
#include<string>
using namespace std;
class MyString
{
private:
bool mBufferInitialized;
size_t mAllocSize;
string mString;
char *mBuffer;
public:
MyString(size_t size);
MyString(const char* cstr);
MyString();
~MyString();
operator char*() { return GetBuffer(); }
operator const char*() { return GetAsConstChar(); }
const char* GetAsConstChar() { InvalidateBuffer(); return mString.c_str(); }
private:
char* GetBuffer();
void InvalidateBuffer();
};
MyString.cpp
#include "MyString.h"
MyString::MyString(size_t size)
:mAllocSize(size)
,mBufferInitialized(false)
,mBuffer(nullptr)
{
mString.reserve(size);
}
MyString::MyString(const char * cstr)
:MyString()
{
mString.assign(cstr);
}
MyString::MyString()
:MyString((size_t)1024)
{
}
MyString::~MyString()
{
if (mBufferInitialized)
delete[] mBuffer;
}
char * MyString::GetBuffer()
{
if (!mBufferInitialized)
{
mBuffer = new char[mAllocSize]{ '\0' };
mBufferInitialized = true;
}
if (mString.length() > 0)
memcpy(mBuffer, mString.c_str(), mString.length());
return mBuffer;
}
void MyString::InvalidateBuffer()
{
if (mBufferInitialized && mBuffer && strlen(mBuffer) > 0)
{
mString.assign(mBuffer);
mBuffer[0] = '\0';
}
}
Sample usage (main.cpp)
#include "MyString.h"
#include <iostream>
void testSetChars(char * name)
{
if (!name)
return;
//This length is not known to us, but the maximum
//return length is known for each function.
char str[] = "random random name";
strcpy_s(name, strlen(str) + 1, str);
}
int main(int, char*)
{
MyString cs("test initializer");
cout << cs.GetAsConstChar() << '\n';
testSetChars(cs);
cout << cs.GetAsConstChar() << '\n';
getchar();
return 0;
}
Now, I plan to call the InvalidateBuffer() in almost all the methods before doing anything else. Now some of my questions are :
Is there a better way to do it in terms of memory/performance and/or safety, especially in C++ 11 (apart from the usual move constructor/assignment operators which I plan to add to it soon)?
I had initially implemented the 'buffer' using a std::vector of chars, which was easier to implement and more C++ like, but was concerned about performance. So the GetBuffer() method would just return the beginning pointer of the resized vector of . Do you think there are any major pros/cons of using a vector instead of char* here?
I plan to add wide char support to it later. Do you think a union of two structs : {char,string} and {wchar_t, wstring} would be the way to go for that purpose (it will be only one of these two at a time)?
Is it too much overkill rather than just doing the usual way of passing char array pointer, converting to a std::string and doing our work with it. The third party function calls expecting char* arguments are used heavily in the code and I plan to completely replace both char* and std::string with this new string if it works.
Thank you for your patience and help!
If I understood you correctly, you want this to work:
mystring foo;
c_function(foo);
// use the filled foo
with a c_function like ...
void c_function(char * dest) {
strcpy(dest, "FOOOOO");
}
Instead, I propose this (ideone example):
template<std::size_t max>
struct string_filler {
char data[max+1];
std::string & destination;
string_filler(std::string & d) : destination(d) {
data[0] = '\0'; // paranoia
}
~string_filler() {
destination = data;
}
operator char *() {
return data;
}
};
and using it like:
std::string foo;
c_function(string_filler<80>{foo});
This way you provide a "normal" buffer to the C function with a maximum that you specify (which you should know either way ... otherwise calling the function would be unsafe). On destruction of the temporary (which, according to the standard, must happen after that expression with the function call) the string is copied (using std::string assignment operator) into a buffer managed by the std::string.
Addressing your questions:
Do you think there are any major pros/cons of using a vector instead of char* here?
Yes: Using a vector frees your from manual memory management. This is a huge pro.
I plan to add wide char support to it later. Do you think a union of two structs : {char,string} and {wchar_t, wstring} would be the way to go for that purpose (it will be only one of these two at a time)?
A union is a bad idea. How do you know which member is currently active? You need a flag outside of the union. Do you really want every string to carry that around? Instead look what the standard library is doing: It's using templates to provide this abstraction.
Is it too much overkill [..]
Writing a string class? Yes, way too much.
What you want to do already exists. For example with this plain old C function:
/**
* Write n characters into buffer.
* n cann't be more than size
* Return number of written characters
*/
ssize_t fillString(char * buffer, ssize_t size);
Since C++11:
std::string str;
// Resize string to be sure to have memory
str.resize(80);
auto newSize = fillSrting(&str[0], str.size());
str.resize(newSize);
or without first resizing:
std::string str;
if (!str.empty()) // To avoid UB
{
auto newSize = fillSrting(&str[0], str.size());
str.resize(newSize);
}
But before C++11, std::string isn't guaranteed to be stored in a single chunk of contiguous memory. So you have to pass through a std::vector<char> before;
std::vector<char> v;
// Resize string to be sure to have memor
v.resize(80);
ssize_t newSize = fillSrting(&v[0], v.size());
std::string str(v.begin(), v.begin() + newSize);
You can use it easily with something like Daniel's proposition
I have a class called Game which contains the following:
vector<shared_ptr<A>> attr; // attributes
D diff; // differences
vector<shared_ptr<C>> change; // change
My question is, how can I write these (save) to a file and read/load it up later?
I thought about using a struct with these in it, and simply saving the struct but I have no idea where to start.
This is my attempt so far, with just trying to save change. I've read up a lot on the issue and my issue (well one of them, anyway) here seems to be that I am storing pointers which after closing the program would be invalid (compounded by the fact that I also free them before exiting).
/* Saves state to file */
void Game::saveGame(string toFile) {
ofstream ofs(toFile, ios::binary);
ofs.write((char *)&this->change, sizeof(C));
/* Free memory code here */
....
exit(0);
};
/* Loads game state from file */
void Game::loadGame(string fromFile) {
ifstream ifs(fromFile, ios::binary);
ifs.read((char *)&this->change, sizeof(C));
this->change.toString(); // display load results
};
Can anyone guide me in the right direction for serializing this data? I'd like to use only standard packages, so no boost.
Thanks.
I have no idea how is implemented classes A, C or D, but that is the first question: how to serialize an object of that class. For the C case, you need to implement something like this:
std::ostream& operator <<(std::ostream& os, const C& c) {
// ... code to serialize c to an output stream
return os;
}
std::istream& operator >>(std::istream& is, C& c) {
// ... code to populate c contents from the input stream
return is;
}
or, if you prefer, create a write() and read() function for that class.
Well, if you want to serialize a vector<shared_ptr<C>> looks obvious you don't want to serialize the pointer, but the contents. So you need to dereference each of those pointers and serialize. If the size of the vector is not known before loading it (i.e., is not always the same), you'll need to store that information. Then, you can create a pair of functions to serialize the complete vector:
std::ostream& operator <<(std::ostream& os, const std::vector<std::shared_ptr<C>>& vc) {
// serialize the size of the vector using << operator
// for each element of the vector, let it be called 'pc'
os << *pc << std::endl; // store the element pointed by the pointer, not the pointer.
return os;
}
std::istream& operator >>(std::istream& is, std::vector<std::shared_ptr<C>>& c) {
// read the size of the vector using >> operator
// set the size of the vector
// for each i < sizeo of the vector, let 'auto &pc = vc[i]' be a reference to the i-th element of the vector
C c; // temporary object
is >> c; // read the object stored in the stream
pc = std::make_shared<C>(c); // construct the shared pointer, assuming the class C has copy constructor
return is;
}
And then,
/* Saves state to file */
void Game::saveGame(string toFile) {
ofstream ofs(toFile);
ofs << change;
....
};
/* Loads game state from file */
void Game::loadGame(string fromFile) {
ifstream ifs(fromFile);
ifs >> change;
};
I know there are a lot of things you still need to resolve. I suggest you to investigate to resolve them so you understand well how to solve your problem.
Not only are you saving pointers, you're trying to save a shared_ptr but using the wrong size.
You need to write serialization functions for all your classes, taking care to never just write the raw bits of a non-POD type. It's safest to always implement member-by-member serialization for everything, because you never know what the future will bring.
Then handling collections of them is just a matter of also storing how many there are.
Example for the Cs:
void Game::save(ofstream& stream, const C& data)
{
// Save data as appropriate...
}
void Game::saveGame(string toFile) {
ofstream ofs(toFile, ios::binary);
ofs.write((char *)change.size(), sizeof(change.size());
for (vector<shared_ptr<C>>::const_iterator c = change.begin(); c != change.end(); ++c)
{
save(ofs, **c);
}
};
shared_ptr<C> Game::loadC(ofstream& stream)
{
shared_ptr<C> data(new C);
// load the object...
return data;
}
void Game::loadGame(string fromFile) {
change.clear();
size_t count = 0;
ifstream ifs(fromFile, ios::binary);
ifs.read((char *)&count, sizeof(count));
change.reserve(count);
for (int i = 0; i < count; ++i)
{
change.push_back(loadC(ifs));
}
};
All the error handling is missing of course - you would need to add that.
It's actually a good idea to at least start with text storage (using << and >>) instead of binary. It's easier to find bugs, or mess around with the saved state, when you can just edit it in a text editor.
Writing your own serialization is quite a challenge. Even if you do not use boost serializatoin I would recommend you learn how to use it and comprehend how it works rather than discovering it yourself.
When serializing you finally end up with a buffer of data of which content you have very vague idea. You have to save everything you need to be able to restore it. You read it chunk by chunk. Example (not compiled, not tested and not stylish ):
void save(ostream& out, const string& s)
{
out << s.size();
out.write(s.c_str(), s.size());
}
void load(istream& in, string& s)
{
unsigned len;
in >> len;
s.resize(len);
in.read((char*)s, len);
}
struct Game
{
void save(ostream& out)
{
player.save(out);
};
void load(istream& in)
{
player.load(in);
}
};
struct Player
{
void save(ostream& out)
{
// save in the same order as loading, serializing everything you need to read it back
save(out, name);
save(out, experience);
}
void load(istream& in)
{
load(in, name);
load(in, experience); //
}
};
I do not know why you would do it to yourself instead of using boost but those are some of the cases you should consider:
- type - you must figure out a way to know what "type of change" you actually have there.
- a string (vector, whatever) - size + data (then the first thing you read back from the string is the length, you resize it and copy the "length" number of characters)
- a pointer - save the data pointed by pointer, then upon deserialization you have to allocate it, construct it (usually default construct) and read back the data and reset the members to their respective values. Note: you have to avoid memory leakage.
- polymorphic pointer - ouch you have to know what type the pointer actually points to, you have to construct the derived type, save the values of the derived type... so you have to save type information
- null pointer... you have to distinguish null pointer so you know that you do not need to further read data from the stream.
- versioning - you have to be able to read a data after you added/removed a field
There is too much of it for you to get a complete answer.
Background:
I'm trying to optimize a logging system so that it uses memory-mapped files. I need to provide an std::ostream-like interface so that the logging system can write to that memory.
I have identified std::strstream (which is deprecated though) and boost::iostreams::basic_array_sink could fit my needs.
Now I want to have the logging cyclic, meaning that when the output pointer is near the end of the memory block it should start over at the beginning again.
My question is where would be the best point to start in order to implement this specific behaviour.
I'm rather overwhelmed by the std::iostreams class hierarchy and don't grasp all the internal workings as for now.
I'm uncertain to whether i should/need to derive from ostream, streambuf, or both?
Are these made for being derived from, anyway?
Or using boost:iostreams, would i need to have to write my own Sink?
EDIT:
The following attempt compiles and produces the expected output:
class rollingstreambuf : public std::basic_streambuf<TCHAR>
{
public:
typedef std::basic_streambuf<TCHAR> Base;
rollingstreambuf(Base::char_type* baseptr, size_t size)
{
setp(baseptr, baseptr + size);
}
protected:
virtual int_type overflow (int_type c)
{
// reset position to start of buffer
setp(pbase(), epptr());
return putchar(c);
}
virtual std::streamsize xsputn (const char* s, std::streamsize n)
{
if (n >= epptr() - pptr())
// reset position to start of buffer
setp(pbase(), epptr());
return Base::xsputn(s, n);
}
};
char buffer[100];
rollingstreambuf buf(buffer, sizeof(buffer));
std::basic_ostream<TCHAR> out(&buf);
for (int i=0; i<10; i++)
{
out << "Mumblemumble " << i << '\n';
}
out << std::ends; //write terminating NULL char
Printing the buffer gives:
Mumblemumble 6
Mumblemumble 7
Mumblemumble 8
Mumblemumble 9
(which confirms the roll-over has taken place)
What it does is that it makes the streambuf use the provided buffer as a cyclic output buffer (put area), without ever advancing the buffer window in the output sequence (stream).
(Using terminology from http://en.cppreference.com/w/cpp/io/basic_streambuf)
Now i feel very uncertain about the robustness and quality of this implementation. Please review and comment it.
This is a valid approach. overflow() should return:
traits::eof() or throws an exception if the function fails. Otherwise, returns some value other than traits::eof() to indicate success.
E.g.:
virtual int_type overflow (int_type c)
{
// reset position to start of buffer
setp(pbase(), epptr());
return traits::not_eof(c);
}
xsputn() should probably write the beginning of the sequence to the end of the buffer, then rewind and write the remaining sequence to the front of the buffer. You could probably get away with the default implementation of xsputn() that calls sputc(c) for each character and then overflow() when the buffer is full.
I am trying to implement a stream buffer and I'm having trouble with making overflow() work. I resize the buffer by 10 more characters and reset the buffer using setp. Then I increment the pointer back where we left off. For some reason the output is not right:
template <class charT, class traits = std::char_traits<charT>>
class stringbuf : public std::basic_stringbuf<charT, traits>
{
public:
using char_type = charT;
using traits_type = traits;
using int_type = typename traits::int_type;
public:
stringbuf()
: buffer(10, 0)
{
this->setp(&buffer.front(), &buffer.back());
}
int_type overflow(int_type c = traits::eof())
{
if (traits::eq_int_type(c, traits::eof()))
return traits::not_eof(c);
std::ptrdiff_t diff = this->pptr() - this->pbase();
buffer.resize(buffer.size() + 10);
this->setp(&buffer.front(), &buffer.back());
this->pbump(diff);
return traits::not_eof(traits::to_int_type(*this->pptr()));
}
// ...
std::basic_string<charT> str()
{
return buffer;
}
private:
std::basic_string<charT> buffer;
};
int main()
{
stringbuf<char> buf;
std::ostream os(&buf);
os << "hello world how are you?";
std::cout << buf.str();
}
When I print the string it comes out as:
hello worl how are ou?
It's missing the d and the y. What did I do wrong?
The first thing to not is that you are deriving from std::basic_stringbuf<char> for whatever reason without overriding all of the relevant virtual functions. For example, you don't override xsputn() or sync(): whatever these functions end up doing you'll inherit. I'd strongly recommend to derive your stream buffer from std::basic_streambuf<char> instead!
The overflow() method announces a buffer which is one character smaller than the string to the stream buffer: &buffer.back() isn't a pointer to the end of the array but to the last character in the string. Personally, I would use
this->setp(&this->buffer.front(), &this->buffer.front() + this->buffer.size());
There is no problem so far. However, after making space for more characters you omitted adding the overflowing character, i.e., the argument passed to overflow() to the buffer:
this->pbump(diff);
*this->pptr() = traits::to_char_type(c);
this->pbump(1);
There are a few more little things which are not quite right:
It is generally a bad idea to give overriding virtual functions a default parameter. The base class function already provides the default and the new default is only picked up when the function is ever called explicitly.
The string returned may contain a number of null characters at the end because the held string is actually bigger than sequence which was written so far unless the buffer is exactly full. You should probably implement the str() function differently:
std::basic_string<charT> str() const
{
return this->buffer.substr(0, this->pptr() - this->pbase());
}
Growing the string by a constant value is a major performance problem: the cost of writing n characters is n * n. For larger n (they don't really need to become huge) this will cause problems. You are much better off growing your buffer exponentially, e.g., doubling it every time or growing by a factor of 1.5 if you feel doubling isn't a good idea.
I need to share a stack of strings between processes (possibly more complex objects in the future). I've decided to use boost::interprocess but I can't get it to work. I'm sure it's because I'm not understanding something. I followed their example, but I would really appreciate it if someone with experience with using that library can have a look at my code and tell me what's wrong. The problem is it seems to work but after a few iterations I get all kinds of exceptions both on the reader process and sometimes on the writer process. Here's a simplified version of my implementation:
using namespace boost::interprocess;
class SharedMemoryWrapper
{
public:
SharedMemoryWrapper(const std::string & name, bool server) :
m_name(name),
m_server(server)
{
if (server)
{
named_mutex::remove("named_mutex");
shared_memory_object::remove(m_name.c_str());
m_segment = new managed_shared_memory (create_only,name.c_str(),65536);
m_stackAllocator = new StringStackAllocator(m_segment->get_segment_manager());
m_stack = m_segment->construct<StringStack>("MyStack")(*m_stackAllocator);
}
else
{
m_segment = new managed_shared_memory(open_only ,name.c_str());
m_stack = m_segment->find<StringStack>("MyStack").first;
}
m_mutex = new named_mutex(open_or_create, "named_mutex");
}
~SharedMemoryWrapper()
{
if (m_server)
{
named_mutex::remove("named_mutex");
m_segment->destroy<StringStack>("MyStack");
delete m_stackAllocator;
shared_memory_object::remove(m_name.c_str());
}
delete m_mutex;
delete m_segment;
}
void push(const std::string & in)
{
scoped_lock<named_mutex> lock(*m_mutex);
boost::interprocess::string inStr(in.c_str());
m_stack->push_back(inStr);
}
std::string pop()
{
scoped_lock<named_mutex> lock(*m_mutex);
std::string result = "";
if (m_stack->size() > 0)
{
result = std::string(m_stack->begin()->c_str());
m_stack->erase(m_stack->begin());
}
return result;
}
private:
typedef boost::interprocess::allocator<boost::interprocess::string, boost::interprocess::managed_shared_memory::segment_manager> StringStackAllocator;
typedef boost::interprocess::vector<boost::interprocess::string, StringStackAllocator> StringStack;
bool m_server;
std::string m_name;
boost::interprocess::managed_shared_memory * m_segment;
StringStackAllocator * m_stackAllocator;
StringStack * m_stack;
boost::interprocess::named_mutex * m_mutex;
};
EDIT Edited to use named_mutex. Original code was using interprocess_mutex which is incorrect, but that wasn't the problem.
EDIT2 I should also note that things work up to a point. The writer process can push several small strings (or one very large string) before the reader breaks. The reader breaks in a way that the line m_stack->begin() does not refer to a valid string. It's garbage. And then further execution throws an exception.
EDIT3 I have modified the class to use boost::interprocess::string rather than std::string. Still the reader fails with invalid memory address. Here is the reader/writer
//reader process
SharedMemoryWrapper mem("MyMemory", true);
std::string myString;
int x = 5;
do
{
myString = mem.pop();
if (myString != "")
{
std::cout << myString << std::endl;
}
} while (1); //while (myString != "");
//writer
SharedMemoryWrapper mem("MyMemory", false);
for (int i = 0; i < 1000000000; i++)
{
std::stringstream ss;
ss << i; //causes failure after few thousand iterations
//ss << "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" << i; //causes immediate failure
mem.push(ss.str());
}
return 0;
There are several things that leaped out at me about your implementation. One was the use of a pointer to the named mutex object, whereas the documentation of most boost libraries tends to bend over backwards to not use a pointer. This leads me to ask for a reference to the program snippet you worked from in building your own test case, as I have had similar misadventures and sometimes the only way out was to go back to the exemplar and work forward one step at a time until I come across the breaking change.
The other thing that seems questionable is your allocation of a 65k block for shared memory, and then in your test code, looping to 1000000000, pushing a string onto your stack each iteration.
With a modern PC able to execute 1000 instructions per microsecond and more, and operating systems like Windows still doling out execution quanta in 15 millisecond. chunks, it won't take long to overflow that stack. That would be my first guess as to why things are haywire.
P.S.
I just returned from fixing my name to something resembling my actual identity. Then the irony hit that my answer to your question has been staring us both in the face from the upper left hand corner of the browser page! (That is, of course, presuming I was correct, which is so often not the case in this biz.)
Well maybe shared memory is not the right design for your problem to begin with. However we would not know, because we don't know what you try to achieve in the first place.