It might not be advisable according to what I have read at a couple of places (and that's probably the reason std::string doesn't do it already), but in a controlled environment and with careful usage, I think it might be ok to write a string class which can be implicitly converted to a proper writable char buffer when needed by third party library methods (which take only char* as an argument), and still behave like a modern string having methods like Find(), Split(), SubString() etc. While I can try to implement the usual other string manipulation methods later, I first wanted to ask about the efficient and safe way to do this main task. Currently, we have to allocate a char array of roughly the maximum size of the char* output that is expected from the third party method, pass it there, then convert the return char* to a std::string to be able to use the convenient methods it allows, then again pass its (const char*) result to another method using string.c_str(). This is both lengthy and makes the code look a little messy.
Here is my very initial implementation so far:
MyString.h
#pragma once
#include<string>
using namespace std;
class MyString
{
private:
bool mBufferInitialized;
size_t mAllocSize;
string mString;
char *mBuffer;
public:
MyString(size_t size);
MyString(const char* cstr);
MyString();
~MyString();
operator char*() { return GetBuffer(); }
operator const char*() { return GetAsConstChar(); }
const char* GetAsConstChar() { InvalidateBuffer(); return mString.c_str(); }
private:
char* GetBuffer();
void InvalidateBuffer();
};
MyString.cpp
#include "MyString.h"
MyString::MyString(size_t size)
:mAllocSize(size)
,mBufferInitialized(false)
,mBuffer(nullptr)
{
mString.reserve(size);
}
MyString::MyString(const char * cstr)
:MyString()
{
mString.assign(cstr);
}
MyString::MyString()
:MyString((size_t)1024)
{
}
MyString::~MyString()
{
if (mBufferInitialized)
delete[] mBuffer;
}
char * MyString::GetBuffer()
{
if (!mBufferInitialized)
{
mBuffer = new char[mAllocSize]{ '\0' };
mBufferInitialized = true;
}
if (mString.length() > 0)
memcpy(mBuffer, mString.c_str(), mString.length());
return mBuffer;
}
void MyString::InvalidateBuffer()
{
if (mBufferInitialized && mBuffer && strlen(mBuffer) > 0)
{
mString.assign(mBuffer);
mBuffer[0] = '\0';
}
}
Sample usage (main.cpp)
#include "MyString.h"
#include <iostream>
void testSetChars(char * name)
{
if (!name)
return;
//This length is not known to us, but the maximum
//return length is known for each function.
char str[] = "random random name";
strcpy_s(name, strlen(str) + 1, str);
}
int main(int, char*)
{
MyString cs("test initializer");
cout << cs.GetAsConstChar() << '\n';
testSetChars(cs);
cout << cs.GetAsConstChar() << '\n';
getchar();
return 0;
}
Now, I plan to call the InvalidateBuffer() in almost all the methods before doing anything else. Now some of my questions are :
Is there a better way to do it in terms of memory/performance and/or safety, especially in C++ 11 (apart from the usual move constructor/assignment operators which I plan to add to it soon)?
I had initially implemented the 'buffer' using a std::vector of chars, which was easier to implement and more C++ like, but was concerned about performance. So the GetBuffer() method would just return the beginning pointer of the resized vector of . Do you think there are any major pros/cons of using a vector instead of char* here?
I plan to add wide char support to it later. Do you think a union of two structs : {char,string} and {wchar_t, wstring} would be the way to go for that purpose (it will be only one of these two at a time)?
Is it too much overkill rather than just doing the usual way of passing char array pointer, converting to a std::string and doing our work with it. The third party function calls expecting char* arguments are used heavily in the code and I plan to completely replace both char* and std::string with this new string if it works.
Thank you for your patience and help!
If I understood you correctly, you want this to work:
mystring foo;
c_function(foo);
// use the filled foo
with a c_function like ...
void c_function(char * dest) {
strcpy(dest, "FOOOOO");
}
Instead, I propose this (ideone example):
template<std::size_t max>
struct string_filler {
char data[max+1];
std::string & destination;
string_filler(std::string & d) : destination(d) {
data[0] = '\0'; // paranoia
}
~string_filler() {
destination = data;
}
operator char *() {
return data;
}
};
and using it like:
std::string foo;
c_function(string_filler<80>{foo});
This way you provide a "normal" buffer to the C function with a maximum that you specify (which you should know either way ... otherwise calling the function would be unsafe). On destruction of the temporary (which, according to the standard, must happen after that expression with the function call) the string is copied (using std::string assignment operator) into a buffer managed by the std::string.
Addressing your questions:
Do you think there are any major pros/cons of using a vector instead of char* here?
Yes: Using a vector frees your from manual memory management. This is a huge pro.
I plan to add wide char support to it later. Do you think a union of two structs : {char,string} and {wchar_t, wstring} would be the way to go for that purpose (it will be only one of these two at a time)?
A union is a bad idea. How do you know which member is currently active? You need a flag outside of the union. Do you really want every string to carry that around? Instead look what the standard library is doing: It's using templates to provide this abstraction.
Is it too much overkill [..]
Writing a string class? Yes, way too much.
What you want to do already exists. For example with this plain old C function:
/**
* Write n characters into buffer.
* n cann't be more than size
* Return number of written characters
*/
ssize_t fillString(char * buffer, ssize_t size);
Since C++11:
std::string str;
// Resize string to be sure to have memory
str.resize(80);
auto newSize = fillSrting(&str[0], str.size());
str.resize(newSize);
or without first resizing:
std::string str;
if (!str.empty()) // To avoid UB
{
auto newSize = fillSrting(&str[0], str.size());
str.resize(newSize);
}
But before C++11, std::string isn't guaranteed to be stored in a single chunk of contiguous memory. So you have to pass through a std::vector<char> before;
std::vector<char> v;
// Resize string to be sure to have memor
v.resize(80);
ssize_t newSize = fillSrting(&v[0], v.size());
std::string str(v.begin(), v.begin() + newSize);
You can use it easily with something like Daniel's proposition
I have a program that employs an entity-component-system framework. Essentially this means that I have a collection of entities that have various components attached to them. Entities are actually just integer ID numbers, and components are attached to them by mapping the component to the specified ID number of the entity.
Now, I need to store collections of entities and the associated components to a file that can be modified later on, so basically I need a saving and loading functionality. However, being somewhat a newcomer to C++, I have hard time figuring out how to exactly do this.
Coming from Java and C#, my first choice would be to serialize the objects into, say, JSON, and then deserialize them when the JSON is loaded. However, C++ does not have any reflection features. So, the question is: how do I save and load C++ objects? I don't mean the actual file operations, I mean the way the objects and structs should be handled in order to preserve them between program launches.
One way of doing is to create Persistent Objects in C++, and store the your data.
check out the following links:
C++ object persistence library similar to eternity
http://sourceforge.net/projects/litesql/
http://en.wikipedia.org/wiki/ODB_(C%2B%2B)
http://drdobbs.com/cpp/184408893
http://tools.devshed.com/c/a/Web-Development/C-Programming-Persistence/
C++ doesn't support persistence directly (there are proposals for adding persistence and reflection to C++ in the future). Persistence support is not as trivial as it may seem at first. The size and memory layout of the same object may vary from one platform to another. Different byte ordering, or endian-ness, complicate matters even further. To make an object persistent, we have to reserve its state in a non-volatile storage device. ie: Write a persistent object to retain its state outside the scope of the program in which it was created.
Other Way, is to store the objects into an array, then push the array buffer to a file.
The advantage are that the disk platters don't have waste time ramping up and also the writing can be performed contiguously.
You can increase the performance by using threads. Dump the objects to a buffer, once done trigger a thread to handle the output.
Example:
The following code has not been compiled and is for illustrative purposes only.
#include <fstream>
#include <algorithm>
using std::ofstream;
using std::fill;
#define MAX_DATA_LEN 1024 // Assuming max size of data be 1024
class stream_interface
{
virtual void load_from_buffer(const unsigned char *& buf_ptr) = 0;
virtual size_t size_on_stream(void) const = 0;
virtual void store_to_buffer(unsigned char *& buf_ptr) const = 0;
};
struct Component
: public stream_interface,
data_length(MAX_DATA_LEN)
{
unsigned int entity;
std::string data;
const unsigned int data_length;
void load_from_buffer(const unsigned char *& buf_ptr)
{
entity = *((unsigned int *) buf_ptr);
buf_ptr += sizeof(unsigned int);
data = std::string((char *) buf_ptr);
buf_ptr += data_length;
return;
}
size_t size_on_stream(void) const
{
return sizeof(unsigned int) + data_length;
}
void store_to_buffer(unsigned char *& buf_ptr) const
{
*((unsigned int *) buf_ptr) = entity;
buf_ptr += sizeof(unsigned int);
std::fill(buf_ptr, 0, data_length);
strncpy((char *) buf_ptr, data.c_str(), data_length);
buf_ptr += data_length;
return;
}
};
int main(void)
{
Component c1;
c1.data = "Some Data";
c1.entity = 5;
ofstream data_file("ComponentList.bin", std::ios::binary);
// Determine size of buffer
size_t buffer_size = c1.size_on_stream();
// Allocate the buffer
unsigned char * buffer = new unsigned char [buffer_size];
unsigned char * buf_ptr = buffer;
// Write / store the object into the buffer.
c1.store_to_buffer(buf_ptr);
// Write the buffer to the file / stream.
data_file.write((char *) buffer, buffer_size);
data_file.close();
delete [] buffer;
return 0;
}
I have a message class which was previously a bit of a pain to work with, you had to construct the message class, tell it to allocate space for your object and then populate the space either by construction or memberwise.
I want to make it possible to construct the message object with an immediate, inline new of the resulting object, but to do so with a simple syntax at the call site while ensuring copy elision.
#include <cstdint>
typedef uint8_t id_t;
enum class MessageID { WorldPeace };
class Message
{
uint8_t* m_data; // current memory
uint8_t m_localData[64]; // upto 64 bytes.
id_t m_messageId;
size_t m_size; // amount of data used
size_t m_capacity; // amount of space available
// ...
public:
Message(size_t requestSize, id_t messageId)
: m_data(m_localData)
, m_messageId(messageId)
, m_size(0), m_capacity(sizeof(m_localData))
{
grow(requestSize);
}
void grow(size_t newSize)
{
if (newSize > m_capacity)
{
m_data = realloc((m_data == m_localData) ? nullptr : m_data, newSize);
assert(m_data != nullptr); // my system uses less brutal mem mgmt
m_size = newSize;
}
}
template<typename T>
T* allocatePtr()
{
size_t offset = size;
grow(offset + sizeof(T));
return (T*)(m_data + offset);
}
#ifdef USE_CPP11
template<typename T, typename Args...>
Message(id_t messageId, Args&&... args)
: Message(sizeof(T), messageID)
{
// we know m_data points to a large enough buffer
new ((T*)m_data) T (std::forward<Args>(args)...);
}
#endif
};
Pre-C++11 I had a nasty macro, CONSTRUCT_IN_PLACE, which did:
#define CONSTRUCT_IN_PLACE(Message, Typename, ...) \
new ((Message).allocatePtr<Typename>()) Typename (__VA_ARGS__)
And you would say:
Message outgoing(sizeof(MyStruct), MessageID::WorldPeace);
CONSTRUCT_IN_PLACE(outgoing, MyStruct, wpArg1, wpArg2, wpArg3);
With C++11, you would use
Message outgoing<MyStruct>(MessageID::WorldPeace, wpArg1, wpArg2, wpArg3);
But I find this to be messy. What I want to implement is:
template<typename T>
Message(id_t messageId, T&& src)
: Message(sizeof(T), messageID)
{
// we know m_data points to a large enough buffer
new ((T*)m_data) T (src);
}
So that the user uses
Message outgoing(MessageID::WorldPeace, MyStruct(wpArg1, wpArg2, wpArg3));
But it seems that this first constructs a temporary MyStruct on the stack turning the in-place new into a call to the move constructor of T.
Many of these messages are simple, often POD, and they are often in marshalling functions like this:
void dispatchWorldPeace(int wpArg1, int wpArg2, int wpArg3)
{
Message outgoing(MessageID::WorldPeace, MyStruct(wpArg1, wpArg2, wpArg3));
outgoing.send(g_listener);
}
So I want to avoid creating an intermediate temporary that is going to require a subsequent move/copy.
It seems like the compiler should be able to eliminate the temporary and the move and forward the construction all the way down to the in-place new.
What am I doing that is causing it not to? (GCC 4.8.1, Clang 3.5, MSVC 2013)
You won't be able to elide the copy/move in the placement new: copy elision is entirely based on the idea that the compiler knows at construction time where the object will eventually end up. Also, since copy elision actually changes the behavior of the program (after all, it won't call the respective constructor and the destructor even if they have side-effects) copy elision is limited to a few very specific cases (listed in 12.8 [class.copy] paragraph 31: essentially when returning a local variable by name, when throwing a local variable by name, when catching an exception of the correct type by value, and when copying/moving a temporary variable; see the clause for exact details). Since [placement] new is none of the contexts where the copy can be elided and the argument to constructor is clearly not a temporary (it is named), the copy/move will never be elided. Even adding the missing std::forward<T>(...) to your constructor will cause the copy/move to be elided:
template<typename T>
Message(id_t messageId, T&& src)
: Message(sizeof(T), messageID)
{
// placement new take a void* anyway, i.e., no need to cast
new (m_data) T (std::forward<T>(src));
}
I don't think you can explicitly specify a template parameter when calling a constructor. Thus, I think the closest you could probably get without constructing the object ahead of time and getting it copied/moved is something like this:
template <typename>
struct Tag {};
template <typename T, typename A>
Message::Message(Tag<T>, id_t messageId, A... args)
: Message(messageId, sizeof(T)) {
new(this->m_data) T(std::forward<A>(args)...);
}
One approach which might make things a bit nicer is using the id_t to map to the relevant type assuming that there is a mapping from message Ids to the relevant type:
typedef uint8_t id_t;
template <typename T, id_t id> struct Tag {};
struct MessageId {
static constexpr Tag<MyStruct, 1> WorldPeace;
// ...
};
template <typename T, id_t id, typename... A>
Message::Message(Tag<T, id>, A&&... args)
Message(id, sizeof(T)) {
new(this->m_data) T(std::forward<A>)(args)...);
}
Foreword
The conceptual barrier that even C++2049 cannot cross is that you require all the bits that compose your message to be aligned in a contiguous memory block.
The only way C++ can give you that is through the use of the placement new operator. Otherwise, objects will simply be constructed according to their storage class (on the stack or through whatever you define as a new operator).
It means any object you pass to your payload constructor will be first constructed (on the stack) and then used by the constructor (that will most likely copy-construct it).
Avoiding this copy completely is impossible. You may have a forward constructor doing the minimal amount of copy, but still the scalar parameters passed to the initializer will likely be copied, as will any data that the constructor of the initializer deemed necessary to memorize and/or produce.
If you want to be able to pass parameters freely to each of the constructors needed to build the complete message without them being first stored in the parameter objects, it will require
the use of a placement new operator for each of the sub-objects that compose the message,
the memorization of each single scalar parameter passed to the various sub-constructors,
specific code for each object to feed the placement new operator with the proper address and call the constructor of the sub-object.
You will end up with a toplevel message constructor taking all possible initial parameters and dispatching them to the various sub-objects constructors.
I don't even know if this is feasible, but the result would be very fragile and error-prone at any rate.
Is that what you want, just for the benefit of a bit of syntactic sugar?
If you're offering an API, you cannot cover all cases. The best approach is to make something that degrades nicely, IMHO.
The simple solution would be to limit payload constructor parameters to scalar values or implement "in-place sub-construction" for a limited set of message payloads that you can control. At your level you cannot do more than that to make sure the message construction proceeds with no extra copies.
Now the application software will be free to define constructors that take objects as parameters, and then the price to pay will be these extra copies.
Besides, this might be the most efficient approach, if the parameter is something costly to construct (i.e. the construction time is greater than the copy time, so it is more efficient to create a static object and modify it slightly between each message) or if it has a greater lifetime than your function for any reason.
a working, ugly solution
First, let's start with a vintage, template-less solution that does in-place construction.
The idea is to have the message pre-allocate the right kind of memory (local buffer of dynamic) depending on the size of the object.
The proper base address is then passed to a placement new to construct the message contents in place.
#include <cstdint>
#include <cstdio>
#include <new>
typedef uint8_t id_t;
enum class MessageID { WorldPeace, Armaggedon };
#define SMALL_BUF_SIZE 64
class Message {
id_t m_messageId;
uint8_t* m_data;
uint8_t m_localData[SMALL_BUF_SIZE];
public:
// choose the proper location for contents
Message (MessageID messageId, size_t size)
{
m_messageId = (id_t)messageId;
m_data = size <= SMALL_BUF_SIZE ? m_localData : new uint8_t[size];
}
// dispose of the contents if need be
~Message ()
{
if (m_data != m_localData) delete m_data;
}
// let placement new know about the contents location
void * location (void)
{
return m_data;
}
};
// a macro to do the in-place construction
#define BuildMessage(msg, id, obj, ... ) \
Message msg(MessageID::id, sizeof(obj)); \
new (msg.location()) obj (__VA_ARGS__); \
// example uses
struct small {
int a, b, c;
small (int a, int b, int c) :a(a),b(b),c(c) {}
};
struct big {
int lump[1000];
};
int main(void)
{
BuildMessage(msg1, WorldPeace, small, 1, 2, 3)
BuildMessage(msg2, Armaggedon, big)
}
This is just a trimmed down version of your initial code, with no templates at all.
I find it relatively clean and easy to use, but to each his own.
The only inefficiency I see here is the static allocation of 64 bytes that will be useless if the message is too big.
And of course all type information is lost once the messages are constructed, so accessing their contents afterward would be awkward.
About forwarding and construction in place
Basically, the new && qualifier does no magic. To do in-place construction, the compiler needs to know the address that will be used for object storage before calling the constructor.
Once you've invoked an object creation, the memory has been allocated and the && thing will only allow you to use that address to pass ownership of the said memory to another object without resorting to useless copies.
You can use templates to recognize a call to the Message constructor involving a given class passed as message contents, but that will be too late: the object will have been constructed before your constructor can do anything about its memory location.
I can't see a way to create a template on top of the Message class that would defer an object construction until you have decided at which location you want to construct it.
However, you could work on the classes defining the object contents to have some in-place construction automated.
This will not solve the general problem of passing objects to the constructor of the object that will be built in place.
To do that, you would need the sub-objects themselves to be constructed through a placement new, which would mean implementing a specific template interface for each of the initializers, and have each object provide the address of construction to each of its sub-objects.
Now for syntactic sugar.
To make the ugly templating worth the while, you can specialize your message classes to handle big and small messages differently.
The idea is to have a single lump of memory to pass to your sending function. So in case of small messages, the message header and contents are defined as local message properties, and for big ones, extra memory is allocated to include the message header.
Thus the magic DMA used to propell your messages through the system will have a clean data block to work with either way.
Dynamic allocations will still occur once per big message, and never for small ones.
#include <cstdint>
#include <new>
// ==========================================================================
// Common definitions
// ==========================================================================
// message header
enum class MessageID : uint8_t { WorldPeace, Armaggedon };
struct MessageHeader {
MessageID id;
uint8_t __padding; // one free byte here
uint16_t size;
};
// small buffer size
#define SMALL_BUF_SIZE 64
// dummy send function
int some_DMA_trick(int destination, void * data, uint16_t size);
// ==========================================================================
// Macro solution
// ==========================================================================
// -----------------------------------------
// Message class
// -----------------------------------------
class mMessage {
// local storage defined even for big messages
MessageHeader m_header;
uint8_t m_localData[SMALL_BUF_SIZE];
// pointer to the actual message
MessageHeader * m_head;
public:
// choose the proper location for contents
mMessage (MessageID messageId, uint16_t size)
{
m_head = size <= SMALL_BUF_SIZE
? &m_header
: (MessageHeader *) new uint8_t[size + sizeof (m_header)];
m_head->id = messageId;
m_head->size = size;
}
// dispose of the contents if need be
~mMessage ()
{
if (m_head != &m_header) delete m_head;
}
// let placement new know about the contents location
void * location (void)
{
return m_head+1;
}
// send a message
int send(int destination)
{
return some_DMA_trick (destination, m_head, (uint16_t)(m_head->size + sizeof (m_head)));
}
};
// -----------------------------------------
// macro to do the in-place construction
// -----------------------------------------
#define BuildMessage(msg, obj, id, ... ) \
mMessage msg (MessageID::id, sizeof(obj)); \
new (msg.location()) obj (__VA_ARGS__); \
// ==========================================================================
// Template solution
// ==========================================================================
#include <utility>
// -----------------------------------------
// template to check storage capacity
// -----------------------------------------
template<typename T>
struct storage
{
enum { local = sizeof(T)<=SMALL_BUF_SIZE };
};
// -----------------------------------------
// base message class
// -----------------------------------------
class tMessage {
protected:
MessageHeader * m_head;
tMessage(MessageHeader * head, MessageID id, uint16_t size)
: m_head(head)
{
m_head->id = id;
m_head->size = size;
}
public:
int send(int destination)
{
return some_DMA_trick (destination, m_head, (uint16_t)(m_head->size + sizeof (*m_head)));
}
};
// -----------------------------------------
// general message template
// -----------------------------------------
template<bool local_storage, typename message_contents>
class aMessage {};
// -----------------------------------------
// specialization for big messages
// -----------------------------------------
template<typename T>
class aMessage<false, T> : public tMessage
{
public:
// in-place constructor
template<class... Args>
aMessage(MessageID id, Args...args)
: tMessage(
(MessageHeader *)new uint8_t[sizeof(T)+sizeof(*m_head)], // dynamic allocation
id, sizeof(T))
{
new (m_head+1) T(std::forward<Args>(args)...);
}
// destructor
~aMessage ()
{
delete m_head;
}
// syntactic sugar to access contents
T& contents(void) { return *(T*)(m_head+1); }
};
// -----------------------------------------
// specialization for small messages
// -----------------------------------------
template<typename T>
class aMessage<true, T> : public tMessage
{
// message body defined locally
MessageHeader m_header;
uint8_t m_data[sizeof(T)]; // no need for 64 bytes here
public:
// in-place constructor
template<class... Args>
aMessage(MessageID id, Args...args)
: tMessage(
&m_header, // local storage
id, sizeof(T))
{
new (m_head+1) T(std::forward<Args>(args)...);
}
// syntactic sugar to access contents
T& contents(void) { return *(T*)(m_head+1); }
};
// -----------------------------------------
// helper macro to hide template ugliness
// -----------------------------------------
#define Message(T) aMessage<storage<T>::local, T>
// something like typedef aMessage<storage<T>::local, T> Message<T>
// ==========================================================================
// Example
// ==========================================================================
#include <cstdio>
#include <cstring>
// message sending
int some_DMA_trick(int destination, void * data, uint16_t size)
{
printf("sending %d bytes #%p to %08X\n", size, data, destination);
return 1;
}
// some dynamic contents
struct gizmo {
char * s;
gizmo(void) { s = nullptr; };
gizmo (const gizmo& g) = delete;
gizmo (const char * msg)
{
s = new char[strlen(msg) + 3];
strcpy(s, msg);
strcat(s, "#");
}
gizmo (gizmo&& g)
{
s = g.s;
g.s = nullptr;
strcat(s, "*");
}
~gizmo()
{
delete s;
}
gizmo& operator=(gizmo g)
{
std::swap(s, g.s);
return *this;
}
bool operator!=(gizmo& g)
{
return strcmp (s, g.s) != 0;
}
};
// some small contents
struct small {
int a, b, c;
gizmo g;
small (gizmo g, int a, int b, int c)
: a(a), b(b), c(c), g(std::move(g))
{
}
void trace(void)
{
printf("small: %d %d %d %s\n", a, b, c, g.s);
}
};
// some big contents
struct big {
gizmo lump[1000];
big(const char * msg = "?")
{
for (size_t i = 0; i != sizeof(lump) / sizeof(lump[0]); i++)
lump[i] = gizmo (msg);
}
void trace(void)
{
printf("big: set to ");
gizmo& first = lump[0];
for (size_t i = 1; i != sizeof(lump) / sizeof(lump[0]); i++)
if (lump[i] != first) { printf(" Erm... mostly "); break; }
printf("%s\n", first.s);
}
};
int main(void)
{
// macros
BuildMessage(mmsg1, small, WorldPeace, gizmo("Hi"), 1, 2, 3);
BuildMessage(mmsg2, big , Armaggedon, "Doom");
((small *)mmsg1.location())->trace();
((big *)mmsg2.location())->trace();
mmsg1.send(0x1000);
mmsg2.send(0x2000);
// templates
Message (small) tmsg1(MessageID::WorldPeace, gizmo("Hello"), 4, 5, 6);
Message (big ) tmsg2(MessageID::Armaggedon, "Damnation");
tmsg1.contents().trace();
tmsg2.contents().trace();
tmsg1.send(0x3000);
tmsg2.send(0x4000);
}
output:
small: 1 2 3 Hi#*
big: set to Doom#
sending 20 bytes #0xbf81be20 to 00001000
sending 4004 bytes #0x9e58018 to 00002000
small: 4 5 6 Hello#**
big: set to Damnation#
sending 20 bytes #0xbf81be0c to 00003000
sending 4004 bytes #0x9e5ce50 to 00004000
Arguments forwarding
I see little point in doing constructor parameters forwarding here.
Any bit of dynamic data referenced by the message contents would have to be either static or copied into the message body, otherwise the referenced data would vanish as soon as the message creator would go out of scope.
If the users of this wonderfully efficient library start passing around magic pointers and other global data inside messages, I wonder how the global system performance will like that. But that's none of my business, after all.
Macros
I resorted to a macro to hide the template ugliness in type definition.
If someone has an idea to get rid of it, I'm interested.
Efficiency
The template variation requires an extra forwarding of the contents parameters to reach the constructor. I can't see how that could be avoided.
The macro version wastes 68 bytes of memory for big messages, and some memory for small ones (64 - sizeof (contents object)).
Performance-wise, this extra bit of memory is the only gain the templates offer. Since all these objects are supposedly constructed on the stack and live for a handful of microseconds, it is pretty neglectible.
Compared to your initial version, this one should handle message sending more efficiently for big messages. Here again, if these messages are rare and only offered for convenience, the difference is not terribly useful.
The template version maintains a single pointer to the message payload, that could be spared for small messages if you implemented a specialized version of the send function.
Hardly worth the code duplication, IMHO.
A last word
I think I know pretty well how an operating system works and what performances concerns might be. I wrote quite a few real-time applications, plus some drivers and a couple of BSPs in my time.
I also saw more than once a very efficient system layer ruined by too permissive an interface that allowed application software programmers to do the silliest things without even knowing.
That is what triggered my initial reaction.
If I had my say in global system design, I would forbid all these magic pointers and other under-the-hood mingling with object references, to limit non-specialist users to an inoccuous use of system layers, instead of allowing them to inadvertently spread cockroaches through the system.
Unless the users of this interface are template and real-time savvies, they will not understand a bit what is going on beneath the syntactic sugar crust, and might very soon shoot themselves (and their co-workers and the application software) in the foot.
Suppose a poor application software programmer adds a puny field in one of its structs and crosses unknowingly the 64 bytes barrier. All of a sudden the system performance will crumble, and you will need Mr template & real time expert to explain the poor guy that what he did killed a lot of kittens.
Even worse, the system degradation might be progressive or unnoticeable at first, so one day you might wake up with thousands of lines of code that did dynamic allocations for years without anybody noticing, and the global overhaul to correct the problem might be huge.
If, on the other hand, all people in your company are munching at templates and mutexes for breakfast, syntactic sugar is not even required in the first place.
Consider a typical function that fills in a buffer:
const char* fillMyBuffer( const char* buf, int size );
Suppose this function fills the buffer with some useful data, that I want to use almost immediately after the call, and then I want to get rid of the buffer.
An efficient way of doing this is to allocate on the stack:
doStuff();
{
char myBuf[BUF_LEN];
const char* pBuf = fillMyBuffer( myBuf, BUF_LEN );
processBuffer( pBuf );
}
doOtherStuff();
So this is great for my library because the buffer is allocated on the stack - being essentially no cost to allocate, use and discard. It lasts the entire scope of the containing braces.
But I have a library where I do this pattern all the time. I'd like to automate this a little. Ideally I'd like code that looks like this:
doStuff();
{
// tricky - the returned buffer lasts the entire scope of the braces.
const char* pBuf = fillMyBufferLocal();
processBuffer( pBuf );
}
doOtherStuff();
But how to achieve this?
I did the following, which seems to work, but I know is counter to the standard:
class localBuf
{
public:
operator char* () { return &mBuf[0]; }
char mBuf[BUF_LEN];
};
#define fillMyBufferLocal() fillMyBuffer( localBuf(), BUF_LEN );
As a practical matter, the buffer is lasting on the stack during the entire lifetime of the containing braces. But the standard says that the object only has to last until the function returns. E.g. technically its just as unsafe as if I'd allocated the buffer on the stack inside the function.
Is there a safe way to achieve this?
I would generally recommend your original solution. It separates the allocation of the buffer from filling it. However, if you want to implement this fillMyBufferLocal alternative, it will have to dynamically allocate the buffer and return a pointer to it. Of course, if you return a raw pointer to dynamically allocated memory, it's very unclear that the memory should later be destroyed. Instead, return a smart pointer that encapsulates the appropriate ownership:
std::unique_ptr<char[]> fillMyBufferLocal()
{
std::unique_ptr<char[]> buffer(new char[BUF_LEN]);
// Fill it
return buffer;
}
Then you can use it like so:
auto buffer = fillMyBufferLocal();
processBuffer(buffer.get());
I do not think you should want to do this. It just makes the code harder to understand.
Automatic storage duration means that when an object goes out of scope, it is destroyed. Here you want trick the system into something that behaves like creating an object with automatic storage duration (i.e. allocates on the stack), but without respecting the corresponding rules (i.e. without being destroyed when returning from fillMyBuffer()).
The closest, meaningful thing you can do in my opinion is to use a global buffer that fillMyBuffer() can reuse, or let that buffer be a static variable inside fillMyBuffer(). For instance:
template<int BUF_LEN = 255>
const char* fill_my_buffer()
{
static char myBuf[BUF_LEN];
// Fill...
return myBuf;
}
However, I strongly suggest reconsidering your requirements, and either:
Keep using the solution you are currently adopting (i.e. transparently allocate on the stack); or
Allocate the buffer dynamically inside fillMyBuffer() and return a RAII wrapper (like a unique_ptr) to this dynamically allocated buffer.
UPDATE:
As a last, desperate attempt, you could define a macro that does the allocation and the invocation of fill_my_buffer() for you:
#define PREPARE_BUFFER(B, S) \
char buffer[S]; \
const char* B = fill_my_buffer(buffer, S);
You would then use it this way:
PREPARE_BUFFER(pBuf, 256);
processBuffer(pBuf);
You could write a class that contains a stack-based buffer and converts to char const *, e.g.
void processBuffer(char const * buffer);
char const * fillMyBuffer(char const * buffer, int size);
int const BUF_LEN = 123;
class Wrapper
{
public:
Wrapper(char const * (*fill)(char const *, int))
{
fill(&m_buffer[0], m_buffer.size());
}
operator char const * () const { return &m_buffer[0]; }
private:
std::array<char, BUF_LEN> m_buffer;
};
void foo()
{
Wrapper wrapper(fillMyBuffer);
processBuffer(wrapper);
}
i have a packet struct which have a variable len for a string example:
BYTE StringLen;
String MyString; //String is not a real type, just trying to represent an string of unknown size
My question is how i can make the implementation of this packet inside an struct without knowing the size of members (in this case strings). Here is an example of how i want it to "look like"
void ProcessPacket (PacketStruct* packet)
{
pointer = &packet.MyString;
}
I think its not possible to make since the compiler doesn't know the size of the string until run time. So how can make it look high level and comprehensible?.
The reason i need structs its for document every packet without the user actually have to look any of the functions that analyze the packet.
So i can resume the question to: is there a way to declare an struct of undefined size members or something close as a struct?
I would recommend a shell class that just interprets the packet data.
struct StringPacket {
char *data_;
StringPacket (char *data) : data_(data) {}
unsigned char len () const { return *data_; }
std::string str () const { return std::string(data_+1, len());
};
As mentioned in comments, you wanted a way to treat a variable-sized packet like a struct. The old C way to do that was to create a struct that looked like this:
struct StringPacketC {
unsigned char len_;
char str_[1]; /* Modern C allows char str_[]; but C++ doesn't */
};
And then, cast the data (remember, this is C code):
struct StringPacketC *strpack = (struct StringPacketC *)packet;
But, you are entering undefined behavior, since to access the full range of data in strpack, you would have to read beyond the 1 byte array boundary defined in the struct. But, this is a commonly used technique in C.
But, in C++, you don't have to resort to such a hack, because you can define accessor methods to treat the variable length data appropriately.
you can copy the string into a high-level std::string (at least, if my guess that String is a typedef for const char* is correct):
void ProcessPacket( const PacketStruct& packet )
{
std::string highLevelString( packet.MyString,
static_cast< size_t >( packet.StringLen ) );
...
}
A simple variant according to your posting would be:
struct PacketStruct {
std::string MyString;
size_t length () const { return MyString.length(); }
const char* operator & () const { return MyString.c_str(); }
};
This can be used (almost) as you desired above:
void ProcessPacket (const PacketStruct& packet)
{
const char * pointer = &packet;
size_t length = packet.length();
std::cout << pointer << '\t' << length << std::endl;
}
and should be invoked like:
int main()
{
PacketStruct p;
p.MyString ="Hello";
ProcessPacket(p);
}