Out-parameters and move semantics

Out-parameters and move semantics - c++

Consider a case of a lock-free concurrent data structure where a pop() operation needs to return an item or false if the cointainer is empty (rather than blocking or throwing). The data structure is templated on a user type T, which can potentially be large (but also could be lightweight, and I want things to be efficient in either case). T has to be at least movable, but I don't want it to have to be copyable.
I was thinking that the function signature would be bool DS<T>::pop(T &item) so the item is extracted as an out-parameter rather than return value (which instead is used to indicate success or failure). However, how do I actually pass it out? Assume there's an underlying buffer. Would I do item = std::move(_buff[_tail])—does it make sense to move into a reference out-parameter? A downside is that the user would have to pass in a default-constructed T, which goes a bit against effective RAII because the result is an object that hasn't actually initialized its resources if the function fails.
Another option is returning an std::pair<bool, T> rather than using an out-parameter, but there again needs to be a default-constructible T which holds no resource in the case of failure, for the return std::make_pair(false, T).
A third option would be returning the item as std::unique_ptr<T>, but this incurs useless overhead in the case of T being a pointer or another lightweight type. While I could store just pointers in the data structure with the actual items stored externally, that incurs not just the penalty of additional dereferences and cache misses, but also removes the natural padding that items stored directly in the buffer add and help minimize producer and consumer threads from hitting the same cache lines.

#include <boost/optional.hpp>
#include <string>
template<class T>
struct atomic_queue
{
using value_type = T;
auto pop() -> boost::optional<T>
{
boost::optional<T> result;
/*
* insert atomic ops here, optionally filling result
*/
return result;
};
auto push(T&& arg) -> bool
{
/*
* insert atomic ops here, optionally stealing arg
*/
return true;
};
static auto make_empty_result() {
return boost::optional<T>();
}
};
struct difficult {
difficult(std::string);
difficult() = delete;
difficult(difficult const&) = delete;
difficult& operator=(difficult const&) = delete;
difficult(difficult &&) = default;
difficult& operator=(difficult &&) = default;
};
extern void spin();
int main()
{
atomic_queue<difficult> q;
auto d = difficult("arg");
while(not q.push(std::move(d)))
spin();
auto popped = q.make_empty_result();
while(not (popped = q.pop()))
spin();
auto& val = popped.get();
}

Related

Copyable C++ coroutine with data

I have written a forward iterator that iterates over the nodes of a graph in order of a (preorder/postorder/inorder) DFS spanning tree. Since it is quite complicated compared to writing a simple DFS and calling a callback for each encountered node, I thought I could use C++20 coroutines to simplify the code of the iterator.
However, C++20 coroutines are not copyable (much less so if they are stateful) but iterators should better be copyable!
Is there any way I could still use some coroutine-like code for my iterator?
Note: I want all iterators to iterate independently from each other. If I copy an iterator and then call ++ on it, then only the iterator should have advanced, but not its copy.

I figured that, with the Miro Knejp "goto-hack", something resembling copyable co-routines is possible as follows (this toy example just counts "+1" "*2" until a certain value but it illustrates the point).
(1) this is just a simple wrapper for the actual function
template<class Func, class Data>
struct CopyableCoroutine {
Data data;
Func func;
CopyableCoroutine(const Data& _data, const Func& _func): data(_data), func(_func) {}
bool done() const { return data.done(); }
template<class... Args> decltype(auto) operator()(Args&&... args) {
return func(data, std::forward<Args>(args)...);
}
};
(2) this is where the magic happens
struct my_stack {
int n;
int next_label_idx = 0;
bool done() const { return n > 30; }
};
int wierd_count(my_stack& coro_data) {
static constexpr void* jump_table[] = {&&beginning, &&before_add, &&after_add, &&the_end};
goto* jump_table[coro_data.next_label_idx];
beginning:
while(coro_data.n <= 32) {
coro_data.next_label_idx = 1;
return coro_data.n;
before_add:
coro_data.next_label_idx = 2;
++coro_data.n; // odd steps: add one
return coro_data.n;
after_add:
coro_data.next_label_idx = 3;
coro_data.n *= 2; // even steps: multiply 2
}
the_end:
return -1;
}
play with it here.
NOTE: unfortunately, this "dirty hack" requires some extra work and I would be super happy if that could be avoided somehow. I'd really love to see C++ native coroutines that can be copied if the user promises that its stack is copyable.

Const lock wrapper for STL containers

I'm trying to create a wrapper for an std::vector (or any other container from STL, if possible) that can "lock" and "unlock" the const state of a vector that it's holding.
For example, if I create an object of that wrapper, I want to be able to do something like this:
int main()
{
ConstLockVectorWrapper<int> myWrapper(std::vector<int>{}); // Here I pass an empty vector in the constructor parameters,
// which means that my wrapper will be holding an empty vector
// By default the vector inside my wrapper is not locked,
// I can change its size and the values that it holds
myWrapper.get().push_back(10); // ok
myWrapper.get().push_back(20); // ok
myWrapper.get().at(0) = 5; // ok
print(myWrapper.get()); // Prints 5 20
myWrapper.lock(); // Now I made the vector inside my wrapper unchangable
myWrapper.get().push_back(30); // error, the vector is locked
myWrapper.get().at(0) = 55; // error
print(myWrapper.get()); // ok
myWrapper.unlock(); // Now I can change my vector's size and its values again
_getch();
return 0;
}
The only solution (that's not working, unfortunately) I've got, is to create a const reference (const std::vector<T> &) and a regular reference (td::vector<T> &) inside a wrapper class, and bound them to the main vector in our wrapper class.
So, this is what I've done:
template <typename T>
class ConstLockVectorWrapper {
public:
ConstLockVectorWrapper(const std::vector<T> & vec)
: wrappedVector(vec), wrappedVectorRef(wrappedVector), wrappedVectorConstRef(wrappedVector), constLock(false)
{}
void lock()
{
if (constLock) // if the vector is already locked, we just exit the function
return;
// else we lock the vector
constLock = true;
}
void unlock()
{
if (!constLock) // if the vector is already unlocked (changable), we just exit the function
return;
// else we unlock the vector
constLock = false;
}
return_type get() // I need to return a const std::vector<T> & if constLock == true, and std::vector<T> & otherwise, what return type should I put in here?
{
if (constLock)
return wrappedVectorConstRef;
else
return wrappedVectorRef;
}
private:
bool constLock;
std::vector<T> wrappedVector;
// refs
std::vector<T> & wrappedVectorRef;
const std::vector<T> & wrappedVectorConstRef;
};
Of course, it doesn't work. Just because I don't know what to put in the return type of my get() fucntion.
I've tried using trailing return type, didn't work:
template <typename T>
class ConstLockVectorWrapper {
public:
// ...
private:
bool constLock;
std::vector<T> wrappedVector;
// refs
std::vector<T> & wrappedVectorRef;
const std::vector<T> & wrappedVectorConstRef;
public:
auto get() -> decltype((constLock ? wrappedVectorConstRef : wrappedVectorRef))
{
if (constLock)
return wrappedVectorConstRef;
else
return wrappedVectorRef;
}
};
I can't come up with any solution that will actually work, because I'm not so good at C++ yet.
So I'm asking for your help with my problem. Any suggestions or hints to solve this problem would be appreciated!
Thanks
PS
My main goal is to make my wrapper container-type-independent, so it can "lock" and "unlock" the const state of the container it's holding, independently of its type.
And here's the print() function I used in the first code snippet:
template <typename Container>
void print(const Container & c)
{
for (const auto & var : c)
std::cout << var << std::endl;
}

Fundamentally, a method always returns the same thing. The same type. Every time. It's not possible, in C++, to have a method sometimes return one type, and another type at other times. C++ does not work this way.
So, the initial approach would be to have get() return a proxy object with a state. Using, roughly, the same classes and names from your question:
class return_type {
bool is_const;
std::vector<T> &wrapped_vec;
public:
return_type(bool is_constArg,
std::vector<T> &wrapped_vecArg)
: is_const(is_constArg), wrapped_vec(wrapped_vecArg)
{
}
void push_back(T &&t)
{
if (is_const)
throw std::runtime_error(); // Or, whatever...
wrapped_vec.push_back(std::forward<T>(t));
}
// return_type will have to implement, and baby-sit all other
// methods you wish to invoke on the underlying vector.
};
return_type get()
{
return return_type(constLock);
}
This is simple, but crude and somewhat tedious. You would have to implement every std::vector method you need to use in the return_type proxy.
A better approach would be to take advantage of C++11 lambdas. This will avoid the need to reimplement every wheel, at an expense of some additional code bloat. But, big deal. RAM is cheap, these days. Instead of get() and return_type, you will now be implementing two template methods in your wrapper: get_const() and get_mutable(). Each one of them takes a lambda parameter and invokes it and, if all goes well, passing it the wrapped vector as an argument:
template<typename lambda>
void get_mutable(lambda &&l)
{
if (constLock)
throw std::runtime_error(); // Or, whatever...
l(wrapped_vec);
}
template<typename lambda>
void get_const(lambda &&l)
{
l(const_cast<const std::vector<T> &>(wrapped_vec));
}
The only thing you now need to decide is whether you need access a mutable or a constant vector, and pick the right getter:
myWrapper.get_mutable( [&](std::vector<int> &v) { v.push_back(10); } );
get_mutable() throws an exception if the vector is locked at this time. Otherwise it passes the vector to your lambda. Your lambda does whatever the heck it wants with it, which can be push_back(), or anything else, then returns.
But if you only need read-only access to the vector, use get_const():
int s;
myWrapper.get_const( [&](const std::vector<int> &v) { s=v.size(); } );
Note that get_const() takes care to const_cast the vector, before invoking the lambda, so the lambda will not be able to modify it. This will be enforced at compile-time.
With some additional work, it would also be possible to clean this up a little bit, and have the getter also return whatever lambda returns to the caller, making it possible to do something like this:
int s=myWrapper.get_const( [&](const std::vector<int> &v) { return v.size(); } );
It's possible to have get_const() and get_mutable() be smart enough to figure out if the lambda returns something, and happily pass it back to the caller, whatever it is. And how to do that, I suppose, will have to be another question on stackoverflow.com
P.S. If you don't have C++11, you can just have get_const() and get_mutable() return the wrapped vector (with get_mutable() verifying that it's not locked). This really accomplishes the same thing. The key point is that due to the way that C++ works, you will have to disambiguate, in advance, whether you need constant or mutable access.

I was working on a similar problem a while back. In multithreaded environment sometimes its more efficient to have different types of lock depending on whether you are reading or writing. But the locking is entirely cooperative. It is possible to obtain a read-only lock but still accidentally write to the object.
One solution I am exploring is, instead of obtaining a read-only lock from an object, getting a read-only wrapper of my object so that not only is the object read-only locked it is also only possible to call read-only (const) methods on the object.
The basic wrapper I used was something like this:
template<typename T>
class ConstWrapper
{
T& v;
public:
ConstWrapper(T& v): v(v) {}
T const& operator* () const { return v; } // return const reference
T const* operator->() const { return &v;} // return const pointer
};
By overloading the * and -> operators you get a kind of pass through ability to call the enclosed objects methods - but using pointer semantics (though its not a pointer).
std::vector<int> v {1, 2, 3, 4}; // not const
ConstWrapper<std::vector<int>> cv(v); // const wrapper
std::cout << cv->at(0) << '\n'; // okay at() is a const method
cv->push_back(8); // ILLEGAL!! push_back() is not a const method

Modify internal structure when implementing std::hash<T>

I'm writing a custom OrderedTree class I want to use as a key to an unordered_set.
I want to do a couple things when hashing the Tree:
calculate the hash lazily and cache it as needed (since this may be an expensive operation),
maybe balance the tree.
Neither of these operations change the semantic equality or hash value of the object, but they do modify some private fields.
Unfortunately, trying to modify any members in OrderedTree while inside std::hash<Tree>::operator() seems to violate const correctness that unordered_set expects.
Can I use my OrderedTree with unordered_set? If so, how?
EDIT:
As per request in the comments, minimal proof of concept:
#include <unordered_set>
std::size_t hash_combine(std::size_t a, std::size_t b) {
// TODO: Copy from boost source or something
return 0;
}
struct Node {
int value;
Node *left, *right, *parent;
std::size_t hash(std::size_t seed) const {
if (left != nullptr)
seed = left->hash(seed);
std::hash<int> hasher;
seed = hash_combine(seed, hasher(value));
if (right != nullptr)
seed = right->hash(seed);
return seed;
}
};
struct Tree {
Tree(): hash_(0), root(nullptr) {}
Node *root;
std::size_t hash() const {
if (hash_ == 0 && root != nullptr) {
hash_ = root->hash(7);
}
return hash_;
}
private:
std::size_t hash_;
};
namespace std {
template<>
struct hash<Tree> {
std::size_t operator()(const Tree& t) const {
return t.hash();
}
};
}
int main() {
std::unordered_set<Tree> set;
}
When I try to compile I get:
Sample.cc:31:13: error: cannot assign to non-static data member within const member function 'hash'
hash_ = root->hash(7);
~~~~~ ^
Sample.cc:29:15: note: member function 'Tree::hash' is declared const here
std::size_t hash() const {
~~~~~~~~~~~~^~~~~~~~~~~~

There is a guarantee that std containers will only call const members when doing const or logically const operations. If those const operations are multiple-reader safe, then so is the container; contrawise, if they are not, neither is the container.
The immutability of the hash value and equality (or < on ordered containers) are the only things you need guarantee in a key type in an associative container. Actual const gives the above multiple-reader guarantee, which can be quite useful. What more, violating it costs you using this in the future, and/or subtle buts when someone does presume const means immutable.
You could carefully synchonize the write operation internally to keep the multiple-reader guarantee, or you can give it up.
To violate const, typically you use mutable. A const method that uses casting to bypass const risks Undefined Behaviour if the object was actually const, and not just a const view of a non-const object.
In general, be careful before using this kind of optimizaton; it can easily increase code complexity (hance bugs, maintenance, etc) more than it adds speed. And speeding up code is fungible: make sure you identify this as slow code and this part as a bottlenecm prior to investing in it. And if you are going to balance in hash, why wait for hash? Balance before insert!

C++ safe idiom to call a member function of a class through a shared_ptr class member

Problem description
In designing an observer pattern for my code, I encountered the following task: I have a class Observer which contains a variable std::shared_ptr<Receiver> and I want to use a weak_ptr<Receiver> to this shared-pointer to safely call a function update() in Observer (for a more detailed motivation including some profiling measurements, see the EDIT below).
Here is an example code:
struct Receiver
{
void call_update_in_observer() { /* how to implement this function? */}
};
struct Observer
{
virtual void update() = 0;
std::shared_ptr<Receiver> receiver;
};
As mentioned there is a weak_ptr<Receiver> from which I want to call Observer::update() -- at most once -- via Receiver::call_update_in_observer():
Observer observer;
std::weak_ptr<Receiver> w (observer.receiver);
auto s = w.lock();
if(s)
{
s->call_update_in_observer(); //this shall call at most once Observer::update()
//regardless how many copies of observer there are
}
(Fyi: the call of update() should happen at most once because it updates a shared_ptr in some derived class which is the actual observer. However, whether it is called once or more often does not affect the question about "safeness" imo.)
Question:
What is an appropriate implementation of Observer and Receiver to carry out that process in a safe manner?
Solution attempt
Here is an attempt for a minimal implementation -- the idea is that Receiver manages a set of currently valid Observer objects, of which one member is called:
struct Receiver
{
std::set<Observer *> obs;
void call_update_in_observer() const
{
for(auto& o : obs)
{
o->update();
break; //one call is sufficient
}
}
};
The class Observer has to take care that the std::shared_ptr<Receiver> object is up-to-date:
struct Observer
{
Observer()
{
receiver->obs.insert(this);
}
Observer(Observer const& other) : receiver(other.receiver)
{
receiver->obs.insert(this);
}
Observer& operator=(Observer rhs)
{
std::swap(*this, rhs);
return *this;
}
~Observer()
{
receiver->obs.erase(this);
}
virtual void update() = 0;
std::shared_ptr<Receiver> receiver = std::make_shared<Receiver>();
};
DEMO
Questions:
Is this already safe? -- "safe" meaning that no expired Foo object is called. Or are there some pitfalls which have to be
considered?
If this code is safe, how would one implement the move constructor and assignment?
(I know this has the feeling of being appropriate for CodeReview, but it's rather about a reasonable pattern for this task than about my code, so I posted it here ... and further the move constructors are still missing.)
EDIT: Motivation
As the above requirements have been called "confusing" in the comments (which I can't deny), here is the motivation: Consider a custom Vector class which in order to save memory performs shallow copies:
struct Vector
{
auto operator[](int i) const { return v[i]; }
std::shared_ptr<std::vector<double> > v;
};
Next one has expression template classes e.g. for the sum of two vectors:
template<typename _VectorType1, typename _VectorType2>
struct VectorSum
{
using VectorType1 = std::decay_t<_VectorType1>;
using VectorType2 = std::decay_t<_VectorType2>;
//alternative 1: store by value
VectorType1 v1;
VectorType2 v2;
//alternative 2: store by shared_ptr
std::shared_ptr<VectorType1> v1;
std::shared_ptr<VectorType2> v2;
auto operator[](int i) const
{
return v1[i] + v2[i];
}
};
//next overload operator+ etc.
According to my measurements, alternative 1 where one stores the vector expressions by value (instead of by shared-pointer) is faster by a factor of two in Visual Studio 2015. In a simple test on Coliru, the speed improvement is even a factor of six:
type Average access time ratio
--------------------------------------------------------------
Foo : 2.81e-05 100%
std::shared_ptr<Foo> : 0.000166 591%
std::unique_ptr<Foo> : 0.000167 595%
std::shared_ptr<FooBase>: 0.000171 611%
std::unique_ptr<FooBase>: 0.000171 611%
The speedup appears particularly when operator[](int i) does not perform expensive calculations which would make the call overhead negligible.
Consider now the case where an arithmetic operation on a vector expression is too expensive to calculate each time anew (e.g. an exponential moving average). Then one needs to memoize the result, for which as before a std::shared_ptr<std::vector<double> > is used.
template<typename _VectorType>
struct Average
{
using VectorType = std::decay_t<_VectorType>;
VectorType v;
std::shared_ptr<std::vector<double> > store;
auto operator[](int i) const
{
//if store[i] is filled, return it
//otherwise calculate average and store it.
}
};
In this setup, when the vector expression v is modified somewhere in the program, one needs to propagate that change to the dependent Average class (of which many copies can exists), such that store is recalculated -- otherwise it will contain wrong values. In this update process, however, store needs to be recalculated only once, regardless how many copies of the Average object exist.
This mix of shared-pointer and value semantics is the reason why I'm running in the somewhat confusing situation as above. My solution attempt is to enforce the same cardinality in the observer as in the updated objects -- this is the reason for the shared_ptr<Receiver>.

Forwarding to in-place constructor

I have a message class which was previously a bit of a pain to work with, you had to construct the message class, tell it to allocate space for your object and then populate the space either by construction or memberwise.
I want to make it possible to construct the message object with an immediate, inline new of the resulting object, but to do so with a simple syntax at the call site while ensuring copy elision.
#include <cstdint>
typedef uint8_t id_t;
enum class MessageID { WorldPeace };
class Message
{
uint8_t* m_data; // current memory
uint8_t m_localData[64]; // upto 64 bytes.
id_t m_messageId;
size_t m_size; // amount of data used
size_t m_capacity; // amount of space available
// ...
public:
Message(size_t requestSize, id_t messageId)
: m_data(m_localData)
, m_messageId(messageId)
, m_size(0), m_capacity(sizeof(m_localData))
{
grow(requestSize);
}
void grow(size_t newSize)
{
if (newSize > m_capacity)
{
m_data = realloc((m_data == m_localData) ? nullptr : m_data, newSize);
assert(m_data != nullptr); // my system uses less brutal mem mgmt
m_size = newSize;
}
}
template<typename T>
T* allocatePtr()
{
size_t offset = size;
grow(offset + sizeof(T));
return (T*)(m_data + offset);
}
#ifdef USE_CPP11
template<typename T, typename Args...>
Message(id_t messageId, Args&&... args)
: Message(sizeof(T), messageID)
{
// we know m_data points to a large enough buffer
new ((T*)m_data) T (std::forward<Args>(args)...);
}
#endif
};
Pre-C++11 I had a nasty macro, CONSTRUCT_IN_PLACE, which did:
#define CONSTRUCT_IN_PLACE(Message, Typename, ...) \
new ((Message).allocatePtr<Typename>()) Typename (__VA_ARGS__)
And you would say:
Message outgoing(sizeof(MyStruct), MessageID::WorldPeace);
CONSTRUCT_IN_PLACE(outgoing, MyStruct, wpArg1, wpArg2, wpArg3);
With C++11, you would use
Message outgoing<MyStruct>(MessageID::WorldPeace, wpArg1, wpArg2, wpArg3);
But I find this to be messy. What I want to implement is:
template<typename T>
Message(id_t messageId, T&& src)
: Message(sizeof(T), messageID)
{
// we know m_data points to a large enough buffer
new ((T*)m_data) T (src);
}
So that the user uses
Message outgoing(MessageID::WorldPeace, MyStruct(wpArg1, wpArg2, wpArg3));
But it seems that this first constructs a temporary MyStruct on the stack turning the in-place new into a call to the move constructor of T.
Many of these messages are simple, often POD, and they are often in marshalling functions like this:
void dispatchWorldPeace(int wpArg1, int wpArg2, int wpArg3)
{
Message outgoing(MessageID::WorldPeace, MyStruct(wpArg1, wpArg2, wpArg3));
outgoing.send(g_listener);
}
So I want to avoid creating an intermediate temporary that is going to require a subsequent move/copy.
It seems like the compiler should be able to eliminate the temporary and the move and forward the construction all the way down to the in-place new.
What am I doing that is causing it not to? (GCC 4.8.1, Clang 3.5, MSVC 2013)

You won't be able to elide the copy/move in the placement new: copy elision is entirely based on the idea that the compiler knows at construction time where the object will eventually end up. Also, since copy elision actually changes the behavior of the program (after all, it won't call the respective constructor and the destructor even if they have side-effects) copy elision is limited to a few very specific cases (listed in 12.8 [class.copy] paragraph 31: essentially when returning a local variable by name, when throwing a local variable by name, when catching an exception of the correct type by value, and when copying/moving a temporary variable; see the clause for exact details). Since [placement] new is none of the contexts where the copy can be elided and the argument to constructor is clearly not a temporary (it is named), the copy/move will never be elided. Even adding the missing std::forward<T>(...) to your constructor will cause the copy/move to be elided:
template<typename T>
Message(id_t messageId, T&& src)
: Message(sizeof(T), messageID)
{
// placement new take a void* anyway, i.e., no need to cast
new (m_data) T (std::forward<T>(src));
}
I don't think you can explicitly specify a template parameter when calling a constructor. Thus, I think the closest you could probably get without constructing the object ahead of time and getting it copied/moved is something like this:
template <typename>
struct Tag {};
template <typename T, typename A>
Message::Message(Tag<T>, id_t messageId, A... args)
: Message(messageId, sizeof(T)) {
new(this->m_data) T(std::forward<A>(args)...);
}
One approach which might make things a bit nicer is using the id_t to map to the relevant type assuming that there is a mapping from message Ids to the relevant type:
typedef uint8_t id_t;
template <typename T, id_t id> struct Tag {};
struct MessageId {
static constexpr Tag<MyStruct, 1> WorldPeace;
// ...
};
template <typename T, id_t id, typename... A>
Message::Message(Tag<T, id>, A&&... args)
Message(id, sizeof(T)) {
new(this->m_data) T(std::forward<A>)(args)...);
}

Foreword
The conceptual barrier that even C++2049 cannot cross is that you require all the bits that compose your message to be aligned in a contiguous memory block.
The only way C++ can give you that is through the use of the placement new operator. Otherwise, objects will simply be constructed according to their storage class (on the stack or through whatever you define as a new operator).
It means any object you pass to your payload constructor will be first constructed (on the stack) and then used by the constructor (that will most likely copy-construct it).
Avoiding this copy completely is impossible. You may have a forward constructor doing the minimal amount of copy, but still the scalar parameters passed to the initializer will likely be copied, as will any data that the constructor of the initializer deemed necessary to memorize and/or produce.
If you want to be able to pass parameters freely to each of the constructors needed to build the complete message without them being first stored in the parameter objects, it will require
the use of a placement new operator for each of the sub-objects that compose the message,
the memorization of each single scalar parameter passed to the various sub-constructors,
specific code for each object to feed the placement new operator with the proper address and call the constructor of the sub-object.
You will end up with a toplevel message constructor taking all possible initial parameters and dispatching them to the various sub-objects constructors.
I don't even know if this is feasible, but the result would be very fragile and error-prone at any rate.
Is that what you want, just for the benefit of a bit of syntactic sugar?
If you're offering an API, you cannot cover all cases. The best approach is to make something that degrades nicely, IMHO.
The simple solution would be to limit payload constructor parameters to scalar values or implement "in-place sub-construction" for a limited set of message payloads that you can control. At your level you cannot do more than that to make sure the message construction proceeds with no extra copies.
Now the application software will be free to define constructors that take objects as parameters, and then the price to pay will be these extra copies.
Besides, this might be the most efficient approach, if the parameter is something costly to construct (i.e. the construction time is greater than the copy time, so it is more efficient to create a static object and modify it slightly between each message) or if it has a greater lifetime than your function for any reason.
a working, ugly solution
First, let's start with a vintage, template-less solution that does in-place construction.
The idea is to have the message pre-allocate the right kind of memory (local buffer of dynamic) depending on the size of the object.
The proper base address is then passed to a placement new to construct the message contents in place.
#include <cstdint>
#include <cstdio>
#include <new>
typedef uint8_t id_t;
enum class MessageID { WorldPeace, Armaggedon };
#define SMALL_BUF_SIZE 64
class Message {
id_t m_messageId;
uint8_t* m_data;
uint8_t m_localData[SMALL_BUF_SIZE];
public:
// choose the proper location for contents
Message (MessageID messageId, size_t size)
{
m_messageId = (id_t)messageId;
m_data = size <= SMALL_BUF_SIZE ? m_localData : new uint8_t[size];
}
// dispose of the contents if need be
~Message ()
{
if (m_data != m_localData) delete m_data;
}
// let placement new know about the contents location
void * location (void)
{
return m_data;
}
};
// a macro to do the in-place construction
#define BuildMessage(msg, id, obj, ... ) \
Message msg(MessageID::id, sizeof(obj)); \
new (msg.location()) obj (__VA_ARGS__); \
// example uses
struct small {
int a, b, c;
small (int a, int b, int c) :a(a),b(b),c(c) {}
};
struct big {
int lump[1000];
};
int main(void)
{
BuildMessage(msg1, WorldPeace, small, 1, 2, 3)
BuildMessage(msg2, Armaggedon, big)
}
This is just a trimmed down version of your initial code, with no templates at all.
I find it relatively clean and easy to use, but to each his own.
The only inefficiency I see here is the static allocation of 64 bytes that will be useless if the message is too big.
And of course all type information is lost once the messages are constructed, so accessing their contents afterward would be awkward.
About forwarding and construction in place
Basically, the new && qualifier does no magic. To do in-place construction, the compiler needs to know the address that will be used for object storage before calling the constructor.
Once you've invoked an object creation, the memory has been allocated and the && thing will only allow you to use that address to pass ownership of the said memory to another object without resorting to useless copies.
You can use templates to recognize a call to the Message constructor involving a given class passed as message contents, but that will be too late: the object will have been constructed before your constructor can do anything about its memory location.
I can't see a way to create a template on top of the Message class that would defer an object construction until you have decided at which location you want to construct it.
However, you could work on the classes defining the object contents to have some in-place construction automated.
This will not solve the general problem of passing objects to the constructor of the object that will be built in place.
To do that, you would need the sub-objects themselves to be constructed through a placement new, which would mean implementing a specific template interface for each of the initializers, and have each object provide the address of construction to each of its sub-objects.
Now for syntactic sugar.
To make the ugly templating worth the while, you can specialize your message classes to handle big and small messages differently.
The idea is to have a single lump of memory to pass to your sending function. So in case of small messages, the message header and contents are defined as local message properties, and for big ones, extra memory is allocated to include the message header.
Thus the magic DMA used to propell your messages through the system will have a clean data block to work with either way.
Dynamic allocations will still occur once per big message, and never for small ones.
#include <cstdint>
#include <new>
// ==========================================================================
// Common definitions
// ==========================================================================
// message header
enum class MessageID : uint8_t { WorldPeace, Armaggedon };
struct MessageHeader {
MessageID id;
uint8_t __padding; // one free byte here
uint16_t size;
};
// small buffer size
#define SMALL_BUF_SIZE 64
// dummy send function
int some_DMA_trick(int destination, void * data, uint16_t size);
// ==========================================================================
// Macro solution
// ==========================================================================
// -----------------------------------------
// Message class
// -----------------------------------------
class mMessage {
// local storage defined even for big messages
MessageHeader m_header;
uint8_t m_localData[SMALL_BUF_SIZE];
// pointer to the actual message
MessageHeader * m_head;
public:
// choose the proper location for contents
mMessage (MessageID messageId, uint16_t size)
{
m_head = size <= SMALL_BUF_SIZE
? &m_header
: (MessageHeader *) new uint8_t[size + sizeof (m_header)];
m_head->id = messageId;
m_head->size = size;
}
// dispose of the contents if need be
~mMessage ()
{
if (m_head != &m_header) delete m_head;
}
// let placement new know about the contents location
void * location (void)
{
return m_head+1;
}
// send a message
int send(int destination)
{
return some_DMA_trick (destination, m_head, (uint16_t)(m_head->size + sizeof (m_head)));
}
};
// -----------------------------------------
// macro to do the in-place construction
// -----------------------------------------
#define BuildMessage(msg, obj, id, ... ) \
mMessage msg (MessageID::id, sizeof(obj)); \
new (msg.location()) obj (__VA_ARGS__); \
// ==========================================================================
// Template solution
// ==========================================================================
#include <utility>
// -----------------------------------------
// template to check storage capacity
// -----------------------------------------
template<typename T>
struct storage
{
enum { local = sizeof(T)<=SMALL_BUF_SIZE };
};
// -----------------------------------------
// base message class
// -----------------------------------------
class tMessage {
protected:
MessageHeader * m_head;
tMessage(MessageHeader * head, MessageID id, uint16_t size)
: m_head(head)
{
m_head->id = id;
m_head->size = size;
}
public:
int send(int destination)
{
return some_DMA_trick (destination, m_head, (uint16_t)(m_head->size + sizeof (*m_head)));
}
};
// -----------------------------------------
// general message template
// -----------------------------------------
template<bool local_storage, typename message_contents>
class aMessage {};
// -----------------------------------------
// specialization for big messages
// -----------------------------------------
template<typename T>
class aMessage<false, T> : public tMessage
{
public:
// in-place constructor
template<class... Args>
aMessage(MessageID id, Args...args)
: tMessage(
(MessageHeader *)new uint8_t[sizeof(T)+sizeof(*m_head)], // dynamic allocation
id, sizeof(T))
{
new (m_head+1) T(std::forward<Args>(args)...);
}
// destructor
~aMessage ()
{
delete m_head;
}
// syntactic sugar to access contents
T& contents(void) { return *(T*)(m_head+1); }
};
// -----------------------------------------
// specialization for small messages
// -----------------------------------------
template<typename T>
class aMessage<true, T> : public tMessage
{
// message body defined locally
MessageHeader m_header;
uint8_t m_data[sizeof(T)]; // no need for 64 bytes here
public:
// in-place constructor
template<class... Args>
aMessage(MessageID id, Args...args)
: tMessage(
&m_header, // local storage
id, sizeof(T))
{
new (m_head+1) T(std::forward<Args>(args)...);
}
// syntactic sugar to access contents
T& contents(void) { return *(T*)(m_head+1); }
};
// -----------------------------------------
// helper macro to hide template ugliness
// -----------------------------------------
#define Message(T) aMessage<storage<T>::local, T>
// something like typedef aMessage<storage<T>::local, T> Message<T>
// ==========================================================================
// Example
// ==========================================================================
#include <cstdio>
#include <cstring>
// message sending
int some_DMA_trick(int destination, void * data, uint16_t size)
{
printf("sending %d bytes #%p to %08X\n", size, data, destination);
return 1;
}
// some dynamic contents
struct gizmo {
char * s;
gizmo(void) { s = nullptr; };
gizmo (const gizmo& g) = delete;
gizmo (const char * msg)
{
s = new char[strlen(msg) + 3];
strcpy(s, msg);
strcat(s, "#");
}
gizmo (gizmo&& g)
{
s = g.s;
g.s = nullptr;
strcat(s, "*");
}
~gizmo()
{
delete s;
}
gizmo& operator=(gizmo g)
{
std::swap(s, g.s);
return *this;
}
bool operator!=(gizmo& g)
{
return strcmp (s, g.s) != 0;
}
};
// some small contents
struct small {
int a, b, c;
gizmo g;
small (gizmo g, int a, int b, int c)
: a(a), b(b), c(c), g(std::move(g))
{
}
void trace(void)
{
printf("small: %d %d %d %s\n", a, b, c, g.s);
}
};
// some big contents
struct big {
gizmo lump[1000];
big(const char * msg = "?")
{
for (size_t i = 0; i != sizeof(lump) / sizeof(lump[0]); i++)
lump[i] = gizmo (msg);
}
void trace(void)
{
printf("big: set to ");
gizmo& first = lump[0];
for (size_t i = 1; i != sizeof(lump) / sizeof(lump[0]); i++)
if (lump[i] != first) { printf(" Erm... mostly "); break; }
printf("%s\n", first.s);
}
};
int main(void)
{
// macros
BuildMessage(mmsg1, small, WorldPeace, gizmo("Hi"), 1, 2, 3);
BuildMessage(mmsg2, big , Armaggedon, "Doom");
((small *)mmsg1.location())->trace();
((big *)mmsg2.location())->trace();
mmsg1.send(0x1000);
mmsg2.send(0x2000);
// templates
Message (small) tmsg1(MessageID::WorldPeace, gizmo("Hello"), 4, 5, 6);
Message (big ) tmsg2(MessageID::Armaggedon, "Damnation");
tmsg1.contents().trace();
tmsg2.contents().trace();
tmsg1.send(0x3000);
tmsg2.send(0x4000);
}
output:
small: 1 2 3 Hi#*
big: set to Doom#
sending 20 bytes #0xbf81be20 to 00001000
sending 4004 bytes #0x9e58018 to 00002000
small: 4 5 6 Hello#**
big: set to Damnation#
sending 20 bytes #0xbf81be0c to 00003000
sending 4004 bytes #0x9e5ce50 to 00004000
Arguments forwarding
I see little point in doing constructor parameters forwarding here.
Any bit of dynamic data referenced by the message contents would have to be either static or copied into the message body, otherwise the referenced data would vanish as soon as the message creator would go out of scope.
If the users of this wonderfully efficient library start passing around magic pointers and other global data inside messages, I wonder how the global system performance will like that. But that's none of my business, after all.
Macros
I resorted to a macro to hide the template ugliness in type definition.
If someone has an idea to get rid of it, I'm interested.
Efficiency
The template variation requires an extra forwarding of the contents parameters to reach the constructor. I can't see how that could be avoided.
The macro version wastes 68 bytes of memory for big messages, and some memory for small ones (64 - sizeof (contents object)).
Performance-wise, this extra bit of memory is the only gain the templates offer. Since all these objects are supposedly constructed on the stack and live for a handful of microseconds, it is pretty neglectible.
Compared to your initial version, this one should handle message sending more efficiently for big messages. Here again, if these messages are rare and only offered for convenience, the difference is not terribly useful.
The template version maintains a single pointer to the message payload, that could be spared for small messages if you implemented a specialized version of the send function.
Hardly worth the code duplication, IMHO.
A last word
I think I know pretty well how an operating system works and what performances concerns might be. I wrote quite a few real-time applications, plus some drivers and a couple of BSPs in my time.
I also saw more than once a very efficient system layer ruined by too permissive an interface that allowed application software programmers to do the silliest things without even knowing.
That is what triggered my initial reaction.
If I had my say in global system design, I would forbid all these magic pointers and other under-the-hood mingling with object references, to limit non-specialist users to an inoccuous use of system layers, instead of allowing them to inadvertently spread cockroaches through the system.
Unless the users of this interface are template and real-time savvies, they will not understand a bit what is going on beneath the syntactic sugar crust, and might very soon shoot themselves (and their co-workers and the application software) in the foot.
Suppose a poor application software programmer adds a puny field in one of its structs and crosses unknowingly the 64 bytes barrier. All of a sudden the system performance will crumble, and you will need Mr template & real time expert to explain the poor guy that what he did killed a lot of kittens.
Even worse, the system degradation might be progressive or unnoticeable at first, so one day you might wake up with thousands of lines of code that did dynamic allocations for years without anybody noticing, and the global overhaul to correct the problem might be huge.
If, on the other hand, all people in your company are munching at templates and mutexes for breakfast, syntactic sugar is not even required in the first place.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Out-parameters and move semantics - c++

Related

Copyable C++ coroutine with data

Const lock wrapper for STL containers

Modify internal structure when implementing std::hash<T>

C++ safe idiom to call a member function of a class through a shared_ptr class member

Forwarding to in-place constructor

Categories

Resources