I am porting to C++11 a C code base that makes use of a number of custom intrusive data structures.
In C, the usage patterns will typically look like this:
struct foo {
// some members
struct data_structure_node node;
};
// user code
struct *foo = NULL;
struct data_structure_node *result = find_in_data_structure(data_structure, some_key);
if (node) {
foo = container_of(result, struct data_structure_node, node);
// use foo
}
Here, container_of is implemented much like in the Linux kernel:
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
As the code moves to more idiomatic C++, structures like foo typically end up becoming classes that use different access controls, virtual functions, etc. This, in turn, makes them adopt a non standard layout and causes GCC and clang to emit the following warning when container_of is used:
error: 'offsetof' within non-standard-layout type 'foo' is conditionally-supported [-Werror=invalid-offsetof]
I have been pondering how to implement a safe alternative to the container_of macro. Using a pointers to data member is the first idea that came to my mind and I'm considering replacing uses of container_of by, essentially,
template <class Parent, class Member>
Parent* my_container_of(Member *member, Member Parent::* ptr_to_member)
{
Parent *dummy_parent = nullptr;
auto *offset_of_member = reinterpret_cast<char *>(&(dummy_parent->*ptr_to_member));
auto address_of_parent = reinterpret_cast<char *>(member) - offset_of_member;
return reinterpret_cast<Parent *>(address_of_parent);
}
to get struct foo * from a struct data_structure_node *.
In particular, the use of ptr_to_member against the null dummy_parent makes me uneasy as it seems equivalent to performing arithmetic on a null pointer, which I understand is undefined behavior (C++11 Standard 5.7.5).
[...] Unless both pointers point to elements of the same array object, or one past the last element of the array object, the behavior is undefined
Boost.Instrusive uses an approach that seems roughly equivalent to my_container_of().
I'm wondering:
is my_container_of() safe?
is there a cleaner way of achieving this that I'm missing?
You can do intrusive data structures in C++ even nicer than C. The first thing is to use inheritance. So you try this:
struct List {
List *next{nullptr};
};
struct MyFoo : List {
MyFoo * get_next() const { return next; }
};
But there you get an error that next is a List* and not a MyFoo *. To fix this you can introduce templates:
template <typename T>
struct List {
T *next{nullptr};
};
struct MyFoo : List<MyFoo> {
MyFoo * get_next() const { return next; }
};
Now your intrusive list has the right type for the next. But you are limited to one intrusive list per object. So lets extend the template a bit more:
template <typename T, typename U>
struct List {
T *next{nullptr};
};
class Siblings;
class Children;
struct MyFoo : List<MyFoo, Siblings>, List<MyFoo, Children> {
using Sibling = List<MyFoo, Siblings>;
using Child = List<MyFoo, Children>;
MyFoo * get_sibling() const { return Sibling::next; }
MyFoo * get_child() const { return Child::next; }
};
Now you can inherit as many List as you want into a class and scoping the access to access the right List. No need of any offset() or container_of macros.
Note: The Siblings and Children classes are just declarations and purely there to give the List different types. They are never defined or instantiated.
Related
I am writing a C++ wrapper around a C library. Here is an example of my strategy.
// header file
class LibrdfUri { // wrapper around librdf.h librdf_uri*
/*
* If the deleter of std::unique_ptr is an empty
* class then it can do some optimizations and
* not actually store the deleter object.
* Otherwise it has to accommodate extra space for
* the deleter, which is unnecessary
* https://stackoverflow.com/questions/61969200/what-is-the-purpose-of-wrapping-this-private-deleter-function-in-a-struct/61969274#61969274
*/
struct deleter {
// turns deleter into a functor. For passing on to unique_ptr
void operator()(librdf_uri *ptr);
};
// automate management of librdf_uri* lifetime
std::unique_ptr<librdf_uri, deleter> librdf_uri_;
public:
LibrdfUri() = default;
explicit LibrdfUri(const std::string& uri); // construct from string
librdf_uri *get(); // returns the underlying raw pointer
};
// implementation
void LibrdfUri::deleter::operator()(librdf_uri *ptr) {
librdf_free_uri(ptr); // this is the C library function for destruction of librdf_uri
}
LibrdfUri::LibrdfUri(const std::string &uri) {
// create pointer to underlying C library 'object'
librdf_uri_ = std::unique_ptr<librdf_uri, deleter>(
librdf_new_uri(World::getWorld(), (const unsigned char *) uri.c_str()) // World::getWorld is static. Returns a pointer required by librdf_new_uri
);
}
librdf_uri *LibrdfUri::get() {
return librdf_uri_.get();
}
// and is used like so:
LibrdfUri uri("http://uri.com");
librdf_uri* curi = uri.get(); // when needed
This works for the single type librdf_uri* which is a part of the underlying library however I have lots of these. My question is double barrelled. The first part concerns the best general strategy for generalizing this wrapper to other classes while the second is concerns the implementation of that strategy.
Regarding the first part, here are my thoughts:
1. I could implement each class manually like I've done here. This is probably the simplest and least elegant. Yet it still might be my best option. However there is a small amount of code duplication involved, since each CWrapper I write will essentially have the same structure. Not to mention if I need to change something then I'll have to do each class individually.
2. Use an base class (abstract?)
3. Use a template
The second part of my question is basically: if I implement either option 2 or 3 (which I think might even be just a single option) how would I do it?
Here is a (vastly broken) version of what I'm thinking:
template<class LibrdfType>
class CWrapper {
struct deleter { ; //?
void operator()(LibrdfType *ptr) {
// ??
};
}
std::unique_ptr<LibrdfType, deleter> ptr;
public:
CWrapper() = default;
LibrdfType *get() {
ptr.get();
};
};
Then, LibrdfUri and any other C class I need to wrap, would just subclass CWrapper
This is a better deleter:
template<auto f>
using deleter=std::integral_constant< std::decay_t<decltype(f)>, f >;
use:
deleter<librdf_free_uri>
is a stateless deleter that calls librdf_free_uri.
But we don't need that I think. Here is what I might do:
There are 3 pieces of information you need.
How to construct
How to destroy
What type to store
One way is to define ADL baser helpers with famous names that you override to delete/construct.
template<class T>struct tag_t{};
template<class T>constexpr tag_t<T> tag{};
template<class T>
void delete_wrapptr(T*)=delete;
struct cleanup_wrapptr{
template<class T>
void operator()(T* t)const{ delete_wrapptr(t); }
};
template<class T>
using wrapptr=std::unique_ptr<T, cleanup_wrapptr>;
template<class T>
wrapptr<T> make_wrapptr( tag_t<T>, ... )=delete;
now you just have to write overloads for make and delete.
void delete_wrapptr(librdf_uri* ptr){
librdf_free_uri(ptr); // this is the C library function for destruction of librdf_uri
}
librdr_uri* make_wrapptr(tag_t<librdf_uri>, const std::string &uri) {
return librdf_new_uri(World::getWorld(), (const unsigned char *) uri.c_str()); // World::getWorld is static. Returns a pointer required by librdf_new_uri
}
and you can;
wrapptr<librdf_uri> ptr = make_wrapptr(tag<librdf_uri>, uri);
the implementation becomes just overriding those two functions.
make_wrapptr and delete_wrapptr overloads you write need to be visible at creating point, and in the namespace of T, tag_t or cleanup_wrapptr. The implementations can be hidden in a cpp file, but the declaration of the overload cannot.
What I am needing can be done by storing this pointer of enclosing class into nested class for example this way:
class CEnclosing {
public:
class CNested : public CSomeGeneric {
public:
CNested(CEnclosing* e) : m_e(e) {}
virtual void operator=(int i) { m_e->SomeMethod(i); }
CEnclosing* m_e;
};
CNested nested;
CEnclosing() : nested(this) {}
virtual void SomeMethod(int i);
};
int main()
{
CEnclosing e;
e.nested = 123;
return 0;
}
This works well, but requires sizeof(void*) bytes of memory more for each nested member class. Exist effective and portable way to do this without need to store pointer to instance of CEnclosing in m_e?
As stated previously, C++ does not provide any way to do this. A nested class has no special way to find its enclosing class. The solution you already have is the recommended way.
If you have an advanced scenario, and if you are prepared to maintain non-portable code, and if the cost of storing an additional pointer is important enough to use a risky solution, then there is a way based on the C++ object model. With a number of provisos I won't go into, you can rely on the enclosing and nested classes being laid out in memory in a predictable order, and there being a fixed offset between the start of the enclosing and nested classes.
The code is something like:
CEnclosing e;
int offset = (char*)&e.nested - (char*)&e;
//... inside nested class
CEnclosing* pencl = (CEnclosing*)((char*)this - offset);
OTOH it's equally possible that the offsetof macro may just do it for you, but I haven't tried it.
If you really want to do this, read about trivially copyable and standard layout in the standard.
I believe the following could be portable; though it is not fool-proof. Specifically, it will not work across virtual inheritance.
Also, I would like to point that it is not safe, in that it will happily compile even if the member you pass does not correspond to the one you compute the offset with:
#include <iostream>
template <typename C, typename T>
std::ptrdiff_t offsetof_impl(T C::* ptr) {
C c; // only works for default constructible classes
T* t = &(c.*ptr);
return reinterpret_cast<char*>(&c) - reinterpret_cast<char*>(t);
}
template <typename C, typename T, T C::* Ptr>
std::ptrdiff_t offsetof() {
static std::ptrdiff_t const Offset = offsetof_impl(Ptr);
return Offset;
}
template <typename C, typename T, T C::* Ptr>
C& get_enclosing(T& t) {
return *reinterpret_cast<C*>(reinterpret_cast<char*>(&t)
+ offsetof<C, T, Ptr>());
}
// Demo
struct E { int i; int j; };
int main() {
E e = { 3, 4 };
//
// BEWARE: get_enclosing<E, int, &E::j>(e.i); compiles ERRONEOUSLY too.
// ^ != ^
//
E& ref = get_enclosing<E, int, &E::j>(e.j);
std::cout << (void const*)&e << " " << (void const*)&ref << "\n";
return 0;
}
Still, it does run on this simplistic example, which allowed me to find 2 bugs in my initial implementation (already). Handle with caution.
The clear and simple answer to your question is no, C++11 doesn't have any special feature to handle your scenario. But there is a trick in C++ to allow you to do this:
If CEnclosing didn't have a virtual function, a pointer to nested would have the same value as a pointer to the containing instance. That is:
(void*)&e == (void*)&e.nested
This is because the variable nested is the first in the class CEnclosing.
However, since you have a virtual function in CEnclosing class, then all you need to do is subtract the vtable size from &e.nested and you should have a pointer to e. Don't forget to cast correctly, though!
EDIT: As Stephane Rolland said, this is a dangerous solution and, honestly, I wouldn't use it, but this is the only way (or trick) I could think of to access the enclosing class from a nested class. Personally, I would probably try to redesign the relation between these two classes if I really want to optimise memory usage up to the level you mentioned.
How about using multiple inheritance like this:
class CNested {
public:
virtual void operator=(int i) { SomeMethod(i); }
virtual void SomeMethod(int i) = 0;
};
class CEnclosing: public CSomeGeneric, public CNested {
int nEncMember;
public:
CNested& nested;
CEnclosing() : nested(*this), nEncMember(456) {}
virtual void SomeMethod(int i) { std:cout << i + nEncMember; }
};
I am writing a template of a working class, and this might just be a dumb question, but if I have a template structure (linked list) to hold possibly pointers to objects then how do I know that they are being deleted, or that they where pointers in the first place?
for example: the linkedList will be used in 2 ways in this program
a pointer to an object of class Thing is placed inside a node inside a linkedList
an enum is placed inside a node inside a linkedList
I know that the nodes are being deleted, but how do I know that the thing in the node is a pointer so that it can be deleted as well, and not just be a Null referenced object?
You can specialize the node based on the type of the object, and for the pointer specialization, create a destructor for the node-type that properly allocates and deletes the pointer managed by the node.
For instance:
//general node type for non-pointer types
template<typename T>
struct linked_list_node
{
T data;
linked_list_node<T>* next;
linked_list_node(const T& d): data(d), next(NULL) {}
~linked_list_node() {}
};
//specialized version for pointer types
template<typename T>
struct linked_list_node<T*>
{
typedef void (*deleter)(T*);
T* data;
linked_list_node<T>* next;
deleter d_func; //custom function for reclaiming pointer-type
linked_list_node(const T& d): data(new T(d)), next(NULL), d_func(NULL) {}
linked_list_node(const T& d, deleter func): data(new T(d)),
next(NULL), d_func(func) {}
~linked_list_node()
{
if(d_func)
d_func(data); //execute custom function for reclaiming pointer-type
else
delete data;
}
};
You can then instantiate the different versions by passing the correct template argument when creating an instance of the linked_list_node type. For instance,
linked_list_node<MyPtr*> node(FooPtr); //creates the specialized ptr version
linked_list_node<MyEnum> node(FooEnum); //creates a non-ptr version of the node
Template specialization is the best answer, and will work well as long as you don't mix types of nodes. However if you want to mix types of your linked nodes, let me show you how to do it. First, there is no straightforward template solution. You would have to type cast your linked nodes together due to strict type constrains.
A quite common solution is to construct a variant class (which can hold one value with variant types, and is always aware which one). Qt has a QVariant class, for instance. Boost has boost::any.
Here is a complete example implementation using a custom variant class that could hold any of your types. I can handle your suggested object pointer and enum, but could be extended to hold more.
An example which will print "delete obj" once:
#include <iostream>
int
main( int argc, char **argv )
{
LinkedList<VariantExample> elementObj( new ExampleObj );
LinkedList<VariantExample> elementEnum( enumOne );
elementEnum.setNext( elementObj );
}
// VariantExample class. Have a look at [QVariant][4] to see how a fairly
// complete interface could look like.
struct ExampleObj
{
};
enum ExampleEnum
{
enumOne,
enumTwo
};
struct VariantExample
{
ExampleObj* obj; // or better boost::shared_ptr<ExampleObj> obj
ExampleEnum en;
bool is_obj;
bool is_enum;
VariantExample() : obj(0), is_obj(false), is_enum(false) {}
// implicit conversion constructors
VariantExample( ExampleObj* obj_ ) : is_obj(true), is_enum(false)
{ obj = obj_;
}
VariantExample( ExampleEnum en_ ) : obj(0), is_obj(false), is_enum(true)
{ en = en_;
}
// Not needed when using boost::shared_ptr above
void
destroy()
{
if( is_obj && obj )
{
std::cout << "delete obj" << std::endl;
delete obj;
}
}
};
// The linked list template class which handles variant classes with a destroy()
// method (see VariantExample).
template
<
typename _type_ = VariantExample
>
struct LinkedList
{
LinkedList* m_next;
_type_ m_variant;
explicit
LinkedList( _type_ variant_ ) : m_next(0), m_variant( variant_ ){ }
void
setNext( LinkedList& next_ ){ m_next = &next_; }
// Not needed when using boost::shared_ptr above
~LinkedList()
{
m_variant.destroy();
}
};
Because elementObj's destroy method called once when the LinkedList's destructor is called, the output "delete obj" is appearing just once. Again, as you were quite specific about the delete/ownership, this example has a destroy method/interface. It will be explicitly called in the destructor of the LinkedList class. A better ownership model could be implemented with ie. boost::shared_ptr. Then you dont need to destroy it manually. It helps to read about conversion constructors, by the way.
// the first parameter becomes boost::shared_ptr<ExampleObj>( new ExampleObj ) )
// and is deleted when LinkedList is destroyed. See code comments above.
LinkedList<> elementObj( new ExampleObj );
Finally note that you have to have a single variant class to hold all your types which could appear in your LinkedList chain. Two different LinkedList Variant types would not work, again, finally because of the "next" pointer type; which would be not compatible.
Footnote:
How type constrains prevent an easy solution ? Imagine your linked node "next" pointer type is not just the the bare template name, its a shortcut, but is actually qualified including the template arguments - what end up as the type symbol the compiler uses to judge type compabilities.
Say I have a struct with a bunch of members:
struct foo {
int len;
bar *stuff;
};
As it so happens stuff will point to an array of bars that is len long. I'd like to encode this in stuff's type. So something like:
struct foo {
int len;
DependentLength<bar, &foo::len> stuff;
};
Then I could implement DependentLength to behave like a pointer to a bar array but that asserts when trying to looking at an index bigger than foo::len. However, I can't implement DependentLength<&foo::len>::operator[] because operator[] only takes one parameter, the index, and it needs to know the location of the 'foo' object in order to dereference the member pointer template parameter and do the assert check.
However, I happen to know that DependentLength will only ever be used here as a member of 'foo'. What I'd really like to do is tell DependentLength where to find len relative to itself, rather than relative to a foo pointer. So something like DependentLength<(char*)&foo::stuff - (char*)&foo::len> stuff;, but that's not legal C++. Is there a good or failing that evil language hack that could make this work?
So something like DependentLength<(char*)&foo::stuff - (char*)&foo::len> stuff;
You're asking templates to perform calculations based on dynamic properties passed to them during run-time ... that won't work for templates since they must be instantiated with values that allow the compile to create the code requested by the template parameters at compile time. Thus any values passed to a template must be resolvable at compile-time, and not run-time.
You're going to have to use a dynamic container type. For instance, std::vector meets your request where the std::vector::at() function will throw an exception if you exceed the bounds of the underlying container. It's unfortunately not as convenient as a static_assert, but again, using static_assert is impossible for this situation since you need run-time checking for the bounds. Additionally, std::vector also incorporates an overload for operator[], iterators, queries for it's size, etc.
You can tell the template the offset of the member to use as length.
template<typename T, typename LEN_T, ptrdiff_t LEN_OFFSET>
class DependentArray
{
public:
T& operator[](LEN_T i_offset)
{
if (i_offset < 0) throw xxx;
if (i_offset > this->size()) throw xxx;
return this->m_pArr[i_offset];
} // []
private:
LEN_T& size()
{
return *reinterpret_cast<LEN_T*>(reinterpret_cast<char*>(this) + LEN_OFFSET);
} // ()
private:
T* m_pArr;
};
struct foo
{
int len;
DependentArray<bar, int, -sizeof(int)> stuff;
};
Edit 2:
Thought of another solution. Use a class that is good for foo only to supply the offset of the size field and define its method after foo is defined and offsets can be calculated:
#define MEMBER_OFFSET(T,M) \
(reinterpret_cast<char*>(&reinterpret_cast<T*>(0x10)->M) - \
reinterpret_cast<char*>(reinterpret_cast<T*>(0x10)))
template<typename T, typename LEN_T, typename SIZE_OFFSET_SUPPLIER>
class FooDependentArray
{
public:
T& operator[](LEN_T i_offset)
{
if (i_offset < 0) throw xxx;
if (i_offset > this->size()) throw xxx;
return this->m_pArr[i_offset];
} // []
private:
LEN_T& size()
{
const ptrdiff_t len_offest = SIZE_OFFSET_SUPPLIER::getOffset();
return *reinterpret_cast<LEN_T*>(reinterpret_cast<char*>(this) + len_offset);
} // ()
private:
T* m_pArr;
};
struct FooSizeOffsetSupplier
{
static ptrdiff_t getOffset();
};
struct foo
{
int len;
DependentArray<bar, int, FooSizeOffsetSupplier> stuff;
};
ptrdiff_t FooSizeOffsetSupplier::getOffset()
{
return MEMBER_OFFSET(Foo,m_len) - MEMBER_OFFSET(Foo,m_pArr);
} // ()
This makes it possible to add and remove members from foo.
So basically the assignment was we had to create a doubly linked list that's templated generically instead of locked to a single data type. I've tried compiling both with gcc and msvc and both compilers are giving me roughly the same errors so I'm assuming its just my bad coding and not the quirkyness of one compiler or the other.
Currently, I'm getting errors saying that my classes in linkList.h are not a template
../linkList.h:34: error: ‘llist’ is not a template type
../linkList.h:143: error: ‘iter’ is not a template type
../josephus.cpp:14: error: ‘llist’ is not a template
../josephus.cpp:14: error: aggregate ‘llist ppl’ has incomplete type
and cannot be defined ../josephus.cpp:15: error: ‘iter’ is not a
template
linkList.h
template<typename T>
class iter
{
public:
iter()
{
position = sentin;
container = sentin->payload;
}
T get() const
{
assert(position != sentin);
return position->payload;
}
void next()
{
position = position->next;
}
void previous()
{
position = position->prev;
}
bool equals(iter itr) const
{
return position == itr.position;
}
private:
node *position;
llist *container;
};
josephus.cpp
llist<int> ppl;
iter<int> pos;
int start = static_cast<int>(argv[1]) - 1;
int end = static_cast<int>(argv[2]) - 1;
Any help in this matter is much appreciated
Your forward declaration says llist is a class:
class llist;
Then you say it is a template:
template<typename T>
class llist;
Similarly with iter.
I don't know how you could make it compilable easily. However, you can make node and iter 'inside' of llist.
There are several issues.
class A;
is not the way you forward declare a templated class.
If A has a single templated parameter you need to say:
template<typename T>
class A;
If you say that after you've already said class A; you're contradicting yourself. The next issue is simlar, friend class A; if A is templated won't work, you need to say friend class A<T>; or similar. Finally, static_cast<int>(argv[1]) will not compile (althought static_cast<int>(argv[1][0]) would, but is still not want you want). To convert a string to an integer meaningfully, you'll need to use atoi, strtol, stringstream etc.
The llist is not a class. So forward declaring it is not usefull.
template<typename T> class llist;
Trying to make the code compile is relatively simple.
You have just missed the template part of a lot of the types. Search for iter llist and node and make sure they have the appropriate on the end.
If you look at the STL it is conventinal to typedef some internal types for ease of use. You could follow the same principle.
template<typename T>
class llist
{
typedef iter<T> Iter;
typedef node<T> Node;
// The rest of the code.
};