Where does a tuple store its data? - c++

When I define a tuple like std::tuple<int, char> foo; Where inside the class does it store the int and char values? I'm looking for a layman's terms explanation.

If you take the time to digest it then the GNU implementation is actually a decent example of recursive inheritance using C++0x variadic templates. This is not a subject that lends itself easily to a layman's explanation and is best understood by reading the code over and over until it makes sense.
From what I can see they're inheriting upwards for each successive type in the tuple's type-list with each inherited class taking charge of the storage for that type until the recursion hits the end of the type-list.

Anywhere it wants, really. I mean, it's an implementation detail. But I would say all implementations will eventually boil down to data members with automatic storage duration inside the tuple object. If you're really interested, you can look into the source code of open-source standard library implementations (like libc++ and libstdc++) or the implementation used by your compiler.

Related

Easy way to implement small buffer optimization for arbitrary type erasure (like in std::function.)

I tend to use type erasure technique quite a bit.
It typically looks like this:
class YetAnotherTypeErasure
{
public:
// interface redirected to pImpl
private:
// Adapting function
template ...
friend YetAnotherTypeErasure make_YetAnotherTypeErasure (...);
class Interface {...};
template <typename Adaptee>
class Concrete final : public Interface {
// redirecting Interface to Adaptee
};
std::unique_ptr<Interface> pImpl_; // always on the heap
};
std::function does something similar, but it has a small buffer optimization, so if Concrete<Adaptee> is smaller than smth and has nothrow move operations, it will be stored in it. Is there some generic library solution to do it fairly easy? For enforcing small buffer only storing at compile time? Maybe something has been proposed for standardisation?
I know nothing about the small buffer optimization required by the standard or any proposal, though it is often allowed or encouraged.
Note that some (conditionally) non-throwing requirements on such types effectively require the optimization in practice because alternatives (like non-throwing allocation from emergency buffers) seem insane here.
On the other hand, you can just make your own solution from scratch, based on the standard library (e.g. std::aligned_storage). This may still verbose from the view of users, but not too hard.
Actually I implemented (not proposed then) any with such optimization and some related utilities several years ago. Lately, libstdc++'s implementation of std::experimental::any used the technique almost exactly as this (however, __ prefixed internal names are certainly not good for ordinary library users).
My implementation now uses some common helpers to deal with the storage. These helpers do ease to implement the type erasure storage strategy (at least fit for something similar to any enough). But I am still interested in more general high-level solution to simplify the interface redirecting.
There is also a function implementation based directly on the any implementation above. They support move-only types and sane allocator interface, while std ones not. The function implementation has better performance than std::function in libstdc++ in some cases, thanks to the (partially no-op) default initialization of the underlying any object.
I found a reasonably nice solution for everyday code - use std::function
With tiny library support to help with const correctness,
the code get's down to 20 lines:
https://gcc.godbolt.org/z/GtewFI
I think C++20 polymorphic_value comes closest to what we can do in modern c++: wg21.link/p0201
Basically it's like std::any but all of your types have to inherit the same interface.
It is Semiregular, they decided to drop equality.
This has some overhead: one vptr in the class itself and a separate dispatch mechanism in the polymorphic value. It also has a pointer like interface instead of a value like.
However, considering how easy it is to use it comparing to writing your own type_erased adapter, I'd say for most use-cases would be more than good enough.

What is the purpose of boost::fusion?

Ive spent the day reading notes and watching a video on boost::fusion and I really don't get some aspects to it.
Take for example, the boost::fusion::has_key<S> function. What is the purpose of having this in boost::fusion? Is the idea that we just try and move as much programming as possible to happen at compile-time? So pretty much any boost::fusion function is the same as the run-time version, except it now evaluates at compile time? (and we assume doing more at compile-time is good?).
Related to boost::fusion, i'm also a bit confused why metafunctions always return types. Why is this?
Another way to look at boost::fusion is to think of it as "poor man introspection" library. The original motivation for boost::fusion comes from the direction of boost::spirit parser/generator framework, in particular the need to support what is called "parser attributes".
Imagine, you've got a CSV string to parse:
aaaa, 1.1
The type, this string parses into, can be described as "tuple of string and double". We can define such tuples in "plain" C++, either with old school structs (struct { string a; double b; } or newer tuple<string, double>). The only thing we miss is some sort of adapter, which will allow to pass tuples (and some other types) of arbitrary composition to a unified parser interface and expect it to make sense of it without passing any out of band information (such as string parsing templates used by scanf).
That's where boost::fusion comes into play. The most straightforward way to construct a "fusion sequence" is to adapt a normal struct:
struct a {
string s;
double d;
};
BOOST_FUSION_ADAPT_STRUCT(a, (string, s)(double, d))
The "ADAPT_STRUCT" macro adds the necessary information for parser framework (in this example) to be able to "iterate" over members of struct a to the tune of the following questions:
I just parsed a string. Can I assign it to first member of struct a?
I just parsed a double. Can I assign it to second member of struct a?
Are there any other members in struct a or should I stop parsing?
Obviously, this basic example can be further extended (and boost::fusion supplies the capability) to address much more complex cases:
Variants - let's say parser can encounter either sting or double and wants to assign it to the right member of struct a. BOOST_FUSION_ADAPT_ASSOC_STRUCT comes to the rescue (now our parser can ask questions like "which member of struct a is of type double?").
Transformations - our parser can be designed to accept certain types as parameters but the rest of the programs had changed quite a bit. Yet, fusion metafunctions can be conveniently used to adapt new types to old realities (or vice versa).
The rest of boost::fusion functionality naturally follows from the above basics. fusion really shines when there's a need for conversion (in either direction) of "loose IO data" to strongly typed/structured data C++ programs operate upon (if efficiency is of concern). It is the enabling factor behind spirit::qi and spirit::karma being such an efficient (probably the fastest) I/O frameworks .
Fusion is there as a bridge between compile-time and run-time containers and algorithms. You may or may not want to move some of your processing to compile-time, but if you do want to then Fusion might help. I don't think it has a specific manifesto to move as much as possible to compile-time, although I may be wrong.
Meta-functions return types because template meta-programming wasn't invented on purpose. It was discovered more-or-less by accident that C++ templates can be used as a compile-time programming language. A meta-function is a mapping from template arguments to instantiations of a template. As of C++03 there were are two kinds of template (class- and function-), therefore a meta-function has to "return" either a class or a function. Classes are more useful than functions, since you can put values etc. in their static data members.
C++11 adds another kind of template (for typedefs), but that is kind of irrelevant to meta-programming. More importantly for compile-time programming, C++11 adds constexpr functions. They're properly designed for the purpose and they return values just like normal functions. Of course, their input is not a type, so they can't be mappings from types to something else in the way that templates can. So in that sense they lack the "meta-" part of meta-programming. They're "just" compile-time evaluation of normal C++ functions, not meta-functions.

Will C++ compiler generate code for each template type?

I have two questions about templates in C++. Let's imagine I have written a simple List and now I want to use it in my program to store pointers to different object types (A*, B* ... ALot*). My colleague says that for each type there will be generated a dedicated piece of code, even though all pointers in fact have the same size.
If this is true, can somebody explain me why? For example in Java generics have the same purpose as templates for pointers in C++. Generics are only used for pre-compile type checking and are stripped down before compilation. And of course the same byte code is used for everything.
Second question is, will dedicated code be also generated for char and short (considering that they both have the same size and there are no specialization).
If this makes any difference, we are talking about embedded applications.
I have found a similar question, but it did not completely answer my question: Do C++ template classes duplicate code for each pointer type used?
Thanks a lot!
I have two questions about templates in C++. Let's imagine I have written a simple List and now I want to use it in my program to store pointers to different object types (A*, B* ... ALot*). My colleague says that for each type there will be generated a dedicated piece of code, even though all pointers in fact have the same size.
Yes, this is equivalent to having both functions written.
Some linkers will detect the identical functions, and eliminate them. Some libraries are aware that their linker doesn't have this feature, and factor out common code into a single implementation, leaving only a casting wrapper around the common code. Ie, a std::vector<T*> specialization may forward all work to a std::vector<void*> then do casting on the way out.
Now, comdat folding is delicate: it is relatively easy to make functions you think are identical, but end up not being the same, so two functions are generated. As a toy example, you could go off and print the typename via typeid(x).name(). Now each version of the function is distinct, and they cannot be eliminated.
In some cases, you might do something like this thinking that it is a run time property that differs, and hence identical code will be created, and the identical functions eliminated -- but a smart C++ compiler might figure out what you did, use the as-if rule and turn it into a compile-time check, and block not-really-identical functions from being treated as identical.
If this is true, can somebody explain me why? For example in Java generics have the same purpose as templates for pointers in C++. Generics are only used for per-compile type checking and are stripped down before compilation. And of course the same byte code is used for everything.
No, they aren't. Generics are roughly equivalent to the C++ technique of type erasure, such as what std::function<void()> does to store any callable object. In C++, type erasure is often done via templates, but not all uses of templates are type erasure!
The things that C++ does with templates that are not in essence type erasure are generally impossible to do with Java generics.
In C++, you can create a type erased container of pointers using templates, but std::vector doesn't do that -- it creates an actual container of pointers. The advantage to this is that all type checking on the std::vector is done at compile time, so there doesn't have to be any run time checks: a safe type-erased std::vector may require run time type checking and the associated overhead involved.
Second question is, will dedicated code be also generated for char and short (considering that they both have the same size and there are no specialization).
They are distinct types. I can write code that will behave differently with a char or short value. As an example:
std::cout << x << "\n";
with x being a short, this print an integer whose value is x -- with x being a char, this prints the character corresponding to x.
Now, almost all template code exists in header files, and is implicitly inline. While inline doesn't mean what most folk think it means, it does mean that the compiler can hoist the code into the calling context easily.
If this makes any difference, we are talking about embedded applications.
What really makes a difference is what your particular compiler and linker is, and what settings and flags they have active.
The answer is maybe. In general, each instantiation of a
template is a unique type, with a unique implementation, and
will result in a totally independent instance of the code.
Merging the instances is possible, but would be considered
"optimization" (under the "as if" rule), and this optimization
isn't wide spread.
With regards to comparisons with Java, there are several points
to keep in mind:
C++ uses value semantics by default. An std::vector, for
example, will actually insert copies. And whether you're
copying a short or a double does make a difference in the
generated code. In Java, short and double will be boxed,
and the generated code will clone a boxed instance in some way;
cloning doesn't require different code, since it calls a virtual
function of Object, but physically copying does.
C++ is far more powerful than Java. In particular, it allows
comparing things like the address of functions, and it requires
that the functions in different instantiations of templates have
different addresses. Usually, this is not an important point,
and I can easily imagine a compiler with an option which tells
it to ignore this point, and to merge instances which are
identical at the binary level. (I think VC++ has something like
this.)
Another issue is that the implementation of a template in C++
must be present in the header file. In Java, of course,
everything must be present, always, so this issue affects all
classes, not just template. This is, of course, one of the
reasons why Java is not appropriate for large applications. But
it means that you don't want any complicated functionality in a
template; doing so loses one of the major advantages of C++,
compared to Java (and many other languages). In fact, it's not
rare, when implementing complicated functionality in templates,
to have the template inherit from a non-template class which
does most of the implementation in terms of void*. While
implementing large blocks of code in terms of void* is never
fun, it does have the advantage of offering the best of both
worlds to the client: the implementation is hidden in compiled
files, invisible in any way, shape or manner to the client.

Is boost::variant rocket science? (And should I therefore avoid it for simple problems?)

OK, so I have this tiny little corner of my code where I'd like my function return either of (int, double, CString) to clean up the code a bit.
So I think: No problem to write a little union-like wrapper struct with three members etc. But wait! Haven't I read of boost::variant? Wouldn't this be exactly what I need? This would save me from messing around with a wrapper struct myself! (Note that I already have the boost library available in my project.)
So I fire up my browser, navigate to Chapter 28. Boost.Variant and lo and behold:
The variant class template is a safe, generic, stack-based discriminated union container, offering a simple solution for manipulating an object from a heterogeneous set of types [...]
Great! Exactly what I need!
But then it goes on:
Boost.Variant vs. Boost.Any
Boost.Any makes little use of template metaprogramming techniques (avoiding potentially hard-to-read error messages and significant compile-time processor and memory demands).
[...]
Troubleshooting
"Internal heap limit reached" -- Microsoft Visual C++ -- The compiler option /ZmNNN can increase the memory allocation limit. The NNN is a scaling percentage (i.e., 100 denotes the default limit). (Try /Zm200.)
[...]
Uh oh. So using boost::variant may significantly increase compile-time and generate hard-to-read error messages. What if someone moves my use of boost::variant to a common header, will our project suddenly take lots longer to compile? Am I introducing an (unnecessarily) complex type?
Should I use boost::variant for my simple tiny problem?
Generally, use boost::variant if you do want a discriminated union (any is for unknown types -- think of it as some kind of equivalent to how void* is used in C).
Some advantages include exception handling, potential usage of less space than the sum of the type sizes, type discriminated "visiting". Basically, stuff you'd want to perform on the discriminated union.
However, for boost::variant to be efficient, at least one of the types used must be "easily" constructed (read the documentation for more details on what "easily" means).
Boost.variant is not that complex, IMHO. Yes, it is template based, but it doesn't use any really complex feature of C++. I've used quite a bit and no problem at all. I think in your case it would help better describing what your code is doing.
Another way of thinking is transforming what that function returns into a more semantically rich structure/class that allows interpreting which inner element is interesting, but that depends on your design.
This kind of boost element comes from functional programming, where you have variants around every corner.
It should be a way to have a type-safe approach to returning a kind of value that can be of many precise types. This means that is useful to solve your problem BUT you should consider if it's really what you need to do.
The added value compared to other approaches that tries to solve the same problem should be the type-safety (you won't be able to place whatever you want inside a variant without noticing, in opposition to a void*)
I don't use it because, to me, it's a symptom of bad design.
Either your method should return an object that implements a determinated interface or it should be split in more than one method. Design should be reviewed, anyway.

Internal typedefs in C++ - good style or bad style?

Something I have found myself doing often lately is declaring typedefs relevant to a particular class inside that class, i.e.
class Lorem
{
typedef boost::shared_ptr<Lorem> ptr;
typedef std::vector<Lorem::ptr> vector;
//
// ...
//
};
These types are then used elsewhere in the code:
Lorem::vector lorems;
Lorem::ptr lorem( new Lorem() );
lorems.push_back( lorem );
Reasons I like it:
It reduces the noise introduced by the class templates, std::vector<Lorem> becomes Lorem::vector, etc.
It serves as a statement of intent - in the example above, the Lorem class is intended to be reference counted via boost::shared_ptr and stored in a vector.
It allows the implementation to change - i.e. if Lorem needed to be changed to be intrusively reference counted (via boost::intrusive_ptr) at a later stage then this would have minimal impact to the code.
I think it looks 'prettier' and is arguably easier to read.
Reasons I don't like it:
There are sometimes issues with dependencies - if you want to embed, say, a Lorem::vector within another class but only need (or want) to forward declare Lorem (as opposed to introducing a dependency on its header file) then you end up having to use the explicit types (e.g. boost::shared_ptr<Lorem> rather than Lorem::ptr), which is a little inconsistent.
It may not be very common, and hence harder to understand?
I try to be objective with my coding style, so it would be good to get some other opinions on it so I can dissect my thinking a little bit.
I think it is excellent style, and I use it myself. It is always best to limit the scope of names as much as possible, and use of classes is the best way to do this in C++. For example, the C++ Standard library makes heavy use of typedefs within classes.
It serves as a statement of intent -
in the example above, the Lorem class
is intended to be reference counted
via boost::shared_ptr and stored in a
vector.
This is exactly what it does not do.
If I see 'Foo::Ptr' in the code, I have absolutely no idea whether it's a shared_ptr or a Foo* (STL has ::pointer typedefs that are T*, remember) or whatever. Esp. if it's a shared pointer, I don't provide a typedef at all, but keep the shared_ptr use explicitly in the code.
Actually, I hardly ever use typedefs outside Template Metaprogramming.
The STL does this type of thing all the time
The STL design with concepts defined in terms of member functions and nested typedefs is a historical cul-de-sac, modern template libraries use free functions and traits classes (cf. Boost.Graph), because these do not exclude built-in types from modelling the concept and because it makes adapting types that were not designed with the given template libraries' concepts in mind easier.
Don't use the STL as a reason to make the same mistakes.
Typedefs are the ones what policy based design and traits built upon in C++, so The power of Generic Programming in C++ stems from typedefs themselves.
Typdefs are definitely are good style. And all your "reasons I like" are good and correct.
About problems you have with that. Well, forward declaration is not a holy grail. You can simply design your code to avoid multi level dependencies.
You can move typedef outside the class but Class::ptr is so much prettier then ClassPtr that I don't do this. It is like with namespaces as for me - things stay connected within the scope.
Sometimes I did
Trait<Loren>::ptr
Trait<Loren>::collection
Trait<Loren>::map
And it can be default for all domain classes and with some specialization for certain ones.
The STL does this type of thing all the time - the typedefs are part of the interface for many classes in the STL.
reference
iterator
size_type
value_type
etc...
are all typedefs that are part of the interface for various STL template classes.
Another vote for this being a good idea. I started doing this when writing a simulation that had to be efficient, both in time and space. All of the value types had an Ptr typedef that started out as a boost shared pointer. I then did some profiling and changed some of them to a boost intrusive pointer without having to change any of the code where these objects were used.
Note that this only works when you know where the classes are going to be used, and that all the uses have the same requirements. I wouldn't use this in library code, for example, because you can't know when writing the library the context in which it will be used.
Currently I'm working on code, that intensively uses these kind of typedefs. So far that is fine.
But I noticed that there are quite often iterative typedefs, the definitions are split among several classes, and you never really know what type you are dealing with. My task is to summarize the size of some complex data structures hidden behind these typedefs - so I can't rely on existing interfaces. In combination with three to six levels of nested namespaces and then it becomes confusing.
So before using them, there are some points to be considered
Does anyone else need these typedefs? Is the class used a lot by other classes?
Do I shorten the usage or hide the class? (In case of hiding you also could think of interfaces.)
Are other people working with the code? How do they do it? Will they think it is easier or will they become confused?
When the typedef is used only within the class itself (i.e. is declared as private) I think its a good idea. However, for exactly the reasons you give, I would not use it if the typedef's need to be known outside the class. In that case I recommend to move them outside the class.
I recommend to move those typedefs outside the class. This way, you remove direct dependency on shared pointer and vector classes and you can include them only when needed. Unless you are using those types in your class implementation, I consider they shouldn't be inner typedefs.
The reasons you like it are still matched, since they are solved by the type aliasing through typedef, not by declaring them inside your class.