How does the compiler define the classes in type_traits? - c++

In C++11 and later, the <type_traits> header contains many classes for type checking, such as std::is_empty, std::is_polymorphic, std::is_trivially_constructible and many others.
While we use these classes just like normal classes, I cannot figure out any way to possibly write the definition of these classes. No amount of SFINAE (even with C++14/17 rules) or other method seems to be able to tell if a class is polymorphic, empty, or satisfy other properties. An class that is empty still occupies a positive amount of space as the class must have a unique address.
How then, might compilers define such classes in C++? Or perhaps it is necessary for the compiler to be intrinsically aware of these class names and parse them specially?

Back in the olden days, when people were first fooling around with type traits, they wrote some really nasty template code in attempts to write portable code to detect certain properties. My take on this was that you had to put a drip-pan under your computer to catch the molten metal as the compiler overheated trying to compile this stuff. Steve Adamczyk, of Edison Design Group (provider of industrial-strength compiler frontends), had a more constructive take on the problem: instead of writing all this template code that takes enormous amounts of compiler time and often breaks them, ask me to provide a helper function.
When type traits were first formally introduced (in TR1, 2006), there were several traits that nobody knew how to implement portably. Since TR1 was supposed to be exclusively library additions, these couldn't count on compiler help, so their specifications allowed them to get an answer that was occasionally wrong, but they could be implemented in portable code.
Nowadays, those allowances have been removed; the library has to get the right answer. The compiler help for doing this isn't special knowledge of particular templates; it's a function call that tells you whether a particular class has a particular property. The compiler can recognize the name of the function, and provide an appropriate answer. This provides a lower-level toolkit that the traits templates can use, individually or in combination, to decide whether the class has the trait in question.

Related

Does C++20 offer any new solutions to the problem of public member invisibility and source code bloat with inherited class templates?

Does C++20 offer any new solutions to the problem of public member invisibility and source code bloat/repetition with inherited class templates described in this question over 2 years ago ?
The "Problem"
The alleged "problem" is that, in a template, an unqualified name used in a way which isn't dependent on the specialization is truly independent of the specialization and refers to the entity with that name found at that point. The alleged source code "bloat" is using this-> to explicitly make the name dependent or qualifying the name. This is still the situation in C++20.
Just to be clear, the set of entities is not known at the point we refer to them. In the linked question, the base class depends on the template parameter, and we have only seen the primary base class template. The base class template may be specialized later and may have completely different member functions than the ones we've seen. So, any "solution" requires that a name without any obvious contextual dependence on the specialization find entities not yet declared which may be surprising.
Why It's Impossible
Any naive changes in this direction are either pretty big or have severe downsides, or both.
You could postpone all name lookup until instantiation time, discarding two phase lookup. That invites ODR violations which is silent UB, which is a huge downside.
You could restrict specialization so that you cannot specialize the base class later such that a different entity is found. That is difficult to diagnose, so it would likely be a new rule introducing silent UB.
You could opt in with a using declaration: using X::* as you propose or using class X as someone else suggested in a different context. This has the benefit of explicitness. It moves the problem a level up: if X is not dependent, presumably it should be found now, but if it is dependent, what happens? We can't instantiate it prior to instantiating the template we're in now. Thus, we can't interpret any names we see until instantiation. It has similar downsides to discarding two phase lookup.
Any of these changes would add complexity to an already complex area and would also pose significant backwards compatibility hurdles. None of them is a clear win.
Note: In C++20 they did make the rules more uniform by allowing ADL to find function templates with specified parameters to be found: f<int>(1).
Why It's Not a Problem
I doubt there would be consensus that this really is a problem. The linked question makes a poor argument. The derived class adds member function behavior to a certain base class, but a free function works better. These behaviors did not need member access, they aren't required by the language to be members, they can be found as non-members with ADL, and by using a free function they apply even when the static type you have is the base type. So, using inheritance for this is unnecessary coupling, and is a worse option.
Searching for 100s of places to add this-> and adding these 6 characters 100s of times strikes me as Code Bloat and Repetition when I have to templatize a base class
"Searching": The compiler will tell you when a name can't be found, which is better than silent bad behavior, such as making ODR violations easier to hit.
"templatize a base": Templatizing the base class doesn't trigger this. Templatizing the derived class and making the base dependent does. Yes, when templatizing the derived class, specifying that a bare name used in a way that is independent of the template parameter is in fact dependent may seem like boilerplate to some, but others might argue being explicit is clearer.
"100s of times": Seems hyperbolic.
These code patterns are used all the time in real world. Just look at CRTP. (comment on linked question)
Again, this only applies if the derived class is templated. I would dispute the commonality, but these idioms do exist and have a place.
Most importantly, though, is that CRTP is not a goal. CRTP is a hack. It's a C++ idiom because C++ lacks better facilities. CRTP allows a class to opt into certain behaviors that would otherwise be bothersome to write. Relevant C++ proposals do exist, but by and large, they have focused on making extension easier or removing boilerplate, and not on making CRTP, the hack, easier.
These are some that come to mind:
C++20: Comparisons
A very common use of CRTP is for things that require a lot of extra boilerplate. C++ required you to define operator== and operator!=, for instance. By opting into a CRTP base class, one could define only the primitive operation and have the other one generated.
C++20 fixed the underlying problem with comparisons. The typical member-wise comparison can be defaulted, and comparisons can be re-written so that != can invoke ==.
The problem is solved at the root, removing CRTP, not enhancing it.
C++20: Iterators to Ranges
Another common use of CRTP in the same vein as above is iterators. Writing custom iterators requires only a few fundamental operations: advance and dereference for forward iterators. But then there's a lot of extra seemingly unnecessary ceremony: pre-increment, post-increment, const iterators, typedefs, etc.
C++20 took a large step forward by introducing range concepts and the range library. The result is that it should be much less necessary to write a custom iterator. Ranges become a capable concept on their own, and there's a good suite of range combinators.
C++20: Concepts
C++ essentially has two systems for specifying an interface: virtual functions and concepts. They have trade-offs. Virtual functions are intrusive. But prior to C++20, concepts were implicit and emulated. One reason to use CRTP or inheritance generally would be to inject virtual functions. But in C++20, concepts are a language feature removing one big negative.
Future C++: Metaclasses
One value of CRTP in addition to the boilerplate reduction is satisfying an entire collection of multiple type requirements. A comparable class defines all the comparison operators. An clonable base defines a virtual destructor and clone.
This is the topic of metaclasses, which is not yet in C++.
See Also
See also the work on customization points which seems very interesting. And see the debates on unified function call syntax, which seems like we'll never get.
Summary
There's a very good question hiding in here about how C++20 makes it easier to reduce boilerplate, remove hacks like CRTP, and write better and clearer code. C++20 takes several steps in this regard, but they made the expression of intent easier, not a particular idiom.

Pertinence of void pointers

Looking through a colleague's code, I see that some of its handles are stored as void pointers.
// Class header
void* hSomeSdk;
// Class implementation
hSomeSdk = new SomeSDK(...);
((SomeSDK*)hSomeSdk)->DoSomeWork();
Now I know that sometimes handles are void pointers because it may be unknown before runtime what will be the actual type of the handle. Or that it can help when we need to share the pointer without revealing its actual structure. But this does not seem to be the case in my situation: it will always be SomeSDK and it is not shared outside the class where it is created. Also the author of this code is gone from the company.
Are there other reasons why it would make sense to have it be a void pointer?
Since this is a member variable, I'm gonna go out on a limb and say your colleague wanted to minimize dependencies. Including the header for SomeSDK is probably undesirable just to define a pointer. The colleague may have had one of two reasons as far as I can see from the code you show:
They just didn't know they can add a forward declarations like class SomeSDK; to allow defining pointers. Some programmers just aren't aware of it.
They couldn't forward declare it. If SomeSDK is not a class, but a type alias (aka typedef), then it's not possible to forward declare it exactly. One can only declare the class it aliases, but that in turn may be an implementation detail that's hard to keep track of. Even the standard library has a similar problem, that is why it provides iosfwd to make forward declaring standard stream types easier.
If the code is peppered with casts of this handle, then the design should have been reworked ages ago. Otherwise, if it's in one place (or a few at most) only, I can see why the people maintaining it could live with it peacefully.
Nope.
If I had to guess, the ex-colleague was unfamiliar with forward declarations and thus didn't know they could still do SomeSDK* in the header without including the entire SomeSDK definition.
Given the constraints you've mentioned, the only thing this pattern achieves is to eliminate some type safety, make the code harder to read/maintain, and generate a Stack Overflow question.
void* were popular and needed back in C. They are convenient in the sense that they can be easily cast to anything. If you need to cast from double* to char*, you have to make a mid cast to void*.
The problem with void* is that they are too flexible: they do not convey intentions of the writer, making them very unsafe especially in big projects.
In Object Oriented Design it is popular to create abstract interface classes (all members are virtual and not implemented) and make pointers to such classes and then instantiate various possible implementation depending on the usage.
However, nowadays, it is more recommended to work with templates (main advantage of C++ over other languages), as those are much faster and enable more compile-time optimization than OOD allowed. Unfortunately, working with templates is still a huge hassle - they have more complicated syntax and it is difficult to convey intentions of the writer to users about restrictions and demands of the template parameters (Concepts TS that solves this problem decently will be available in C++20 - currently there is only SFINAE, a horrible temporary solution from 20 years ago; while Reflection TS, that will greatly enhance generic programming in C++, is unlikely to be available even in C++23).

How is is_standard_layout useful?

From what I understand, standard layout allows three things:
Empty base class optimization
Backwards compatibility with C with certain pointer casts
Use of offsetof
Now, included in the library is the is_standard_layout predicate metafunction, but I can't see much use for it in generic code as those C features I listed above seem extremely rare to need checking in generic code. The only thing I can think of is using it inside static_assert, but that is only to make code more robust and isn't required.
How is is_standard_layout useful? Are there any things which would be impossible without it, thus requiring it in the standard library?
General response
It is a way of validating assumptions. You wouldn't want to write code that assumes standard layout if that wasn't the case.
C++11 provides a bunch of utilities like this. They are particularly valuable for writing generic code (templates) where you would otherwise have to trust the client code to not make any mistakes.
Notes specific to is_standard_layout
It looks to me like the (pseudo code) definition of is_pod would roughly be...
// note: applied recursively to all members
bool is_pod(T) { return is_standard_layout(T) && is_trivial(T); }
So, you need to know is_standard_layout in order to implement is_pod. Given that, we might as well expose is_standard_layout as a tool available to library developers. Also of note: if you have a use-case for is_pod, you might want to consider the possibility that is_standard_layout might actually be a better (more accurate) choice in that case, since POD is essentially a subset of standard layout.
I get the feeling that they added every conceivable variant of type evaluation, regardless of any obvious value, just in case someone might encounter a need sometime before the next standard comes out. I doubt if piling on these "extra" type properties adds a significant additional burden to compiler developers.
There is a nice discussion of standard layout here: Why is C++11's POD "standard layout" definition the way it is?
There is also a lot of good detail at cppreference.com: Non-static data members

Will C++ compiler generate code for each template type?

I have two questions about templates in C++. Let's imagine I have written a simple List and now I want to use it in my program to store pointers to different object types (A*, B* ... ALot*). My colleague says that for each type there will be generated a dedicated piece of code, even though all pointers in fact have the same size.
If this is true, can somebody explain me why? For example in Java generics have the same purpose as templates for pointers in C++. Generics are only used for pre-compile type checking and are stripped down before compilation. And of course the same byte code is used for everything.
Second question is, will dedicated code be also generated for char and short (considering that they both have the same size and there are no specialization).
If this makes any difference, we are talking about embedded applications.
I have found a similar question, but it did not completely answer my question: Do C++ template classes duplicate code for each pointer type used?
Thanks a lot!
I have two questions about templates in C++. Let's imagine I have written a simple List and now I want to use it in my program to store pointers to different object types (A*, B* ... ALot*). My colleague says that for each type there will be generated a dedicated piece of code, even though all pointers in fact have the same size.
Yes, this is equivalent to having both functions written.
Some linkers will detect the identical functions, and eliminate them. Some libraries are aware that their linker doesn't have this feature, and factor out common code into a single implementation, leaving only a casting wrapper around the common code. Ie, a std::vector<T*> specialization may forward all work to a std::vector<void*> then do casting on the way out.
Now, comdat folding is delicate: it is relatively easy to make functions you think are identical, but end up not being the same, so two functions are generated. As a toy example, you could go off and print the typename via typeid(x).name(). Now each version of the function is distinct, and they cannot be eliminated.
In some cases, you might do something like this thinking that it is a run time property that differs, and hence identical code will be created, and the identical functions eliminated -- but a smart C++ compiler might figure out what you did, use the as-if rule and turn it into a compile-time check, and block not-really-identical functions from being treated as identical.
If this is true, can somebody explain me why? For example in Java generics have the same purpose as templates for pointers in C++. Generics are only used for per-compile type checking and are stripped down before compilation. And of course the same byte code is used for everything.
No, they aren't. Generics are roughly equivalent to the C++ technique of type erasure, such as what std::function<void()> does to store any callable object. In C++, type erasure is often done via templates, but not all uses of templates are type erasure!
The things that C++ does with templates that are not in essence type erasure are generally impossible to do with Java generics.
In C++, you can create a type erased container of pointers using templates, but std::vector doesn't do that -- it creates an actual container of pointers. The advantage to this is that all type checking on the std::vector is done at compile time, so there doesn't have to be any run time checks: a safe type-erased std::vector may require run time type checking and the associated overhead involved.
Second question is, will dedicated code be also generated for char and short (considering that they both have the same size and there are no specialization).
They are distinct types. I can write code that will behave differently with a char or short value. As an example:
std::cout << x << "\n";
with x being a short, this print an integer whose value is x -- with x being a char, this prints the character corresponding to x.
Now, almost all template code exists in header files, and is implicitly inline. While inline doesn't mean what most folk think it means, it does mean that the compiler can hoist the code into the calling context easily.
If this makes any difference, we are talking about embedded applications.
What really makes a difference is what your particular compiler and linker is, and what settings and flags they have active.
The answer is maybe. In general, each instantiation of a
template is a unique type, with a unique implementation, and
will result in a totally independent instance of the code.
Merging the instances is possible, but would be considered
"optimization" (under the "as if" rule), and this optimization
isn't wide spread.
With regards to comparisons with Java, there are several points
to keep in mind:
C++ uses value semantics by default. An std::vector, for
example, will actually insert copies. And whether you're
copying a short or a double does make a difference in the
generated code. In Java, short and double will be boxed,
and the generated code will clone a boxed instance in some way;
cloning doesn't require different code, since it calls a virtual
function of Object, but physically copying does.
C++ is far more powerful than Java. In particular, it allows
comparing things like the address of functions, and it requires
that the functions in different instantiations of templates have
different addresses. Usually, this is not an important point,
and I can easily imagine a compiler with an option which tells
it to ignore this point, and to merge instances which are
identical at the binary level. (I think VC++ has something like
this.)
Another issue is that the implementation of a template in C++
must be present in the header file. In Java, of course,
everything must be present, always, so this issue affects all
classes, not just template. This is, of course, one of the
reasons why Java is not appropriate for large applications. But
it means that you don't want any complicated functionality in a
template; doing so loses one of the major advantages of C++,
compared to Java (and many other languages). In fact, it's not
rare, when implementing complicated functionality in templates,
to have the template inherit from a non-template class which
does most of the implementation in terms of void*. While
implementing large blocks of code in terms of void* is never
fun, it does have the advantage of offering the best of both
worlds to the client: the implementation is hidden in compiled
files, invisible in any way, shape or manner to the client.

Features of C++ that can't be implemented in C?

I have read that C++ is super-set of C and provide a real-time implementation by creating objects. Also C++ is closed to real world as it is enriched with Object Oriented concepts.
What all concepts are there in C++ that can not be implemented in C?
Some say that we can not over write methods in C then how can we have different flavors of printf()?
For example printf("sachin"); will print sachin and printf("%d, %s",count ,name); will print 1,sachin assuming count is an integer whose value is 1 and name is a character array initililsed with "sachin".
Some say data abstraction is achieved in C++, so what about structures?
Some responders here argues that most things that can be produced with C++ code can also be produced with C with enough ambition. This is true in some parts, but some things are inherently impossible to achieve unless you modify the C compiler to deviate from the standard.
Fakeable:
Inheritance (pointer to parent-struct in the child-struct)
Polymorphism (Faking vtable by using a group of function pointers)
Data encapsulation (opaque sub structures with an implementation not exposed in public interface)
Impossible:
Templates (which might as well be called preprocessor step 2)
Function/method overloading by arguments (some try to emulate this with ellipses, but never really comes close)
RAII (Constructors and destructors are automatically invoked in C++, so your stack resources are safely handled within their scope)
Complex cast operators (in C you can cast almost anything)
Exceptions
Worth checking out:
GLib (a C library) has a rather elaborate OO emulation
I posted a question once about what people miss the most when using C instead of C++.
Clarification on RAII:
This concept is usually misinterpreted when it comes to its most important aspect - implicit resource management, i.e. the concept of guaranteeing (usually on language level) that resources are handled properly. Some believes that achieving RAII can be done by leaving this responsibility to the programmer (e.g. explicit destructor calls at goto labels), which unfortunately doesn't come close to providing the safety principles of RAII as a design concept.
A quote from a wikipedia article which clarifies this aspect of RAII:
"Resources therefore need to be tied to the lifespan of suitable objects. They are acquired during initialization, when there is no chance of them being used before they are available, and released with the destruction of the same objects, which is guaranteed to take place even in case of errors."
How about RAII and templates.
It is less about what features can't be implemented, and more about what features are directly supported in the language, and therefore allow clear and succinct expression of the design.
Sure you can implement, simulate, fake, or emulate most C++ features in C, but the resulting code will likely be less readable, or maintainable. Language support for OOP features allows code based on an Object Oriented Design to be expressed far more easily than the same design in a non-OOP language. If C were your language of choice, then often OOD may not be the best design methodology to use - or at least extensive use of advanced OOD idioms may not be advisable.
Of course if you have no design, then you are likely to end up with a mess in any language! ;)
Well, if you aren't going to implement a C++ compiler using C, there are thousands of things you can do with C++, but not with C. To name just a few:
C++ has classes. Classes have constructors and destructors which call code automatically when the object is initialized or descrtucted (going out of scope or with delete keyword).
Classes define an hierarchy. You can extend a class. (Inheritance)
C++ supports polymorphism. This means that you can define virtual methods. The compiler will choose which method to call based on the type of the object.
C++ supports Run Time Information.
You can use exceptions with C++.
Although you can emulate most of the above in C, you need to rely on conventions and do the work manually, whereas the C++ compiler does the job for you.
There is only one printf() in the C standard library. Other varieties are implemented by changing the name, for instance sprintf(), fprintf() and so on.
Structures can't hide implementation, there is no private data in C. Of course you can hide data by not showing what e.g. pointers point to, as is done for FILE * by the standard library. So there is data abstraction, but not as a direct feature of the struct construct.
Also, you can't overload operators in C, so a + b always means that some kind of addition is taking place. In C++, depending on the type of the objects involved, anything could happen.
Note that this implies (subtly) that + in C actually is overridden; int + int is not the same code as float + int for instance. But you can't do that kind of override yourself, it's something for the compiler only.
You can implement C++ fully in C... The original C++ compiler from AT+T was infact a preprocessor called CFront which just translated C++ code into C and compiled that.
This approach is still used today by comeau computing who produce one of the most C++ standards compliant compilers there is, eg. It supports all of C++ features.
namespace
All the rest is "easily" faked :)
printf is using a variable length arguments list, not an overloaded version of the function
C structures do not have constructors and are unable to inherit from other structures they are simply a convenient way to address grouped variables
C is not an OO langaueage and has none of the features of an OO language
having said that your are able to imitate C++ functionality with C code but, with C++ the compiler will do all the work for you in compile time
What all concepts are there in C++
that can not be implemented in C?
This is somewhat of an odd question, because really any concept that can be expressed in C++ can be expressed in C. Even functionality similar to C++ templates can be implemented in C using various horrifying macro tricks and other crimes against humanity.
The real difference comes down to 2 things: what the compiler will agree to enforce, and what syntactic conveniences the language offers.
Regarding compiler enforcement, in C++ the compiler will not allow you to directly access private data members from outside of a class or friends of the class. In C, the compiler won't enforce this; you'll have to rely on API documentation to separate "private" data from "publicly accessible" data.
And regarding syntactic convenience, C++ offers all sorts of conveniences not found in C, such as operator overloading, references, automated object initialization and destruction (in the form of constructors/destructors), exceptions and automated stack-unwinding, built-in support for polymorphism, etc.
So basically, any concept expressed in C++ can be expressed in C; it's simply a matter of how far the compiler will go to help you express a certain concept and how much syntactic convenience the compiler offers. Since C++ is a newer language, it comes with a lot more bells and whistles than you would find in C, thus making the expression of certain concepts easier.
One feature that isn't really OOP-related is default arguments, which can be a real keystroke-saver when used correctly.
Function overloading
I suppose there are so many things namespaces, templates that could not be implemented in C.
There shouldn't be too much such things, because early C++ compilers did produce C source code from C++ source code. Basically you can do everything in Assembler - but don't WANT to do this.
Quoting Joel, I'd say a powerful "feature" of C++ is operator overloading. That for me means having a language that will drive you insane unless you maintain your own code. For example,
i = j * 5;
… in C you know, at least, that j is
being multiplied by five and the
results stored in i.
But if you see that same snippet of
code in C++, you don’t know anything.
Nothing. The only way to know what’s
really happening in C++ is to find out
what types i and j are, something
which might be declared somewhere
altogether else. That’s because j
might be of a type that has operator*
overloaded and it does something
terribly witty when you try to
multiply it. And i might be of a type
that has operator= overloaded, and the
types might not be compatible so an
automatic type coercion function might
end up being called. And the only way
to find out is not only to check the
type of the variables, but to find the
code that implements that type, and
God help you if there’s inheritance
somewhere, because now you have to
traipse all the way up the class
hierarchy all by yourself trying to
find where that code really is, and if
there’s polymorphism somewhere, you’re
really in trouble because it’s not
enough to know what type i and j are
declared, you have to know what type
they are right now, which might
involve inspecting an arbitrary amount
of code and you can never really be
sure if you’ve looked everywhere
thanks to the halting problem (phew!).
When you see i=j*5 in C++ you are
really on your own, bubby, and that,
in my mind, reduces the ability to
detect possible problems just by
looking at code.
But again, this is a feature. (I know I will be modded down, but at the time of writing only a handful of posts talked about downsides of operator overloading)