Access to protected member through member-pointer: is it a hack? - c++

We all know members specified protected from a base class can only be accessed from a derived class own instance. This is a feature from the Standard, and this has been discussed on Stack Overflow multiple times:
Cannot access protected member of another instance from derived type's scope
;
Why can't my object access protected members of another object defined in common base class?
And others.
But it seems possible to walk around this restriction with member pointers, as user chtz has shown me:
struct Base { protected: int value; };
struct Derived : Base
{
void f(Base const& other)
{
//int n = other.value; // error: 'int Base::value' is protected within this context
int n = other.*(&Derived::value); // ok??? why?
(void) n;
}
};
Live demo on coliru
Why is this possible, is it a wanted feature or a glitch somewhere in the implementation or the wording of the Standard?
From comments emerged another question: if Derived::f is called with an actual Base, is it undefined behaviour?

The fact that a member is not accessible using class member access expr.ref (aclass.amember) due to access control [class.access] does not make this member inaccessible using other expressions.
The expression &Derived::value (whose type is int Base::*) is perfectly standard compliant, and it designates the member value of Base. Then the expression a_base.*p where p is a pointer to a member of Base and a_base an instance of Base is also standard compliant.
So any standard compliant compiler shall make the expression other.*(&Derived::value); defined behavior: access the member value of other.

is it a hack?
In similar vein to using reinterpret_cast, this can be dangerous and may potentially be a source of hard to find bugs. But it's well formed and there's no doubt whether it should work.
To clarify the analogy: The behaviour of reinterpret_cast is also specified exactly in the standard and can be used without any UB. But reinterpret_cast circumvents the type system, and the type system is there for a reason. Similarly, this pointer to member trick is well formed according to the standard, but it circumvents the encapsulation of members, and that encapsulation (typically) exists for a reason (I say typically, since I suppose a programmer can use encapsulation frivolously).
[Is it] a glitch somewhere in the implementation or the wording of the Standard?
No, the implementation is correct. This is how the language has been specified to work.
Member function of Derived can obviously access &Derived::value, since it is a protected member of a base.
The result of that operation is a pointer to a member of Base. This can be applied to a reference to Base. Member access privileges does not apply to pointers to members: It applies only to the names of the members.
From comments emerged another question: if Derived::f is called with an actual Base, is it undefined behaviour?
Not UB. Base has the member.

Just to add to the answers and zoom in a bit on the horror I can read between your lines. If you see access specifiers as 'the law', policing you to keep you from doing 'bad things', I think you are missing the point. public, protected, private, const ... are all part of a system that is a huge plus for C++. Languages without it may have many merits but when you build large systems such things are a real asset.
Having said that: I think it's a good thing that it is possible to get around almost all the safety nets provided to you. As long as you remember that 'possible' does not mean 'good'. This is why it should never be 'easy'. But for the rest - it's up to you. You are the architect.
Years ago I could simply do this (and it may still work in certain environments):
#define private public
Very helpful for 'hostile' external header files. Good practice? What do you think? But sometimes your options are limited.
So yes, what you show is kind-of a breach in the system. But hey, what keeps you from deriving and hand out public references to the member? If horrible maintenance problems turn you on - by all means, why not?

Basically what you're doing is tricking the compiler, and this is supposed to work. I always see this kind of questions and people some times get bad results and some times it works, depending on how this converts to assembler code.
I remember seeing a case with a const keyword on a integer, but then with some trickery the guy was able to change the value and successfully circumvented the compiler's awareness. The result was: A wrong value for a simple mathematical operation. The reason is simple: Assembly in x86 does make a distinction between constants and variables, because some instructions do contain constants in their opcode. So, since the compiler believes it's a constant, it'll treat it as a constant and deal with it in an optimized way with the wrong CPU instruction, and baam, you have an error in the resulting number.
In other words: The compiler will try to enforce all the rules it can enforce, but you can probably eventually trick it, and you may or may not get wrong results based on what you're trying to do, so you better do such things only if you know what you're doing.
In your case, the pointer &Derived::value can be calculated from an object by how many bytes there are from the beginning of the class. This is basically how the compiler accesses it, so, the compiler:
Doesn't see any problem with permissions, because you're accessing value through derived at compile-time.
Can do it, because you're taking the offset in bytes in an object that has the same structure as derived (well, obviously, the base).
So, you're not violating any rules. You successfully circumvented the compilation rules. You shouldn't do it, exactly because of the reasons described in the links you attached, as it breaks OOP encapsulation, but, well, if you know what you're doing...

Related

Can we make virtual function inline [duplicate]

Pure virtual functions are those member functions that are virtual and have the pure-specifier ( = 0; )
Clause 10.4 paragraph 2 of C++03 tells us what an abstract class is and, as a side note, the following:
[Note: a function declaration cannot provide both a pure-specifier and a definition
—end note] [Example:
struct C {
virtual void f() = 0 { }; // ill-formed
};
—end example]
For those who are not very familiar with the issue, please note that pure virtual functions can have definitions but the above-mentioned clause forbids such definitions to appear inline (lexically in-class). (For uses of defining pure virtual functions you may see, for example, this GotW)
Now for all other kinds and types of functions it is allowed to provide an in-class definition, and this restriction seems at first glance absolutely artificial and inexplicable. Come to think of it, it seems such on second and subsequent glances :) But I believe the restriction wouldn't be there if there weren't a specific reason for that.
My question is: does anybody know those specific reasons? Good guesses are also welcome.
Notes:
MSVC does allow PVF's to have inline definitions. So don't get surprised :)
the word inline in this question does not refer to the inline keyword. It is supposed to mean lexically in-class
In the SO thread "Why is a pure virtual function initialized by 0?" Jerry Coffin provided this quote from Bjarne Stroustrup’s The Design & Evolution of C++, section §13.2.3, where I've added some emphasis of the part I think is relevant:
The curious =0 syntax was chosen over the obvious alternative of introducing a new keyword pure or abstract because at the time I saw no chance of getting a new keyword accepted. Had I suggested pure, Release 2.0 would have shipped without abstract classes. Given a choice between a nicer syntax and abstract classes, I chose abstract classes. Rather than risking delay and incurring the certain fights over pure, I used the tradition C and C++ convention of using 0 to represent "not there." The =0 syntax fits with my view that a function body is the initializer for a function and also with the (simplistic, but usually adequate) view of the set of virtual functions being implemented as a vector of function pointers. [ … ]
So, when choosing the syntax Bjarne was thinking of a function body as a kind of initializer part of the declarator, and =0 as an alternate form of initializer, one that indicated “no body” (or in his words, “not there”).
It stands to reason that one cannot both indicate “not there” and have a body – in that conceptual picture.
Or, still in that conceptual picture, having two initializers.
Now, that's as far as my telepathic powers, google-foo and soft-reasoning goes. I surmise that nobody's been Interested Enough™ to formulate a proposal to the committee about having this purely syntactical restriction lifted, and following up with all the work that that entails. Thus it's still that way.
You shouldn't have so much faith in the standardization committee. Not everything has a deep reason to explain it. Something are so just because at first nobody thought otherwise and after nobody thought that changing it is important enough (I think it is the case here); for things old enough it could even be an artifact of the first implementation. Some are the result of evolution -- there was a deep reason at a time, but the reason was removed and the initial decision wasn't reconsidered again (it could be also the case here, where the initial decision was because any definition of the pure function was forbidden). Some are the result of negotiation between different POV and the result lacks coherence but this lack was deemed necessary to reach to consensus.
Good guesses... well, considering the situation:
it is legal to declare the function inline and provide an explicitly inline body (outside the class), so there's clearly no objection to the only practical implication of being declared inside the class.
I see no potential ambiguities or conflicts introduced in the grammar, so no logical reason for the exclusion of function definitions in situ.
My guess: the use for bodies for pure virtual functions was realised after the = 0 | { ... } grammar was formulated, and the grammar simply wasn't revised. It's worth considering that there are a lot of proposals for language changes / enhancements - including those to make things like this more logical and consistent - but the number that are picked up by someone and written up as formal proposals is much smaller, and the number of those the Committee has time to consider, and believes the compiler-vendors will be prepared to implement, is much smaller again. Things like this need a champion, and perhaps you're the first person to see an issue in it. To get a feel for this process, check out http://www2.research.att.com/~bs/evol-issues.html.
Good guesses are welcome you say?
I think the = 0 at the declaration comes from having the implementation in mind. Most likely this definition means, that you get a NULL entry in the RTTI's vtbl of the class information -- the location where at runtime addresses of the member functions of a class are stored.
But actually, when put a definition of the function in your *.cpp file, you introduce a name into the object file for the linker: An address in the *.o file where to find a specific function.
The basic linker then does need to know about C++ anymore. It can just link together, even though you declared it as = 0.
I think I read that it is possible what you described, although I forgot the behaviour :-)...
Leaving destructors aside, implementations of pure virtual functions are a strange thing, because they never get called in the natural way. i.e. if you have a pointer or reference to your Base class the underlying object will always be some Derived that overrides the function, and that will always get called.
The only way to actually get the implementation to be called is using the Base::func() syntax from one of the derived class's overloads.
This actually, in some ways, makes it a better target for inlining, as at the point where the compiler wants to invoke it, it is always clear which overload is being called.
Also, if implementations for pure virtual functions were forbidden, there would be an obvious workaround of some other (probably protected) non-virtual function in the Base class that you could just call in the regular way from your derived function. Of course the scope would be less limited in that you could call it from any function.
(By the way, I am under the assumption that Base::f() can only be called with this syntax from Derived::f() and not from Derived::anyOtherFunc(). Am I right with this assumption?).
Pure virtual destructors are a different story, in a sense. It is used as a technique simply to prevent someone creating an instance of the derived class without there being any pure virtual functions elsewhere.
The answer to the actual question of "why" it is not permitted is really just because the standards committee said so, but my answer sheds some light on what we are trying to achieve anyway.

Unions as Base Class

The standard defines that Unions cannot be used as Base class, but is there any specific reasoning for this? As far as I understand Unions can have constructors, destructors, also member variables, and methods to operate on those varibales. In short a Union can encapsulate a datatype and state which might be accessed through member functions. Thus it in most common terms qualifies for being a class and if it can act as a class then why is it restricted from acting as a base class?
Edit: Though the answers try to explain the reasoning I still do not understand how Union as a Derived class is worst than when Union as just a class. So in hope of getting more concrete answer and reasoning I will push this one for a bounty. No offence to the already posted answers, Thanks for those!
Tony Park gave an answer which is pretty close to the truth. The C++ committee basically didn't think it was worth the effort to make unions a strong part of C++, similarly to the treatment of arrays as legacy stuff we had to inherit from C but didn't really want.
Unions have problems: if we allow non-POD types in unions, how do they get constructed? It can certainly be done, but not necessarily safely, and any consideration would require committee resources. And the final result would be less than satisfactory, because what is really required in a sane language is discriminated unions, and bare C unions could never be elevated to discriminated unions in way compatible with C (that I can imagine, anyhow).
To elaborate on the technical issues: since you can wrap a POD-component only union in a struct without losing anything, there's no advantage allowing unions as bases. With POD-only union components, there's no problem with explicit constructors simply assigning one of the components, nor with using a bitblit (memcpy) for compiler generated copy constructor (or assignment).
Such unions, however, aren't useful enough to bother with except to retain them so existing C code can be considered valid C++. These POD-only unions are broken in C++ because they fail to retain a vital invariant they possess in C: any data type can be used as a component type.
To make unions useful, we must allow constructable types as members. This is significant because it is not acceptable to merely assign a component in a constructor body, either of the union itself, or any enclosing struct: you cannot, for example, assign a string to an uninitialised string component.
It follows one must invent some rules for initialising union component with mem-initialisers, for example:
union X { string a; string b; X(string q) : a(q) {} };
But now the question is: what is the rule? Normally the rule is you must initialise every member and base of a class, if you do not do so explicitly, the default constructor is used for the remainder, and if one type which is not explicitly initialised does not have a default constructor, it's an error [Exception: copy constructors, the default is the member copy constructor].
Clearly this rule can't work for unions: the rule has to be instead: if the union has at least one non-POD member, you must explicitly initialise exactly one member in a constructor. In this case, no default constructor, copy constructor, assignment operator, or destructor will be generated and if any of these members are actually used, they must be explicitly supplied.
So now the question becomes: how would you write, say, a copy constructor? It is, of course quite possible to do and get right if you design your union the way, say, X-Windows event unions are designed: with the discriminant tag in each component, but you will have to use placement operator new to do it, and you will have to break the rule I wrote above which appeared at first glance to be correct!
What about default constructor? If you don't have one of those, you can't declare an uninitialised variable.
There are other cases where you can determine the component externally and use placement new to manage a union externally, but that isn't a copy constructor. The fact is, if you have N components you'd need N constructors, and C++ has a broken idea that constructors use the class name, which leaves you rather short of names and forces you to use phantom types to allow overloading to choose the right constructor .. and you can't do that for the copy constructor since its signature is fixed.
Ok, so are there alternatives? Probably, yes, but they're not so easy to dream up, and harder to convince over 100 people that it's worthwhile to think about in a three day meeting crammed with other issues.
It is a pity the committee did not implement the rule above: unions are mandatory for aligning arbitrary data and external management of the components is not really that hard to do manually, and trivial and completely safe when the code is generated by a suitable algorithm, in other words, the rule is mandatory if you want to use C++ as a compiler target language and still generate readable, portable code. Such unions with constructable members have many uses but the most important one is to represent the stack frame of a function containing nested blocks: each block has local data in a struct, and each struct is a union component, there is no need for any constructors or such, the compiler will just use placement new. The union provides alignment and size, and cast free component access.
[And there is no other conforming way to get the right alignment!]
Therefore the answer to your question is: you're asking the wrong question. There's no advantage to POD-only unions being bases, and they certainly can't be derived classes because then they wouldn't be PODs. To make them useful, some time is required to understand why one should follow the principle used everywhere else in C++: missing bits aren't an error unless you try to use them.
Union is a type that can be used as any one of its members depending on which member has been set - only that member can be later read.
When you derive from a type the derived type inherits the base type - the derived type can be used wherever the base type could be. If you could derive from a union the derived class could be used (not implicitly, but explicitly through naming the member) wherever any of the union members could be used, but among those members only one member could be legally accessed. The problem is the data on which member has been set is not stored in the union.
To avoid this subtle yet dangerous contradiction that in fact subverts a type system deriving from a union is not allowed.
Bjarne Stroustrup said 'there seems little reason for it' in The Annotated C++ Reference Manual.
The title asks why unions can't be a base class, but the question appears to be about unions as a derived class. So, which is it?
There's no technical reason why unions can't be a base class; it's just not allowed. A reasonable interpretation would be to think of the union as a struct whose members happen to potentially overlap in memory, and consider the derived class as a class that inherits from this (rather odd) struct. If you need that functionality, you can usually persuade most compilers to accept an anonymous union as a member of a struct. Here's an example, that's suitable for use as a base class. (And there's an anonymous struct in the union for good measure.)
struct V3 {
union {
struct {
float x,y,z;
};
float f[3];
};
};
The rationale for unions as a derived class is probably simpler: the result wouldn't be a union. Unions would have to be the union of all their members, and all of their bases. That's fair enough, and might open up some interesting template possibilities, but you'd have a number of limitations (all bases and members would have to be POD -- and would you be able to inherit twice, because a derived type is inherently non-POD?), this type of inheritance would be different from the other type the language sports (OK, not that this has stopped C++ before) and it's sort of redundant anyway -- the existing union functionality would do just as well.
Stroustrup says this in the D&E book:
As with void *, programmers should know that unions ... are inherently dangerous, should be avoided wherever possible, and should be handled with special care when actually needed.
(The elision doesn't change the meaning.)
So I imagine the decision is arbitrary, and he just saw no reason to change the union functionality (it works fine as-is with the C subset of C++), and so didn't design any integration with the new C++ features. And when the wind changed, it got stuck that way.
I think you got the answer yourself in your comments on EJP's answer.
I think unions are only included in C++ at all in order to be backwards compatible with C. I guess unions seemed like a good idea in 1970, on systems with tiny memory spaces. By the time C++ came along I imagine unions were already looking less useful.
Given that unions are pretty dangerous anyway, and not terribly useful, the vast new opportunities for creating bugs that inheriting from unions would create probably just didn't seem like a good idea :-)
Here's my guess for C++ 03.
As per $9.5/1, In C++ 03, Unions can not have virtual functions. The whole point of a meaningful derivation is to be able to override behaviors in the derived class. If a union cannot have virtual functions, that means that there is no point in deriving from a union.
Hence the rule.
You can inherit the data layout of a union using the anonymous union feature from C++11.
#include <cstddef>
template <size_t N,typename T>
struct VecData {
union {
struct {
float x;
float y;
float z;
};
float a[N];
};
};
template <size_t N, typename T>
class Vec : public VecData<N,T> {
//methods..
};
In general its almost always better not work with unions directly but enclose them within a struct or class. Then you can base your inheritance off the struct outer layer and use unions within if you need to.

Why is std::type_info polymorphic?

Is there a reason why std::type_info is specified to be polymorphic? The destructor is specified to be virtual (and there's a comment to the effect of "so that it's polymorphic" in The Design and Evolution of C++). I can't really see a compelling reason why. I don't have any specific use case, I was just wondering if there ever was a rationale or story behind it.
Here's some ideas that I've come up with and rejected:
It's an extensibility point - implementations might define subclasses, and programs might then try to dynamic_cast a std::type_info to another, implementation-defined derived type. This is possibly the reason, but it seems that it's just as easy for implementations to add an implementation-defined member, which could possibly be virtual. Programs wishing to test for these extensions would necessarily be non-portable anyway.
It's to ensure that derived types are destroyed properly when deleteing a base pointer. But there are no standard derived types, users can't define useful derived types, because type_info has no standard public constructors, and so deleteing a type_info pointer is never both legal and portable. And the derived types aren't useful because they can't be constructed - the only use I know for such non-constructible derived types is in the implementation of things like the is_polymorphic type trait.
It leaves open the possibility of metaclasses with customized types - each real polymorphic class A would get a derived "metaclass" A__type_info, which derives from type_info. Perhaps such derived classes could expose members that call new A with various constructor arguments in a type-safe way, and things like that. But making type_info polymorphic itself actually makes such an idea basically impossible to implement, because you'd have to have metaclasses for your metaclasses, ad infinitum, which is a problem if all the type_info objects have static storage duration. Maybe barring this is the reason for making it polymorphic.
There's some use for applying RTTI features (other than dynamic_cast) to std::type_info itself, or someone thought that it was cute, or embarrassing if type_info wasn't polymorphic. But given that there's no standard derived type, and no other classes in the standard hierarchy which one might reasonably try cross-cast to, the question is: what? Is there a use for expressions such as typeid(std::type_info) == typeid(typeid(A))?
It's because implementers will create their own private derived type (as I believe GCC does). But, why bother specifying it? Even if the destructor wasn't specified as virtual and an implementer decided that it should be, surely that implementation could declare it virtual, because it doesn't change the set of allowed operations on type_info, so a portable program wouldn't be able to tell the difference.
It's something to do with compilers with partially compatible ABIs coexisting, possibly as a result of dynamic linking. Perhaps implementers could recognize their own type_info subclass (as opposed to one originating from another vendor) in a portable way if type_info was guaranteed to be virtual.
The last one is the most plausible to me at the moment, but it's pretty weak.
I assume it's there for the convenience of implementers. It allows them to define extended type_info classes, and delete them through pointers to type_info at program exit, without having to build in special compiler magic to call the correct destructor, or otherwise jump through hoops.
surely that implementation could
declare it virtual, because it doesn't
change the set of allowed operations
on type_info, so a portable program
wouldn't be able to tell the
difference.
I don't think that's true. Consider the following:
#include <typeinfo>
struct A {
int x;
};
struct B {
int x;
};
int main() {
const A *a1 = dynamic_cast<const A*>(&typeid(int));
B b;
const A *a2 = dynamic_cast<const A*>(&b);
}
Whether it's reasonable or not, the first dynamic cast is allowed (and evaluates to a null pointer), whereas the second dynamic cast is not allowed. So, if type_info was defined in the standard to have the default non-virtual destructor, but an implementation added a virtual destructor, then a portable program could tell the difference[*].
Seems simpler to me to put the virtual destructor in the standard, than to either:
a) put a note in the standard that, although the class definition implies that type_info has no virtual functions, it is permitted to have a virtual destructor.
b) determine the set of programs which can distinguish whether type_info is polymorphic or not, and ban them all. They may not be very useful or productive programs, I don't know, but to ban them you have to come up with some standard language that describes the specific exception you're making to the normal rules.
Therefore I think that the standard has to either mandate the virtual destructor, or ban it. Making it optional is too complex (or perhaps I should say, I think it would be judged unnecessarily complex. Complexity never stopped the standards committee in areas where it was considered worthwhile...)
If it was banned, though, then an implementation could:
add a virtual destructor to some derived class of type_info
derive all of its typeinfo objects from that class
use that internally as the polymorphic base class for everything
that would solve the situation I described at the top of the post, but the static type of a typeid expression would still be const std::type_info, so it would be difficult for implementations to define extensions where programs can dynamic_cast to various targets to see what kind of type_info object they have in a particular case. Perhaps the standard hoped to allow that, although an implementation could always offer a variant of typeid with a different static type, or guarantee that a static_cast to a certain extension class will work, and then let the program dynamic_cast from there.
In summary, as far as I know the virtual destructor is potentially useful to implementers, and removing it doesn't gain anyone anything other than that we wouldn't be spending time wondering why it's there ;-)
[*] Actually, I haven't demonstrated that. I've demonstrated that an illegal program would, all else being equal, compile. But an implementation could perhaps work around that by ensuring that all isn't equal, and that it doesn't compile. Boost's is_polymorphic isn't portable, so while it's possible for a program to test that a class is polymorphic, that should be, there may be no way for a conforming program to test that a class isn't polymorphic, that shouldn't be. I think though that even if it's impossible, proving that, in order to remove one line from the standard, is quite a lot of effort.
The C++ standard says that typeid returns an object of type type_info, OR AN IMPLEMENTATION-DEFINED subclass thereof. So... I guess this is pretty much the answer. So I don't see why you reject your points 1 and 2.
Paragraph 5.2.8 Clause 1 of the current C++ standard reads:
The result of a typeid expression is an
lvalue of static type const
std::type_info (18.5.1) and dynamic
type const std::type_info or const
name where name is an
implementation-defined class derived
from std::type_info which preserves
the behavior described in 18.5.1.61)
The lifetime of the object referred to
by the lvalue extends to the end of
the program. Whether or not the
destructor is called for the type_info
object at the end of the program is
unspecified.
Which in turn means that one could write the following code is legal and fine:
const type_info& x = typeid(expr); which may require that type_info be polymorphic
3/ It leaves open the possibility of metaclasses with customized types - each real polymorphic class A would get a derived "metaclass" A__type_info, which derives from type_info. Perhaps such derived classes could expose members that call new A with various constructor arguments in a type-safe way, and things like that. But making type_info polymorphic itself actually makes such an idea basically impossible to implement, because you'd have to have metaclasses for your metaclasses, ad infinitum, which is a problem if all the type_info objects have static storage duration. Maybe barring this is the reason for making it polymorphic.
Clever...
Anyway, I disagree with this reasoning: such implementation could easily rule-out meta classes for types derived from type_info, including type_info itself.
About the simplest "global" id you can have in C++ is a class name,and typeinfo provides a way to compare such id's for equality. But the design is so awkward and limited that you then need to wrap typeinfo in some wrapper class, e.g. to be able to put instances in collections. Andrei Alexandrescu did that in his "Modern C++ Design" and I think that that typeinfo wrapper is part of the Loki library; there's probably one also in Boost; and it's pretty easy to roll your own, e.g. see my own wrapper.
But even for such a wrapper there's not in general any need for a virtual destructor in typeinfo.
The question is therefore not so much "huh, why is there a virtual destructor" but rather, as I see it, "huh, why is the design so backward, awkward and not directly usable"? And I'd put that down to the standardization process. For example, iostreams are not exactly examples of superb design, either; not something to emulate.

How is dynamic_cast typically implemented?

Is the type check a mere integer comparison? Or would it make sense to have a GetTypeId virtual function to distinguishing which would make it an integer comparison?
(Just don't want things to be a string comparison on the class names)
EDIT: What I mean is, if I'm often expecting the wrong type, would it make sense to use something like:
struct Token
{
enum {
AND,
OR,
IF
};
virtual std::size_t GetTokenId() = 0;
};
struct AndToken : public Token
{
std::size_t GetTokenId() { return AND; }
};
And use the GetTokenId member instead of relying on dynamic_cast.
The functionality of the dynamic_cast goes far beyond a simple type check. If it was just a type check, it would be very easy to implement (something like what you have in your original post).
In addition to type checking, dynamic_cast can perform casts to void * and hierarchical cross-casts. These kinds of casts conceptually require some ability to traverse class hierarchy in both directions (up and down). The data structures needed to support such casts are more complicated than a mere scalar type id. The information the dynamic_cast is using is a part of RTTI.
Trying to describe it here would be counterproductive. I used to have a good link that described one possible implementation of RTTI... will try to find it.
I don't know the exact implementation, but here is an idea how I would do it:
Casting from Derived* to Base* can be done in compile time. Casting between two unrelated polimorphic types can be done in compile time too (just return NULL).
Casting from Base* to Derived* needs to be done in run-time, because multiple derived classes possible. The identification of dynamic type can be done using the virtual method table bound to the object (that's why it requires polymorphic classes).
This VMT probably contains extra information about the base classes and their data offsets. These data offsets are relevant when multiple inheritance is involved and is added to the source pointer to make it point to the right location.
If the desired type was not found among the base classes, dynamic_cast would return null.
In some of the original compilers you are correct they used string comparison.
As a result dynamic_cast<> was very slow (relatively speaking) as the class hierarchy was traversed each step up/down the hierarchy chain required a string compare against the class name.
This leads to a lot of people developing their own casting techniques. This was nearly always ultimately futile as it required each class to be annotated correctly and when things went wrong it was nearly impossible to trace the error.
But that is also ancient history.
I am not sure how it is done now but it definitely does not involve string comparison. Doing it yourself is also a bad idea (never do work that the compiler is already doing). Any attempt you make will not be as fast or as accurate as the compiler, remember that years of development have gone into making the compiler code as quick as possible (and it will always be correct).
The compiler cannot divine additional information you may have and stick it in dynamic_cast. If you know certain invariants about your code and you can show that your manual casting mechanism is faster, do it yourself. It doesn't really matter how dynamic_cast is implemented in that case.

Is it always evil to have a struct with methods?

I've just been browsing and spotted the following...
When should you use a class vs a struct in C++?
The consensus there is that, by convention, you should only use struct for POD, no methods, etc.
I've always felt that some types were naturally structs rather than classes, yet could still have a few helper functions as members. The struct should still be POD by most of the usual rules - in particular it must be safe to copy using memcpy. It must have all member data public. But it still makes sense to me to have helper functions as members. I wouldn't even necessarily object to a private method, though I don't recall ever doing this myself. And although it breaks the normal POD rules, I wouldn't object to a struct having constructors, provided they were just initialise-a-few-fields constructors (overriding assignment or destructors would definitely be against the rules).
To me a struct is intuitively a collection of fields - a data structure node or whatever - whereas a class is an abstraction. The logical place to put the helper functions for your collection-of-fields may well be within the struct.
I even think I once read some advice along these lines, though I don't remember where.
Is this against accepted best practice?
EDIT - POD (Plain Old Data) is misrepresented by this question. In particular, a struct can be non-POD purely because a member is non-POD - e.g. an aggregate with a member of type std::string. That aggregate must not be copied with memcpy. In case of confusion, see here.
For what it's worth, all the standard STL functors are defined as structs, and their sole purpose is to have member functions; STL functors aren't supposed to have state.
EDIT: Personally, I use struct whenever a class has all public members. It matters little, so long as one is consistent.
As far as the language is concerned, it doesn't matter, except for default private vs. public access. The choice is subjective.
I'd personally say use struct for PODs, but remember that "POD" doesn't mean "no member functions". It means no virtual functions, constructors, destructor, or operator=.
Edit: I also use structs for simple public-access aggregates of data, even if their members aren't PODs.
I tend to use struct a lot.
For the "traditional" OOP classes (the ones that represent a specific "thing"), I tend to use class simply because it's a common convention.
But most of my classes aren't really OOP objects. They tend to be functors, and traits classes and all sorts of other more abstract code concepts, things I use to express myself, rather than modelling specific "things". And those I usually make struct. It saves me having to type the initial public:, and so it makes the class definition shorter and easier to get an overview of.
I also lean towards making larger, more complex classes class. Except this rarely affects anything, because I lean even more towards refactoring large, complex classes... And then I'm left with small simple ones, which might be made struct's.
But in either case, I'd never consider it "evil". "Evil" is when you do something that actively obfuscates your code, or when you make something far more complex than necessary.
But every C++ programmer knows that struct and class mean virtually the same thing. You're not going to be confused for long because you see a struct Animal {...}; You know that it's a class designed to model an animal. So no, it's not "evil" no matter which way you do it.
When I have control over the style guides, everything is defined as a struct. Historically, I wanted all my objects to be one or the other, since I was taught it was undefined behavior to forward declared a struct as a class, and vice versa (I'm not sure if this was ever actually true, its just what I was told). And really, struct has the more reasonable default state.
I still do the same because I've never been convinced of the value of using it as a form of documentation. If you need to convey that an object only has public members, or doesn't have any member functions, or whatever arbitrary line you chose to draw between the two, you have actual documentation available for that. Since you never actually use the struct or class keyword when using the type, you would need to go hunting for the definition anyhow if you want the information. And with everybody having their own opinion on what struct actually means, its usefulness as self-documentation is reduced. So I shoot for consistency: everything as one or the other. In this case, struct.
Its not a popular way of doing things, to say the least. Fortunately for everybody involved, I very rarely have control over the coding standards.
I use a struct whenever I need an aggregate of public data members (as opposed to class, which I use for types that come with a certain level of abstractions and have their data private).
Whether or not such a data aggregate has member functions doesn't matter to me. In fact, I rarely ever write a struct without also providing at least one constructor for it.
You said, "To me a struct is intuitively a collection of fields - a data structure node or whatever". What you are describing is Plain Old Data. The C++ standard does have an opinion of what POD means with respect to C++ language features. I suggest that, where possible, you adopt the meaning of POD used in C++0x.
I need to find a reference for this, but I thought that, as far as type definitions are concerned, the keywords "struct" and "class" are to be synonyms in C++0x. The only difference being that class defaults to private members, while a struct defaults to public. This means that you'll never see complier error messages like "Type X first seen using struct, now seen using class." once we're all using C++0x.
I think that it is wrong to have strict rules on when to use class and when to use struct. If consistency is important to you, the go ahead and make a coding standard that has what ever sort of rules you like. Just make sure everyone knows that your coding standard relates to how you want you code to look - it doesn't relate to the underlying language features at all.
Maybe to give an example where I break all your rules, you say "To me a struct is intuitively a collection of fields - a data structure node or whatever - whereas a class is an abstraction." When I want to write a class that implements some abstraction, I define the interface as a struct with pure virtual methods (egad!) - this is exposed in a header, as too is a factory method to construct my concrete class. The concrete class is not exposed i a public header at all. Since the concrete class is a secret, it can be implemented however I like and it makes no difference to external code.
I'd suggest that if you think that the keywords struct and class have any relation to the concept of POD, then you're not understanding C++ yet.