Given this:
struct { int x; } ix;
struct A { A() {}; int x; };
A ia;
Which of these is true?
a. ix is an object
b. ia is an object
c. both are objects
d. both are not objects.
Many of these answers have ignored the C++ tag. In C++, "an object is a region of storage. [Note: a function is not an object, regardless of whether or not it occupies storage in the same way that objects do.]" (The C++ Standard, 1.8/1).
If the homework question is about C++, then no other definition of object is applicable, not even "anything that is visible or tangible and is relatively stable in form" (dictionary.reference.com). It's not asking for your opinion about OOP principles, it's in effect asking whether ix and ia are variables.
Since it's homework I'll not tell you the answer, but do note that struct { int x; } ix; is not the same thing as struct ix { int x; };.
On the other hand, if the homework assignment is about OOP principles, then knock yourself out with whatever definition your lecturer has given you of "object". Since I don't know what that is, I can't tell you what answer he'll consider correct...
Given the C++ tag, the answer is pretty much "take your choice."
The C standard defines an object as meaning (in essence) anything that has an address, including all instances of native/primitive types (e.g. int). Since C++ depends so heavily on C, that definition still carries some weight in C++. By this definition, essentially every variable is an object, and so are a few other things (e.g. character string literals, dynamically allocated blocks of memory).
In Smalltalk (at rather the opposite extreme) the answer would be none of them is an object -- an object never has public data. Its behavior is defined entirely in terms of responses to messages.
The word "object" is a rather ambiguous specification without some more context, but in general objects have identity, behavior, and state.
Neither ix nor ia have all three; ix fails because it lacks identity or behavior, and ia fails because it has no behavior. Both are essentially just blobs of data.
There are two commonly used definitions of "object" in C++.
One is official according to the C++ standard, and says that everything that has storage allocated for it is an object. A struct is an object, an int is an object, a bool is an object, a pointer is an object, a string literal is an object, and so on. By this definition, ix, ia and x are all objects. But this probably isn't what your teacher meant. You have to be a bit of a language lawyer to use this definition, and it's not that widely known among "average" C++ users. It's also not a very relevant definition for someone just learning the language.
The definition you are probably expected to use is that of an "object" in the object-oriented sense. Here (at least in the C++ family of languages), an object is typically meant to be an instance of a class.
Which leaves the next obvious question: Is an instance of a struct also an object? Depends. In C++, a class and a struct are essentially the same, so semantically, yes, but technically, you're not using the class keyword, so syntactically, probably not.
In short: It's a silly, and badly worded question, and only you know what your teacher means or wants to hear, because you're the one who attended the classes, not us. All we can do is guess at what he thinks defines a class.
These questions are impossible to answer without extra clarification. The question is tagged C++, which means that the language is supposedly C++.
In this case, if the declarations are made in namespace scope, the ix declaration is invalid. It is illegal to use an unnamed class type (which has no linkage) to declare an object with external linkage. The declaration of ix would work in local scope
void foo() {
struct { int x; } ix; // OK, no linkage
}
It might also work if ix was declared with internal linkage at namespace scope
static struct { int x; } ix; // OK? Internal linkage?
although I personally believe that this was intended to be ill-formed as well (Comeau somehow allows it).
But a namespace-scope declaration with external linkage is ill-formed
// In namespace scope
struct { int x; } ix; // ERROR
So, if the namespace scope is assumed and if the above declarations are meant to be taken as a single piece of code, there are no meaningful answers to these questions. The whole code is simply invalid. It is meaningless. It is not C++.
Otherwise, if ix is declared with no linkage (local) or with internal linkage, then ix is an object.
As for ia, it is an object regardless of where it is declared, since the class type is named.
Note though that the notion of object in C++ has nothing to do with classes. Object in C++ is a region of storage (memory). A variable of int type is an object in C++, for one example.
Added later: The bit about legality of ix declaration is an interesting issue. Apparently C++98 allowed such declarations, which was proposed to be outlawed in DR#132. However, later the proposal was rejected (for a rather weird reason) and the things were left as is. Yet, Comeau Online refuses to accept a declaration of an object with external linkage with unnamed type (internal linkage is OK). It could quite possibly be a formal bug in Comeau compiler (not that I'd complain about it).
Added even later: Oh, I see that there's an even later DR#389, which finally outlaws such declarations, but the status of this DR is still CD1.
By my definition, I'd say an object has properties and methods. Both nouns and verbs.
You can kick a ball, you can invade a country, and you can eat, milk, or punch a cow. Those are therefore objects.
You might have a data structure that represents the properties of a ball (radius), country (population), or cow (daily milk output in liters), but that data structure doesn't represent an object in my mind until you tell it how to process pertinent behaviors.
I recognize this definition may not work in 100% of cases, but it's close enough for my needs.
Technically, an object is an instance of a class, but objects' true usefulness lies in their ability to encapsulate information and aid in the design of systems. They are an analysis tool.
An object is an instnace of a type (be it POD or class).
As such you are able to extract the address of an object. All objects take up at least one byte. The reason for this is that you don't have to add special code for handling zero sized objects because every object in memory has a destinct address (by making everything at least one byte the compiler will automatically have a unique address for each object).
int main()
{
struct { int x; } ix;
struct A { A() {}; int x; };
A ia;
ix.x = 5; // Assigned a value.
// Thus it has state and thus is an object.
ia.x = 6; // Assigned a value.
}
So they are both objects.
The real answer is "e. Whoever writes code like this should be coached to improve their legibility." Okay, that was a joke.
This question isn't so complex. But elsewhere, I've seen programming tests written purposely complex for the purpose of seeing if you can solve puzzles. It's completely pointless, because code that is that complex shouldn't and usually does not exist. If it's that hard to read, it's poorly written code.
Remember, code is not written for computers. Code is written for the next developer after you to read and understand.
And don't write code just so it works. That's not a high enough standard. The worst junk in the world will run, but it's a nightmare to fix or upgrade.
Related
We all know members specified protected from a base class can only be accessed from a derived class own instance. This is a feature from the Standard, and this has been discussed on Stack Overflow multiple times:
Cannot access protected member of another instance from derived type's scope
;
Why can't my object access protected members of another object defined in common base class?
And others.
But it seems possible to walk around this restriction with member pointers, as user chtz has shown me:
struct Base { protected: int value; };
struct Derived : Base
{
void f(Base const& other)
{
//int n = other.value; // error: 'int Base::value' is protected within this context
int n = other.*(&Derived::value); // ok??? why?
(void) n;
}
};
Live demo on coliru
Why is this possible, is it a wanted feature or a glitch somewhere in the implementation or the wording of the Standard?
From comments emerged another question: if Derived::f is called with an actual Base, is it undefined behaviour?
The fact that a member is not accessible using class member access expr.ref (aclass.amember) due to access control [class.access] does not make this member inaccessible using other expressions.
The expression &Derived::value (whose type is int Base::*) is perfectly standard compliant, and it designates the member value of Base. Then the expression a_base.*p where p is a pointer to a member of Base and a_base an instance of Base is also standard compliant.
So any standard compliant compiler shall make the expression other.*(&Derived::value); defined behavior: access the member value of other.
is it a hack?
In similar vein to using reinterpret_cast, this can be dangerous and may potentially be a source of hard to find bugs. But it's well formed and there's no doubt whether it should work.
To clarify the analogy: The behaviour of reinterpret_cast is also specified exactly in the standard and can be used without any UB. But reinterpret_cast circumvents the type system, and the type system is there for a reason. Similarly, this pointer to member trick is well formed according to the standard, but it circumvents the encapsulation of members, and that encapsulation (typically) exists for a reason (I say typically, since I suppose a programmer can use encapsulation frivolously).
[Is it] a glitch somewhere in the implementation or the wording of the Standard?
No, the implementation is correct. This is how the language has been specified to work.
Member function of Derived can obviously access &Derived::value, since it is a protected member of a base.
The result of that operation is a pointer to a member of Base. This can be applied to a reference to Base. Member access privileges does not apply to pointers to members: It applies only to the names of the members.
From comments emerged another question: if Derived::f is called with an actual Base, is it undefined behaviour?
Not UB. Base has the member.
Just to add to the answers and zoom in a bit on the horror I can read between your lines. If you see access specifiers as 'the law', policing you to keep you from doing 'bad things', I think you are missing the point. public, protected, private, const ... are all part of a system that is a huge plus for C++. Languages without it may have many merits but when you build large systems such things are a real asset.
Having said that: I think it's a good thing that it is possible to get around almost all the safety nets provided to you. As long as you remember that 'possible' does not mean 'good'. This is why it should never be 'easy'. But for the rest - it's up to you. You are the architect.
Years ago I could simply do this (and it may still work in certain environments):
#define private public
Very helpful for 'hostile' external header files. Good practice? What do you think? But sometimes your options are limited.
So yes, what you show is kind-of a breach in the system. But hey, what keeps you from deriving and hand out public references to the member? If horrible maintenance problems turn you on - by all means, why not?
Basically what you're doing is tricking the compiler, and this is supposed to work. I always see this kind of questions and people some times get bad results and some times it works, depending on how this converts to assembler code.
I remember seeing a case with a const keyword on a integer, but then with some trickery the guy was able to change the value and successfully circumvented the compiler's awareness. The result was: A wrong value for a simple mathematical operation. The reason is simple: Assembly in x86 does make a distinction between constants and variables, because some instructions do contain constants in their opcode. So, since the compiler believes it's a constant, it'll treat it as a constant and deal with it in an optimized way with the wrong CPU instruction, and baam, you have an error in the resulting number.
In other words: The compiler will try to enforce all the rules it can enforce, but you can probably eventually trick it, and you may or may not get wrong results based on what you're trying to do, so you better do such things only if you know what you're doing.
In your case, the pointer &Derived::value can be calculated from an object by how many bytes there are from the beginning of the class. This is basically how the compiler accesses it, so, the compiler:
Doesn't see any problem with permissions, because you're accessing value through derived at compile-time.
Can do it, because you're taking the offset in bytes in an object that has the same structure as derived (well, obviously, the base).
So, you're not violating any rules. You successfully circumvented the compilation rules. You shouldn't do it, exactly because of the reasons described in the links you attached, as it breaks OOP encapsulation, but, well, if you know what you're doing...
Doesn't a compiler have all the information it needs to generate a dependency tree of all globals and create a well defined and correct initialization order for them? I realize you could write a cyclic dependency with globals - make only that case undefined behavior - and the compiler could warn and maybe error about it.
Usually the reason for this sort of thing is that it would be burdensome to compiler makers or cause compilation to slow significantly. I have no metrics or evidence that indicates either of these wouldn't be true in this case, but my inclination is that neither would be true.
Hm, imagine the following setup, which is perfectly valid C++, but tricky to analyze:
// TU #1
bool c = coin();
// TU #2
extern bool c;
extern int b;
int a = c ? b : 10;
// TU #3
extern bool c;
extern int a;
int b = c ? 20 : a;
It is clear that TU #1 needs to be initialized first, but then what? The standard solution with references-to-statics allows you to write this code correctly with standard C++, but solving this by fixing the global initialization order seems tricky.
The part the compiler can deal with is actually define: objects with static storage duration are constructed in the order their definition appears in the translation unit. The destruction order is just the reverse.
When it comes to ordering objects between translation units, the dependency group for objects is typically not explicitly represented. However, even if the dependencies were explicitly represnted, they wouldn't actually help much: on small projects the dependencies between objects with static storage duration can be managed relatively easy. Where things become interesting are large objects but these have a much higher chance to include initializations of the form
static T global = functionWhichMayuseTheword();
i.e., in the case where the ordering would be useful it is bound not to work.
There is a trivial way to make sure objects are constructed in time which is even thread-safe in C++ (it wasn't thread-safe in C++03 as this standard didn't mention any concept of threads in the first place): Use a function local static object and return a reference to it. The objects will be constructed upon demand but if there are dependencies between them this is generally acceptable:
static T& global() {
static rc = someInitialization();
return rc;
}
Given that there is a simple work-around and neither a proposal nor a working implementation demonstrating that the proposal does work, there is little interest to change the state of how global objects are initialized. Not to mention that improving the support for global objects seems as useful as making goto better.
I am not a compiler author so take what I say with a grain of salt. I think the reasons are as follows.
1) Desire the preserve the C model of separate compilation. Link time analysis is certainly allowed, but I suspect they did not want to make it required.
2) Meyers Singleton (especially now that it has been made thread-safe) provides a good enough alternative in that it is almost as easy to use as a global variable but provides the guarantees you are looking for.
Suppose I have a class whose only purpose is the side-effects caused during construction of its objects (e.g., registering a class with a factory):
class SideEffectCauser {
public:
SideEffectCauser() { /* code causing side-effects */ }
};
Also suppose I'd like to have an object create such side-effects once for each of several translation units. For each such translation unit, I'd like to be able to just put an a SideEffectCauser object at namespace scope in the .cpp file, e.g.,
SideEffectCauser dummyGlobal;
but 3.6.2/3 of the C++03 standard suggests that this object need not be constructed at all unless an object or function in the .cpp file is used, and articles such as this and online discussions such as this suggest that such objects are sometimes not initialized.
On the other hand, Is there a way to instantiate objects from a string holding their class name? has a solution that is claimed to work, and I note that it's based on using an object of a type like SideEffectCauser as a static data member, not as a global, e.g.,
class Holder {
static SideEffectHolder dummyInClass;
};
SideEffectHolder Holder::dummyInClass;
Both dummyGlobal and dummyInClass are non-local statics, but a closer look at 3.6.2/3 of the C++03 standard shows that that passage applies only to objects at namespace scope. I can't actually find anything in the C++03 standard that says when non-local statics at class scope are dynamically initialized, though 9.4.2/7 suggests that the same rules apply to them as to non-local statics at namespace scope.
Question 1: In C++03, is there any reason to believe that dummyInClass is any more likely to be initialized than dummyGlobal? Or may both go uninitialized if no functions or objects in the same translation unit are used?
Question 2: Does anything change in C++11? The wording in 3.6.2 and 9.4.2 is not the same as the C++03 versions, but, from what I can tell, there is no behavioral difference specified for the scenarios I describe above.
Question 3: Is there a reliable way to use objects of a class like SideEffectHolder outside a function body to force side-effects to take place?
I think the only reliable solution is to design this for specific compiler(s) and runtime. No standard covers the initialization of globals in a shared library which I think is the most intricate case, as this is much dependent on the loader and thus OS dependent.
Q1: No
Q2: Not in any practical sense
Q3: Not in a standard way
I'm using something similar with g++ / C++11 under Linux and get my factories registered as expected. I'm not sure why you wouldn't get the functions called. If what you describes is to be implemented it will mean that every single function in that unit has to call the initialization function. I'm not too sure how that could be done. My factories are also inside namespaces, although it is named namespaces. But I don't see why it wouldn't be called.
namespace snap {
namespace plugin_name {
class plugin_name_factory {
public:
plugin_name_factory() { plugin_register(this, name); }
...
} g_plugin_name_factory;
}
}
Note that the static keyword should not be used anymore in C++ anyway. It is often slower to have a static definition than a global.
Suppose I've a POD struct which has more than 40 members. These members are not built-in types, rather most of them are POD structs, which in turn has lots of members, most of which are POD struct again. This pattern goes up to many levels - POD has POD has POD and so on - up to even 10 or so levels.
I cannot post the actual code, so here is one very simple example of this pattern:
//POD struct
struct Big
{
A a[2]; //POD has POD
B b; //POD has POD
double dar[6];
int m;
bool is;
double d;
char c[10];
};
And A and B are defined as:
struct A
{
int i;
int j;
int k;
};
struct B
{
A a; //POD has POD
double x;
double y;
double z;
char *s;
};
It's really very simplified version of the actual code which was written (in C) almost 20 years back by Citrix Systems when they came up with ICA protocol. Over the years, the code has been changed a lot. Now we've the source code, but we cannot know which code is being used in the current version of ICA, and which has been discarded, as the discarded part is also present in the source code.
That is the background of the problem. The problem is: now we've the source code, and we're building a system on the top of ICA protocol, for which at some point we need to know the values of few members of the big struct. Few members, not all. Fortunately, those members appear in the beginning of the struct, so can we write a struct which is part of the big struct, as:
//Part of struct B
//Whatever members it has, they are in the same order as they appear in Big.
struct partBig
{
A a[2];
B b;
double dar[6];
//rest is ignored
};
Now suppose, we know pointer to Big struct (that we know by deciphering the protocol and data streams), then can we write this:
Big *pBig = GetBig();
partBig *part = (partBig*)pBig; //Is this safe?
/*here can we pretend that part is actually Big, so as to access
first few members safely (using part pointer), namely those
which are defined in partBig?*/
I don't want to define the entire Big struct in our code, as it has too many POD members and if I define the struct entirely, I've to first define hundreds of other structs. I don't want to that, as even if I do, I doubt if I can do that correctly, as I don't know all the structs correctly (as to which version is being used today, and which is discarded).
I've done the casting already, and it seems to work for last one year, I didn't see any problem with that. But now I thought why not start a topic and ask everyone. Maybe, I'll have a better solution, or at least, will make some important notes.
Relevant references from the language specification will be appreciated. :-)
Here is a demo of such casting: http://ideone.com/c7SWr
The language specification has some features that are similar to what you are trying to do: the 6.5.2.3/5 states that if you have several structs that share common initial sequence of members, you are allowed to inspect these common members through any of the structs in the union, regardless of which one is currently active. So, from this guarantee one can easily derive that structs with common initial sequence of members should have identical memory layout for these common members.
However, the language does not seem to explicitly allow doing what you are doing, i.e. reinterpreting one struct object as another unrelated struct object through a pointer cast. (The language allows you to access the first member of a struct object through a pointer cast, but not what you do in your code). So, from that perspective, what you are doing might be a strict-aliasing violation (see 6.5/7). I'm not sure about this though, since it is not immediately clear to me whether the intent of 6.5/7 was to outlaw this kind of access.
However, in any case, I don't think that compilers that use strict-aliasing rules for optimization do it at aggregate type level. Most likely they only do it at fundamental type level, meaning that your code should work fine in practice.
Yes, assuming your small structure is laid out with the same packing/padding rules as the big one, you'll be fine. The memory layout order of fields in a structure is well-defined by the language specifications.
The only way to get BigPart from Big is using reinterpret_cast (or C style, as in your question), and the standard does not specify what happens in that case. However, it should work as expected. It would probably fail only for some super exotic platforms.
The relevant type-aliasing rule is 3.10/10: "If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: ...".
The list contains two cases that are relevant to us:
a type similar (as defined in 4.4) to the dynamic type of the object
an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union)
The first case isn't sufficient: it merely covers cv-qualifications. The second case allows at the union hack that AndreyT mentions (the common initial sequence), but not anything else.
void foo (int x)
{
struct A { static const int d = 0; }; // error
}
Other than the reference from standard, is there any motivation behind this to disallow static field inside an inner class ?
error: field `foo(int)::A::d' in local class cannot be static
Edit: However, static member functions are allowed. I have one use case for such scenario. Suppose I want foo() to be called only for PODs then I can implement it like,
template<typename T>
void foo (T x)
{
struct A { static const T d = 0; }; // many compilers allow double, float etc.
}
foo() should pass for PODs only (if static is allowed) and not for other data types. This is just one use case which comes to my mind.
Because, static members of a class need to be defined in global a scope, e.g.
foo.h
class A {
static int dude;
};
foo.cpp
int A::dude = 314;
Since the scope inside void foo(int x) is local to that function, there is no scope to define its static member[s].
Magnus Skog has given the real answer: a static data member is just a declaration; the object must be defined elsewhere, at namespace scope, and the class definition isn't visible at namespace scope.
Note that this restriction only applies to static data members. Which means that there is a simple work-around:
class Local
{
static int& static_i()
{
static int value;
return value;
}
};
This provides you with exactly the same functionality, at the cost of
using the function syntax to access it.
Because nobody saw any need for it ?
[edit]: static variables need be defined only once, generally outside of the class (except for built-ins). Allowing them within a local class would require designing a way to define them also. [/edit]
Any feature added to a language has a cost:
it must be implemented by the compiler
it must be maintained in the compiler (and may introduce bugs, even in other features)
it lives in the compiler (and thus may cause some slow down even when unused)
Sometimes, not implementing a feature is the right decision.
Local functions, and classes, add difficulty already to the language, for little gain: they can be avoided with static functions and unnamed namespaces.
Frankly, if I had to make the decision, I'd remove them entirely: they just clutter the grammar.
A single example: The Most Vexing Parse.
I think this is the same naming problem that has prevented us from using local types in template instantiations.
The name foo()::A::d is not a good name for the linker to resolve, so how should it find the definition of the static member? What if there is another struct A in function baz()?
Interesting question, but I have difficulty understanding why you'd want a static member in a local class. Statics are typically used to maintain state across program flow, but in this case wouldn't it be better to use a static variable whose scope was foo()?
If I had to guess why the restriction exists, I'd say it was something to do with the difficulty for the compiler in knowing when to perform the static initialisation. The C++ standards docs might provide a more formal justification.
Just because.
One annoying thing about C++ is that there's a strong dependence on a "global context" concept where everything must be uniquely named. Even the nested namespaces machinery is just string trickery.
I suppose (just a wild guess) that one serious technical issue is working with linkers that were designed for C and that just got some tweak to get them working with C++ (and C++ code needs C interoperability).
It would be nice to be able to get any C++ code and "wrap it" to be able to use it without conflicts in a larger project, but this is not the case because of linkage problems. I don't think there is any reasonable philosophical reason for forbidding statics or non-inline methods (or even nested functions) at the function level but this is what we got (for now).
Even the declaration/definition duality with all its annoying verbosity and implications is just about implementation problems (and to give the ability to sell usable object code without providing the source, something that is now a lot less popular for good reasons).