Could someone point me to the (official) definition of object in C++? In the current specification, the word "object" is used a few thousand of times, but I can't seem to find a section or reference that explains what an object is.
The background to this somehow basic question is a discussion I recently had with another user, who was surprised to my question of whether a pointer to a variable of a scoped enum type could be considered an object pointer.
According to what he says, in C++ each variable is an object, hence also the variable i in int i = 42;.
Anyway, I could find other sources stating that an object in C++ is an instance of a class (and this is surely what I was taught at school many years ago), which contradicts in my understanding the assumption above that any variable is an object. Or is there an explanation to this apparent contradiction?
References aren't objects. Instances of pretty much any other type are.
Here's the definition, found in section 1.8:
The constructs in a C ++ program create, destroy, refer to, access, and manipulate objects. An object is a region of storage. [ Note: A function is not an object, regardless of whether or not it occupies storage in the way that objects do. — end note ] An object is created by a definition (3.1), by a new-expression (5.3.4) or by the implementation (12.2) when needed. The properties of an object are determined when the object is created. An object can have a name (Clause 3). An object has a storage duration (3.7) which influences its lifetime (3.8). An object has a type (3.9). The term object type refers to the type with which the object is created. Some objects are polymorphic (10.3); the implementation generates information associated with each such object that makes it possible to determine that object's type during program execution. For other objects, the interpretation of the values found therein is determined by the type of the expressions (Clause 5) used to access them.
More useful is the definition of object type in 3.9p8:
An object type is a (possibly cv-qualified) type that is not a function type, not a reference type, and not a void type.
Functions have function type but they aren't instances, and there never are instances of void.
To deal with your particular debate, you need the definition of object pointer, from 3.9.2p3:
The type of a pointer to void or a pointer to an object type is called an object pointer type.
As it turns out, the definition of object never mattered, only the definition of object type. A pointer to a scoped enum is certainly an object pointer (and it is itself also an object).
You'll find that the Standard uses the phrase class object when it means to restrict to instances of class, struct, or union type.
Related
I am learning C++ using the books listed here. My question is that is there a difference between a variable and an object. Consider the following example:
int i = 0; //i is an object of type int. Or we can say i is a variable of type int
int &refI = i; //refI is a "reference variable" of "reference type `int&`. But here refI is not an object.
My current understanding is that both of the terms variable and object overlaps to a large extent. But in some contexts like in case of refI above, there can be some differences. In particular, refI is a reference variable of reference type int& and refI is not an object because a reference is an alias for some other object. While i is both an object and a variable.
My question is that am i correctly analyzing the refI case above? If not, what does the standard say about this.
My second question is that, does the standard C++ strictly differentiate between these two terms. If yes, how and where. For example something like,
a variable may be defined as an object with a name. And any object without a name is not a variable.
Here the user says that a variable and object are different.
Edit
I am also asking this question because i am aware that in Python(as it is a dynamically typed language) there is a clear distinction between variables and objects. Does the C++ standard also make such a clear distinction.
Difference between an object and a variable in C++
Variable is a programming language level concept. A variable has a type and it (usually) has a name. A variable can denote an object, or a reference.
There's no concise definition for the meaning of "variable" in the standard nor a section dedicated to them alone, but closest individual rule to specifying its meaning is:
[basic.pre]
A variable is introduced by the declaration of a reference other than a non-static data member or of an object.
The variable's name, if any, denotes the reference or object.
Object is a concept in the level of the abstract machine that the language defines. It is mostly specified in the section "Object model [intro.object]" which begins:
[intro.object]
The constructs in a C++ program create, destroy, refer to, access, and manipulate objects.
An object is created by a definition, by a new-expression ([expr.new]), by an operation that implicitly creates objects (see below), when implicitly changing the active member of a union, or when a temporary object is created ([conv.rval], [class.temporary]).
An object occupies a region of storage in its period of construction ([class.cdtor]), throughout its lifetime, and in its period of destruction ([class.cdtor]).
int i = 0; //i is an object of type int. Or we can say i is a variable of type int
i is a variable of type int. The variable's name denotes an object.
int &refI = i; //refI is a "reference variable" of "reference type `int&`. But here refI is not an object.
refI is a variable of type int&. The variable's name denotes a reference. The reference is bound to the object named by i and can be seen as another name for the same object.
Your understanding seems to be correct. I'm going to reuse the quote from #RichardCritten answer, but with a different explanation.
[basic.pre]/6
A variable is introduced by the declaration of a reference other than a non-static data member or of an object.
The variable's name, if any, denotes the reference or object.
So a variable is one of:
A named object, e.g. int i = 1;. Non-static data members don't count. Functions don't count, since they're not objects, see below. Named objects are always created by declarations.
An unnamed object that has a declaration. The only ones I'm aware of are unnamed function parameters (and structured bindings, see below).
A named reference, e.g. int &j = i;. Non-static data members don't count. Named references are always created by declarations.
An unnamed reference that has a declaration. The only ones I'm aware of are unnamed function parameters (and structured bindings, see below). As far as I'm aware, there are no unnamed references without declarations, since expressions can't have reference types.
A structured binding: there's a single unnamed object or reference per structured binding (regardless of the number of identifiers), AND, if this structured binding was initialized with a std::tuple-like class (as opposed to an array or a class with magically detected members), there's also one reference per each member (unnamed, surprisingly - the identifiers magically refer to those references, they are not their names).
[intro.object]
The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is created by a definition, by a new-expression, by an operation that implicitly creates objects (see below), when implicitly changing the active member of a union, or when a temporary object is created. An object occupies a region of storage in its period of construction, throughout its lifetime, and in its period of destruction.
[Note 1: A function is not an object, regardless of whether or not it occupies storage in the way that objects do.
— end note]
The properties of an object are determined when the object is created. An object can have a name. An object has a storage duration which influences its lifetime. An object has a type.
The above explains what counts as an object.
But I find it easier to remember what isn't one:
Functions are not objects. (see quote above)
References are not objects. (again, note that expressions can't have reference types; there are no temporary unnamed references)
Prvalues are not objects (since C++17).
If the "object" doesn't exist, it isn't an object (i.e. if its lifetime hasn't started or has already ended, AND its constructor nor destructor are currently running).
Obviously, labels are not objects, and neither are macros.
Ignoring the language-lawyer tag since you are just learning. You don't need to know the exact wording of the standard to be able to understand the example you gave.
int i = 0; //i is an object of type int. Or we can say i is a variable of type int
int &refI = i; //refI is a "reference variable" of "reference type `int&`. But here refI is not an object.
Both i and refI are variables. Anything you give a name is a variable.
But both i and refI are also objects. Every variable has an address in memory where it lives and that makes it an object. There are objects that are not variables, for example the C string literal "Hello, world!". The string is placed somewhere in memory and has an address, it is an object. But it has no name so it isn't a variable.
Note: If nothing uses the address of an object the compiler may, and often will, optimize. Often the data is only kept in CPU registers and never actually stored in memory.
So to summarize: All variables are objects but not all objects are variables.
Current Draft Standard basic.pre.6 "A variable is introduced by the declaration of a reference other than a non-static data member or of an object. The variable's name, if any, denotes the reference or object."
Current draft standard says (previous standards have similar wording) in [basic.life/1]:
The lifetime of an object or reference is a runtime property of the
object or reference. An object is said to have non-vacuous
initialization if it is of a class or aggregate type and it or one of
its subobjects is initialized by a constructor other than a trivial
default constructor. [ Note: Initialization by a trivial copy/move
constructor is non-vacuous initialization. — end note ] The lifetime
of an object of type T begins when:
(1.1) storage with the proper
alignment and size for type T is obtained, and
(1.2) if the object has
non-vacuous initialization, its initialization is complete,
See this code:
alignas(int) char obj[sizeof(int)];
Does basic.life/1 mean that here an int (and several other types, which has the same or less alignment/size requirements as int) has begun its lifetime?
What does this even mean? If an object has begun its lifetime, is it created? [intro.object/1] says:
[...] An object is created by a definition ([basic.def]), by a new-expression, when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary]) [...]
So, according to this, my obj (as an int) is not created. But its lifetime as an int (and as other, possibly infinite-type vacuously-initializable objects) has started.
I'm confused, can you give a clarification on this?
You cannot begin the lifetime of a object unless the object has been created. And [intro.object]/1 defines the only ways in which objects can be created:
An object is created by a definition (6.1), by a new-expression (8.3.4), when implicitly changing the active member of a union (12.3), or when a temporary object is created (7.4, 15.2).
The object created by this definition is of type char[]. Therefore, that is the only object whose lifetime begins. And no other objects are created by this construct.
To lend credence to this interpretation, the proposal for C++20 P0593 exists whose primary purpose is to allow that very declaration to implicitly create other such objects.
Comments:
The condition in (1.2) still bothers me. Why is it there?
It is there because it cannot say "the initialization is complete" for an object that doesn't undergo initialization.
suppose, that I have a new(obj) int afterwards. That clearly creates an int object. But before that, obj has obtained the necessary storage.
No, the declaration of obj obtained storage for an object of type char[]. What obtains storage for the int object being created is new(obj). Yes, the placement-new expression obtains storage for the object that it creates. Just like a declaration of a variable obtains storage for the object it creates.
Just because that storage happens to exist already doesn't mean it isn't being obtained.
I interpret
The lifetime of an object of type T begins when...
to mean
Given that a program creates an object of T, the following describes when that object's lifetime is said to begin...
and not
If the following conditions are satisfied, then an object of type T exists, and its lifetime begins when...
That is, there's an implicit additional condition that the object is "created" in some way described in [intro.object]/1. But the paragraph [basic.life]1/ does not mean to by itself imply that any object exist, only one of the properties of objects that do exist.
So for your declaration, the text describes the beginning of the lifetimes of one object of type char[sizeof(int)] and one or more objects of type char (even if the declaration is a statement in a block scope and there is no initialization), but since there is no object of type int implied to exist, we won't say anything about the lifetime of such an object.
Because the Standard deliberately refrains from requiring that all implementations be suitable for all purposes, it will often be necessary for quality implementations intended for various purposes to guarantee the behavior of code over which the Standard itself would impose no requirements.
If some type T supports implicit object creation and a program converts the address of some object to a T*, a high-quality implementation which is intended to support low-level programming concepts without requiring special syntax will behave as though such conversion creates an object of type T in cases where that would allow the program to have defined behavior, but would not implicitly create such objects is cases where doing so would not be necessary but would instead result in Undefined Behavior by destroying other objects.
Thus, if float and uint32_t are the same size and have the same alignment requirements, then given e.g.
alignas(uint32_t) char obj[sizeof(uint32_t)];
float *fp = (float*)obj;
*fp = 1.0f;
uint32_t *up = (uint32_t*)obj;
The initialization of fp would create a float because that would be needed to make the assignment to *fp work. If up will be used in a fashion that would require a uint32_t to exist there, the assignment to up could create one while destroying the float that was there. If up isn't used in such a fashion, but fp is used in a way that would require that the float still exist, that float would still exist. If both pointers are used in ways that would require that the respective objects still exist, even a quality compiler intended for low-level programming might be incapable of handling that possibility.
Note that implementations which are not particularly suitable for low-level programming may not support the semantics described here. The authors of the Standard allows compiler writers to support such semantics or not, based upon whether they are necessary for their compilers' intended purposes; unfortunately, there is not as yet any standard way to distinguish compilers that are suitable for such purposes from those that aren't.
In the comments and answers to this question:
Virtual function compiler optimization c++
it is argued that a virtual function call in a loop cannot be devirtualized, because the virtual function might replace this by another object using placement new, e.g.:
void A::foo() { // virtual
static_assert(sizeof(A) == sizeof(Derived));
new(this) Derived;
}
The example is from a LLVM blog article about devirtualization
Now my question is: is that allowed by the standard?
I could find this on cppreference about storage reuse: (emphasis mine)
A program is not required to call the destructor of an object to end its lifetime if the object is trivially-destructible or if the program does not rely on the side effects of the destructor. However, if a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
If the new object must have the same type, it must have the same virtual functions. So it is not possible to have a different virtual function, and thus, devirtualization is acceptable.
Or do I misunderstand something?
The quote you provided says:
If a program ends the lifetime of an non-trivial object, it must ensure that a new object of the same type is constructed in-place (e.g. via placement new) before the destructor may be called implicitly
The intent of this statement relates to something a bit different to what you are doing. The statement is meant to say that when you destroy an object without destroying its name, something still refers to that storage with the original type, o you need to construct a new object there so that when the implicit destruction occurs, there is a valid object to destroy. This is relevant for example if you have an automatic ("stack") variable, and you call its destructor--you need to construct a new instance there before the destructor is called when the variable goes out of scope.
The statement as a whole, and its "of the same type" clause in particular, has no bearing on the topic you're discussing, which is whether you are allowed to construct a different polymorphic type having the same storage requirements in place of an old one. I don't know of any reason why you shouldn't be allowed to do that.
Now, that being said, the question you linked to is doing something different: it is calling a function using implicit this in a loop, and the question is whether the compiler could assume that the vptr for this will not change in that loop. I believe the compiler could (and clang -fstrict-vtable-pointers does) assume this, because this is only valid if the type is the same after the placement new.
So while the quotes from the standard you have provided are not relevant to this issue, the end result is that it does seem possible for an optimizer to devirtualize function calls made in a loop under the assumption that the type of *this (or its vptr) cannot change. The type of an object stored at an address (and its vptr) can change, but if it does, the old this is no longer valid.
It appears that you intend to use the new object using handles (pointers, references, or the original variable name) that existed prior to its recreation. That's allowed only if the instance type is not changed, plus some other conditions excluding const objects and sub-objects:
From [basic.life]:
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied,
and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
Your quote from the Standard is merely a consequence of this one.
Your proposed "devirtualization counter-example" does not meet these requirements, therefore all attempts to access the object after it is replaced will cause undefined behavior.
The blog post even pointed this out, in the very next sentence after the example code you looked at.
From what I understand (and what my textbook says), an object is a programming element that is self-contained, which holds data and a procedure that performs an operation on that data. With this being said, why are things like cin, cout, string, etc. considered objects? Is cin an object, in the way that I defined? Is cin the name of a self-contained unit, which holds data and a procedure that performs operations on that data, found within the source code of the iostream header file?
cin and cout are variables, and as such they're objects.
An object, in C++, is a not-necessarily contiguous region of storage, with an associated content interpretation in the form of a type.
This is a term defined by the C++ standard.
C++11 §1.8/1
” The constructs in a C ++ program create, destroy, refer to, access, and manipulate objects. An object is a
region of storage. [Note: A function is not an object, regardless of whether or not it occupies storage in the
way that objects do. —end note ] An object is created by a definition (3.1), by a new-expression (5.3.4) or
by the implementation (12.2) when needed. The properties of an object are determined when the object is
created. An object can have a name (Clause 3). An object has a storage duration (3.7) which influences
its lifetime (3.8). An object has a type (3.9). The term object type refers to the type with which the object
is created. Some objects are polymorphic (10.3); the implementation generates information associated with
each such object that makes it possible to determine that object’s type during program execution. For other
objects, the interpretation of the values found therein is determined by the type of the expressions (Clause 5)
used to access them.
The non-contiguous thing was primarily in support of multiple inheritance, but at least one committee member argued strongly, in a discussion with me, that it was intended to support making objects in general non-contiguous. However, I know of no extant compiler that does that. It seems meaningless to me.
std::string is not an object, it's a type.
Note: with some other programming languages, and in computer science in general, the term “object” often denotes an instance of a class type. In C++ even instances of non-class types such as int, are objects.
They are considered objects, because they are "objects". They are not types, they are instances.
You can see how they are defined on cppreference.
Example:
extern std::istream cin;
extern std::wistream wcin;
As you can see, cin is a variable whose type is std::istream.
Regarding your assumption about std::string: again, cppreference is very helpful.
We can see that std::string is not a variable/object, but a type alias for std::basic_string<char> instead.
Can one say Variables and constants are objects of data types ?
I wonder what would be the proper explanation for this
int a;
float f;
Here, Can we say a is an object of type int and f is an object of type float?
Yes
Per paragraph §1.8, both a and b are objects of their corresponding types.
1 An object is a region of storage. [Note: A function is not an object,
regardless of whether or not it occupies storage in the way that
objects do. —end note ] An object is created by a definition (3.1), by
a new-expression (5.3.4) or by the implementation (12.2) when needed.
The properties of an object are determined when the object is created.
An object can have a name (Clause 3). An object has a storage duration
(3.7) which influences its lifetime (3.8). An object has a type (3.9).
The term object type refers to the type with which the object is
created.
[intro.object]
and those variables fit in the above quoted definition.
a and f are objects of type int and type float, respectively. Yes, that contradicts what #Patashu says, and that's because we're using different definitions of "object".
#Patashu uses the definition from object-oriented programming: an object is a thing with methods, etc. And that's perfectly fine.
However, C++ is a multi-paradigm language -- it supports more than one programming model. The C++ language definition uses the word "object" in the broader sense that compiler writers use: an object is a region of storage with various operations that can be performed on that storage. The operations are defined by the object's type. There's a well-defined set of operations that can be applied to an object of type int, so when you know that you're dealing with an int you and the compiler know what things you can do with it and, by implication, what things you can't do with it.
I'd say yes. A data object is simply a region of storage that contains a value or a group of values. Both int a and float f agree with this definition. If we want to see the differences between those and the "traditional" objects in object oriented languages, we should show the concept of data type, which helps the compiler allocate storage for that data object, and interpret its memory values when it is accessed.
Each data object in C++ must have a data type (identifiers for data objects and data types are established in the variable/constant declaration). In the classification of data types is where we see that int a; and Object a; are not "quite the same":
int and float are basic data types, in the sense that they are provided by the language. The Object type in this example would be a derived type because it is created from basic types.
Data types can be classified in other, often overlapping, groups: For example, one can say that Object is a user-defined type; and that int is a scalar type, because it represents a single data value.
No
(or rather "arguably not")
Ad "Variables and constants are objects of data types":
Although you can find the phrase "object of type" in the C++ Standard (like "object of type T" [basic.def.odr]/5) and the C Standard (e.g. "object of type wchar_t" 3.7.3), one could argue about the use of the term "variable" in your example, at least in C++:
[basic]/6
A variable is introduced by the declaration of a reference other than a non-static data member or of an object. The variable’s name denotes the reference or object.
So, int answer = 42; int& deepthought = answer; introduces:
an object of type int
a variable, name answer, referring to the object above
a variable, name deepthought, referring to the same object
But a reference AFAIK is not an object (does not have to be) -- so one could argue that variables are not necessarily objects. Of course, they're not identical, e.g. dynamic memory allocation.
Ad "a is object of datatype int and f an object of type float"
AFAIK that complies to the Standard, although to be more precise, one would have to include something like "denotes", e.g. "a denotes an object of datatype int".
But I think there's no ambiguity, therefore I consider it OK.