what actually it means when we say "initialize the object" in c++? - c++

I have tried to research it.. but I think everything is an object in c++....
like (int, float) are scalar objects..etc.
But when we create class's instance, documentation refers the "initialize an object with constructor" in c++. What does it actually means.

I think everything is an object in c++
By the letter of the standard, an "object" is "a region of storage". You can read the nitty-gritty details under [intro.object].
But in layman's terms yes you are right.
like (int, float) are scalar objects..etc.
Absolutely. An int is an object. A float is an object.
(Of course, int and float themselves are types.)
But when we create class's instance, documentation refers the "initialize an object with constructor" in c++.
There's nothing wrong with that. You can initialise an int, and you can initialise a float, and you can initialise an object of class type. For the latter case, one way to do that is using a constructor. That doesn't change anything.
What does it actually means.
Exactly what it says: performing the steps needed to give some object an initial value.
I'll caution you also that there is a lot of bad "documentation" (notably poor tutorials etc.) for C++ out there on the web, so it's also possible that you came across badly-worded or flat-out incorrect text. Notice that, even in the comments section under your question, some people got this wrong.

everything is an object in c++.... like (int, float) are scalar
objects
This is wrong. int and float are built-in types. The user can define it's own types, like for example:
struct A {};
Here, A is a user-defined type. It is not an object!
An object is an instance of a type:
A a;
int i;
Here, a and i are objects of type A and int. When you initialize an object of type A, it means you instanciate class A and initialize it by calling one of the class constructors.
Another example:
std::vector<int> v {1,2,3}
Here, the object v of type std::vector<int> is initialized with the values {1,2,3}, by calling the constructor of the class std::vector<int>.

Creating an object in C++ has 2 steps:
1) find some memory to contain the object.
This can be some space in the stack frame of the function for local objects or the data section for global objects. In those cases the compiler deals with that. Or operator new is used to dynamically create an object and allocates some memory. Which is the case if you write "new Foo()" anywhere.
2) "initialize an object with constructor"
This simply means the constructor is called. The address from step 1 is passed as this to the constructor and any arguments you specified too. If you have no constructor then the default constructor is used when possible.

Related

Does a member have to be initialized to take its address?

Can I initialize a pointer to a data member before initializing the member? In other words, is this valid C++?
#include <string>
class Klass {
public:
Klass()
: ptr_str{&str}
, str{}
{}
private:
std::string *ptr_str;
std::string str;
};
this question is similar to mine, but the order is correct there, and the answer says
I'd advise against coding like this in case someone changes the order of the members in your class.
Which seems to mean reversing the order would be illegal but I couldn't be sure.
Does a member have to be initialized to take its address?
No.
Can I initialize a pointer to a data member before initializing the member? In other words, is this valid C++?
Yes. Yes.
There is no restriction that operand of unary & need to be initialised. There is an example in the standard in specification of unary & operator:
int a;
int* p1 = &a;
Here, the value of a is indeterminate and it is OK to point to it.
What that example doesn't demonstrate is pointing to an object before its lifetime has begun, which is what happens in your example. Using a pointer to an object before and after its lifetime is explicitly allowed if the storage is occupied. Standard draft says:
[basic.life] Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that represents the address of the storage location where the object will be or was located may be used but only in limited ways ...
The rule goes on to list how the usage is restricted. You can get by with common sense. In short, you can treat it as you could treat a void*, except violating these restrictions is UB rather than ill-formed. Similar rule exists for references.
There are also restrictions on computing the address of non-static members specifically. Standard draft says:
[class.cdtor] ... To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.
In the constructor of Klass, the construction of Klass has started and destruction hasn't completed, so the above rule is satisfied.
P.S. Your class is copyable, but the copy will have a pointer to the member of another instance. Consider whether that makes sense for your class. If not, you will need to implement custom copy and move constructors and assignment operators. A self-reference like this is a rare case where you may need custom definitions for those, but not a custom destructor, so it is an exception to the rule of five (or three).
P.P.S If your intention is to point to one of the members, and no object other than a member, then you might want to use a pointer to member instead of pointer to object.
Funny question.
It is legitimate and will "work", though barely. There is a little "but" related to types which makes the whole thing a bit awkward with a bad taste (but not illegitimate), and which might make it illegal some border cases involving inheritance.
You can, of course, take the address of any object whether it's initialized or not, as long as it exists in the scope and has a name which you can prepend operator& to. Dereferencing the pointer is a different thing, but that wasn't the question.
Now, the subtle problem is that the standard defines the result of operator& for non-static struct members as "“pointer to member of class C of type T” and is a prvalue designating C::m".
Which basically means that ptr_str{&str} will take the address of str, but the type is not pointer-to, but pointer-to-member-of. It is then implicitly and silently cast to pointer-to.
In other words, although you do not need to explicitly write &this->str, that's nevertheless what its type is -- it's what it is and what it means [1].
Is this valid, and is it safe to use it within the initializer list? Well yes, just... barely. It's safe to use it as long as it's not being used to access uninitialized members or virtual functions, directly or indirectly. Which, as it happens, is the case here (it might not be the case in a different, arguably contrived case).
[1] Funnily, paragraph 4 starts with a clause that says that no member pointer is formed when you put stuff in parentheses. That's remarkable because most people would probably do that just to be 100% sure they got operator precedence right. But if I read correctly, then &this->foo and &(this->foo) are not in any way the same!

Why is C++ "known" for making lots of copies?

Quite recently, I ran across an old (but still funny) "If Programming Languages Were Essays" comic. I'm quite familiar with the majority of the languages on it, but I was a little confused about the one on C++.
Having just started C++ recently, I wasn't entirely sure why C++ is known for making tonnes of copies of objects. I went to do a little research, and found that when arguments are passed by value, a copy of the object is passed. However, plenty of languages do passing by value as a default so I don't think I'm hitting the right reason. As well, I got into copy constructors and how C++ has (unlike Java) a default copy constructor that does shallow copies, but that doesn't have me convinced either.
Can anybody shed some light on this conception of C++?
Pass by value and return by value is something C++ inherited from C. It was simple because data types in C (structs) were essentially packs of data. With some hacking with function pointers you can get "member functions", but it's not really the same as structs in C++.
However, as of the new standard, move semantics enables "moves" rather than copies. Moving an object from one place to another involves casting it down to an rvalue reference using std::move, and then passing it to either a move constructor or a move assignment operator that takes in an rvalue reference.
However, moving an object from one place to the other leaves it's "original" state in a valid but unknown state. For example, moving std::string objects from one place to another (i.e to another variable) yields the "original" to be the empty string.
Pass by value is not a problem for languages which default to making everything a reference.
But in C++, parameters default to value types unless you explicitly specify that they are taken by reference.
Furthermore, full copies happen on every assignment unless the assignment operator is overwritten to do something else.
class Foo {
int a;
double d;
uint64_t z;
}
Foo foo;
Foo bar = foo; // just made a copy of all of the guts of Foo.
In Java, that would have been assigning a reference.

What are some of the disadvantages of using a reference instead of a pointer?

Given a class "A" exists and is correct. What would be some of the negative results of using a reference to "A" instead of a pointer in a class "B". That is:
// In Declaration File
class A;
class B
{
public:
B();
~B();
private:
A& a;
};
// In Definition File
B::B(): a(* new A())
{}
B::~B()
{
delete &a;
}
Omitted extra code for further correctness of "B", such as the copy constructor and assignment operator, just wanted to demonstrate the concept of the question.
The immediate limitations are that:
You cannot alter a reference's value. You can alter the A it refers to, but you cannot reallocate or reassign a during B's lifetime.
a must never be 0.
Thus:
The object is not assignable.
B should not be copy constructible, unless you teach A and its subtypes to clone properly.
B will not be a good candidate as an element of collections types if stored as value. A vector of Bs would likely be implemented most easily as std::vector<B*>, which may introduce further complications (or simplifications, depending on your design).
These may be good things, depending on your needs.
Caveats:
slicing is another problem to be aware of if a is assignable and assignment is reachable within B.
You can't change the object referred to by a after the fact, e.g. on assignment. Also, it makes your type non-POD (the type given would be non-POD anyway due to the private data member anyway, but in some cases it might matter).
But the main disadvantage is probably it might confuse readers of your code.
Of course, by adding a reference member to your class B means that the compiler can no longer generate the implicit default and copy constructors, and assignment operators; and neither a manually written assignment operator can reasign a.
I don't think there are negative results, other than the fact that delete &a may look odd. The fact that the object was created by new is somewhat lost by binding the result to a reference, and it may only matter since the fact that its lifetime has to be controlled by B is not clear.
If you use a reference:
You must provide the value at construction time
You cannot change what it refers to
It cannot be null
It will prevent your class from being assignable
You might perhaps consider using a smart pointer of some sort instead (std::unique_ptr, std::shared_ptr, etc). This has the added benefit of automatically deleting the object for you.
A reference to a dynamically-allocated object violates the principle of least surprise; that is, no one normally expects code written as you have. In general that will make the maintenance cost higher in the future.
The problem with references is that it looks like a alias or other name for same variable but it you look at the machine code generated internally they use constant pointers to do operations, and this can be a performance issue because you may assume within a code that variable arithmetic is taking place but at assembly code level they are manipulating pointers which you may not realize at code level.
You may not want to use pointer manipulation where you want fast integer or float arithmetic.

Built-in datatypes versus User defined datatypes in C++

Constructors build objects from dust.
This is a statement which I have been coming across many times,recently.
While initializing a built-in datatype variable, the variable also HAS to be "built from dust" . So, are there also constructors for built in types?
Also, how does the compiler treat a BUILT IN DATATYPE and a USER DEFINED CLASS differently, while creating instances for each?
I mean details regarding constructors, destructors etc.
This query on stack overflow is regarding the same and it has some pretty intresting details , most intresting one being what Bjarne said ... !
Do built-in types have default constructors?
Simply put, according to the C++ standard:
12.1 Constructors [class.ctor]
2. A constructor is used to initialize objects of its class type...
so no, built-in datatypes (assuming you're talking about things like ints and floats) do not have constructors because they are not class types. Class types are are specified as such:
9 Classes [class]
1. A class is a type. Its name becomes a class-name (9.1) within
its scope.
class-name:
identifier
template-id
Class-specifiers and elaborated-type-specifiers (7.1.5.3) are used
to make class-names. An object of a class consists of a (possibly
empty) sequence of members and base class objects.
class-specifier:
class-head { member-specification (opt) }
class-head:
class-key identifieropt base-clauseopt
class-key nested-name-specifier identifier base-clauseopt
class-key nested-name-specifieropt template-id base-clauseopt
class-key:
class
struct
union
And since the built-in types are not declared like that, they cannot be class types.
So how are instances of built-in types created? The general process of bringing built-in and class instances into existance is called initialization, for which there's a huge 8-page section in the C++ standard (8.5) that lays out in excruciating detail about it. Here's some of the rules you can find in section 8.5.
As already mentioned, built-in data types don't have constructors.
But you still can use construction-like initialization syntax, like in int i(3), or int i = int(). As far as I know that was introduced to language to better support generic programming, i.e. to be able to write
template <class T>
T f() { T t = T(); }
f(42);
While initializing a built-in datatype variable, the variable also HAS to be "built from dust" . So, are there also constructors for built in types?
Per request, I am rebuilding my answer from dust.
I'm not particularly fond of that "Constructors build objects from dust" phrase. It is a bit misleading.
An object, be it a primitive type, a pointer, or a instance of a big class, occupies a certain known amount of memory. That memory must somehow be set aside for the object. In some circumstances, that set-aside memory is initialized. That initialization is what constructors do. They do not set aside (or allocate) the memory needed to store the object. That step is performed before the constructor is called.
There are times when a variable does not have to be initialized. For example,
int some_function (int some argument) {
int index;
...
}
Note that index was not assigned a value. On entry to some_function, a chunk of memory is set aside for the variable index. This memory already exists somewhere; it is just set aside, or allocated. Since the memory already exists somewhere, each bit will have some pre-existing value. If a variable is not initialized, it will have an initial value. The initial value of the variable index might be 42, or 1404197501, or something entirely different.
Some languages provide a default initialization in case the programmer did not specify one. (C and C++ do not.) Sometimes there is nothing wrong with not initializing a variable to a known value. The very next statement might be an assignment statement, for example. The upside of providing a default initialization is that failing to initialize variables is a typical programming mistake. The downside is that this initialization has a cost, albeit typically tiny. That tiny cost can be significant when it occurs in a time-critical, multiply-nested loop. Not providing a default initial value fits the C and C++ philosophy of not providing something the programmer did not ask for.
Some variables, even non-class variables, absolutely do need to be given an initial value. For example, there is no way to assign a value to a variable that is of a reference type except in the declaration statement. The same goes for variables that are declared to be constant.
Some classes have hidden data that absolutely do need to be initialized. Some classes have const or reference data members that absolutely do need to be initialized. These classes need to be initialized, or constructed. Not all classes do need to be initialized. A class or structure that doesn't have any virtual functions, doesn't have an explicitly-provided constructor or destructor, and whose member data are all primitive data types, is called plain old data, or POD. POD classes do not need to be constructed.
Bottom line:
An object, whether it is a primitive type or an instance of a very complex class, is not "built from dust". Dust is, after all, very harmful to computers. They are built from bits.
Setting aside, or allocating, memory for some object and initializing that set-aside memory are two different things.
The memory need to store an object is allocated, not created. The memory already exists. Because that memory already exists, the bits that comprise the object will have some pre-existing values. You should of course never rely on those preexisting values, but they are there.
The reason for initializing variables, or data members, is to give them a reliable, known value. Sometimes that initialization is just a waste of CPU time. If you didn't ask the compiler to provide such a value, C and C++ assume the omission is intentional.
The constructor for some object does not allocate the memory needed to store the object itself. That step has already been done by the time the constructor is called. What a constructor does do is to initialize that already allocated memory.
The initial response:
A variable of a primitive type does not have to be "built from dust". The memory to store the variable needs to be allocated, but the variable can be left uninitialized. A constructor does not build the object from dust. A constructor does not allocate the memory needed to store the to-be constructed object. That memory has already been allocated by the time the constructor is called. (A constructor might initialize some pointer data member to memory allocated by the constructor, but the bits occupied by that pointer must already exist.)
Some objects such as primitive types and POD classes do not necessarily need to be initialized. Declare a non-static primitive type variable without an initial value and that variable will be uninitialized. The same goes for POD classes. Suppose you know you are going to assign a value to some variable before the value of the variable is accessed. Do you need to provide an initial value? No.
Some languages do give an initial value to every variable. C and C++ do not. If you didn't ask for an initial value, C and C++ are not going to force an initial value on the variable. That initialization has a cost, typically tiny, but it exists.
Built In data types(fundamental types, arrays,references, pointers, and enums) do not have constructors.
A constructor is a member function. A member function can only be defined for a class type
C++03 9.3/1:
"Functions declared in the definition of a class, excluding those declared with a friend specifier, are called member functions of that class".
Many a times usage of an POD type in certain syntax's(given below) might give an impression that they are constructed using constructors or copy constructors but it just Initialization without any of the two.
int x(5);

What can be instantiated?

What types in C++ can be instantiated?
I know that the following each directly create a single instance of Foo:
Foo bar;
Foo *bizz = new Foo();
However, what about with built-in types? Does the following create two instances of int, or is instance the wrong word to use and memory is just being allocated?
int bar2;
int *bizz2 = new int;
What about pointers? Did the above example create an int * instance, or just allocate memory for an int *?
Would using literals like 42 or 3.14 create an instance as well?
I've seen the argument that if you cannot subclass a type, it is not a class, and if it is not a class, it cannot be instantiated. Is this true?
So long as we're talking about C++, the only authoritative source is the ISO standard. That doesn't ever use the word "instantiation" for anything but class and function templates.
It does, however, use the word "instance". For example:
An instance of each object with automatic storage duration (3.7.2) is associated with each entry into its block.
Note that in C++ parlance, an int lvalue is also an "object":
The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is a region of storage.
Since new clearly creates regions of storage, anything thus created is an object, and, following the precedent of the specification, can be called an instance.
As far as I can tell, you're really just asking about terminology here. The only real distinction made by the C++ standard is POD types and non-POD types, where non-POD types have features like user-defined constructors, member functions, private variables, etc., and POD types don't. Basic types like int and float are of course PODs, as are arrays of PODs and C-structs of PODs.
Apart from (and overlapping with) C++, the concept of an "instance" in Object-Oriented Programming usually refers to allocating space for an object in memory, and then initializing it with a constructor. Whether this is done on the stack or the heap, or any other location in memory for that matter, is largely irrelevant.
However, the C++ standard seems to consider all data types "objects." For example, in 3.9 it says:
"The object representation of type T
is the sequence of N unsigned char
objects taken up by the object of type
T, where N equals sizeof(T)..."
So basically, the only distinction made by the C++ standard itself is POD versus non-POD.
in C++ an 'instance' and 'instantiate' is only associated with Classes
note however that these are also english words that can have conversational meaning.
'pointer' is certainly a class of things in the english usage and a pointer is certainly an instance of that class
but in c++ speak 'pointer' is not a Class and a pointer is not an Instance of a Class
see also - how many angels on pinheads
The concept of an "instance" isn't something that's really intrinsic to C++ -- basically you have "things which have a constructor and things which don't".
So, all types have a size, e.g. an int is commonly 4 bytes, a struct with a couple of ints is going to be 8 and so on. Now, slap a constructor on that struct, and it starts looking (and behaving) like a class. More specifically:
int foo; // <-- 4 bytes, no constructor
struct Foo
{
int foo;
int bar;
}; // <-- 8 bytes, no constructor
struct Foo
{
Foo() : foo(0), bar(0) {}
int foo;
int bar;
}; // <-- 8 bytes, with constructor
Now, you any of these types can live on the stack or on the heap. When you create something on the stack, like the "int foo;" above, goes away after its scope goes away (e.g. at the end of the function call). If you create something with "new" it goes on the heap and gets its own place to live in memory until you call delete on it. In both cases the constructor, if there, will be called during instantiation.
It is unusual to do "new int", but it's allowed. You can even pass 0 or 1 arguments to the constructor. I'm not sure if "new int()" means it's 0-initialized (I'd guess yes) as distinct from "new int".
When you define a value on the stack, it's not usually called "allocating memory" (although it is getting memory on the stack in theory, it's possible that the value lives only in CPU registers).
Literals don't necessarily get an address in program memory; CPU instructions may encode data directly (e.g. put 42 into register B). Probably arbitrary floating point constants have an address.