Does an unused member variable take up memory? - c++

Does initializing a member variable and not referencing/using it further take up RAM during runtime, or does the compiler simply ignore that variable?
struct Foo {
int var1;
int var2;
Foo() { var1 = 5; std::cout << var1; }
};
In the example above, the member 'var1' gets a value which is then displayed in the console. 'Var2', however, is not used at all. Therefore writing it to memory during runtime would be a waste of resources. Does the compiler take these kinds of situations into an account and simply ignore unused variables, or is the Foo object always the same size, regardless of whether its members are used?

The golden C++ "as-if" rule1 states that, if the observable behavior of a program doesn't depend on an unused data-member existence, the compiler is allowed to optimized it away.
Does an unused member variable take up memory?
No (if it is "really" unused).
Now comes two questions in mind:
When would the observable behavior not depend on a member existence?
Does that kind of situations occurs in real life programs?
Let's start with an example.
Example
#include <iostream>
struct Foo1
{ int var1 = 5; Foo1() { std::cout << var1; } };
struct Foo2
{ int var1 = 5; int var2; Foo2() { std::cout << var1; } };
void f1() { (void) Foo1{}; }
void f2() { (void) Foo2{}; }
If we ask gcc to compile this translation unit, it outputs:
f1():
mov esi, 5
mov edi, OFFSET FLAT:_ZSt4cout
jmp std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
f2():
jmp f1()
f2 is the same as f1, and no memory is ever used to hold an actual Foo2::var2. (Clang does something similar).
Discussion
Some may say this is different for two reasons:
this is too trivial an example,
the struct is entirely optimized, it doesn't count.
Well, a good program is a smart and complex assembly of simple things rather than a simple juxtaposition of complex things. In real life, you write tons of simple functions using simple structures than the compiler optimizes away. For instance:
bool insert(std::set<int>& set, int value)
{
return set.insert(value).second;
}
This is a genuine example of a data-member (here, std::pair<std::set<int>::iterator, bool>::first) being unused. Guess what? It is optimized away (simpler example with a dummy set if that assembly makes you cry).
Now would be the perfect time to read the excellent answer of Max Langhof (upvote it for me please). It explains why, in the end, the concept of structure doesn't make sense at the assembly level the compiler outputs.
"But, if I do X, the fact that the unused member is optimized away is a problem!"
There have been a number of comments arguing this answer must be wrong because some operation (like assert(sizeof(Foo2) == 2*sizeof(int))) would break something.
If X is part of the observable behavior of the program2, the compiler is not allowed to optimized things away. There are a lot of operations on an object containing an "unused" data-member which would have an observable effect on the program. If such an operation is performed or if the compiler cannot prove none is performed, that "unused" data-member is part of the observable behavior of the program and cannot be optimized away.
Operations that affect the observable behavior include, but are not limited to:
taking the size of a type of object (sizeof(Foo)),
taking the address of a data member declared after the "unused" one,
copying the object with a function like memcpy,
manipulating the representation of the object (like with memcmp),
qualifying an object as volatile,
etc.
1)
[intro.abstract]/1
The semantic descriptions in this document define a parameterized nondeterministic abstract machine. This document places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.
2) Like an assert passing or failing is.

It's important to realize that the code the compiler produces has no actual knowledge of your data structures (because such a thing doesn't exist on assembly level), and neither does the optimizer. The compiler only produces code for each function, not data structures.
Ok, it also writes constant data sections and such.
Based on that, we can already say that the optimizer won't "remove" or "eliminate" members, because it doesn't output data structures. It outputs code, which may or may not use the members, and among its goals is saving memory or cycles by eliminating pointless uses (i.e. writes/reads) of the members.
The gist of it is that "if the compiler can prove within the scope of a function (including functions that were inlined into it) that the unused member makes no difference for how the function operates (and what it returns) then chances are good that the presence of the member causes no overhead".
As you make the interactions of a function with the outside world more complicated/unclear to the compiler (take/return more complex data structures, e.g. a std::vector<Foo>, hide the definition of a function in a different compilation unit, forbid/disincentivize inlining etc.), it becomes more and more likely that the compiler cannot prove that the unused member has no effect.
There are no hard rules here because it all depends on the optimizations the compiler makes, but as long as you do trivial things (such as shown in YSC's answer) it's very likely that no overhead will be present, whereas doing complicated things (e.g. returning a std::vector<Foo> from a function too large for inlining) will probably incur the overhead.
To illustrate the point, consider this example:
struct Foo {
int var1 = 3;
int var2 = 4;
int var3 = 5;
};
int test()
{
Foo foo;
std::array<char, sizeof(Foo)> arr;
std::memcpy(&arr, &foo, sizeof(Foo));
return arr[0] + arr[4];
}
We do non-trivial things here (take addresses, inspect and add bytes from the byte representation) and yet the optimizer can figure out that the result is always the same on this platform:
test(): # #test()
mov eax, 7
ret
Not only did the members of Foo not occupy any memory, a Foo didn't even come into existence! If there are other usages that can't be optimized then e.g. sizeof(Foo) might matter - but only for that segment of code! If all usages could be optimized like this then the existence of e.g. var3 does not influence the generated code. But even if it is used somewhere else, test() would remain optimized!
In short: Each usage of Foo is optimized independently. Some may use more memory because of an unneeded member, some may not. Consult your compiler manual for more details.

The compiler will only optimise away an unused member variable (especially a public one) if it can prove that removing the variable has no side effects and that no part of the program depends on the size of Foo being the same.
I don't think any current compiler performs such optimisations unless the structure isn't really being used at all. Some compilers may at least warn about unused private variables but not usually for public ones.

In general, you have to assume that you get what you have asked for, for example, the "unused" member variables are there.
Since in your example both members are public, the compiler cannot know if some code (particularly from other translation units = other *.cpp files, which are compiled separately and then linked) would access the "unused" member.
The answer of YSC gives a very simple example, where the class type is only used as a variable of automatic storage duration and where no pointer to that variable is taken. There, the compiler can inline all the code and can then eliminate all the dead code.
If you have interfaces between functions defined in different translation units, typically the compiler does not know anything. The interfaces follow typically some predefined ABI (like that) such that different object files can be linked together without any problems. Typically ABIs do not make a difference if a member is used or not. So, in such cases, the second member has to be physically in the memory (unless eliminated out later by the linker).
And as long as you are within the boundaries of the language, you cannot observe that any elimination happens. If you call sizeof(Foo), you will get 2*sizeof(int). If you create an array of Foos, the distance between the beginnings of two consecutive objects of Foo is always sizeof(Foo) bytes.
Your type is a standard layout type, which means that you can also access on members based on compile-time computed offsets (cf. the offsetof macro). Moreover, you can inspect the byte-by-byte representation of the object by copying onto an array of char using std::memcpy. In all these cases, the second member can be observed to be there.

The examples provided by other answers to this question which elide var2 are based on a single optimization technique: constant propagation, and subsequent elision of the whole structure (not the elision of just var2). This is the simple case, and optimizing compilers do implement it.
For unmanaged C/C++ codes the answer is that the compiler will in general not elide var2. As far as I know there is no support for such a C/C++ struct transformation in debugging information, and if the struct is accessible as a variable in a debugger then var2 cannot be elided. As far as I know no current C/C++ compiler can specialize functions according to elision of var2, so if the struct is passed to or returned from a non-inlined function then var2 cannot be elided.
For managed languages such as C#/Java with a JIT compiler the compiler might be able to safely elide var2 because it can precisely track if it is being used and whether it escapes to unmanaged code. The physical size of the struct in managed languages can be different from its size reported to the programmer.
Year 2019 C/C++ compilers cannot elide var2 from the struct unless the whole struct variable is elided. For interesting cases of elision of var2 from the struct, the answer is: No.
Some future C/C++ compilers will be able to elide var2 from the struct, and the ecosystem built around the compilers will need to adapt to process elision information generated by compilers.

It's dependent on your compiler and its optimization level.
In gcc, if you specify -O, it will turn on the following optimization flags:
-fauto-inc-dec
-fbranch-count-reg
-fcombine-stack-adjustments
-fcompare-elim
-fcprop-registers
-fdce
-fdefer-pop
...
-fdce stands for Dead Code Elimination.
You can use __attribute__((used)) to prevent gcc eliminating an unused variable with static storage:
This attribute, attached to a variable with static storage, means that
the variable must be emitted even if it appears that the variable is
not referenced.
When applied to a static data member of a C++ class template, the
attribute also means that the member is instantiated if the class
itself is instantiated.

Related

Do the C++ standards guarantee that unused private fields will influence sizeof?

Consider the following struct:
class Foo {
int a;
};
Testing in g++, I get that sizeof(Foo) == 4 but is that guaranteed by the standard? Would a compiler be allowed to notice that a is an unused private field and remove it from the in-memory representation of the class (leading to a smaller sizeof)?
I don't expect any compilers to actually do that kind of optimization but this question popped up in a language lawyering discussion so now I'm curious.
The C++ standard doesn't define a lot about memory layouts. The fundamental rule for this case is item 4 under section 9 Classes:
4 Complete objects and member subobjects of class type shall have nonzero size. [ Note: Class objects can be assigned, passed as arguments to functions, and returned by functions (except objects of classes for which copying or moving has been restricted; see 12.8). Other plausible operators, such as equality comparison, can be defined by the user; see 13.5. — end note ]
Now there is one more restriction, though: Standard-layout classes. (no static elements, no virtuals, same visibility for all members) Section 9.2 Class members requires layout compatibility between different classes for standard-layout classes. This prevents elimination of members from such classes.
For non-trivial non-standard-layout classes I see no further restriction in the standard. The exact behavior of sizeof(), reinterpret_cast(), ... are implementation defined (i.e. 5.2.10 "The mapping function is implementation-defined.").
The answer is yes and no. A compiler could not exhibit exactly that behaviour within the standard, but it could do so partly.
There is no reason at all why a compiler could not optimise away the storage for the struct if that storage is never referenced. If the compiler gets its analysis right, then no program that you could write would ever be able to tell whether the storage exists or not.
However, the compiler cannot report a smaller sizeof() thereby. The standard is pretty clear that objects have to be big enough to hold the bits and bytes they contain (see for example 3.9/4 in N3797), and to report a sizeof smaller than that required to hold an int would be wrong.
At N3797 5.3.2:
The sizeof operator yields the number of bytes in the object
representation of its operand
I do not se that 'representation' can change according to whether the struct or member is referenced.
As another way of looking at it:
struct A {
int i;
};
struct B {
int i;
};
A a;
a.i = 0;
assert(sizeof(A)==sizeof(B));
I do not see that this assert can be allowed to fail in a standards-conforming implementation.
If you look at templates, you'll notice that "optimization" of such often ends up with nearly nothing in the output even though the template files may be thousands of lines...
I think that the optimization you are talking about will nearly always occur in a function when the object is used on the stack and the object doesn't get copied or passed down to another function and the private field is never accessed (not even initialized... which could be viewed as a bug!)

C++ constructor initialization list alternative in C?

In C++, classes constructors can use initialization lists, which I am told is a performance feature that improves by avoiding extra assignments. So I wonder if there is a similar approach to achieve the same benefits in C for functions that basically serve the same purpose to initialize structs as C++ class constructors?
I am a little unclear on how exactly the feature works in a C++ compiler, so any additional info on the subject will also be appreciated.
C doesn't have any similar feature, however since C also doesn't have constructors, there is no danger of unnecessary assignments.
The bigger principle is that introducing one feature into a language often creates a need for additional features to reinforce the original. An trivial example is threads. If threads are built into the language as a feature, then there is the immediate question of how to synchronize them. Hence synchronization is also needed. So you see languages (like C) with no built-in threads or synchronization and languages with both, but not one without the other. Here, constructors is to threads as synchonization is to list initializers.
In C++ constructors, initialization lists allow the C++ compiler to constructor members in-place, at the location of the member variable, instead of using an assignment operator, copy-constructor, or move-constructor to initialize the member variable. See Section 10.6 of the C++ FAQ for more details.
In C, there are no such automatic operations provided by the C compiler. This means that the programmer controls all initialization directly, and no special-language features are required to avoid these extra operations.
To be a little more clear, consider what happens when you use assignment to initialize in a C++ constructor:
The member variable is first constructed with a default constructor
A temporary object is constructed
An assignment or move-assignment operator is called to re-initialize the member variable with the temporary.
Call the destructor on the temporary.
While some compilers can optimize this away in some situations, your mileage may vary, and no C++ compiler can optimize these steps away in all situations. Now, consider how a programmer would exactly duplicate these steps in C:
void my_struct_init(struct my_struct* sp)
{
member_init_default(&sp->the_member); /* default constructor for member */
struct member memb; /* temporary on stack */
member_init_other(&memb, ...params...); /* initialize memb */
member_assign(&sp->the_member,&memb); /* assign member */
member_finalize(&memb); /* finalize the temporary */
}
Few C programmers would do this (without good reason). Instead, they would automatically code the optimization:
member_init_other(&sp->the_member, ...params...);
The feature exists in C++ because the compiler does a lot of automatic things for the programmer. This is often makes life easier for the programmer, but requires features like initialization lists to help the compiler generate optimum code. C compilers present a much simpler model of the underlying machine, do fewer things automatically, and thus require fewer features (though not necessarily less work) to generate similarly optimal code.
There is no such feature in C. The closest thing is designated initializers.
You can write a separate function and call it when you are creating an object from a class and then pass it to it:
C:
typedef struct {
int x;
}mine;
void mine_initializer(mine* me)
{
me->x = 4; //initialization
}
int main(void)
{
mine me;
mine_initializer(&me);
return 0;
}
Also you can do that in C++:
struct mine{
int x;
void initialize()
{
x = 4; //initialization
}
};
void main(void)
{
mine me;
me.initialize();
printf("%d",me.x);
}
this will output 4 as the result.

Pure functions in C++11

Can one in C++11 somehow in gcc mark a function (not a class method) as const telling that it is pure and does not use the global memory but only its arguments?
I've tried gcc's __attribute__((const)) and it is precisely what I want. But it does not produce any compile time error when the global memory is touched in the function.
Edit 1
Please be careful. I mean pure functions. Not constant functions. GCC's attribute is a little bit confusing. Pure functions only use their arguments.
Are you looking for constexpr? This tells the compiler that the function may be evaluated at compile time. A constexpr function must have literal return and parameter types and the body can only contain static asserts, typedefs, using declarations and directives and one return statement. A constexpr function may be called in a constant expression.
constexpr int add(int a, int b) { return a + b; }
int x[add(3, 6)];
Having looked at the meaning of __atribute__((const)), the answer is no, you cannot do this with standard C++. Using constexpr will achieve the same effect, but only on a much more limited set of functions. There is nothing stopping a compiler from making these optimizations on its own, however, as long as the compiled program behaves the same way (the as-if rule).
Because it has been mentioned a lot here, lets forget about Meta programming for now, which is pure functional anyway and off topic. However, a constexpr function foo can be called with non constexpr arguments and in this context foo is actually a pure function evaluated at runtime (I am ignoring global variables here). But you can write many pure functions that you cannot make constexpr, this includes any function throwing exceptions for example.
Second I assume the OP means marking pure as an assertion for the compiler to check. GCC's pure attribute is the opposite, a way for the coder to help the compiler.
While the answer to the OP's question is NO, it is very interesting to read about the history of attempts to introduce a pure keyword (or impure and let pure be the default).
The d-lang community quickly figured out that the meaning of "pure" is not clear. Logging should not make a function impure. Mutable variables that do not escape the function call should be allowed in pure functions. Equal return values having different addresses should not be considered impure. But D goes even further than that in stretching purity.
So the d-lang community introduced the term "weakly pure" and "strongly pure". But later disputes showed that weak and strong is not black and white and there are grey zones. see purity in D
Rust introduced the "pure" keyword early on; and they dropped it because of its complexity. see purity in Rust.
Among the great benefits of a "pure" keyword there is an ugly consequence though. A templated function can be pure or not depending on its type parameters. This can explode the number of template instantiations. Those instantiations may only need to exist temporarily in the compiler and not get into the executable but they can still explode compile times.
A syntax highlighting editor could be of some help here without modifying the language. Optimizing C++ compilers do actually reason about the pureness of a function, they just do not guarantee catching all cases.
I find it sad that this feature seems to have low priority. It makes reasoning about code so much easier. I would even argue that it would improve software design by the way it incentivizing programmers to think differently.
using just standard C++11:
namespace g{ int x; }
constexpr int foo()
{
//return g::x = 42; Nah, not constant
return 42; // OK
}
int main()
{}
here's another example:
constexpr int foo( int blah = 0 )
{
return blah + 42; // OK
}
int main( int argc, char** )
{
int bah[foo(2)]; // Very constant.
int const troll = foo( argc ); // Very non-constant.
}
The meaning of GCC's __attribute__( const ) is documented in the GNU compiler docs as …
Many functions do not examine any values except their arguments, and have no effects except the return value. Basically this is just slightly more strict class than the pure attribute below, since function is not allowed to read global memory.
One may take that to mean that the function result should only depend on the arguments, and that the function should have no side effects.
This allows a more general class of functions than C++11 constexpr, which makes the function inline, restricts arguments and function result to literal types, and restricts the "active" statements of the function body to a single return statement, where (C++11 §7.1.5/3)
— every constructor call and implicit conversion used in initializing the return value (6.6.3, 8.5) shall be one of those allowed in a constant expression (5.19)
As an example, it is difficult (I would think not impossible, but difficult) to make a constexpr sin function.
But the purity of the result matters only to two parties:
When known to be pure, the compiler can elide calls with known results.
This is mostly an optimization of macro-generated code. Replace macros with inline functions to avoid silly generation of identical sub-expressions.
When known to be pure, a programmer can remove a call entirely.
This is just a matter of proper documentation. :-)
So instead of looking for a way to express the purity of e.g. sin in the language, I suggest just avoid code generation via macros, and document pure functions as such.
And use constexpr for the functions where it's practically possible (unfortunately, as of Dec. 2012 the latest Visual C++ compiler doesn't yet support constexpr).
There is a previous SO question about the relationship between pure and constexpr. Mainly, every constexpr function is pure, but not vice versa.

Inheritance Costs in C++

Taking the following snippet as an example:
struct Foo
{
typedef int type;
};
class Bar : private Foo
{
};
class Baz
{
};
As you can see, no virtual functions exist in this relationship. Since this is the case, are the the following assumptions accurate as far as the language is concerned?
No virtual function table will be created in Bar.
sizeof(Bar) == sizeof(Baz)
Basically, I'm trying to figure out if I'll be paying any sort of penalty for doing this. My initial testing (albeit on a single compiler) indicates that my assertions are valid, but I'm not sure if this is my compiler's optimizer or the language specification that's responsible for what I'm seeing.
According to the standard, Bar is not a POD (plain old data) type, because it has a base. As a result, the standard gives C++ compilers wide latitude with what they do with such a type.
However, very few compilers are going to do anything insane here. The one thing you probably have to look out for is the Empty Base Optimization. For various technical reasons, the C++ standard requires that any instance be allocated storage space. For some compilers, Foo will be allocated dedicated space in the bar class. Compilers which implement the Empty Base Optimization (most all in modern use) will remove the empty base, however.
If the given compiler does not implement EBO, then sizeof(foo) will be at least twice sizeof(baz).
Yeah, without any virtual members or member variables, there shouldn't be a size difference.
As far as I know the compiler will optimize this correctly, if any optimizing is needed at all.

What are the benefits to passing integral types by const ref

The question: Is there benefit to passing an integral type by const reference as opposed to simply by value.
ie.
void foo(const int& n); // case #1
vs
void foo(int n); // case #2
The answer is clear for user defined types, case #1 avoids needless copying while ensuring the constness of the object. However in the above case, the reference and the integer (at least on my system) are the same size, so I can't imagine there being a whole lot of difference in terms of how long it takes for the function call (due to copying). However, my question is really related to the compiler inlining the function:
For very small inline functions, will the compiler have to make a copy of the integer in case #2? By letting the compiler know we won't change the reference can it inline the function call without needless copying of the integer?
Any advice is welcome.
Passing a built-in int type by const ref will actually be a minor de-optimization (generally). At least for a non-inline function. The compiler may have to actually pass a pointer that has to be de-referenced to get the value. You might think it could always optimize this away, but aliasing rules and the need to support separate compilation might force the compiler's hand.
However, for your secondary question:
For very small inline functions, will the compiler have to make a copy of the integer in case #2? By letting the compiler know we won't change the reference can it inline the function call without needless copying of the integer?
The compiler should be able to optimize away the copy or the dereference if semantics allow it, since in that situation the compiler has full knowledge of the state at the call site and the function implementation. It'll likely just load the value into a register have its way with it and just use the register for something else when it's done with the parameter. Of course,all this is very dependent on the actual implementation of the function.
I actually find it irritating when somebody uses const references like this for the basic datatypes. I can't see any benefit of doing this, although it may be argued that for datatypes bigger than sizeof(pointer) it may be more efficient. Although, I really don't care about such minute 'optimizations'.
It depends on the compiler, but I'd expect that any reasonable optimizer would give you the same results either way.
I tested with gcc, and the results were indeed the same. here's the code I tested:
inline int foo(const int& n) {
return n * 2;
}
int bar(int x) {
int y = foo(x);
return y;
}
(with and without const & on foo's n parameter)
I then compiled with gcc 4.0.1 with the following command line:
g++ -O3 -S -o foo.s foo.cc
The outputs of the two compiles were identical.
It's usually not worth it. Even for inline function, the compiler won't be stupid. The only time I would say it's appropriate is if you had a template; it might not be worth the extra effort to specialize for builtins just to take a copy instead of a reference.
You can use boost::call_traits<your type>::param_type for optimal parameter passing. Which defaults to simple parameter passing of primitive types and passing by const reference of structs and classes.
A lot of people are saying there's no difference between the two. I can see one (perhaps contrived) case in which the difference would matter...
int i = 0;
void f(const int &j)
{
i++;
if (j == 0)
{
// Do something.
}
}
void g()
{
f(i);
}
But.. As others have mentioned... integers and pointers are likely to be of similar size. For something as small as an integer, references will decrease your performance. It probably won't be too noticeable unless your method is called a lot, but it will be there. On the other hand, under some circumstances the compiler may optimize it out.
When writing or using templates, you may end up with (const int &) because the template writer can't know what the type actually is. If the object is heavyweight, passing a reference is the right thing to do; if it's an int or something, the compiler may be able to optimize it away.
In the absence of some kind of external requirement, there is generally no reason to do something like this for a one-off function -- it's just extra typing, plus throwing around references actually tends to inhibit optimization. Copying small data in registers is much cheaper than reloading it from memory in case it's changed!
I can't think of any benefit. I've even seen recommendation that when writing templates, you use meta-programming to pass integral types by value and only use const reference for non-integral types.
well the cost of a reference is the same typically of that of an integral type, but with the reference you have an indirection that has to take place, because the reference to some memory has to be resolved into a value.
Just copy by value, stick to an immutable convention for built-in types.
Don't do this. int is the same size as pointer/reference on common 32-bit plattforms, and smaller on 64-bit, thus you could get yourself a performance disadvantage instead of a benefit. I mean, all function arguments are pushed onto stack in order so that a function can read them, and it will either be your int, or its address in the case of reference. Another disadvantage is that the callee will either access your n through an indirection (dereferencing an address), or it will make copy on its stack as an optimization.
If you make some changes to an int passed by value, it might be written either back onto the place on the stack where it was passed, or onto a new stack position. The second case naturally isn't advantagous, but shouldn't happen. By consting you bar yourself from making such a change, but this would work the same with const int.
In the proper inline case it doesn't matter, naturally, but keep in mind that not everything where you write inline, will be.
Please read Want Speed? Pass by Value by Dave Abrahams.
It's not only performance.
A true story: this week I noticed that a colleague tried to improve upon the Numerical Recipes and replaced the macro
#define SHFT(a,b,c,d) do { (a)=(b); (b)=(c); (c)=(d); } while (0)
by this function
inline void Rotate(double& dFirst, double& dSecond, double& dThird, const double dNewValue)
{
dFirst = dSecond;
dSecond = dThird;
dThird = dNewValue;
} // Function Rotate
This would have worked, if he had passed the last parameter by reference, but as it is, this code
Rotate(dum,*fb,*fa,dum);
which was supposed to swap *fa and *fb no longer works.
Passing it by reference without const is not possible, as in other places non-l-values are passed to the last parameter.