In the following c++ program:
struct O
{
const int id;
};
extern void externalFunc();
int func(O* o)
{
//first load of o->id happens here:
int first = o->id;
externalFunc();
//second load happens here, but why?
return o->id + first;
}
both Clang and MSVC with all optimizations on compile this code in a way where the o->id value gets loaded from memory twice.
Why are these compilers unable to remove the second load? I am trying to tell the compiler the value is guaranteed not to change by having the id member marked const, but apparently both compilers do not find this sufficient guarantee. If I remove the invocation of externalFunc() they do optimize away the second load. How do I convince the compiler this value is really not going to change?
Consider:
#include <iostream>
struct O
{
const int id;
O(int x) : id(x) {}
};
O* global = nullptr;
void externalFunc() {
global->~O();
new(global) O(42);
}
int func(O* o)
{
int first = o->id;
externalFunc();
// o->id has changed, even though o hasn't
return o->id + first;
}
int main()
{
O o(1);
global = &o;
std::cout << func(&o);
}
Output: 43. Demo
externalFunc() might alter o->id. (Not o, which is a local variable.)
Why are these compilers unable to remove the second load? I am trying to tell the compiler the value is guaranteed not to change by having the id member marked const, but apparently both compilers do not find this sufficient guarantee.
Because it isn't. Consider this example.
static O mg {5};
void
externalFunc()
{
mg.~O();
new (&mg) O {6};
}
int
main()
{
std::cout << mg.id << '\n';
func(&mg);
std::cout << mg.id << '\n';
}
The first load reads the value 5, the second will read 6.
How do I convince the compiler this value is really not going to change?
Simply cache the field. This will still not convince the compiler that o->id will not change but it will assure it that if it does, you don't care.
int
func(O* o)
{
const int id = o->id;
externalFunc();
return id + id;
}
I have made it a general habit to cache all values of primitive fields that I access via a pointer (including the this pointer) into local (const) variables. If the compiler can ensure that the values can't change, it has no additional cost, and if it can't, it might produce slightly better code. As a nice aside, it also allows you to give names to the values that make most sense in the context of the function.
The compiler itself doesn't have the code for externalFunc() while compiling func(), so it doesn't know what it might do. Therefore, it functions as a barrier.
If you linked statically, this would fall under link-time-optimization (which can be enabled on GCC with -flto), and is also supported in MSVC.
In order to hint to GCC/Clang (looks like MSVC doesn't support this one) that the function doesn't change global memory, you can mark it with the pure attribute:
extern void __attribute__((pure)) void externalFunc();
And then it will stop posing that obstacle.
Related
I am quite a newbie to the C++ programming, but this question keeps on spinning in my head. I understand that returning reference to a local variable in a function is illegal, i.e. compiling this code snippet:
inline int& funref() {
int a = 8;
return a; // not OK!
}
results in a warning from the compiler and then a runtime error. But then, why does this piece of code get compiled without any warnings and run without error?
inline int& funref() {
int a = 8;
int& refa = a;
return refa; // OK!
}
int main() {
int& refa = funref();
cout << refa;
}
My compiler is g++ on Linux Fedora platform.
It's still wrong, it just happens to be working by (un)happy coincidence.
This code has undefined behaviour with all the usual caveats (it might always work, it might always work until it's too late to fix, it might set fire to your house and run away with your betrothed).
The compiler isn't required to issue a diagnostic (warning or error message) for every possible mistake, just because it isn't always possible to do so. Here, at least your current version of g++ hasn't warned. A different compiler, or a different version of g++, or even the same version with different flags, might warn you.
The reason why you can't return a reference to a local variable is because the local variable will get wiped when your function returns. Simply put, the compiler prevents you from referencing garbage data.
However, the compiler isn't bulletproof (as shown in your example #2).
It does work for retrieving a singleton instance, though.
inline int& funref()
{
static int* p_a = nullptr;
if (nullptr == p_a)
p_a = new int(8);
return *p_a;
}
this case is valid because the memory pointed by p_a remains valid after the function returns.
It's my first year of using C++ and learning on the way. I'm currently reading up on Return Value Optimizations (I use C++11 btw). E.g. here https://en.wikipedia.org/wiki/Return_value_optimization, and immediately these beginner examples with primitive types spring to mind:
int& func1()
{
int i = 1;
return i;
}
//error, 'i' was declared with automatic storage (in practice on the stack(?))
//and is undefined by the time function returns
...and this one:
int func1()
{
int i = 1;
return i;
}
//perfectly fine, 'i' is copied... (to previous stack frame... right?)
Now, I get to this and try to understand it in the light of the other two:
Simpleclass func1()
{
return Simpleclass();
}
What actually happens here? I know most compilers will optimise this, what I am asking is not 'if' but:
how the optimisation works (the accepted response)
does it interfere with storage duration: stack/heap (Old: Is it basically random whether I've copied from stack or created on heap and moved (passed the reference)? Does it depend on created object size?)
is it not better to use, say, explicit std::move?
You won't see any effect of RVO when returning ints.
However, when returning large objects like this:
struct Huge { ... };
Huge makeHuge() {
Huge h { x, y, x };
h.doSomething();
return h;
}
The following code...
auto h = makeHuge();
... after RVO would be implemented something like this (pseudo code) ...
h_storage = allocate_from_stack(sizeof(Huge));
makeHuge(addressof(h_storage));
auto& h = *properly_aligned(h_storage);
... and makeHuge would compile to something like this...
void makeHuge(Huge* h_storage) // in fact this address can be
// inferred from the stack pointer
// (or just 'known' when inlining).
{
phuge = operator (h_storage) new Huge(x, y, z);
phuge->doSomething();
}
Given the following:
class ReadWrite {
public:
int Read(size_t address);
void Write(size_t address, int val);
private:
std::map<size_t, int> db;
}
In read function when accessing an address which no previous write was made to I want to either throw exception designating such error or allow that and return 0, in other words I would like to either use std::map<size_t, int>::operator[]() or std::map<size_t, int>::at(), depending on some bool value which user can set. So I add the following:
class ReadWrite {
public:
int Read(size_t add) { if (allow) return db[add]; return db.at(add);}
void Write(size_t add, int val) { db[add] = val; }
void Allow() { allow = true; }
private:
bool allow = false;
std::map<size_t, int> db;
}
The problem with that is:
Usually, the program will have one call of allow or none at the beginning of the program and then afterwards many accesses. So, performance wise, this code is bad because it every-time performs the check if (allow) where usually it's either always true or always false.
So how would you solve such problem?
Edit:
While the described use case (one or none Allow() at first) of this class is very likely it's not definite and so I must allow user call Allow() dynamically.
Another Edit:
Solutions which use function pointer: What about the performance overhead incurred by using function pointer which is not able to make inline by the compiler? If we use std::function instead will that solve the issue?
Usually, the program will have one call of allow or none at the
beginning of the program and then afterwards many accesses. So,
performance wise, this code is bad because it every-time performs the
check if (allow) where usually it's either always true or always
false. So how would you solve such problem?
I won't, The CPU will.
the Branch Prediction will figure out that the answer is most likely to be same for some long time so it will able to optimize the branch in the hardware level very much. it will still incur some overhead, but very negligible.
If you really need to optimize your program, I think your better use std::unordered_map instead of std::map, or move to some faster map implementation, like google::dense_hash_map. the branch is insignificant compared to map-lookup.
If you want to decrease the time-cost, you have to increase the memory-cost. Accepting that, you can do this with a function pointer. Below is my answer:
class ReadWrite {
public:
void Write(size_t add, int val) { db[add] = val; }
// when allowed, make the function pointer point to read2
void Allow() { Read = &ReadWrite::read2;}
//function pointer that points to read1 by default
int (ReadWrite::*Read)(size_t) = &ReadWrite::read1;
private:
int read1(size_t add){return db.at(add);}
int read2(size_t add) {return db[add];}
std::map<size_t, int> db;
};
The function pointer can be called as the other member functions. As an example:
ReadWrite rwObject;
//some code here
//...
rwObject.Read(5); //use of function pointer
//
Note that non-static data member initialization is available with c++11, so the int (ReadWrite::*Read)(size_t) = &ReadWrite::read1; may not compile with older versions. In that case, you have to explicitly declare one constructor, where the initialization of the function pointer can be done.
You can use a pointer to function.
class ReadWrite {
public:
void Write(size_t add, int val) { db[add] = val; }
int Read(size_t add) { (this->*Rfunc)(add); }
void Allow() { Rfunc = &ReadWrite::Read2; }
private:
std::map<size_t, int> db;
int Read1(size_t add) { return db.at(add); }
int Read2(size_t add) { return db[add]; }
int (ReadWrite::*Rfunc)(size_t) = &ReadWrite::Read1;
}
If you want runtime dynamic behaviour you'll have to pay for it at runtime (at the point you want your logic to behave dynamically).
You want different behaviour at the point where you call Read depending on a runtime condition and you'll have to check that condition.
No matter whether your overhad is a function pointer call or a branch, you'll find a jump or call to different places in your program depending on allow at the point Read is called by the client code.
Note: Profile and fix real bottlenecks - not suspected ones. (You'll learn more if you profile by either having your suspicion confirmed or by finding out why your assumption about the performance was wrong.)
I'm looking to the answer to the following question: is may_alias suitable as attribute for pointer to an object of some class Foo? Or must it be used at class level only?
Consider the following code(it is based on a real-world example which is more complex):
#include <iostream>
using namespace std;
#define alias_hack __attribute__((__may_alias__))
template <typename T>
class Foo
{
private:
/*alias_hack*/ char Data[sizeof (T)];
public:
/*alias_hack*/ T& GetT()
{
return *((/*alias_hack*/ T*)Data);
}
};
struct Bar
{
int Baz;
Bar(int baz)
: Baz(baz)
{}
} /*alias_hack*/; // <- uncommeting this line apparently solves the problem, but does so on class-level(rather than pointer-level)
// uncommenting previous alias_hack's doesn't help
int main()
{
Foo<Bar> foo;
foo.GetT().Baz = 42;
cout << foo.GetT().Baz << endl;
}
Is there any way to tell gcc that single pointer may_alias some another?
BTW, please note that gcc detection mechanism of such problem is imperfect, so it is very easy to just make this warning go away without actually solving the problem.
Consider the following snippet of code:
#include <iostream>
using namespace std;
int main()
{
long i = 42;
long* iptr = &i;
//(*(short*)&i) = 3; // with warning
//(*(short*)iptr) = 3; // without warning
cout << i << endl;
}
Uncomment one of the lines to see the difference in compiler output.
Simple answer - sorry, no.
__attrbite__ gives instructions to the compiler. Objects exist in the memory of the executed program. Hence nothing in __attribute__ list can relate to the run-time execution.
Dimitar is correct. may_alias is a type attribute. It can only apply to a type, not an instance of the type. What you'd like is what gcc calls a "variable attribute". It would not be easy to disable optimizations for one specific pointer. What would the compiler do if you call a function with this pointer? The function is potentially already compiled and will behave based on the type passed to the function, not based on the address store in the pointer (you should see now why this is a type attribute)
Now depending on your code something like that might work:
#define define_may_alias_type(X) class X ## _may alias : public X { } attribute ((may_alias));
You'd just pass your pointer as Foo_may_alias * (instead of Foo *) when it might alias. That's hacky though
Wrt your question about the warning, it's because -Wall defaults to -Wstrict-aliasing=3 which is not 100% accurate. Actually, -Wstrict-aliasing is never 100% accurate but depending on the level you'll get more or less false negatives (and false positives). If you pass -Wstrict-aliasing=1 to gcc, you'll see a warning for both
I have the following structure:
struct CountCarrier
{
int *CurrCount;
};
And this is what I want to do:
int main()
{
CountCarrier carrier = CountCarrier();
*(carrier.CurrCount) = 2; // initialize the *(carrier.CurrCount) to 2
IncreaseCount(&carrier); // should increase the *(carrier.CurrCount) to 3
}
void IncreaseCount(CountCarrier *countCarrier)
{
int *currCounts = countCarrier->CurrCount;
(*currCounts)++;
}
So, my intention is specified in the comments.
However, I couldn't get this to work. For starters, the program throws an exception at this line:
*(carrier.CurrCount) = 2;
And I suspect the following line won't work as well. Anything I did wrong?
struct CountCarrier
{
int *CurrCount; //No memory assigned
};
You need to allocate some valid memory to the pointer inside the structure to be able to put data in this.
Unless you do so, What you ar trying to do is attempting to write at some invalid address, which results in an Undefined Behavior, which luckiy in this case shows up as an exception.
Resolution:
struct CountCarrier
{
int *CurrCount; //No memory assigned
CountCarrier():CurrCount(new(int))
{
}
};
Suggestion:
Stay away from dynamic allocations as long as you can.
When you think of using pointers always think whether you really need one. In this case it doesn't really seem that you need one, A simple int member would be just fine.
You need to create the pointer. ie. carrier->CurrCount = new int;
*(carrier.CurrCount)
This is dereferencing the pointer carrier.CurrCount, but you never initialized it. I suspect this is what you want:
carrier.CurrCount = new int(2);
I seriously doubt that your program throws an exception at the line:
*(carrier.CurrCount) = 2;
While throwing an exception is certainly allowed behaviour, it seems much more likely that you encountered an access violation that caused the process to be killed by the operating system.
The problem is that you are using a pointer, but your pointer is not initialised to point at anything. This means that the result of the pointer dereference is undefined.
In this situation there does not seem to be any advantage to using a pointer at all. Your CurrCount member would work just as well if it was just a plain int.
If you are using C++, then you should encash its facilities. Instead of correcting your code, I am showing here that how the code should look like:
struct CountCarrier
{
int CurrCount; // simple data member
CountCarrier(int count) : CurrCount(count) {} // constructor
CountCarrier& operator ++ () // overloaded operator
{
++ CurrCount;
return *this;
}
};
We are overloading operator ++, because you have only one data member. You can replace with some named method also, like void IncrementCount().
CountCarrier carrier(2);
++ carrier;
As Als said, you need to provide some memory for the code to work.
But why make it so complicated? You don't need any pointers for the code you have to work. The "modern C++" way looks more like this:
struct CountCarrier
{
public:
CountCarrier(int currCount) : currCount(currCount) {}
void IncreaseCount() { ++currCount; }
int GetCount() const { return currCount; }
private:
int currCount;
};
int main()
{
CountCarrier carrier(2); // Initialize carrier.currCount to 2
carrier.IncreaseCount(); // Increment carrier.currCount to 3
}
Note how much cleaner and less error prone that is. Like I said, pick up a good introductory C++ book and read through it.