Class member access through out of bounds access to array member - c++

I've seen a very experienced programmer do something like:
#include <iostream>
#include <string>
using namespace std;
typedef int BOOL;
BOOL is_max(int a) {
return a&3;
}
class Foo
{
public:
inline Foo() {}
inline ~Foo() {}
int min[3];
int max[3];
};
int main()
{
Foo foo = Foo();
for (int i=0; i<3; i++) {
foo.min[i] = i;
foo.max[i] = i+3;
}
BOOL b = is_max(75); // b==3
// Print out foo.max[1] (which is foo.min[4])
cout << foo.min[1+b] << endl;
}
This is in a very computationally expensive part of the code. So I guess it's indeed faster than create branching with an if condition. Since both arrays (max and min) are of int type and contiguous in the class definition, this should always work.
Is there a reason one should avoid this approach? I know this is probably not the best for code readability and maintainability (e.g. if someone would add a third member in the class definition at the wrong place, i.e. between min and max). Probably a better approach would then tho have
int[6] extrema;
Other than that, would there be other downsides of that approach? Could this lead to premature termination/segmentation fault somehow?

There are two problems, first of all, it is out of bounds access if b is larger then 1 and that is undefined behavior.
Another problem is, that the compiler only sees foo.max[i] = i+3; but has no indication in your code that max is used at any point after that loop. So from the perspective of the optimizer and because accessing max trough min is not valid, it could assume that foo.max[i] = i+3; in the loop is useless and could theoretically optimize it away.
And based on a short look at the compiled output of gcc with optimizations turned on this seems to be indeed the case.
So even if there wouldn't be any unknown padding involved and you could be sure about the memory layout it is still definitely something you must not do.

Related

What is the point of including array's size in a function declaration?

I understand how arrays can be passed to functions in c++, but I don't understand what is the point of including array's size in a function declaration when this size is ignored anyway, because what we are really passing to function is a pointer to the first element of the array.
For example if we have following code:
void ArrayTest1(int(&ints)[3])
{
for (int i = 0; i < 3; i++) std::cout << ints[i];
}
void ArrayTest2(int ints[3])
{
for (int i = 0; i < 3; i++) std::cout << ints[i];
}
void ArrayTest3(int ints[])
{
for (int i = 0; i < 3; i++) std::cout << ints[i];
}
int main()
{
int ints[] = { 1, 2 };
//ArrayTest1(ints); //won't compile
ArrayTest2(ints); //ok
ArrayTest3(ints); //ok
return 0;
}
Then from what I understand functions ArrayTest2 and ArrayTest3 are identical.
Is syntax used in ArrayTest2 only meant to make it clear that this function expects an array with 3 elements and passing array with a different size can cause errors? I'm not new to programming, but I'm new to c++ so I'd like to know what is the point of such syntax and when do people use it.
One of C++’s major design goals was far reaching compatibility with C. Not supporting all aspects of C-array usage would have been seriously detrimental to this goal.
Your examples even give a good indication that compatibility is indeed the intent, and that C++ would work differently if that hadn’t been the case. Consider your ArrayTest1(). It takes a C-array by reference, something that does not exist in C. And in this case the array dimension is significant.
// C-array of length 3 taken by reference
void foo(int (&x)[3]);
void bar() {
int a[4] = {1,2,3,4};
foo(a); // Does not compile. Wrong array size.
}
Only ArrayTest2() and ArrayTest3() are exactly equivalent to:
void foo(int*);
Btw: Bjarne Stroustrup’s The Design and Evolution of C++ is a good read if you’re interested in why C++ is designed the way it is.
I don't exactly know why C permits this (yes, this comes from C), but it is indeed pointless.
Worse than that, the "wrong" dimension would be very misleading to readers of your code.
We might argue that it was to keep the language grammar simple, as the grammar of a declarator already makes the dimension optional (consider int x[] = {1,2,3} vs int x[3] = {1,2,3}).
Maybe some people use it to document intent; I certainly never would.

What changed the value in int a? A simple C++ problem confused me

I solved this introduction problem on hackerrank.
Here is something strange when I try to solve this problem.
the input is
4
1 4 3 2
I want to read the numbers into an array.
#include <cmath>
#include <cstdio>
#include <vector>
#include <iostream>
#include <algorithm>
using namespace std;
int main() {
int a;
int arr[a];
scanf("%d",&a);
for(int i=0; i<=a-1; i++){
scanf("%d",&arr[i]);
printf("i = %d, a = %d\n", i, a);
}
return 0;
}
I got the output:
i = 0, a = 4
i = 1, a = 4
i = 2, a = 4
i = 3, a = 2
The array is correct.
My question is why the value in int a is changed? Why it is changed to 2 instead of 3?
if I rearrange following lines:
int a;
scanf("%d",&a);
int arr[a];
the value in int a is not changed,
i = 0, a = 4
i = 1, a = 4
i = 2, a = 4
i = 3, a = 4
This is wrong:
int a;
int arr[a];
scanf("%d",&a);
Two problems: You are using a before you read the value from the user. Using a unitinitalized is undefined behavior. The output of your code could be anything or nothing.
Then you cannot have a static array with a run-time size. Some compilers support variable length arrays as an extension, but they are not standard c++ (see here).
If you want to write C++, then you should actually use C++. Dynamically sized arrays are std::vector. Your code could look like this:
#include <vector>
#include <iostream>
int main() {
int a;
std::cin >> a; // read user input before you use the value
std::vector<int> x(a); // create vector with a elements
for (size_t i=0; i < x.size(); ++i) {
std::cin >> x[i];
std::cout << "i = " << i << " a = " << a << "\n";
}
}
My question is why the value in int a is changed? Why it is changed to 2 instead of 3?
Undefined behavior means just that, the behavior of your program is undefined. Compilers are not made to compile invalid code. If you do compile invalid code then strange things can happen. Accessing arr[i] is accessing some completely bogus memory address and it can happen that writing to that overwrites the value of a. However, it is important to note that what happens here has little to do with C++, but rather your compiler and the output of the compiler. If you really want to understand the details you need to look at the assembly, but that wont tell you anything about how C++ "works". You can do that with https://godbolt.org/, but maybe the better would be to pay attention to your compilers warnings and try to write correct code.
int a;
int arr[a];
scanf("%d",&a);
This means:
Declare an uninitialised a, with some unspecified value that is not permitted to be used
Declare an array with runtime bounds a (which doesn't exist), which is not permitted
Read user input into a.
Even if these steps were performed in the correct order, you cannot have runtime bounds in C++. Some compilers permit it as an extension, though I've found these to work haphazardly, and it's certainly going to result in strange effects when you use an uninitialised value to do it!
In this case, in practice, you probably have all sorts of weirdness going on in your stack, since you're accessing "non-existent" elements of arr, overwriting the variables on the stack that are "below" it, such as a. Though I caution that trying to analyse the results of undefined behaviour is kind of pointlesss, as they can change at any time for various black-boxed reasons.
Make a nice vector instead.

c++ changing implicit conversion from double to int

I have code which has a lot of conversions from double to int . The code can be seen as
double n = 5.78;
int d = n; // double implicitly converted to a int
The implicit conversion from double to int is that of a truncation which means 5.78 will be saved as 5 . However it has been decided to change this behavior with custom rounding off .
One approach to such problem would be to have your own DOUBLE and INT data types and use conversion operators but alas my code is big and I am not allowed to do much changes . Another approach i thought of was to add 0.5 in each of the numbers but alas the code is big and i was changing too much .
What can be a simple approach to change double to int conversion behaviour which impact the whole code.
You can use uniform initialization syntax to forbid narrowing conversions:
double a;
int b{a}; // error
If you don't want that, you can use std::round function (or its sisters std::ceil/std::floor/std::trunc):
int b = std::round(a);
If you want minimal diff changes, here's what you can do. Please note, though, that this is a bad solution (if it can be named that), and much more likely leaving you crashing and burning due to undefined behavior than actually solving real problems.
Define your custom Int type that handles conversions the way you want it to:
class MyInt
{
//...
};
then evilly replace each occurence of int with MyInt with the help of preprocessor black magic:
#define int MyInt
Problems:
if you accidentally change definitions in the standard library - you're in the UB-land
if you change the return type of main - you're in the UB-land
if you change the definition of a function but not it's forward declarations - you're in the UB/linker error land. Or in the silently-calling-different-overload-land.
probably more.
Do something like this:
#include <iostream>
using namespace std;
int myConvert (double rhs)
{
int answer = (int)rhs; //do something fancier here to meet your needs
return answer;
}
int main()
{
double n = 5.78;
int d = myConvert(n);
cout << "d = " << d << endl;
return 0;
}
You can make myConvert as fancy as you want. Otherwise, you could define your own class for int (e.g. myInt class) and overload the = operator to do the right conversion.

Does a simple cast to perform a raw copy of a variable break strict aliasing?

I've been reading about strict aliasing quite a lot lately. The C/C++ standards say that the following code is invalid (undefined behavior to be correct), since the compiler might have the value of a cached somewhere and would not recognize that it needs to update the value when I update b;
float *a;
...
int *b = reinterpret_cast<int*>(a);
*b = 1;
The standard also says that char* can alias anything, so (correct me if I'm wrong) compiler would reload all cached values whenever a write access to a char* variable is made. Thus the following code would be correct:
float *a;
...
char *b = reinterpret_cast<char*>(a);
*b = 1;
But what about the cases when pointers are not involved at all? For example, I have the following code, and GCC throws warnings about strict aliasing at me.
float a = 2.4;
int32_t b = reinterpret_cast<int&>(a);
What I want to do is just to copy raw value of a, so strict aliasing shouldn't apply. Is there a possible problem here, or just GCC is overly cautious about that?
EDIT
I know there's a solution using memcpy, but it results in code that is much less readable, so I would like not to use that solution.
EDIT2
int32_t b = *reinterpret_cast<int*>(&a); also does not work.
SOLVED
This seems to be a bug in GCC.
If you want to copy some memory, you could just tell the compiler to do that:
Edit: added a function for more readable code:
#include <iostream>
using std::cout; using std::endl;
#include <string.h>
template <class T, class U>
T memcpy(const U& source)
{
T temp;
memcpy(&temp, &source, sizeof(temp));
return temp;
}
int main()
{
float f = 4.2;
cout << "f: " << f << endl;
int i = memcpy<int>(f);
cout << "i: " << i << endl;
}
[Code]
[Updated Code]
Edit: As user/GMan correctly pointed out in the comments, a full-featured implementation could check that T and U are PODs. However, given that the name of the function is still memcpy, it might be OK to rely on your developers treating it as having the same constraints as the original memcpy. That's up to your organization. Also, use the size of the destination, not the source. (Thanks, Oli.)
Basically the strict aliasing rules is "it is undefined to access memory with another type than its declared one, excepted as array of characters". So, gcc isn't overcautious.
If this is something you need to do often, you can also just use a union, which IMHO is more readable than casting or memcpy for this specific purpose:
union floatIntUnion {
float a;
int32_t b;
};
int main() {
floatIntUnion fiu;
fiu.a = 2.4;
int32_t &x = fiu.b;
cout << x << endl;
}
I realize that this doesn't really answer your question about strict-aliasing, but I think this method makes the code look cleaner and shows your intent better.
And also realize that even doing the copies correctly, there is no guarantee that the int you get out will correspond to the same float on other platforms, so count any network/file I/O of these floats/ints out if you plan to create a cross-platform project.

i++ less efficient than ++i, how to show this?

I am trying to show by example that the prefix increment is more efficient than the postfix increment.
In theory this makes sense: i++ needs to be able to return the unincremented original value and therefore store it, whereas ++i can return the incremented value without storing the previous value.
But is there a good example to show this in practice?
I tried the following code:
int array[100];
int main()
{
for(int i = 0; i < sizeof(array)/sizeof(*array); i++)
array[i] = 1;
}
I compiled it using gcc 4.4.0 like this:
gcc -Wa,-adhls -O0 myfile.cpp
I did this again, with the postfix increment changed to a prefix increment:
for(int i = 0; i < sizeof(array)/sizeof(*array); ++i)
The result is identical assembly code in both cases.
This was somewhat unexpected. It seemed like that by turning off optimizations (with -O0) I should see a difference to show the concept. What am I missing? Is there a better example to show this?
In the general case, the post increment will result in a copy where a pre-increment will not. Of course this will be optimized away in a large number of cases and in the cases where it isn't the copy operation will be negligible (ie., for built in types).
Here's a small example that show the potential inefficiency of post-increment.
#include <stdio.h>
class foo
{
public:
int x;
foo() : x(0) {
printf( "construct foo()\n");
};
foo( foo const& other) {
printf( "copy foo()\n");
x = other.x;
};
foo& operator=( foo const& rhs) {
printf( "assign foo()\n");
x = rhs.x;
return *this;
};
foo& operator++() {
printf( "preincrement foo\n");
++x;
return *this;
};
foo operator++( int) {
printf( "postincrement foo\n");
foo temp( *this);
++x;
return temp;
};
};
int main()
{
foo bar;
printf( "\n" "preinc example: \n");
++bar;
printf( "\n" "postinc example: \n");
bar++;
}
The results from an optimized build (which actually removes a second copy operation in the post-increment case due to RVO):
construct foo()
preinc example:
preincrement foo
postinc example:
postincrement foo
copy foo()
In general, if you don't need the semantics of the post-increment, why take the chance that an unnecessary copy will occur?
Of course, it's good to keep in mind that a custom operator++() - either the pre or post variant - is free to return whatever it wants (or even do whatever it wants), and I'd imagine that there are quite a few that don't follow the usual rules. Occasionally I've come across implementations that return "void", which makes the usual semantic difference go away.
You won't see any difference with integers. You need to use iterators or something where post and prefix really do something different. And you need to turn all optimisations on, not off!
I like to follow the rule of "say what you mean".
++i simply increments. i++ increments and has a special, non-intuitive result of evaluation. I only use i++ if I explicitly want that behavior, and use ++i in all other cases. If you follow this practice, when you do see i++ in code, it's obvious that post-increment behavior really was intended.
Several points:
First, you're unlikely to see a major performance difference in any way
Second, your benchmarking is useless if you have optimizations disabled. What we want to know is if this change gives us more or less efficient code, which means that we have to use it with the most efficient code the compiler is able to produce. We don't care whether it is faster in unoptimized builds, we need to know if it is faster in optimized ones.
For built-in datatypes like integers, the compiler is generally able to optimize the difference away. The problem mainly occurs for more complex types with overloaded increment iterators, where the compiler can't trivially see that the two operations would be equivalent in the context.
You should use the code that clearest expresses your intent. Do you want to "add one to the value", or "add one to the value, but keep working on the original value a bit longer"? Usually, the former is the case, and then a pre-increment better expresses your intent.
If you want to show the difference, the simplest option is simply to impement both operators, and point out that one requires an extra copy, the other does not.
This code and its comments should demonstrate the differences between the two.
class a {
int index;
some_ridiculously_big_type big;
//etc...
};
// prefix ++a
void operator++ (a& _a) {
++_a.index
}
// postfix a++
void operator++ (a& _a, int b) {
_a.index++;
}
// now the program
int main (void) {
a my_a;
// prefix:
// 1. updates my_a.index
// 2. copies my_a.index to b
int b = (++my_a).index;
// postfix
// 1. creates a copy of my_a, including the *big* member.
// 2. updates my_a.index
// 3. copies index out of the **copy** of my_a that was created in step 1
int c = (my_a++).index;
}
You can see that the postfix has an extra step (step 1) which involves creating a copy of the object. This has both implications for both memory consumption and runtime. That is why prefix is more efficient that postfix for non-basic types.
Depending on some_ridiculously_big_type and also on whatever you do with the result of the incrememt, you'll be able to see the difference either with or without optimizations.
In response to Mihail, this is a somewhat more portable version his code:
#include <cstdio>
#include <ctime>
using namespace std;
#define SOME_BIG_CONSTANT 100000000
#define OUTER 40
int main( int argc, char * argv[] ) {
int d = 0;
time_t now = time(0);
if ( argc == 1 ) {
for ( int n = 0; n < OUTER; n++ ) {
int i = 0;
while(i < SOME_BIG_CONSTANT) {
d += i++;
}
}
}
else {
for ( int n = 0; n < OUTER; n++ ) {
int i = 0;
while(i < SOME_BIG_CONSTANT) {
d += ++i;
}
}
}
int t = time(0) - now;
printf( "%d\n", t );
return d % 2;
}
The outer loops are there to allow me to fiddle the timings to get something suitable on my platform.
I don't use VC++ any more, so i compiled it (on Windows) with:
g++ -O3 t.cpp
I then ran it by alternating:
a.exe
and
a.exe 1
My timing results were approximately the same for both cases. Sometimes one version would be faster by up to 20% and sometimes the other. This I would guess is due to other processes running on my system.
Try to use while or do something with returned value, e.g.:
#define SOME_BIG_CONSTANT 1000000000
int _tmain(int argc, _TCHAR* argv[])
{
int i = 1;
int d = 0;
DWORD d1 = GetTickCount();
while(i < SOME_BIG_CONSTANT + 1)
{
d += i++;
}
DWORD t1 = GetTickCount() - d1;
printf("%d", d);
printf("\ni++ > %d <\n", t1);
i = 0;
d = 0;
d1 = GetTickCount();
while(i < SOME_BIG_CONSTANT)
{
d += ++i;
}
t1 = GetTickCount() - d1;
printf("%d", d);
printf("\n++i > %d <\n", t1);
return 0;
}
Compiled with VS 2005 using /O2 or /Ox, tried on my desktop and on laptop.
Stably get something around on laptop, on desktop numbers are a bit different (but rate is about the same):
i++ > 8xx <
++i > 6xx <
xx means that numbers are different e.g. 813 vs 640 - still around 20% speed up.
And one more point - if you replace "d +=" with "d = " you will see nice optimization trick:
i++ > 935 <
++i > 0 <
However, it's quite specific. But after all, I don't see any reasons to change my mind and think there is no difference :)
Perhaps you could just show the theoretical difference by writing out both versions with x86 assembly instructions? As many people have pointed out before, compiler will always make its own decisions on how best to compile/assemble the program.
If the example is meant for students not familiar with the x86 instruction set, you might consider using the MIPS32 instruction set -- for some odd reason many people seem to find it to be easier to comprehend than x86 assembly.
Ok, all this prefix/postfix "optimization" is just... some big misunderstanding.
The major idea that i++ returns its original copy and thus requires copying the value.
This may be correct for some unefficient implementations of iterators. However in 99% of cases even with STL iterators there is no difference because compiler knows how to optimize it and the actual iterators are just pointers that look like class. And of course there is no difference for primitive types like integers on pointers.
So... forget about it.
EDIT: Clearification
As I had mentioned, most of STL iterator classes are just pointers wrapped with classes, that have all member functions inlined allowing out-optimization of such irrelevant copy.
And yes, if you have your own iterators without inlined member functions, then it may
work slower. But, you should just understand what compiler does and what does not.
As a small prove, take this code:
int sum1(vector<int> const &v)
{
int n;
for(auto x=v.begin();x!=v.end();x++)
n+=*x;
return n;
}
int sum2(vector<int> const &v)
{
int n;
for(auto x=v.begin();x!=v.end();++x)
n+=*x;
return n;
}
int sum3(set<int> const &v)
{
int n;
for(auto x=v.begin();x!=v.end();x++)
n+=*x;
return n;
}
int sum4(set<int> const &v)
{
int n;
for(auto x=v.begin();x!=v.end();++x)
n+=*x;
return n;
}
Compile it to assembly and compare sum1 and sum2, sum3 and sum4...
I just can tell you... gcc give exactly the same code with -02.