I'm wondering if the arrays a[] and b[] in the code below are contiguous in memory:
int main() {
int a[3];
int b[3];
return 0;
}
a[0],a[1] and a[2] should be contiguous and the same for b, but are there any guarantees on where b will be allocated in relation to a?
If not, is there any way to force a and b to be contiguous with eachother? i.e. - so that they are allocated next to eachother in the stack.
No, C++ makes no guarantee about the location of these variables in memory. One or both may not even be in memory (for example if they are optimised out)!
In order to obtain a relative ordering between the two, at least, you would have to encapsulate them into a struct or class and, even then, there would be padding/alignment to consider.
There's no guarantee that individual variables will be next to each other on the stack.
In your example you could "cheat" and simulate what you're looking for by doing this:
int main()
{
int buffer[6]
int *a = buffer;
int *b = buffer+3;
return 0;
}
Now, when you write to a[4] you'll actually write to b[0].
Related
Let's say we have the following two pieces of code:
int *a = (int *)malloc(sizeof(*a));
int *b = (int *)malloc(sizeof(*b));
And
int *a = (int *)malloc(2 * sizeof(*a));
int *b = a + 1;
Both of them allocate two integers on the heap and (assuming the normal usage) they should be equivalent. The first seems to be slower as it calls malloc twice and can result in a more cache-friendly code. The second however is possibly insecure as we can accidentally override the value of what b points to just by incrementing a and writing to the resulting pointer (or someone malicious can instantly change the value of b just by knowing where a is).
It's possible that the above claims are not true (for example the speed is questioned here: Minimizing the amount of malloc() calls improves performance?) but my question is just: Can the compiler do this type of transformation or is there something fundamentally different between the two according to the standard? If it is possible, what compiler flags (let's say gcc) can allow it?
In reality, no, the compiler will never combine the 2 malloc() calls into a single malloc() call automatically. Each call to malloc() returns the address of a new memory block, there is no guarantee that the allocated blocks will be located anywhere close to each other, and each allocated block must be free()'d individually. So no compiler will ever assume anything about the relationship between multiple allocated blocks and try to optimize their allocations for you.
Now, it is possible that in a very simplified use-case, where the allocation and deallocation were in the same scope, and if it can be proven to be safe to do so, then the compiler vendor might decide to try to optimize, ie:
void doIt()
{
int *a = (int *)malloc(sizeof(*a));
int *b = (int *)malloc(sizeof(*b));
...
free(a);
free(b);
}
Could become:
void doIt()
{
void *ptr = malloc(sizeof(int) * 2);
int *a = (int *)ptr;
int *b = a + 1;
...
free(ptr);
}
But in reality, no compiler vendor will actually attempt to do this. It is not worth the effort, or the risk, for such little gain. And it would not work in more complex scenarios anyway, eg:
void doIt()
{
int *a = (int *)malloc(sizeof(*a));
int *b = (int *)malloc(sizeof(*b));
...
UseAndFree(a, b);
}
void UseAndFree(int *a, int *b)
{
...
free(a);
free(b);
}
No, it can't, because the compiler (in general) doesn't know when a and b might get free()'d, and if it allocates them both as part of a single allocation, then it would need to free() them both at the same time also.
There's a number of reasons why this will likely never happen, but the most important is lifetimes where these allocations, if made independently, can be freed independently. If made together they're locked to the same lifetime.
This sort of nuance is best expressed by the developer rather than determined by the compiler.
Is the second "insecure" in that you can overwrite values? In C, and by extension C++, the language does not protect you from bad programming. You are free to shoot yourself in the foot at any time, using any means necessary:
int a;
int b;
int* p = &a;
p[1] = 9; // Bullet, meet foot
(&b)[-1] = 9; // Why not?
If you want to allocate N of something by all means use calloc() to express it, or an appropriately sized malloc(). Doing individual allocations is pointless unless there's a good reason.
Normally you wouldn't allocate a single int, that's kind of useless, but there are cases where that might be the only reasonable option. Typically it's larger blocks of things, like a full struct or a character buffer.
First of all:
int *a = (int *)malloc(8);
int *b = a + 4;
Is not what you think. You want:
int *a = malloc(sizeof(*a) * 2);
int *b = a + 1;
It shows that pointer arithmetic is something you need to learn.
Secondly: the compiler does not change anything in your code, and it will not combine any function calls in one. What you try to achieve is a micro-optimization. If you want to use a larger chunk of memory simply use arrays.
int *a = malloc(sizeof(*a) * 2);
a[0] = 5;
a[1] = 6;
/* some other code */
free(a);
Do not use "magic" number is malloc only sizeof of the objects. Do not cast the result of malloc
I've done exactly that with a bignum library, but you only free the one pointer.
//initialization every time program runs
extern bignum_t *scratch00; //these are useful for taylor series, etc.
extern bignum_t *scratch01;
extern bignum_t *scratch02;
.
.
.
bignum_t *bn_malloc(int bignums)
{
return(malloc(bignums * bn_numbytes));
}
.
.
.
//bignums specific to the program being written at the moment
bignum_t *numerator;
bignum_t *denom;
bignum_t *denom_add;
bignum_t *accum;
bignum_t *term;
.
.
.
numerator = bn_malloc(1);
denom = bn_malloc(1);
denom_add = bn_malloc(1);
accum = bn_malloc(1);
term = bn_malloc(1);
On VS (release), I run the following:
int main(void)
{
char b[] = "123";
char a[] = "1234567";
printf("%x %x\n", b,a);
return 0;
}
I can see that, the mem address of a is b+3(the length of the string). Which shows that the memory are allocated with no gaps. And this guarantee that least memories are used.
So, I now kind of believe that all compilers will do so.
I want to make sure of this guess here. Can somebody give me an more formal proof or tell me that my guess is rooted on a coincidence.
No, it's not guaranteed that there will always be perfect packing of data.
For example, I compiled and runned this code on g++, and the difference is 8.
You can read more about this here.
tl;dr: Compilers can align objects in memory to only addresses divisible by some constant (always machine-word length) to help processor(for them it's easier to work with such addresses)
UPD: one interesting example about alignment:
#include <iostream>
using namespace std;
struct A
{
int a;
char b;
int c;
char d;
};
struct B
{
int a;
int c;
char b;
char d;
};
int main()
{
cout << sizeof(A) << " " << sizeof(B) << "\n";
}
For me, it prints
16 12
There is no guarantee what addresses will be chosen for each variable. Different processors may have different requirements or preferences for alignment of variables for instance.
Also, I hope there were at least 4 bytes between the addresses in your example. "123" requires 4 bytes - the extra byte being for the null terminator.
Try reversing the order of declaring a[] and b[], and/or increase the length of b.
You are making a very big assumption about how storage is allocated. Depending on your compiler the string literals might get stored in a literal pool that is NOT on the stack. Yet a[] and b[] do occupy elements on the stack. So, another test would be to add int c and compare those addresses.
I want to do something like this below:
int main() {
int a[10];
int *d = generateArrayOfSize(10) // This generates an array of size 10 on the heap
a = d;
print(a); // Prints the first 10 elements of array.
}
However above code gives compilation error (incompatible types in assignment of ‘int*’ to ‘int [10]’).
What can I do to make the above code to work?
Arrays are non-assignable and non-copyable, so you'd have to copy each element by hand (in a loop), or using std::copy.
If you're using C++, then use C++ arrays rather than C style arrays and pointers. Here's an example
#include <array>
#include <iostream>
template<size_t N>
std::array<int, N> generateArrayOfSize(void)
{
std::array<int, N> a;
for (int n=0; n<N; ++n)
a[n] = n;
return a;
}
template<size_t N>
void print(std::array<int, N> const &a)
{
for (auto num : a)
std::cout << num << " ";
}
int main() {
std::array<int, 10> a;
std::array<int, 10> d = generateArrayOfSize<10>();
a = d;
print(a); // Prints the first 10 elements of array.
}
which outputs 0 1 2 3 4 5 6 7 8 9
Arrays are not pointers.
You can't do :
int a[10];
int *d;
a = d;
Change it to :
int *a;
int *d;
a = d;
Main differences between arrays and pointers in C programming :
Pointer | Array
-------------------------------------------|-------------------------------------------
A pointer is a place in memory that keeps | An array is a single, pre allocated chunk
address of another place inside | of contiguous elements (all of the same
| type), fixed in size and location.
-------------------------------------------|-------------------------------------------
A pointer can point to a dynamically | They are static in nature. Once memory is
allocated memory. In this case, the memory | allocated , it cannot be resized or freed
allocation can be resized or freed later. | dynamically.
-------------------------------------------|-------------------------------------------
You have a quite good explanation here : https://stackoverflow.com/a/7725410/1394283
An array is not a pointer (although a name of an array often decays to a pointer to its first element).
To make the above code to work, you can declare a as a pointer: int *a;. The print function takes an int* (or a decayed array) anyway.
If you really want to have two arrays and copy contents from one array to another, you should copy the data in a loop.
This will print in this way when you assign a string reference to a pointer you have to use *ptr to print the value of a pointer otherwise in your case print(d) that is like cout< in c++ it will only print the location of the d[0].
int ary[5]={1,2,3,4,5};
int *d;
d=ary;
for(int i=0;i<5;i++)
cout<<*(d+i);
Because array names are non-modifiable. So you can't do
a = d;
Declare it as a pointer like this:
int *a;
Little rusty with my C++ but try something like this.
int main() {
int *a;
int *d = generateArrayOfSize(10) // This generates an array of size 10 on the heap
a = d;
print(a); // Prints the first 10 elements of array.
}
In C, it was always true when Thing X[10]; was declared, X was the constant address of the first element(i.e. &X[0]). So you could then say:
Thing *Y = X; // Equivalent to (Thing *Y = &X[0];)
But in C++, the compiler "remembers" that the Thing array X has 10 elements, and some C++ imposed type checking rules break. Imagine we add Thing Z[20]; to the discussion.
Thing *Y = X; and Thing *Y = Z; if both allowed, would imply that a single variable could be set to Thing Arrays of length 10 and 20, which are very different (ahem) "things", as a quick look at a 2D array will reveal. This sort of justifies why the C language assumed equivalent of X and &X[0] is broken in C++.
Well, at least for some versions of C++. So best not to assume it, and use
Thing *Y = &x[0]; and Thing *Y = &Z[0] instead;
This approach has two advantages. It does what is wanted, and it actually compiles. :-)
I want to store a state variable composed of multiple POD-structs of various types into a single memory area. Since the combination of structs used to make up the state variable is decided at run time, i cannot just place them into a surrounding struct or class. Also i want the number of memory allocations to be as low as possible.
What is the best way to do it? Is the following code legal/portable or can it cause alignment errors on some platforms / with some compilers?
struct TestA {
int a;
short b;
};
struct TestB {
int c;
float d;
char e;
};
int main() {
void* mem = new uint8_t[sizeof(TestA) + sizeof(TestB)];
TestA* a1 = (TestA*) mem;
a1->a = a1->b = 42;
a1++;
TestB* b = (TestB*) a1;
b->c = 5;
b->d = 23.f;
b->e = 'e';
}
What you're trying to do is essentially "placement new." So all caveats apply here too. If the memory location is not aligned properly for the given type, then you're into undefined behavior. In your code:
a1++;
is not guaranteed to give an address that's properly aligned for a TestB. So your code is not standard-conformant.
Why this program only works when I initialize a and b.
I want to pass it without initializing a and b, for example:
numChange(10,15);
Is this possible ?
#include <iostream>
using namespace std;
void numChange(int *x,int *y)
{
*x = 99;
*y = 77;
}
int main()
{
numChange(10,15);
//int a=10;
//int b=15;
//numChange(&a,&b);
cout<<a<<" , "<<b<<endl;
return 0;
}
Because you have defined your function to receive pointers, but when you call that function you are trying to pass an int.
The compiler is expecting memory addresses and you are trying to pass constants.
It does not make sense, you are trying to do something like 10 = 99; 15 = 77;?
numChange(10,15);
//int a=10;
//int b=15;
It seems that you are hopping that a = 10 = 99 and b = 15 = 77;
If this was possible, it means that I could never (after the call of numChange(10,15);) make a variable to actually have the value 10 because 10 is "pointing" to 99 (is not).
Recall: a pointer is an integer containing a location in memory.
This:
int a, b;
...
a = b;
copies the integer stored at the memory location reserved for 'b' to
the memory location reserved for 'a'.
This:
int *a, b;
...
a = &b;
stores the location of 'b' in 'a'. Following it with this:
*a = 42;
will store 42 in the memory location stored in 'a', which is the
variable 'b'.
Now, let's look at your code. This:
void numChange(int *x,int *y)
tells the compiler that 'numChange' will be called with two
pointers--that is, memory addresses. This part:
*x = 99;
*y = 77;
then stores two integers at the locations given in 'x' and 'y'.
When you call:
numChange(10,15);
the arguments are integers instead of memory location. However under
the hood, memory locations are also integers so the compiler converts
the arguments to pointers. Effectively, it's doing this:
numChange((int *)10, (int*)15);
(It should issue a warning when this happens, since it's almost never
a good idea, but it will do it.)
Basically, your call to 'numChange' tells it that there are integer
variables at memory addresses 10 and 15, and 'numChange' carries on
and stores integers at those memory locations. Since there aren't
variables (that we know of) at those locations, this code actually
overwrites some other data somewhere.
Meanwhile, this code:
int a=10;
int b=15;
numChange(&a,&b);
creates two integer variables and then passes their addresses in
memory to 'numChange'. BTW, you don't actually need to initialize
them. This works too:
int a, b;
numChange(&a,&b);
What's important is that the variables are created (and the compiler
sets aside RAM for them) and that their locations are then passed to
'numChange'.
(One aside: I'm treating variables as always being stored in RAM.
It's safe to think of them this way but modern compilers will try to
store them in CPU registers as much as possible for performance
reasons, copying them back into RAM when needed.)