Confusion in Memory Addressing with Arrays - c++

Lets have an array of type int:-
int arr[5];
Now,
if arr[0] is at address 100 then
Why do we have;
arr[1] at address 102 ,
arr[2] at address 104 and so on.
Instead of
arr[1] at address 101 ,
arr[2] at address 102 and so on.
Is it because an integer takes 2 bytes?
Does each memory block has 1 Byte capacity (whether it is 32 bit processor or 64 bit)?

Your first example is consistent with 16-bit ints.
As to your second example (&arr[0]==100, &arr[1]==101, &arr[2]==103), this can't possibly be a valid layout since the distance between consecutive elements varies between the first pair and the second.

It is because an integer takes 2 bytes?
Yes
Apparently on your system, int has the size of 2. On other systems, this might not be the case. Usually int is either sized 4 or 8 bytes, but other sizes are possible also.

You are right, on your machine the sizeof int is 2, so next possible value in the array will be 2 bytes away from the previous one.
-------------------------------
|100|101|102|103|104|105|106....
-------------------------------
arr[0] arr[1] arr[2]
There is no guaranty regarding size of int. C++ spec just says that sizeof(int) >= sizeof(char). It depends upon processor, compiler etc.
For more info try this

Related

How do this code work without any errors?

I've this code i wrote that sets the array to 0
int arr[4];
memset(arr, 0, sizeof (arr));
Very simple, but how the code works without any errors even though sizeof(arr) = 16 (4 the array size * 4 for int) and the size i used when i declared the array is 4, How memset sets 16 bits to zero and the array i passed as a parameter has the size of 4?
I used memset(arr, 0, sizeof(arr)/sizeof(*arr)); to get the real size of the array and the result was accurate and it gives me 4 but how the above code works correctly?
memset sets 16 bytes (not bits) to 0. This is correct because the size of your array is 16 bytes, as you correctly stated (4 integers x 4 bytes per integer). sizeof knows the number of elements in your array and it knows the size of each element. As you can see in the docs, the third argument of memset takes the number of bytes, not the number of elements. http://www.cplusplus.com/reference/cstring/memset/
But be careful with using sizeof() where you pass array as int x[] or int* x. For example the following piece of code will not do what you expect:
void foo(int arr[]) {
auto s = sizeof(arr); // be careful! this won't do what you expect! it will return the size of pointer to array, not the size of the array itself
...
}
int a[10];
foo(a);
The third parameter is number of bytes. Which is 4*4=16 for your case.
memset
Actually the first solution is the correct one.
The function memset takes as third parameter the number of bytes to set to zero.
num:
Number of bytes to be set to the value.
sizeof returns the number of bytes occupied by the expression.
In your case sizeof(arr) = 16 which is exactly yhe number of bytes requested by memset function.
Your second solution:
memset(arr, 0, sizeof(arr)/sizeof(*arr)); // Note that: sizeof(arr)/sizeof(*arr) == 16 / 4 (generally) == 4 bytes
will set only the first 4 bytes to zero, that is the first integer of the array. So that solution is wrong if your intent is to set each element of the array to zero.

C Pointers with 2D arrays [duplicate]

Here is the problem program:
#include <stdio.h>
int main()
{
int apricot[2][3][5];
int (*r)[5]=apricot[0];
int *t=apricot[0][0];
printf("%p\n%p\n%p\n%p\n",r,r+1,t,t+1);
}
The output of it is:
# ./a.out
0xbfa44000
0xbfa44014
0xbfa44000
0xbfa44004
I think t's dimension's value should be 5 because t is the last dimension,and the fact is matched(0xbfa44004-0xbfa44000+1=5)
But the r's dimension's value is 0xbfa44014-0xbfa44000+1=21,I think it should be 3*5=15,because 3 and 5 are the last two dimensions,then why the difference is 21?
r is a pointer to an array of 5 ints.
Assuming 1 int is 4 bytes on your system (from t and t+1), then "stepping" that pointer by 1 (r+1) means an increase in 5*4 = 20 bytes. Which is what you get here.
You get tricked by the C syntax. r is an array pointer to an array of int, t is a plain int pointer. When doing any kind of pointer arithmetic, you do it in the unit pointed at.
Thus t+1 means the address of t + the size of one pointed-at object. Since t points at int and int is 4 bytes on your system, you get an address 4 bytes from t.
The same rule applies to r. It is a pointer to an array of 5 int. When you do pointer arithmetic on it by r+1, you get the size of the pointed-at object, which has size 5*sizeof(int), which happens to be 20 bytes on your computer. So therefore r+1 gives you an address 20 bytes (==14 hex) from r.

Difference between various ways of using memset function

What is the difference between the following three commands?
Suppose we declare an array arr having 10 elements.
int arr[10];
Now the commands are:
Command 1:
memset(arr,0,sizeof(arr));
and
Command 2:
memset(arr,0,10*sizeof(int));
These two commands are running smoothly in an program but the following command is not
Command 3:
memset(arr,0,10);
So what is the difference between the 3 commands?
Case #1: sizeof(arr) returns 10 * sizeof(int)
Case #2: sizeof(int) * 10 returns the same thing
Case #3: 10 returns 10
An int takes up more than one byte (usually 4 on 32 bit). So if you did 40 for case three, it would probably work. But never actually do that.
memset's 3rd paranneter is how many bytes to fill. So here you're telling memset to set 10 bytes:
memset(arr,0,10);
But arr isn't necesarrily 10 bytes. (In fact, it's not) You need to know how many bytes are in arr, not how many elements.
The size of an int is not guaranteed to be 1 byte. On most modern PC-type hardware, it's going to be 4 bytes.
You should not assume the size of any particular datatype, except char which is guaranteed to be 1 byte exactly. For everything else, you must determine the size of it (at compile time) by using sizeof.
memset(arr,0,sizeof(arr)) fills arr with sizeof(arr) zeros -- as bytes. sizeof(arr) is correct in this case, but beware using this approach on pointers rather than arrays.
memset(arr,0,10*sizeof(int)) fills arr with 10*sizeof(int) zeros, again as bytes. This is once again the correct size in this case. This is more fragile than the first. What if arr does not contain 10 elements, and what if the type of each element is not int. For example, you find you are getting overflow and change int arr[10] to long long arr[10].
memset(arr,0,10) fills the first 10 bytes of arr with zeros. This clearly is not what you want.
None of these is very C++-like. It would be much better to use std::fill, which you get from the <algorithm> header file. For example, std::fill (a, a+10, 0).

Weird sizeof() result

When I run sizeof(r) on my Mac. It says sizeof(r) = 1. My understanding is that the size of a union is the size of its largest element. In this case shouldn't the largest element be the struct s?
union
{
struct
{
char i:1;
char j:2;
char m:3;
}s;
char ch;
}r;
Your union composes of two parts, a struct, and a character. The size of the union, since it shares the memory, is the size of the largest element, plus the size of any padding it sticks on (which in your case is 0 bytes).
First, let's see the size ideone reports for each:
http://ideone.com/LAhop
Okay, both are 1. Therefore, the union's size must be 1 as well.
The struct is composed of bitfields. One is 1 bit, one is 2, and one is 3. This gives a total of 6 out of the 8 bits in one byte. Since it has to be at least one byte anyway (bitfields aren't really sized in bits), the size is 1.
As for char, here's what the C++11 standard says in § 3.9.1/1 [basic.fundamental]:
Objects declared as characters (char) shall be large enough to store any member
of the implementation’s basic character set.
For pretty much every platform, this is one byte.
This is one byte.
The struct s is taking up 1 + 2 + 3 = 6 bits which fit into 1 byte and its unioning with a char which is 1 byte. Hence the answer 1 byte.

Understanding sizeof(char) in 32 bit C compilers

(sizeof) char always returns 1 in 32 bit GCC compiler.
But since the basic block size in 32 bit compiler is 4, How does char occupy a single byte when the basic size is 4 bytes???
Considering the following :
struct st
{
int a;
char c;
};
sizeof(st) returns as 8 as agreed with the default block size of 4 bytes (since 2 blocks are allotted)
I can never understand why sizeof(char) returns as 1 when it is allotted a block of size 4.
Can someone pls explain this???
I would be very thankful for any replies explaining it!!!
EDIT : The typo of 'bits' has been changed to 'bytes'. I ask Sorry to the person who made the first edit. I rollbacked the EDIT since I did not notice the change U made.
Thanks to all those who made it a point that It must be changed especially #Mike Burton for downvoting the question and to #jalf who seemed to jump to conclusions over my understanding of concepts!!
sizeof(char) is always 1. Always. The 'block size' you're talking about is just the native word size of the machine - usually the size that will result in most efficient operation. Your computer can still address each byte individually - that's what the sizeof operator is telling you about. When you do sizeof(int), it returns 4 to tell you that an int is 4 bytes on your machine. Likewise, your structure is 8 bytes long. There is no information from sizeof about how many bits there are in a byte.
The reason your structure is 8 bytes long rather than 5 (as you might expect), is that the compiler is adding padding to the structure in order to keep everything nicely aligned to that native word length, again for greater efficiency. Most compilers give you the option to pack a structure, either with a #pragma directive or some other compiler extension, in which case you can force your structure to take minimum size, regardless of your machine's word length.
char is size 1, since that's the smallest access size your computer can handle - for most machines an 8-bit value. The sizeof operator gives you the size of all other quantities in units of how many char objects would be the same size as whatever you asked about. The padding (see link below) is added by the compiler to your data structure for performance reasons, so it is larger in practice than you might think from just looking at the structure definition.
There is a wikipedia article called Data structure alignment which has a good explanation and examples.
It is structure alignment with padding. c uses 1 byte, 3 bytes are non used. More here
Sample code demonstrating structure alignment:
struct st
{
int a;
char c;
};
struct stb
{
int a;
char c;
char d;
char e;
char f;
};
struct stc
{
int a;
char c;
char d;
char e;
char f;
char g;
};
std::cout<<sizeof(st) << std::endl; //8
std::cout<<sizeof(stb) << std::endl; //8
std::cout<<sizeof(stc) << std::endl; //12
The size of the struct is bigger than the sum of its individual components, since it was set to be divisible by 4 bytes by the 32 bit compiler. These results may be different on different compilers, especially if they are on a 64 bit compiler.
First of all, sizeof returns a number of bytes, not bits. sizeof(char) == 1 tells you that a char is eight bits (one byte) long. All of the fundamental data types in C are at least one byte long.
Your structure returns a size of 8. This is a sum of three things: the size of the int, the size of the char (which we know is 1), and the size of any extra padding that the compiler added to the structure. Since many implementations use a 4-byte int, this would imply that your compiler is adding 3 bytes of padding to your structure. Most likely this is added after the char in order to make the size of the structure a multiple of 4 (a 32-bit CPU access data most efficiently in 32-bit chunks, and 32 bits is four bytes).
Edit: Just because the block size is four bytes doesn't mean that a data type can't be smaller than four bytes. When the CPU loads a one-byte char into a 32-bit register, the value will be sign-extended automatically (by the hardware) to make it fill the register. The CPU is smart enough to handle data in N-byte increments (where N is a power of 2), as long as it isn't larger than the register. When storing the data on disk or in memory, there is no reason to store every char as four bytes. The char in your structure happened to look like it was four bytes long because of the padding added after it. If you changed your structure to have two char variables instead of one, you should see that the size of the structure is the same (you added an extra byte of data, and the compiler added one fewer byte of padding).
All object sizes in C and C++ are defined in terms of bytes, not bits. A byte is the smallest addressable unit of memory on the computer. A bit is a single binary digit, a 0 or a 1.
On most computers, a byte is 8 bits (so a byte can store values from 0 to 256), although computers exist with other byte sizes.
A memory address identifies a byte, even on 32-bit machines. Addresses N and N+1 point to two subsequent bytes.
An int, which is typically 32 bits covers 4 bytes, meaning that 4 different memory addresses exist that each point to part of the int.
In a 32-bit machine, all the 32 actually means is that the CPU is designed to work efficiently with 32-bit values, and that an address is 32 bits long. It doesn't mean that memory can only be addressed in blocks of 32 bits.
The CPU can still address individual bytes, which is useful when dealing with chars, for example.
As for your example:
struct st
{
int a;
char c;
};
sizeof(st) returns 8 not because all structs have a size divisible by 4, but because of alignment. For the CPU to efficiently read an integer, its must be located on an address that is divisible by the size of the integer (4 bytes). So an int can be placed on address 8, 12 or 16, but not on address 11.
A char only requires its address to be divisible by the size of a char (1), so it can be placed on any address.
So in theory, the compiler could have given your struct a size of 5 bytes... Except that this wouldn't work if you created an array of st objects.
In an array, each object is placed immediately after the previous one, with no padding. So if the first object in the array is placed at an address divisible by 4, then the next object would be placed at a 5 bytes higher address, which would not be divisible by 4, and so the second struct in the array would not be properly aligned.
To solve this, the compiler inserts padding inside the struct, so its size becomes a multiple of its alignment requirement.
Not because it is impossible to create objects that don't have a size that is a multiple of 4, but because one of the members of your st struct requires 4-byte alignment, and so every time the compiler places an int in memory, it has to make sure it is placed at an address that is divisible by 4.
If you create a struct of two chars, it won't get a size of 4. It will usually get a size of 2, because when it contains only chars, the object can be placed at any address, and so alignment is not an issue.
Sizeof returns the value in bytes. You were talking about bits. 32 bit architectures are word aligned and byte referenced. It is irrelevant how the architecture stores a char, but to compiler, you must reference chars 1 byte at a time, even if they use up less than 1 byte.
This is why sizeof(char) is 1.
ints are 32 bit, hence sizeof(int)= 4, doubles are 64 bit, hence sizeof(double) = 8, etc.
Because of optimisation padding is added so size of an object is 1, 2 or n*4 bytes (or something like that, talking about x86). That's why there is added padding to 5-byte object and to 1-byte not. Single char doesn't have to be padded, it can be allocated on 1 byte, we can store it on space allocated with malloc(1). st cannot be stored on space allocated with malloc(5) because when st struct is being copied whole 8 bytes are being copied.
It works the same way as using half a piece of paper. You use one part for a char and the other part for something else. The compiler will hide this from you since loading and storing a char into a 32bit processor register depends on the processor.
Some processors have instructions to load and store only parts of the 32bit others have to use binary operations to extract the value of a char.
Addressing a char works as it is AFAIR by definition the smallest addressable memory. On a 32bit system pointers to two different ints will be at least 4 address points apart, char addresses will be only 1 apart.