C++ vector max_size(); - c++

On 32 bit System.
std::vector<char>::max_size() returns 232-1, size of char — 1 byte
std::vector<int>::max_size() returns 230-1, size of int — 4 byte
std::vector<double>::max_size() returns 229-1, size of double — 8 byte
can anyone tell me max_size() depends on what?
and what will be the return value of max_size() if it runs on 64 bit system.

max_size() is the theoretical maximum number of items that could be put in your vector. On a 32-bit system, you could in theory allocate 4Gb == 2^32 which is 2^32 char values, 2^30 int values or 2^29 double values. It would appear that your implementation is using that value, but subtracting 1.
Of course, you could never really allocate a vector that big on a 32-bit system; you'll run out of memory long before then.
There is no requirement on what value max_size() returns other than that you cannot allocate a vector bigger than that. On a 64-bit system it might return 2^64-1 for char, or it might return a smaller value because the system only has a limited memory space. 64-bit PCs are often limited to a 48-bit address space anyway.

max_size() returns
the maximum potential size the vector
could reach due to system or library
implementation limitations.
so I suppose that the maximum value is implementation dependent. On my machine the following code
std::vector<int> v;
cout << v.max_size();
produces output:
4611686018427387903 // built as 64-bit target
1073741823 // built as 32-bit target
so the formula 2^(64-size(type))-1 looks correct for that case as well.

Simply get the answer by
std::vector<dataType> v;
std::cout << v.max_size();
Or we can get the answer by (2^nativePointerBitWidth)/sizeof(dataType) - 1. For example, on a 64 bit system, long long is (typically) 8 bytes wide, so we have (2^64)/8 - 1 == 2305843009213693951.

Related

Why -1 equals 4th byte of integer -5?

Why does this code deduce a true (or 1 without std::boolalpha)
char* arr = new char[4];
int* i = new (arr) int(-5);
char c = -1;
std::cout << std::boolalpha << (arr[3] == c) << std::endl;
Why does this code deduce a true
Depending on the system used to run the program, the output could be either true or false, or behaviour of the program could be undefined.
On systems where negative numbers are represented using two's complement (which is very common, and will be guaranteed since C++20) and where byte-endianness is little endian (which is somewhat common on desktop systems; not so much elsewhere) and where the size of int is exactly 4, it just so happens that the byte arr[3] has the value -1. An example of CPU architecture where all of these conditions match is x86 a non-matching example is AVR32.
On big endian systems, this would not be the case and output wouldn't be true. And on systems where the size of int is less than 4 bytes, the byte could be uninitialised in which case output could be either true or false. In case where size of int is greater than 4 bytes, the behaviour of the program would be undefined.
If you inspect arr in a debugger you will probably see that it has the value 0xfbffffff. This will be true if your computer uses little-endian byte order, in which the least significant byte is stored at the lowest address. For your code to execute as you seem to expect, you should be examining arr[0]. Also, your code as written would probably execute as you expect on a machine that uses big-endian byte order.

Pointer Conception

Here i get 4225440 as the address of arr[0]; as it an integer array, the address will be increased by 4, so next one will be 4225444;
now
whats happen with those addresses
if put manualy one of addresses it shows absurd value from where it comes.
This is the code under discussion
#include <stdio.h>
int arr[10],i,a,*j;
void del(int a);
main()
{
for(i=0;i<4;i++)
scanf("%d",&arr[i]);
j=(int*)4225443;
for(i=0;i<4;i++)
{
printf("\n%d ",arr[i]);
printf(" %d ",&arr[i]);
}
printf(" %d ",*j);
}
j=(int*)4225443;
/* ... */
printf(" %d ",*j);
C has its word to say:
(C11, 6.3.2.3p5) "An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation."
In your case you can add to that you are also violating aliasing rules.
most of the CPUs that we use today have either
a 32bit or 64 bit wide bus between the CPU and the memory.
Lets use the 32 bit wide bus for demonstration purposes..
in general, each memory access will be to read (or write) 32 bits,
where the first address of that 32 bits will be an address
that is evenly divisible by 32.
In such a architecture, a 'int' will start on a address
that is evenly divisible by 32 and be 4 bytes (32bits) long.
in general, when the address of 'something' is NOT
on a 32 bit address boundary
(I.E. the address is not evenly divisible by 32)
then the CPU will:
for read,
read the whole 32 bits from memory,
starting at the 32 bit boundary,
then, within the CPU,
using the registers and the logic and math operations,
extract the desired byte.
for write,
read the whole 32 bits from memory,
starting at the 32 bit boundary,
then, within the CPU,
using the registers and logic and math operations,
modify the desired byte,
then write the whole 32 bits to memory
In other words,
accessing memory other than on 32bit boundarys is SLOW.
Unfortunately some CPUs,
if requested to read/write some value to/from memory
at other than a 32 bit boundary will raise a bus error.
regarding the 'unbelievable' value of the int
when the second byte of the int is modified...
A int (lets use a little endian architecture) is 4 bytes,
aligned on a 32 bit boundary
(I.E. the lowest address of the int is on a 32 bit boundary.)
Lets, for example say the int contains '5'
then its' representation in memory is 0x00,0x00,0x00,0x05
Then the second byte (address of the int+1) is set to some value,
for example, say 3,
Then the int contains 0x000, 0x03, 0x00, 0x05
now, when that int is printed, it will display: 196613
Note: the order of the bytes in memory is somewhat different
for a big endian architecture.
It will print value, located in address 4225443, if value exists, otherwise it will produce memory violation exception.

Cannot resize vector to 1765880295

I want to allocate a vector of size 1765880295 and so i used resize(1765880295) but the program stops running.The adjact problem would be code block not responding.
what is wrong?
Although the max_size gives 4294967295 which is greater than 1765880295 the problem is still the same even without resizing the vector.
Depending on what is stored in the vector -- for example, a 32-bit pointer or uint32, the size of the vector (number of elements * size of each element) will exceed the maximum addressable space on a 32-bit system.
The max_size is dependent on the implementation (some may have 1073741823 as their max_size). But even if your implementation supports a bigger number, the program will fail if there is not enough memory.
For example: if you have a vector<int>, and the sizeof(int) == 4 // bytes, we do the math, and...
1765880295 * 4 bytes = 7063521180 bytes ≈ 6.578 gygabytes
So you would require around 6.6GiB of free memory to allocate that enormous vector.

Does compiler adjust int size?

I wonder if in that case, compiller will adjust int variable size to its maximum possible value? Or will it use whole 32 bit int?
pseudocode:
int func()
{
if (statement)
return 10;
else if (statement2)
return 50;
else
return 100;
}
// how much memory will be alocated as it needs only 1 byte?
The function returns int, the allocated memory will be sizeof(int), regardless of the actual value stored in it.
I will use the full 32 bits (assuming that an int is 32 bits on this architecture).
It is defined at compile time
Yes friend it will use whole 32 bit because the memory allocation to the primitive types is done at compile time.
Int32 is value type. It is stored on stack on compile time. If it is inside any object then it will go to heap which is dynamic memory.
In your case, for any return value, compiler will allocate fixed bits on stack to store your return integer value, according to the size of int32 that is 32 bits, which can have range –2,147,483,648 to 2,147,483,647 if singed and 0 to 4,294,967,295 if unsigned.

Understanding sizeof(char) in 32 bit C compilers

(sizeof) char always returns 1 in 32 bit GCC compiler.
But since the basic block size in 32 bit compiler is 4, How does char occupy a single byte when the basic size is 4 bytes???
Considering the following :
struct st
{
int a;
char c;
};
sizeof(st) returns as 8 as agreed with the default block size of 4 bytes (since 2 blocks are allotted)
I can never understand why sizeof(char) returns as 1 when it is allotted a block of size 4.
Can someone pls explain this???
I would be very thankful for any replies explaining it!!!
EDIT : The typo of 'bits' has been changed to 'bytes'. I ask Sorry to the person who made the first edit. I rollbacked the EDIT since I did not notice the change U made.
Thanks to all those who made it a point that It must be changed especially #Mike Burton for downvoting the question and to #jalf who seemed to jump to conclusions over my understanding of concepts!!
sizeof(char) is always 1. Always. The 'block size' you're talking about is just the native word size of the machine - usually the size that will result in most efficient operation. Your computer can still address each byte individually - that's what the sizeof operator is telling you about. When you do sizeof(int), it returns 4 to tell you that an int is 4 bytes on your machine. Likewise, your structure is 8 bytes long. There is no information from sizeof about how many bits there are in a byte.
The reason your structure is 8 bytes long rather than 5 (as you might expect), is that the compiler is adding padding to the structure in order to keep everything nicely aligned to that native word length, again for greater efficiency. Most compilers give you the option to pack a structure, either with a #pragma directive or some other compiler extension, in which case you can force your structure to take minimum size, regardless of your machine's word length.
char is size 1, since that's the smallest access size your computer can handle - for most machines an 8-bit value. The sizeof operator gives you the size of all other quantities in units of how many char objects would be the same size as whatever you asked about. The padding (see link below) is added by the compiler to your data structure for performance reasons, so it is larger in practice than you might think from just looking at the structure definition.
There is a wikipedia article called Data structure alignment which has a good explanation and examples.
It is structure alignment with padding. c uses 1 byte, 3 bytes are non used. More here
Sample code demonstrating structure alignment:
struct st
{
int a;
char c;
};
struct stb
{
int a;
char c;
char d;
char e;
char f;
};
struct stc
{
int a;
char c;
char d;
char e;
char f;
char g;
};
std::cout<<sizeof(st) << std::endl; //8
std::cout<<sizeof(stb) << std::endl; //8
std::cout<<sizeof(stc) << std::endl; //12
The size of the struct is bigger than the sum of its individual components, since it was set to be divisible by 4 bytes by the 32 bit compiler. These results may be different on different compilers, especially if they are on a 64 bit compiler.
First of all, sizeof returns a number of bytes, not bits. sizeof(char) == 1 tells you that a char is eight bits (one byte) long. All of the fundamental data types in C are at least one byte long.
Your structure returns a size of 8. This is a sum of three things: the size of the int, the size of the char (which we know is 1), and the size of any extra padding that the compiler added to the structure. Since many implementations use a 4-byte int, this would imply that your compiler is adding 3 bytes of padding to your structure. Most likely this is added after the char in order to make the size of the structure a multiple of 4 (a 32-bit CPU access data most efficiently in 32-bit chunks, and 32 bits is four bytes).
Edit: Just because the block size is four bytes doesn't mean that a data type can't be smaller than four bytes. When the CPU loads a one-byte char into a 32-bit register, the value will be sign-extended automatically (by the hardware) to make it fill the register. The CPU is smart enough to handle data in N-byte increments (where N is a power of 2), as long as it isn't larger than the register. When storing the data on disk or in memory, there is no reason to store every char as four bytes. The char in your structure happened to look like it was four bytes long because of the padding added after it. If you changed your structure to have two char variables instead of one, you should see that the size of the structure is the same (you added an extra byte of data, and the compiler added one fewer byte of padding).
All object sizes in C and C++ are defined in terms of bytes, not bits. A byte is the smallest addressable unit of memory on the computer. A bit is a single binary digit, a 0 or a 1.
On most computers, a byte is 8 bits (so a byte can store values from 0 to 256), although computers exist with other byte sizes.
A memory address identifies a byte, even on 32-bit machines. Addresses N and N+1 point to two subsequent bytes.
An int, which is typically 32 bits covers 4 bytes, meaning that 4 different memory addresses exist that each point to part of the int.
In a 32-bit machine, all the 32 actually means is that the CPU is designed to work efficiently with 32-bit values, and that an address is 32 bits long. It doesn't mean that memory can only be addressed in blocks of 32 bits.
The CPU can still address individual bytes, which is useful when dealing with chars, for example.
As for your example:
struct st
{
int a;
char c;
};
sizeof(st) returns 8 not because all structs have a size divisible by 4, but because of alignment. For the CPU to efficiently read an integer, its must be located on an address that is divisible by the size of the integer (4 bytes). So an int can be placed on address 8, 12 or 16, but not on address 11.
A char only requires its address to be divisible by the size of a char (1), so it can be placed on any address.
So in theory, the compiler could have given your struct a size of 5 bytes... Except that this wouldn't work if you created an array of st objects.
In an array, each object is placed immediately after the previous one, with no padding. So if the first object in the array is placed at an address divisible by 4, then the next object would be placed at a 5 bytes higher address, which would not be divisible by 4, and so the second struct in the array would not be properly aligned.
To solve this, the compiler inserts padding inside the struct, so its size becomes a multiple of its alignment requirement.
Not because it is impossible to create objects that don't have a size that is a multiple of 4, but because one of the members of your st struct requires 4-byte alignment, and so every time the compiler places an int in memory, it has to make sure it is placed at an address that is divisible by 4.
If you create a struct of two chars, it won't get a size of 4. It will usually get a size of 2, because when it contains only chars, the object can be placed at any address, and so alignment is not an issue.
Sizeof returns the value in bytes. You were talking about bits. 32 bit architectures are word aligned and byte referenced. It is irrelevant how the architecture stores a char, but to compiler, you must reference chars 1 byte at a time, even if they use up less than 1 byte.
This is why sizeof(char) is 1.
ints are 32 bit, hence sizeof(int)= 4, doubles are 64 bit, hence sizeof(double) = 8, etc.
Because of optimisation padding is added so size of an object is 1, 2 or n*4 bytes (or something like that, talking about x86). That's why there is added padding to 5-byte object and to 1-byte not. Single char doesn't have to be padded, it can be allocated on 1 byte, we can store it on space allocated with malloc(1). st cannot be stored on space allocated with malloc(5) because when st struct is being copied whole 8 bytes are being copied.
It works the same way as using half a piece of paper. You use one part for a char and the other part for something else. The compiler will hide this from you since loading and storing a char into a 32bit processor register depends on the processor.
Some processors have instructions to load and store only parts of the 32bit others have to use binary operations to extract the value of a char.
Addressing a char works as it is AFAIR by definition the smallest addressable memory. On a 32bit system pointers to two different ints will be at least 4 address points apart, char addresses will be only 1 apart.