converting char array to int using c++ - c++

Can anyone explain me how this code results 16843009? How it works?
As I saw in my tests, (int *)&x results 0x61ff1b and as I know that is the address of the first element in the array. and how the result of *(int *)&x is 16843009? Thanks.
#include <iostream>
using namespace std;
int main()
{
char x[5] = {1, 1, 1, 1, 1};
cout << *(int *)&x;
return 0;;
}

If we write 16843009 as binary we get 1000000010000000100000001. Padding that with extra zeros we get: 00000001000000010000000100000001. Every 8 bits (which is a char) has a value of 00000001, which is 1.
&x is a pointer to an array of char (Specifically a char(*)[5]). This is reinterpreted as a pointer to int. On your system, int is probably 4 bytes, and all four of those bytes are seperately set to 1, which means you get an int where every 8 bits are set to 1.

Related

Pointer Arithmetic (adding ints to arrays)

So I read online that if you have an array like
int arr[3] = {1,2,3}
I read that if you take (arr+n) any number n it will just add sizeof(n) to the address so if n is an integer(takes up 4 bytes) it will just add 4 right? But then I also experimented on my own and read some more stuff and found that
arr[i] = *(arr+i) for any i, which means it's not just adding the sizeof(i) so how exactly is this working?
Because obviously arr[0] == *(arr+0) and arr[1] == *(arr+1) so it's not just adding sizeof(the number) what is it doing?
I read that if you take (arr+n) any number n it will just add sizeof(n) to the address
This is wrong. When n is an int (or an integer literal of type int) then sizeof(n) is the same as sizeof(int) and that is a compile time constant. What actually happens is that first arr decays to a pointer to the first element of the array. Then sizeof(int) * n is added to the pointers value (because elements type is int):
1 2 3
^ ^
| |
arr arr+2
This is because each element in the array occupies sizeof(int) bytes and to get to memory address of the next element you have to add sizeof(int).
[...] and read some more stuff and found that arr[i] = *(arr+i)
This is correct. For c-ararys arr[i] is just shorthand way of writing *(arr+i).
When you write some_pointer + x then how much the pointer value is incremented depends on the type of the pointer. Consider this example:
#include <iostream>
int main(void) {
int * x = 0;
double * y = 0;
std::cout << x + 2 << "\n";
std::cout << y + 2 << "\n";
}
Possible output is
0x8
0x10
because x is incremented by 2* sizeof(int) while y is incremented by 2 * sizeof(double). Thats also the reason why you get different results here:
#include <iostream>
int main(void) {
int x[] = {1,2,3};
std::cout << &x + 1 <<"\n";
std::cout << &x[0] + 1;
}
However, note that you get different output with int* x = new int[3]{1,2,3}; because then x is just a int* that points to an array, it is not an array. This distinction between arrays and pointers to arrays causes much confusion. It is important to understand that arrays are not pointers, but they often do decay to pointers to their first element.

Pointer typecasting c++

#include <iostream>
#define print(x) std::cout << x;
#define println(x) std::cout << x << std::endl;
int main() {
int ex[5];
int* ptr = ex;
for (int i = 0; i < 5; i++) {
ex[i] = 2;
}
ex[2] = 3;
*(int*)((char*)ptr + 8) = 4;
println(ex[2]);
}
on line 13 i'm using (char*) and when i run println(sizeof(char*)) it says that it's 4 bytes but my instructor says that it's 1 byte long so we need to add 8 bytes to access the value in ex[2], how could this be possible i didn't understand ! :/
It depends on the architecture you use. By definition char is the type that has the size of 1, so sizeof(char) evaluates as 1, but it does not automatically mean that it is 8 bits.
To access the next value, you must add sizeof(int) to the pointer to make your code work independent of the architecture it is used on.
When you work with pointers, you tell the compiler that the value the pointer points to takes the space of that type in memory, and the next thing in the memory should be after that amount of units(bytes). So if you cast your int pointer to char pointer, you should add sizeof(int) to your char pointer to have the same effect as you would have added 1 to the int pointer. This is because char is automatically 1 unit by definition, if you would use anything other than char, this would not work, there is no architecture independent specification of sizes of types.

What is *(int*)&data[18] actually doing in this code?

I came across this syntax for reading a BMP file in C++
#include <fstream>
int main() {
std::ifstream in('filename.bmp', std::ifstream::binary);
in.seekg(0, in.end);
size = in.tellg();
in.seekg(0);
unsigned char * data = new unsigned char[size];
in.read((unsigned char *)data, size);
int width = *(int*)&data[18];
// omitted remainder for minimal example
}
and I don't understand what the line
int width = *(int*)&data[18];
is actually doing. Why doesn't a simple cast from unsigned char * to int, int width = (int)data[18];, work?
Note
As #user4581301 indicated in the comments, this depends on the implementation and will fail in many instances. And as #NathanOliver- Reinstate Monica and #ChrisMM pointed out this is Undefined Behavior and the result is not guaranteed.
According to the bitmap header format, the width of the bitmap in pixels is stored as a signed 32-bit integer beginning at byte offset 18. The syntax
int width = *(int*)&data[18];
reads bytes 19 through 22, inclusive (assuming a 32-bit int) and interprets the result as an integer.
How?
&data[18] gets the address of the unsigned char at index 18
(int*) casts the address from unsigned char* to int* to avoid loss of precision on 64 bit architectures
*(int*) dereferences the address to get the referred int value
So basically, it takes the address of data[18] and reads the bytes at that address as if they were an integer.
Why doesn't a simple cast to `int` work?
sizeof(data[18]) is 1, because unsigned char is one byte (0-255) but sizeof(&data[18]) is 4 if the system is 32-bit and 8 if it is 64-bit, this can be larger (or even smaller for 16-bit systems) but with the exception of 16-bit systems it should be at minimum 4 bytes. Obviously reading more than 4 bytes is not desired in this case, and the cast to (int*) and subsequent dereference to int yields 4 bytes, and indeed the 4 bytes between offsets 18 and 21, inclusive. A simple cast from unsigned char to int will also yield 4 bytes, but only one byte of the information from data. This is illustrated by the following example:
#include <iostream>
#include <bitset>
int main() {
// Populate 18-21 with a recognizable pattern for demonstration
std::bitset<8> _bits(std::string("10011010"));
unsigned long bits = _bits.to_ulong();
for (int ii = 18; ii < 22; ii ++) {
data[ii] = static_cast<unsigned char>(bits);
}
std::cout << "data[18] -> 1 byte "
<< std::bitset<32>(data[18]) << std::endl;
std::cout << "*(unsigned short*)&data[18] -> 2 bytes "
<< std::bitset<32>(*(unsigned short*)&data[18]) << std::endl;
std::cout << "*(int*)&data[18] -> 4 bytes "
<< std::bitset<32>(*(int*)&data[18]) << std::endl;
}
data[18] -> 1 byte 00000000000000000000000010011010
*(unsigned short*)&data[18] -> 2 bytes 00000000000000001001101010011010
*(int*)&data[18] -> 4 bytes 10011010100110101001101010011010

Divide char* into few variables

I have some char array: char char[8] which containing for example two ints, on first 4 indexes is first int, and on next 4 indexes there is second int.
char array[8] = {0,0,0,1,0,0,0,1};
int a = array[0-3]; // =1;
int b = array[4-8]; // =1;
How to cast this array to two int's?
There can be any other type, not necessarily int, but this is only some example:
I know i can copy this array to two char arrays which size will be 4 and then cast each of array to int. But i think this isn't nice, and breaks the principle of clean code.
If your data has the correct endianness, you can extract blitable types from a byte buffer with memcpy:
int8_t array[8] = {0,0,0,1,0,0,0,1};
int32_t a, b;
memcpy(&a, array + 0, sizeof a);
memcpy(&b, array + 4, sizeof b);
While #Vivek is correct that ntohl can be used to normalize endianness, you have to do that as a second step. Do not play games with pointers as that violates strict aliasing and leads to undefined behavior (in practice, either alignment exceptions or the optimizer discarding large portions of your code as unreachable).
int8_t array[8] = {0,0,0,1,0,0,0,1};
int32_t tmp;
memcpy(&tmp, array + 0, sizeof tmp);
int a = ntohl(tmp);
memcpy(&tmp, array + 4, sizeof tmp);
int b = ntohl(tmp);
Please note that almost all optimizing compilers are smart enough to not call a function when they see memcpy with a small constant count argument.
Let's use a little bit of the C++ algorithms, such as std::accumulate:
#include <numeric>
#include <iostream>
int getTotal(const char* value, int start, int end)
{
return std::accumulate(value + start, value + end, 0,
[](int n, char ch){ return n * 10 + (ch-'0');});
}
int main()
{
char value[8] = {'1','2','3','4','0','0','1','4'};
int total1 = getTotal(value, 0, 4);
int total2 = getTotal(value, 4, 8);
std::cout << total1 << " " << total2;
}
Note the usage of std::accumulate and the lambda function. All we did was have a running total, multiplying each subtotal by 10. The character is translated to a number by simply subtracting '0'.
Live Example
You can type cast the bytes from the array to an int *. Then dereferencing will cause 4 bytes to be read as an int. Then doing an ntohl, will ensure that the bytes in the int are arranged as per the host order.
char array[8] = {0,0,0,1,0,0,0,1};
int a = *((int *)array);
int b = *((int *)&array[4]);
a = ntohl(a);
b = ntohl(b);
This will set a and b to 1 on both little and big endian systems.
If the compiler is set for strict aliasing, memcpy could be used to achieve the same, as follows:
char array[8] = {0,0,0,1,0,0,0,1};
int a, b;
memcpy(&a, array, sizeof(int));
memcpy(&b, array+4, sizeof(int));
a = ntohl(a);
b = ntohl(b);

Unsigned int not working C++

Following are different programs/scenarios using unsigned int with respective outputs. I don't know why some of them are not working as intended.
Expected output: 2
Program 1:
int main()
{
int value = -2;
std::cout << (unsigned int)value;
return 0;
}
// OUTPUT: 4294967294
Program 2:
int main()
{
int value;
value = -2;
std::cout << (unsigned int)value;
return 0;
}
// OUTPUT: 4294967294
Program 3:
int main()
{
int value;
std::cin >> value; // 2
std::cout << (unsigned int)value;
return 0;
}
// OUTPUT: 2
Can someone explain why Program 1 and Program 2 don't work? Sorry, I'm new at coding.
You are expecting the cast from int to unsigned int to simply change the sign of a negative value while maintaining its magnitude. But that isn't how it works in C or C++. when it comes to overflow, unsigned integers follow modular arithmetic, meaning that assigning or initializing from negatives values such as -1 or -2 wraps around to the largest and second largest unsigned values, and so on. So, for example, these two are equivalent:
unsigned int n = -1;
unsigned int m = -2;
and
unsigned int n = std::numeric_limits<unsigned int>::max();
unsigned int m = std::numeric_limits<unsigned int>::max() - 1;
See this working example.
Also note that there is no substantial difference between programs 1 and 2. It is all down to the sign of the value used to initialize or assign to the unsigned integer.
Casting a value from signed to unsigned changes how the single bits of the value are interpreted. Lets have a look at a simple example with an 8 bit value like char and unsigned char.
The values of a character value range from -128 to 127. Including the 0 these are 256 (2^8) values. Usually the first bit indicates wether the value is negativ or positive. Therefore only the last 7 bits can be used to describe the actual value.
An unsigned character can't take any negative values because there is no bit to determine wether the value should be negative or positiv. Therfore its value ranges from 0 to 256.
When all bits are set (1111 1111) the unsigned character will have the value 256. However the simple character value will treat the first bit as an indicator for a negative value. Sticking to the two's complement this value will be -1.
This is the reason the cast from int to unsigned int does not what you expected it to do, but it does exactly what its supposed to do.
EDIT
If you just want to switch from negative to positive values write yourself a simple function like that
uint32_t makeUnsigned(int32_t toCast)
{
if (toCast < 0)
toCast *= -1;
return static_cast<uint32_t>(toCast);
}
This way you will convert your incoming int to an unsigned int with an maximal value of 2^32 - 1