Somebody please explain what it is behaving like this? - c++

int main(){
int p = 10 , q = 11;
int a[3];
int * ptr = &q;
cout << *ptr+1 <<endl;
cout <<"P = " << (int)&p <<endl;
cout <<"q = " << (int)&q <<endl;
cout << "Size of p = "<< sizeof(p) <<endl;
cout << "(int)&a[0]:" << (int)&a[0] <<endl;
cout << "(int)&a[1]:" << (int)&a[1] <<endl;
cout << "(int)&a[2]:" << (int)&a[2] <<endl;
}
And the output of this code is
12
P = 14679472
q = 14679460
Size of p = 4
(int)&a[0]:14679440
(int)&a[1]:14679444
(int)&a[2]:14679448
Now i have few questions .
- int take 4 bytes or 8 bytes but here the difference between p and q is 12. I don't get it ? what it is? am i missing something important here ?
- I was actually reading about about stack frame so any body can explain or provide a good study material what it is would be really appreciated.

With regard to (paraphrased):
An int take four or eight bytes but here the difference between p and q is twelve.
An int does not take four or eight bytes, that's just a common thing in many implementations. The standard specifies a minimum range but no maximum. In any case, the standard does not mandate that distinct variables are allocated on the stack in any particular order. In fact, it doesn't mandate a stack at all, it just has to behave in a certain way.
Granted arrays have to be ordered in a certain way (as you can see from the addresses of the a elements) but p and q are not part of an array. And, even thought the a element addresses are four bytes apart, even that is not mandated. An implementation is free to add padding between elements and after the last element.
Since you can't (safely) use the address of p to do anything other than access p itself, it's address is actually irrelevant.
In regard to your second question about good reference material, that's actually disfavoured by Stack Overflow, one of the close reasons being:
Seeking recommendations for books, tools, software libraries, and more. This question is likely to lead to opinion-based answers.
Hence I won't bother answering that bit.
However, I will give a bit of extra advice: if you really want to know what specific implementations are doing with your code, you should head on over to GodBolt and type your code in. Making some minor modifications to get it to compile cleanly:
#include <iostream>
#define USE_P 1
#define USE_Q 1
#define USE_A 1
int main() {
#if USE_P
int p = 10;
#endif
#if USE_Q
int q = 11;
#endif
#if USE_A
int a[3] = {12, 13, 14};
#endif
std::cout << "hello\n";
}
You'll see something at the start of main that looks like this (though I added the comments):
push rbp
mov rbp, rsp
sub rsp, 32 // Stack frame size
mov DWORD PTR [rbp-4], 10 // p
mov DWORD PTR [rbp-8], 11 // q
mov DWORD PTR [rbp-20], 12 // a[0]
mov DWORD PTR [rbp-16], 13 // a[1]
mov DWORD PTR [rbp-12], 14 // a[2]
And you watch how this changes when you tell it to stop using any of those variables, in any combination. From the example I gave (and using x86-64 gcc 10.1), it appears that the variables are placed as you expected. However, as stated earlier, this is not a requirement.

Why it is priniting 12 for *ptr+1
As per operator precedence, deference operator * will be executed first and it will give the value of 11. 11 plus 1 is 12.
int take 4 bytes or 8 bytes but here the difference between p and q is 12. I don't get it ? what it is? am i missing something important here ?
The compiler/implementation can save the variable with auto storage class in any order. Even, in some implementation, there couldn't a stack at all. The standard just defines the behavior and each implementation has to abide to it but they are free to use any data structure for function calls (but I think stack are mostly used). There could be some variables in between p and q with as much padding as it wants.
I was actually reading about about stack frame so any body can explain or provide a good study material what it is would be really appreciated.
See this and read about ABI.

Related

C++ HOW can this out-of-range access inside struct go wrong?

#include <iostream>
#include <random>
using namespace std;
struct TradeMsg {
int64_t timestamp; // 0->7
char exchange; // 8
char symbol[17]; // 9->25
char sale_condition[4]; // 26 -> 29
char source_of_trade; // 30
uint8_t trade_correction; // 31
int64_t trade_volume; // 32->39
int64_t trade_price; // 40->47
};
static_assert(sizeof(TradeMsg) == 48);
char buffer[1000000];
template<class T, size_t N=1>
int someFunc(char* buffer, T* output, int& cursor) {
// read + process data from buffer. Return data in output. Set cursor to the last byte read + 1.
return cursor + (rand() % 20) + 1; // dummy code
}
void parseData(TradeMsg* msg) {
int cursor = 0;
cursor = someFunc<int64_t>(buffer, &msg->timestamp, cursor);
cursor = someFunc<char>(buffer, &msg->exchange, cursor);
cursor++;
int i = 0;
// i is GUARANTEED to be <= 17 after this loop,
// edit: the input data in buffer[] guarantee that fact.
while (buffer[cursor + i] != ',') {
msg->symbol[i] = buffer[cursor + i];
i++;
}
msg->symbol[i] = '\n'; // might access symbol[17].
cursor = cursor + i + 1;
for (i=0; i<4; i++) msg->sale_condition[i] = buffer[cursor + i];
cursor += 5;
//cursor = someFunc...
}
int main()
{
TradeMsg a;
a.symbol[17] = '\0';
return 0;
}
I have this struct that is guaranteed to have predictable size. In the code, there is a case where the program tries to assign value to an array element past its size msg->symbol[17] = ... .
However, in that case, the assignment does not cause any harm as long as:
It is done before the next struct members (sale_condition) are assigned (no unexpected code reordering).
It does not modifies any previous members (timestamp, exchange).
It does not access any memory outside the struct.
I read that this is undefined behavior. But what kind of compiler optimization/code generation can make this go wrong? symbol[17] is pretty deep inside the middle of the struct, so I don't see how can the compiler generates an access outside it. Assume that platform is x86-64 only
Various folks have pointed out debug-mode checks that will fire on access outside the bounds of an array member of a struct, with options like gcc -fsanitize=undefined. Separate from that, it's also legal for a compiler to use the assumption of non-overlap between member accesses to reorder two assignments which actually do alias:
#Peter in comments points out that the compiler is allowed to assume that accesses to msg->symbol[i] don't affect other struct members, and potentially delay msg->symbol[i] = '\n'; until after the loop that writes msg->sale_condition[i]. (i.e. sink that store to the bottom of the function).
There isn't a good reason you'd expect a compiler to want to do that in this function alone, but perhaps after inlining into some caller that also stored something there, it could be relevant. Or just because it's a DeathStation 9000 that exists in this thought experiment to break your code.
You could write this safely, although GCC compiles that worse
Since char* is allowed to alias any other object, you could offset a char* relative to the start of the whole struct, rather than to the start of the member array. Use offsetof to find the right start point like this:
#include <cstddef>
...
((char*)msg + offsetof(TradeMsg, symbol))[i] = '\n'; // might access symbol[17].
That's exactly equivalent to *((char*)msg + offsetof(...) + i) = '\n'; by definition of C++'s [] operator, even though it lets you use [i] to index relative to the same position.
However, that does compile to less efficient asm with GCC11.2 -O2. (Godbolt), mostly because int i, cursor are narrower than pointer-width. The "safe" version that redoes indexing from the start of the struct does more indexing work in asm, not using the msg+offsetof(symbol) pointer that it was already using as the base register in the loop.
# original version, with UB if `i` goes past the buffer.
# gcc11.2 -O2 -march=haswell. -O3 fully unrolls into a chain of copy/branch
... partially peeled first iteration
.L3: # do{
mov BYTE PTR [rbx+8+rax], dl # store into msg->symbol[i]
movsx rdi, eax # not read inside the loop
lea ecx, [r8+rax]
inc rax
movzx edx, BYTE PTR buffer[rsi+1+rax] # load from buffer
cmp dl, 44
jne .L3 # }while(buffer[cursor+i] != ',')
## End of copy-and-search loop.
# Loops are identical up to this point except for MOVSX here vs. MOV in the no-UB version.
movsx rcx, ecx # just redo sign extension of this calculation that was done repeatedly inside the loop just for this, apparently.
.L2:
mov BYTE PTR [rbx+9+rdi], 10 # store a newline
mov eax, 1 # set up for next loop
# offsetof version, without UB
# same loop, but with RDI and RSI usage switched.
# And with mov esi, eax zero extension instead of movsx rdi, eax sign extension
cmp dl, 44
jne .L3 # }while(buffer[cursor+i] != ',')
add esi, 9 # offsetof(TradeMsg, symbol)
movsx rcx, ecx # more stuff getting sign extended.
movsx rsi, esi # including something used in the newline store
.L2:
mov BYTE PTR [rbx+rsi], 10
mov eax, 1 # set up for next loop
The RCX calculation seems to just be for use by the next loop, setting sale_conditions.
BTW, the copy-and-search loop is like strcpy but with a ',' terminator. Unfortunately gcc/clang don't know how to optimize that; they compile to a slow byte-at-a-time loop, not e.g. an AVX512BW masked store using mask-1 from a vec == set1_epi8(',') compare, to get a mask selecting the bytes-before-',' instead of the comma element. (Probably needs a bithack to isolate that lowest-set-bit as the only set bit, though, unless it's safe to always copy 16 or 17 bytes separate from finding the ',' position, which could be done efficiently without masked stores or branching.)
Another option might be a union between a char[21] and struct{ char sym[17], sale[4];}, if you use a C++ implementation that allows C99-style union type-punning. (It's a GNU extension, and also supported by MSVC, but not necessarily literally every x86 compiler.)
Also, style-wise, shadowing int i = 0; with for( int i=0 ; i<4 ; i++ ) is poor style. Pick a different var name for that loop, like j. (Or if there is anything meaningful, a better name for i which has to survive across multiple loops.)
In a few cases:
When variable guard is set up: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
In a C++ interpreter (yes they exist): https://root.cern/cling/
Your symbol has a size of 17 Yet, you are trying to assign a value to the 18th index a.symbol[17] = '\0';
Remember your index value starts off at 0 not 1.
So you have two places that can go wrong. i can equal 17 which will cause an error and that last line I showed above will cause an error.

Count array elements where there are significant 0 bits (below the highest 1 bit)

An array of 3 bytes is specified. Count the number of bytes where there's a zeros after any one. i.e. where the bits below the most-significant 1 are not all 1.
{00000100, 00000011, 00001000} - for this array the answer is 2.
My code gives 1, but it is incorrect; how to fix that?
#include <iostream>
#include <bitset>
using namespace std;
int main() {
int res = 0, res1 = 0;
_int8 arr[3] = { 4, 3, 8 };
__asm {
mov ecx, 3
mov esi, 0
start_outer:
mov bx, 8
mov al, arr[esi]
start_inner :
shl al, 1
jnb zero
jc one
one :
dec bx к
test bx, bx
jnz start_inner
jmp end_
zero :
dec bx
test bx, bx
jz end_
inc res
shl al, 1
jnb was_zero
jc start_inner
was_zero :
dec bx
dec res
jmp start_inner
end_ :
inc esi
loop start_outer
}
cout << res << endl;
system("pause");
}
Next try.
Please try to explain better next time. Many many people did not understand your question. Anyway. I hope that I understood now.
I will explain the used algorithm for one byte. Later in the program, we will run simple a outer loop 3 times, to work on all values. And, I will of course show the result in assembler. And, this is one of many possible solutions.
We can observe the following:
Your satement "Count the number of bytes where there's a zeros after any one." means, that you want to count the number of transition of a bit from 1 to 0 in one byte. And this, if we look at the bits from the msb to the lsb. So, from left to right.
If we formulate this vice versa, then we can also count the number of transitions from 0 to 1, if we go from right to left.
A transition from 0 to 1 can always be calculated by "and"ing the new value with the negated old value. Example:
OldValue NewValue NotOldValue And
0 0 1 0
0 1 1 1 --> Rising edge
1 0 0 0
1 1 0 0
We can also say in words, if the old, previous vale was not set, and the new value is set, then we have a rising edge.
We can look at one bit (of a byte) after the other, if we shift right the byte. Then, the new Value (the new lowest bit) will be the LSB. We remember the old previous bit, and the do the test. Then we set old = new, read again the new value, do the test and so on and so on. This we do for all bits.
In C++ this could look like this:
#include <iostream>
#include <bitset>
using byte = unsigned char;
byte countForByte(byte b) {
// Initialize counter variable to 0
byte counter{};
// Get the first old value. The lowest bit of the orignal array entry
byte oldValue = b & 1;
// Check all 8 bits
for (int i=0; i<8; ++i) {
// Calculate a new value. First shift to right
b = b >> 1;
// Then mask out lowest bit
byte newValue = b & 1;
// Now apply our algorithm. The result will always be 0 or one. Add to result
counter += (newValue & !oldValue);
// The next old value is the current value from this time
oldValue = newValue;
}
return counter;
}
int main() {
unsigned int x;
std::cin >> x;
std::cout << std::bitset<8>(x).to_string() << "\n";
byte s = countForByte(x);
std::cout << static_cast<int>(s) << '\n';
return 0;
}
So, and for whatever reason, you want a solution in assembler. Also here, you need to tell the people why you want to have it, what compiler you use and what target microprocessor you use. Otherwise, how can people give the correct answer?
Anyway. Here the solution for X86 architecture. Tested wis MS VS2019.
#include <iostream>
int main() {
int res = 0;
unsigned char arr[3] = { 139, 139, 139 };
__asm {
mov esi, 0; index in array
mov ecx, 3; We will work with 3 array values
DoArray:
mov ah, arr[esi]; Load array value at index
mov bl, ah; Old Value
and bl, 1; Get lowest bit of old value
push ecx; Save loop Counter for outer loop
mov ecx, 7; 7Loop runs to get the result for one byte
DoTest:
shr ah, 1; This was the original given byte
mov al, ah; Get the lowest bit from the new shifted value
and al, 1; This is now new value
not bl; Invert the old value
and bl, al; Check for rising edge
movzx edi, bl
add res, edi; Calculate new result
mov bl, al; Old value = new value
loop DoTest
inc esi; Next index in array
pop ecx; Get outer loop counter
loop DoArray; Outer loop
}
std::cout << res << '\n';
return 0;
}
And for this work, I want 100 upvotes and an accepted answer . . .
Basically, user #Michael gave already the correct answer. So all credits go to him.
You can find a lot of bit fiddling posts here on stack overflow. But a very good description for such kind of activities, you may find in the book "Hacker’s Delight" by "Henry S. Warren, Jr.". I have here the 2nd edition.
The solution is presented in chapter 2, "Basics", then "2–1 Manipulating Rightmost Bits"
And if you manually check, what values do NOT fullfill your condition, then you will find out that these are
0,1,3,7,15,31,63,127,255,
or, in binary
0b0000'0000, 0b0000'0001, 0b0000'0011, 0b0000'0111, 0b0000'1111, 0b0001'1111, 0b0011'1111, 0b0111'1111, 0b1111'1111,
And we detect that these values correspond to 2^n - 1. And, according to "Hacker’s Delight", we can find that with the simple formular
(x & (x + 1)) != 0
So, we can translate that to the following code:
#include <iostream>
int main() {
unsigned char arr[3];
unsigned int x, y, z;
std::cin >> x >> y >> z;
arr[0] = static_cast<unsigned char>(x);
arr[1] = static_cast<unsigned char>(y);
arr[2] = static_cast<unsigned char>(z);
unsigned char res = ((arr[0] & (arr[0] + 1)) != 0) + ((arr[1] & (arr[1] + 1)) != 0) + ((arr[2] & (arr[2] + 1)) != 0);
std::cout << static_cast<unsigned int>(res) << '\n';
return 0;
}
Very important. You do not need assembler code. Optimizing compiler will nearly always outperform your handwritten code.
You can check many different versions on Compiler Explorer. Here you could see, that your code example with static values would be completely optimized away. The compiler would simply calculate everthing in compile time and simply show 2 as result. So, caveat. Compiler explorer will show you the assembly language generated by different compilers and for selected hardware. You can take that if you want.
Please additionally note: The above sketched algorithm does not need any branch. Except, if you want to iterate over an array/vector. For this, you could write a small lambda and use algorithms from the C++ standard library.
C++ solution
#include <iostream>
#include <vector>
#include <algorithm>
#include <numeric>
#include <iterator>
int main() {
// Define Lambda to check conditions
auto add = [](const size_t& sum, const unsigned char& x) -> size_t {
return sum + static_cast<size_t>(((x & (x + 1)) == 0) ? 0U : 1U); };
// Vector with any number of test values
std::vector<unsigned char> test{ 4, 3, 8 };
// Calculate and show result
std::cout << std::accumulate(test.begin(), test.end(), 0U, add) << '\n';
return 0;
}

Dynamically Allocating Arrays Depending on User Input in C++

im watching this tutorial on youtube https://www.youtube.com/watch?v=8XAQzcJvOHk Dynamically Allocating Arrays Depending on User Input in C++
this is his code
1 int main()
2 {
3 int *pointer = nullptr;
4
5 cout << "how many items u are gonna enter" << endl;
6 int input;
7 cin >> input;
8
9 pointer = new int[input];
10
11 int temp;
12
13 for (int counter = 0; counter < input; counter++) {
14 cout << "enter the item " << counter + 1 << endl;
15 cin >> temp;
16 *(pointer + counter) = temp;
17 }
18
19 cout << "the items you have entered are" << endl;
20 for (int counter = 0; counter < input; counter++) {
21 cout << counter + 1 << " item is " << *(pointer + counter) << endl;
22 }
23
24 delete[]pointer;
25
26 return 0;
27}
im stuck in line 16, i dont understand why is that, inside the (), the pointer variable and counter are added to each other
Pointer Arithmetic is a good point where to start.
I'm going to try to explain you briefly how it works, but I strongly suggest you to integrate those concepts with a good book or internet references because they are very important for proper handling pointers.
A pointer (as you can imagine from the name) points a memory cell:
int* ptr = /*an address to the memory cell*/
Your memory is composed by sequentially cells, graphically:
Address| Value
-------|------------
0x00 | [#some value] 8 bit
0x01 | [#some value] 8 bit
... | ...
0xN | [#some value] 8 bit
Just to make this example not so long, we can assume each cell contains 8 bits and a integer value is represented with exactly 32 bit (usually that is not true, and it depends on the machine architecture and compiler).
Then a int value is stored exactly in 4 cells. (We explicitly don't consider memory alignment).
So your pointer contains a memory location, the address in the memory which contains the value you've allocated (with the usage of dynamic memory).
For example:
int* ptr = 0x01
That means the variable ptr, stored somewhere in the memory, contains the address 0x01. In the memory cell 0x01 there will be the integer value allocated dynamically.
But, since the value is an integer type, the data will take 4 cell in order to store the complete information. So the data will be "split" among the cell 0x01, 0x02, 0x03, 0x04.
The pointer will points the first memory location of the data, and the number of cell occupied is given by the type of pointer (in that case pointer int so the compiler knows the information starts from cell 0x01 and ends 0x04).
A variable pointer can be evaluated in an arithmetic expression, such sums and differences.
Fo example:
ptr + 10
ptr - 10
Simply, the meaning of that expression is to access to memory address starting from the address stored in ptr and jumping 10 int cells forward or backward.
Attention Note: the expression does not mean to simply add the value to the address obtaining a new address.
Indeed, assuming ptr = 0x01, then the expression:
ptr + 10;
does not mean 0x01 + 10 = 0xa!
Instead that means to jump 10 "block" of size equal to the type's size
pointed by the pointer itself.
That is, 0x01 + 10 * 4bytes.
Since ptr is a pointer to int, then +10 means "plus 10 block of integers", and, in this example, each int occupies 4 bytes (32 bit).
To conclude, the expression:
*(pointer + counter) = temp;
means to access to the address start from pointer and adding #counter block of int, then deference that address with the operator* and write in that address the value temp.
That notation can be easily simplify with the operator[]:
pointer[counter] = temp;
where the meaning is exactly the same, but the notation is more readable, especially when you have to do with array.
This part:
*(pointer + counter)
is just simple pointer arithmetic: we are adding counter (of type int) to the pointer address and then dereferencing it using *. It is the same as pointer[counter]. After that, we are savig value of temp into that particular (dereferenced) location in memory.
*(pointer + counter) is equivalent to pointer[counter] as has been pointed out, the reason it's equivalent is because pointer holds a memory address, when you add say 1 to the that memory address you are infact adding the size of whatever the data type that pointer is pointing to is, multiplied by 1.
If you have a primitive array
int arr[2] = {1,55};
*arr would give you 1 and *(arr + 1) would give you 55
*(pointer + counter) = temp;
is same as
pointer[counter] = temp;
The variable pointer contain the address of the first element of the array.
Adding counter means selecting the address of counter away from starting address.
counter is simply an offset from pointer.

Can someone explain the meaning of malloc(20 * c | -(20 * (unsigned __int64)(unsigned int)c >> 32 != 0))

In decompiled code generated by IDA I see expressions like:
malloc(20 * c | -(20 * (unsigned __int64)(unsigned int)c >> 32 != 0))
malloc(6 * n | -(3 * (unsigned __int64)(unsigned int)(2 * n) >> 32 != 0))
Can someone explain the purpose of these calculations?
c and n are int (signed integer) values.
Update.
Original C++ code was compiled with MSVC for 32-bit platform.
Here's assembly code for second line of decompiled C-code above (malloc(6 * ..)):
mov ecx, [ebp+pThis]
mov [ecx+4], eax
mov eax, [ebp+pThis]
mov eax, [eax]
shl eax, 1
xor ecx, ecx
mov edx, 3
mul edx
seto cl
neg ecx
or ecx, eax
mov esi, esp
push ecx ; Size
call dword ptr ds:__imp__malloc
I'm guessing that original source code used the C++ new operator to allocate an array and was compiled with Visual C++. As user3528438's answer indicates this code is meant to prevent overflows. Specifically it's a 32-bit unsigned saturating multiply. If the result of the multiplication would be greater than 4,294,967,295, the maximum value of a 32-bit unsigned number, the result is clamped or "saturated" to that maximum.
Since Visual Studio 2005, Microsoft's C++ compiler has generated code to protect against overflows. For example, I can generate assembly code that could be decompiled into your examples by compiling the following with Visual C++:
#include <stdlib.h>
void *
operator new[](size_t n) {
return malloc(n);
}
struct S {
char a[20];
};
struct T {
char a[6];
};
void
foo(int n, S **s, T **t) {
*s = new S[n];
*t = new T[n * 2];
}
Which, with Visual Studio 2015's compiler generates the following assembly code:
mov esi, DWORD PTR _n$[esp]
xor ecx, ecx
mov eax, esi
mov edx, 20 ; 00000014H
mul edx
seto cl
neg ecx
or ecx, eax
push ecx
call _malloc
mov ecx, DWORD PTR _s$[esp+4]
; Line 19
mov edx, 6
mov DWORD PTR [ecx], eax
xor ecx, ecx
lea eax, DWORD PTR [esi+esi]
mul edx
seto cl
neg ecx
or ecx, eax
push ecx
call _malloc
Most of the decompiled expression is actually meant to handle just one assembly statement. The assembly instruction seto cl sets CL to 1 if the previous MUL instruction overflows, otherwise it sets CL to 0. Similarly the expression 20 * (unsigned __int64)(unsigned int)c >> 32 != 0 evaluates to 1 if the result of 20 * c overflows, and evaluates to 0 otherwise.
If this overflow protection wasn't there and the result of 20 * c did actually overflow then the call to malloc would probably succeed, but allocate much less memory than the program intended. The program would then likely write past the end of the memory actually allocated and trash other bits of memory. This would amount to a buffer overrun, one that could be potentially exploited by hackers.
Since this code is decompiled from ASM, so we can only guess what it actually does.
Let's first format it so figure the precedence:
malloc(20 * c | -(20 * (unsigned __int64)(unsigned int)c >> 32 != 0))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
//this is first evaluated, promoting c to
//64 bit unsigned int without doing sign
//extension, regardless the type of c
malloc(20 * c | -(20 * (uint64_t)c >> 32 != 0))
^^^^^^^^^^^^^^^^
//then, multiply by 20, with uint64 result
malloc(20 * c | -(20 * (uint64_t)c >> 32 != 0))
^^^^^^^^^^^^^^^^^^^^^^^^^^^
//if 20c is greater than 2^32-1, then result is true,
//use -1 to generate a mask of 0xffffffff,
//bitwise operator | then masks 20c to 0xffffffff
//(2^32-1, the maximum of size_t, input type to malloc)
//regardless what 20c actually is
//if 20c is smaller than 2^32-1, then result is false,
//the mask is 0, bitwise operator | keeps the final
//input to malloc as 20c untouched
What are 20 and 6?
Those probably come from the common usage of
malloc(sizeof(Something)*count). Those two calls to malloc are probably made with sizeof(Something) and sizeof(SomethingElse) evaluated to 20 and 6 at compile time.
So what this code actually does:
My guess, it's trying to prevent sizeof(Something)*count from overflowing and cause the malloc to succeed and cause buffer overflow when the memory is used.
By evaluating the product in 64 bit unsigned int and test against 2^32-1, when size is greater than 2^32-1, the input to malloc is set to a very large value that makes it guaranteed to fail (No 32 bit system can allocate 2^32-1 bytes of memory).
Can someone explain the purpose of these calculations?
It is important to understand that compiling changes the semantic meaning of code. Much unspecified behavior of the original code becomes specified by the compilation process.
IDA has no idea whether things the generated assembly code just happens to do are important or not. To be safe, it tries to perfectly replicate the behavior of the assembly code, even in cases that cannot possibly happen given the way the code is used.
Here, IDA is probably replicating the overflow characteristics that the conversion of types just happens to have on this platform. It can't just replicate the original C code because the original C code likely had unspecified behavior for some values of c or n, likely negative ones.
For example, say I write this C code: int f(unsigned j) { return j; }. My compiler will likely turn that into very simple assembly code giving whatever behavior for negative values of j that my platform just happens to give.
But if you decompile the generated assembly, you cannot decompile it to int f(unsigned j) { return j; } because that will not behave the same as the my assembly code did on platforms with different overflow behavior. That could compile to code (on other platforms) that returns different values than my assembly code does for negative values of j.
So it is often literally impossible (in fact, incorrect) to decompile C code into the original code, it will often have these kinds of "portably replicate this platform's behavior" oddities.
it's rounding up to the nearest block size.
forgive me. What it's doing is calculating a multiple of c while simultaneously checking for a negative value (overflow):
#include <iostream>
#include <cstdint>
size_t foo(char c)
{
return 20 * c | -(20 * (std::uint64_t)(unsigned int)c >> 32 != 0);
}
int main()
{
using namespace std;
for (char i = -4 ; i < 4 ; ++i)
{
cout << "input is: " << int(i) << ", result is " << foo(i) << endl;
}
return 0;
}
results:
input is: -4, result is 18446744073709551615
input is: -3, result is 18446744073709551615
input is: -2, result is 18446744073709551615
input is: -1, result is 18446744073709551615
input is: 0, result is 0
input is: 1, result is 20
input is: 2, result is 40
input is: 3, result is 60
To me the number 18446744073709551615 doesn't mean much, at a glance. Only after seeing it expressed in hex I went "ah". – Jongware
adding << hex:
input is: -1, result is ffffffffffffffff

Modify byte with pointer

long double i, *ptr;
ptr = &i;
I want to modify the value of byte No. 4. Size of long double is 8 byte. So is it possible by subtracting 4 from *ptr ?
i.e
(ptr)-4 = 9;
You can access the bytes that represent an object by converting a pointer to the object to a pointer to unsigned char and then accessing bytes through that pointer. For example, the fourth byte of i could be set to 9 by:
unsigned char *p = (unsigned char *) &i;
*(p+4) = 9;
However, you should not do this without good reason. It runs into portability problems and should only be done for special purposes and with careful attention to the C standard and/or the documentation of your C implementation. If you explain further why you want to do something like this, it might be possible to show better ways of doing it or to explain the hazards.
Note that the correct address for byte four (starting numbering at byte zero) is p+4, not p-4 as used in the question.
I would attempt something more readable
union {
long double d;
char v[sizeof(long double)];
} x;
x.d = 1234567890;
std::cout << x.d << ' ' << int(x.v[6]) << std::endl;
x.v[6] = 0xCC;
std::cout << x.d << ' ' << int(x.v[6]) << std::endl;
yields
1.23457e+09 44
1.23981e+09 -52
(*ptr)-4 = 9 is not permitted because it leads to RValue violation (Right hand side of an assignment operation cannot be another operation).
But you can use bit operations like:
(*ptr) = (*ptr) & 0x00009000;
First see here: How do you set only certain bits of a byte in C without affecting the rest?. The idea is to clear the byte you are interested in, and then set your value with an or operation. So in your case you'd do:
val &= ~0xff; // Clear lower byte
val |= 9 & 0xff; // Set Nine into the least significant byte.