Values of padding bytes in C - c++

I've spotted a strange behavior in this small C program. I have 2 structures, both with padding bytes, but in different places.
The first structure has padding bytes with indices [1:3], and the output is expected: static variables are zeroed-out, so padding values are all 0, local variables on stack are left with garbage values in padding bytes. Example output:
Char is first, then int:
aa 60 8e ef ff ff ff ff
aa 00 00 00 ff ff ff ff
But in the second structure, something strange happens. Padding bytes in this structure are with indices [5:7], so I expected some garbage values in non-static variable, but every time the output is:
Int is first, then char:
ff ff ff ff aa 7f 00 00
ff ff ff ff aa 00 00 00
Why the padding is always 7f 00 00?
The complete program:
#include "stdio.h"
#include "stdint.h"
#include "stddef.h"
// 0 1 2 3 4 5 6 7
// |a|#|#|#|b|b|b|b|
typedef struct
{
uint8_t a;
uint32_t b;
} S1;
// 0 1 2 3 4 5 6 7
// |a|a|a|a|b|#|#|#|
typedef struct
{
uint32_t a;
uint8_t b;
} S2;
void print_bytes(void* mem, size_t num_bytes)
{
for (size_t i = 0; i < num_bytes; i++)
printf("%02x ", *((unsigned char*)mem + i));
putc('\n', stdout);
}
int main()
{
S1 var1 = { .a = 0xAA, .b = 0xFFFFFFFF };
static S1 var1_s = { .a = 0xAA, .b = 0xFFFFFFFF };
printf("Char is first, then int:\n");
print_bytes(&var1, sizeof(S1));
print_bytes(&var1_s, sizeof(S1));
S2 var2 = { .a = 0xFFFFFFFF, .b = 0xAA };
static S2 var2_s = { .a = 0xFFFFFFFF, .b = 0xAA };
printf("\nInt is first, then char:\n");
print_bytes(&var2, sizeof(S2));
print_bytes(&var2_s, sizeof(S2));
}

There's no problem if I run your program. Last bytes are random. It likely depends on your system.

Related

C++ structure/array initialization

I have a C++ array or structure initialization issue that I have not been able to resolve.
I have a 4-level nested structure. Each level is actually the same 48 bytes wrapped in the structure next level up. The issue is when the structure is initialized and declared as a scalar value, it is correctly initialized with the provided values. However, when it is declared as a single element array, all 48 bytes become zeros, as shown below. Unfortunately the structures are too complicated to be pasted here.
If I define 4 simple structures, one containing another, with the innermost one containing the same 12 unsigned integers, then it is initialized correctly, even if it is declared in an array.
Has anyone experienced similar issues? What am I missing? What compiler flags, options, etc could lead to such a problem? Appreciate any comments and help.
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "bls12_381/fq.hpp"
static constexpr embedded_pairing::bls12_381::Fq scalar = {
{{{.std_words = {0x1c7238e5, 0xcf1c38e3, 0x786f0c70, 0x1616ec6e, 0x3a6691ae, 0x21537e29,
0x4d9e82ef, 0xa628f1cb, 0x2e5a7ddf, 0xa68a205b, 0x47085aba, 0xcd91de45}}}}
};
static constexpr embedded_pairing::bls12_381::Fq array[1] = {
{{{{.std_words = {0x1c7238e5, 0xcf1c38e3, 0x786f0c70, 0x1616ec6e, 0x3a6691ae, 0x21537e29,
0x4d9e82ef, 0xa628f1cb, 0x2e5a7ddf, 0xa68a205b, 0x47085aba, 0xcd91de45}}}}}
};
void print_struct(const char *title, const uint8_t *cbuf, int len)
{
printf("\n");
printf("[%s] %d\n", title, len);
for (int i = 0; i < len; i++) {
if (i % 30 == 0 && i != 0)
printf("\n");
else if ((i % 10 == 0 || i % 20 == 0) && i != 0)
printf(" ");
printf("%02X ", cbuf[i]);
}
printf("\n");
}
void run_tests()
{
print_struct("scalar", (const uint8_t *) &scalar, sizeof(scalar));
print_struct("array", (const uint8_t *) &array[0], sizeof(array[0]));
}
[scalar] 48
E5 38 72 1C E3 38 1C CF 70 0C 6F 78 6E EC 16 16 AE 91 66 3A 29 7E 53 21 EF 82 9E 4D CB F1
28 A6 DF 7D 5A 2E 5B 20 8A A6 BA 5A 08 47 45 DE 91 CD
[array] 48
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
I've just narrowed down the example.
The following is a complete and standalone example. I also forgot to mention that the initialization on Linux using g++ 9.3.0, -std=c++17, gets the expected results of all FF's. However, on an embedded device, the inherited structure gets all 0's.
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct Data {
uint32_t words;
};
struct Overlay {
Data val;
};
struct Inherit : Data {
};
static Overlay overlay[1] = {
{{.words = 0xffffffff}}
};
static Inherit inherit[1] = {
{{.words = 0xffffffff}}
};
void print_struct(const char *title, const uint8_t *cbuf, int len)
{
printf("[%s] %d\n", title, len);
for (int i = 0; i < len; i++) {
printf("%02X ", cbuf[i]);
}
printf("\n");
}
int main()
{
print_struct("overlay", (const uint8_t *) &overlay[0], sizeof(overlay[0])); // FF FF FF FF
print_struct("inherit", (const uint8_t *) &inherit[0], sizeof(inherit[0])); // 00 00 00 00 <-- incorrect?
return 0;
}

Disable alignment on a 64-bit structure

I'm trying to align my structure and make it as small as possible using bit fields. I have to send this data back to a client, which will examine the fields to set a few data members.
The size of the structure is indeed the same, but when I set members it does not work at all.
Here's some example code:
#pragma pack(push, 1)
struct PW_INFO
{
char hash[16]; //Does not matter
uint32_t number; //Does not matter
uint32_t salt_id : 30; //Position: 0 bits
uint32_t enc_level : 7; //Position: 30 bits
uint32_t delta : 27; //Position: 37 bits
}; //Total size: 28 bytes
#pragma pack(pop)
void int64shrl(uint64_t& base, uint32_t to_shift, uint32_t position)
{
uint64_t res = static_cast<uint64_t>(to_shift);
res = Int64ShllMod32(res, position);
base |= res;
}
int32_t main()
{
std::cout << "Size of PW_INFO: " << sizeof(PW_INFO) << "\n"; //Returns 28 as expected (16 + sizeof(uint32_t) + 8)
PW_INFO pw = { "abc123", 0, 0, 0, 0 };
pw.enc_level = 105;
uint64_t base{ 0 };
&base; //debug purposes
int64shrl(base, 103, 30);
return 0;
}
Here's where it gets weird: setting the "salt_id" field (which is 30 bits into the bitfield) will yield the following result in memory:
0x003FFB8C 61 62 63 31 32 33 00 00 abc123..
0x003FFB94 00 00 00 00 00 00 00 00 ........
0x003FFB9C 00 00 00 00 00 00 00 00 ........
0x003FFBA4 69 00 00 00 i...
(Only the last 8 bytes are of concern since they represent the bit field.)
But, Int64ShllMod32 returns a correct result (the remote client undersands it perfectly):
0x003FFB7C 00 00 00 c0 19 00 00 00 ...À....
I'm guessing it has to do with alignment, if so how would I completely get rid of it? It seems even if the size is correct, it will try to align it (1 byte boundary as the #pragma directive suggests).
More information:
I use Visual Studio 2015 and its compiler.
I am not trying to write those in a different format, the reason I'm asking this is that I do NOT want to use my own format. They are reading from 64 bit bitfields everywhere, I don't have access to the source code but I see a lot of calls to Int64ShrlMod32 (from what I read, this is what the compiler produces when dealing with 8 byte structures).
The actual bitfield starts at "salt_id". 30 + 7 + 27 = 64 bits, I hope it is clearer now.

vector or array value insertion

I have a variable
long long int alpha;
This alpha is basically 8 bytes
but I wish the byte size can be decided dynamically by input to the function.
So that it could be inserted to char* array.
If there is a function
int putInput(int sizeOfAlpha){
long long int alpha;
char* beta = (char*)malloc(128);
for(int i = 0 ; i < 128 ; i++){
... alpha calculation ...
beta[i*sizeOfAlpha] = alpha; // This is also wrong
}
}
Then the size of alpha has to be modified by sizeOfAlpha
For instance if sizeOfAlpha is 2 in decimal,
and if alpha is 0x00 00 00 00 00 00 04 20 in hex,
and if i is 0 ,
then beta[0] should be 04 and beta[1] should be 20 in hex
if alpha is 0x00 00 00 00 00 00 42 AB in hex,
and if i is 1 ,
then beta[2] should be 42 and beta[3] should be AB in hex
Can anyone help me with this?
Assuming alpha is unsigned :
std::vector<std::uint8_t> vec(8);
for(std::size_t j = (i + 1u) * sizeOfAlpha - 1u; sizeOfAlpha; --j, --sizeOfAlpha) {
vec[j] = alpha & 0xff;
alpha >>= 8;
}
Live on Coliru

Structure size and memory layout depending on #pragma pack

Consider the following program compiled in VC++ 2010:
#pragma pack(push, 1) // 1, 2, 4, 8
struct str_test
{
unsigned int n;
unsigned short s;
unsigned char b[4];
};
#pragma pack(pop)
int main()
{
str_test str;
str.n = 0x01020304;
str.s = 0xa1a2;
str.b[0] = 0xf0;
str.b[1] = 0xf1;
str.b[2] = 0xf2;
str.b[3] = 0xf3;
unsigned char* p = (unsigned char*)&str;
std::cout << sizeof(str_test) << std::endl;
return 0;
}
I set breakpoint on return 0; line and look memory window in the debugger, starting from address p. I get the following results (sizeof and memory layout, depending on pack):
// 1 - 10 (pack, sizeof)
// 04 03 02 01 a2 a1 f0 f1 f2 f3
// 2 - 10
// 04 03 02 01 a2 a1 f0 f1 f2 f3
// 4 - 12
// 04 03 02 01 a2 a1 f0 f1 f2 f3
// 8 - 12
// 04 03 02 01 a2 a1 f0 f1 f2 f3
Two questions:
Why sizeof(str_test) is 12 for pack 8?
Why memory layout is the same and doesn't depend on the pack value?
Why sizeof(str_test) is 12 for pack 8?
From the MSDN docs:
The alignment of a member will be on a boundary that is either a
multiple of n or a multiple of the size of the member, whichever is
smaller.
In your case the largest member is 4bytes which is smaller than 8, so 4bytes will be used for alignement.
Why memory layout is the same and doesn't depend on the pack value?
The compiler isn't permited to reorder struct members, but can pad the members. In case of pack 8 it does the following;
#pragma pack(push, 8) // largest element is 4bytes so it will be used instead of 8
struct str_test
{
unsigned int n; // 4 bytes
unsigned short s; // 2 bytes
unsigned char b[4]; // 4 bytes
//2 bytes padding here;
};
#pragma pack(pop)
And hence sizeof(str_test) will be 12.
Well, it seems the compiler (MSVC2010) change padding location based on the type, in the case of unsigned char b[4]; it placed two bytes padding at the end of the structure. In your case the 2 bytes cc cc happens to be after the character array.
#pragma pack(push, 8) // largest element is 4bytes so it will be used instead of 8
struct str_test
{
unsigned int n; // 4 bytes
unsigned short s; // 2 bytes
//2 bytes padding here;
int; // 4 bytes
};
#pragma pack(pop)
What I did is change the last member from char[4] to int, and can be verified by subtracting the addresses of the last and first member in both cases 6 and 8 respectively.
Memory dump in the case of last member int as follows
04 03 02 01 a2 a1 cc cc f0 f1 f2 f3
Memory dump in the case of last member unsigned char[4] as follows
04 03 02 01 a2 a1 f0 f1 f2 f3 cc cc

C union char array prints 'd' on mac?

I am new to C/C++, but was curious to know about the issue i am seeing.
typedef union
{
int a;
float c;
char b[20];
}
Union;
int main()
{
Union y = {100};
printf("Union y :%d - %s - %f \n",y.a,y.b,y.c);
}
And output is
Union y :100 - d - 0.000000
My question is ...why is 'd' getting printed? I changed the order in union still the same output. but if i declare a char f[20] outside the union then nothing gets printed.
I am having MacBook lion image and using xcode.
THanks in advance
The ASCII code for 'd' is 100. Setting a to 100 amounts to setting b to {'d', '\0', '\0', '\0', …noise…} (on a 32-bit little-endian machine), which printf treats as "d".
The following program may help you understand better:
#include <stdio.h>
typedef union
{
int a;
float c;
char b[20];
}
Union;
void dump(const void* buffer, size_t length)
{
size_t i;
for (i = 0; i < length;) {
printf("%.2x ", reinterpret_cast<const unsigned char*>(buffer)[i]);
++i;
if (i % 16 == 0) {
putchar('\n');
} else if (i % 8 == 0) {
putchar(' ');
}
}
if (i % 16 != 0) {
putchar('\n');
}
}
int main()
{
Union y = {100};
printf("Union y :%d - %s - %f \n",y.a,y.b,y.c);
printf("The content of the Union is: \n");
dump(&y, sizeof y);
}
The output is:
Union y :100 - d - 0.000000
The content of the Union is:
64 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
Effectively, the binary representation of a (which is a int) is 64 00 00 00, 100 in little-endian hexadecimal. The binary representation of b is 64 00 ..., and the 0x00 ends the string, while 0x64 is 'd'. The binary representation of c is 64 00 00 00, which in the IEEE float representation is 0.0, because the non-zero part is the exponent.
I changed the order in union still the same output.
The order of the elements in a union doesn't change anything because all the elements of a union use the same piece of memory. Your code prints 100 for y.a and d for y.b because both expressions interpret the same bytes. So, for example, if you add a line that sets y.b and then prints again:
Union y = {100};
printf("Union y :%d - %s - %f \n",y.a,y.b,y.c);
y.b = 'f';
printf("Union y :%d - %s - %f \n",y.a,y.b,y.c);
you'll see that y.a and y.c. change whenever y.b changes, and vice versa. y.a should change to 102 in the second printf(), since that's the ASCII character code for 'f'.