C++: Casting unsigned char to a Structure

C++: Casting unsigned char to a Structure - c++

What I am trying to do
typedef struct {
unsigned char a;
unsigned char b;
unsigned int c;
} Packet;
unsigned char buffer[] = {1, 1, 0, 0, 0, 1};
Packet pkt = (Packet)buffer;
Basically I am trying to cast a byte array to a structure in C++, when compiling I get:
No matching function call for Packet::Packet(unsigned char[6])
Is this not possible or do I have to manually index into the array?

There are a few ways to do this:
// packet.h
////////////////
struct Packet {
unsigned char a;
unsigned char b;
unsigned int c;
};
If you compile and dump the structs with pahole you will see the paddings
$ pahole -dr --structs main.o
struct Packet {
unsigned char a; /* 0 1 */
unsigned char b; /* 1 1 */
/* XXX 2 bytes hole, try to pack */
unsigned int c; /* 4 4 */
/* size: 8, cachelines: 1, members: 3 */
/* sum members: 6, holes: 1, sum holes: 2 */
/* last cacheline: 8 bytes */
};
So it's basically the 2 chars, 2 padding bytes and 4 bytes of an int for a total of 8 bytes.
Because Intel is a little endian platform, the least significant byte comes first as in
void print_packet( Packet* pkt ) {
printf( "a:%d b:%d c:%d\n", int(a), int(b), c );
}
int main() {
unsigned char buffer[] = {1, 1, 0, 0, 1, 0, 0, 0};
print_packet( (Packet*) buffer );
print_packet( reinterpret_cast<Packet*>(buffer));
}
Produces:
$ g++ main.cpp -o main
$ ./main
a:1 b:1 c:1
a:1 b:1 c:1
However one can change the packing from the command line as below where we set the alignment to 2 bytes.
$ g++ -ggdb main.cpp -o main -fpack-struct=2
$ pahole -dr --structs main
struct Packet {
unsigned char a; /* 0 1 */
unsigned char b; /* 1 1 */
unsigned int c; /* 2 4 */
/* size: 6, cachelines: 1, members: 3 */
/* last cacheline: 6 bytes */
} __attribute__((__packed__));
Then you can see that the Packet struct is only 6 bytes and the result of running main is completely different
$ ./main
a:1 b:1 c:65536
a:1 b:1 c:65536
This is because the value of c is now 0x00000100 or 65536
So not to be at mercy of these compiler shenanigans, it is better to define your packet in code with the right packing as
// packet.h
////////////////
struct [[gnu::packed]] Packet {
unsigned char a;
unsigned char b;
unsigned char reserved[2];
unsigned int c;
};
Then execution becomes
$ g++ -ggdb main.cpp x.cpp -o main -fpack-struct=2
$ ./main
a:1 b:1 c:1
a:1 b:1 c:1
$ g++ -ggdb main.cpp x.cpp -o main -fpack-struct=4
$ ./main
a:1 b:1 c:1
a:1 b:1 c:1
$ g++ -ggdb main.cpp x.cpp -o main -fpack-struct=8
$ ./main
a:1 b:1 c:1
a:1 b:1 c:1
$ g++ -ggdb main.cpp x.cpp -o main -fpack-struct=16
$ ./main
a:1 b:1 c:1
a:1 b:1 c:1

First of all your assumption that byte representation of your structure is excatly same as you write in struct is wrong for most of current architectures.
For example, on 32-bit architecture you definition will be equivalent to
struct Packet {
char a;
char b;
char __hidden_padding[2];
int c;
};
Similar thing, but with different number of padding will happen on 64-bit architecture. So, to avoid this you need to tell compiler to "pack" structure without padding bytes. There is no standard syntaxis for this, but most compilers provide means to do this. For example, for gcc/clang you can do:
struct [[gnu::packed]] Packet {
char a;
char b;
int c;
};
Warning, when working with such structures it is not advised to take address of its members, see Is gcc's __attribute__((packed)) / #pragma pack unsafe?.
Now, since "simple" types like char, int, etc have implementation defined size it is much better to use fixed-sized types, and finally check that structure size is what you expect, like Evg suggsested:
struct [[gnu::packed]] Packet {
int8_t a;
int8_t b;
int32_t c;
};
static_assert(sizeof(Packet) == 6);
Copying is best done by either std::bit_cast if you have C++20 or just memcpy. These 2 are only standard ways today, as far as I know. Using *reinterpret_cast<Packet*>(buffer) is undefined, though still works for most compilers.

You can do this with a reinterpret_cast from the array:
Packet pkt = *reinterpret_cast<Packet*>(buffer);
What this does is decay the array into a pointer to its 1st element, then treat that pointer as a Packet* pointer, then we dereference that and copy it into a new Packet structure. This circumvents essentially all compiler type and safety checks, so you need to be very careful here.
One thing we can do to make this a bit safer is to use a static_assert to ensure that the structure is the size that we expect. This will then fail to compile if the compiler inserts any padding into the structure definition.
static_assert(sizeof(Packet) == 6);
Depending on your compiler and compilation settings, it is almost certain that your structure as written is NOT 6 bytes.
Any time you are using reinterpret_cast, you are working very close to the realm of undefined / compiler dependent behavior. Generally speaking, as long as you do the padding checks and dealing with primitive data types inside the structure, things will work as you would expect even if the code is technically undefined according to the C++ standard. Compiler writers realize this type of code is often needed and so generally support this in a sane way even if not required to by the C++ standard.

Related

Signedness of int in while comparing with string.size() showing warning

So, i was writing down this code
void shortened(string s){
int cnt=0;
for(int i=0;i<s.size()-1;++i){
cnt++;
}
//some extra code
}
this for loop showed me a warning and i.e. of comparison of integer expressions of different signedness int and string::size_type. But as soon as I changed int i = 0 to unsigned int i=0 there was no warning. I know that the length of the string can never be zero and that's why the warning was shown because int i can hold negative numbers as well. But why the warning was being shown in the first place?
i-0;i<s.size()-1
in itself was very complete. I need to clear my doubt.

Assume you are using g++ as compiler. (gcc is the same while clang is unfamiliar to me)
This warning comes from the compilation option -Wall which asks the compiler to give you detailed warning for everything( Wall = Warning All). （For more information, you can visit https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#Warning-Options）
So g++ -o tmp tmp.cpp won't show such warning but g++ -o tmp -Wall tmp.cpp will.
Come back to the main topic: such a wired warning is shown due to the return type for string.size() is size_t and size_t is one of the unsigned types.
On my machine, I find size_t is defined in stddef.h as followed:
#define __SIZE_TYPE__ long unsigned int
typedef __SIZE_TYPE__ size_t;
You compare between an int and an unsigned int in for statement and this why here is such a warning.

Must I care about the padding at the end of a C++ struct? I promise I won't use it in an array

I have a struct, and don't want implicit padding.
#include <cstdint>
struct foo
{
uint8_t a;
uint32_t b;
};
static_assert(sizeof(foo) == 8, "");
I turn on -Wpadded warning.
> g++ test.cpp -c -Wpadded -std=c++14
test.cpp:5:12: warning: padding struct to align 'foo::b' [-Wpadded]
uint32_t b;
^
> clang++ test.cpp -c -Wpadded -std=c++14
test.cpp:5:12: warning: padding struct 'foo' with 3 bytes to align 'b' [-Wpadded]
uint32_t b;
^
1 warning generated.
That's great, and that's what I want.
I'll now switch the members around. I don't care about padding at the end to make the struct of the proper alignment. I'd rather just have the size minimised.
#include <cstdint>
struct foo
{
uint32_t b;
uint8_t a;
};
static_assert(sizeof(foo) == 5, "");
> g++ test.cpp -c -Wpadded -std=c++14
test.cpp:2:8: warning: padding struct size to alignment boundary [-Wpadded]
struct foo
^
test.cpp:7:1: error: static assertion failed:
static_assert(sizeof(foo) == 5, "");
^
> clang++ test.cpp -c -Wpadded -std=c++14
test.cpp:2:8: warning: padding size of 'foo' with 3 bytes to alignment boundary [-Wpadded]
struct foo
^
test.cpp:7:1: error: static_assert failed ""
static_assert(sizeof(foo) == 5, "");
^ ~~~~~~~~~~~~~~~~
1 warning and 1 error generated.
That gives me a warning that I don't want.
How can I get a warning or compile-time error if implicit padding is added, but not if padding is missing at the end to align the whole struct? I'm not interested in using it in an array. Or is that a risky and careless thing permit? I do need instances of the struct to be aligned properly.
Is there an attribute or modifier that would achieve the same effect?

You can do this:
static_assert(offsetof(foo, b) == offsetof(foo, a) + sizeof(uint8_t), "");
Of course you'll need to add one assert per member after the first one, or change the above to include all the fields, e.g. if c is the last field:
static_assert(offsetof(foo, c) == sizeof(uint8_t) + sizeof(uint32_t), "");
Now you may be wondering how to make this more automatic. You can use Boost.Fusion if you want...here's a question I asked and answered along those lines:
Boost Fusion: validate adapted struct member ordering at compile time
You'd use the same approach: sum the size of all fields up to the last one, and check if the offset of the last field is the same.

Clang: error: invalid use of non-static data member

Is this gcc being overly nice and doing what the dev thinks it will do or is clang being overly fussy about something. Am I missing some subtle rule in the standard where clang is actually correct in complaining about this
Or should I use the second bit of code which is basically the how offsetof works
[adrian#localhost ~]$ g++ -Wall -pedantic -ansi a.cc
[adrian#localhost ~]$ a.out
50
[adrian#localhost ~]$ cat a.cc
#include <iostream>
struct Foo
{
char name[50];
};
int main(int argc, char *argv[])
{
std::cout << sizeof(Foo::name) << std::endl;
return 0;
}
[adrian#localhost ~]$ clang++ a.cc
a.cc:10:29: error: invalid use of non-static data member 'name'
std::cout << sizeof(Foo::name) << std::endl;
~~~~~^~~~
1 error generated.
[adrian#localhost ~]$ g++ -Wall -pedantic -ansi b.cc
[adrian#localhost ~]$ a.out
50
[adrian#localhost ~]$ cat b.cc
#include <iostream>
struct Foo
{
char name[50];
};
int main(int argc, char *argv[])
{
std::cout << sizeof(static_cast<Foo*>(0)->name) << std::endl;
return 0;
}
[adrian#localhost ~]$ clang++ b.cc
[adrian#localhost ~]$ a.out
50

I found adding -std=c++11 stops it complaining. GCC is fine
with it in either version.
Modern GCC versions allow this even in -std=c++98 mode. However, older versions, like GCC 3.3.6 of mine, do complain and refuse to compile.
So now I wonder which part of C++98 I am violating with this code.
Wikipedia explicitly states that such a feature was added in C++11, and refers to N2253, which says that the syntax was not considered invalid by the C++98 standard initially, but then intentionally clarified to disallow this (I have no idea how non-static member fields are any different from other variables with regard to their data type). Some time later they decided to make this syntax valid, but not until C++11.
The very same document mentions an ugly workaround, which can also be seen throughout the web:
sizeof(((Class*) 0)->Field)
It looks like simply using 0, NULL or nullptr may trigger compiler warnings for possible dereference of a null pointer (despite the fact that sizeof never evaluates its argument), so an arbitrary non-zero value might be used instead, although it will look like a counter-intuitive “magic constant”. Therefore, in my C++ graceful degradation layer I use:
#if __cplusplus >= 201103L
#define CXX_MODERN 2011
#else
#define CXX_LEGACY 1998
#endif
#ifdef CXX_MODERN
#define CXX_FEATURE_SIZEOF_NONSTATIC
#define CxxSizeOf(TYPE, FIELD) (sizeof TYPE::FIELD)
#else
// Use of `nullptr` may trigger warnings.
#define CxxSizeOf(TYPE, FIELD) (sizeof (reinterpret_cast<const TYPE*>(1234)->FIELD))
#endif
Usage examples:
// On block level:
class SomeHeader {
public:
uint16_t Flags;
static CxxConstExpr size_t FixedSize =
#ifdef CXX_FEATURE_SIZEOF_NONSTATIC
(sizeof Flags)
#else
sizeof(uint16_t)
#endif
;
}; // end class SomeHeader
// Inside a function:
void Foo(void) {
size_t nSize = CxxSizeOf(SomeHeader, Flags);
} // end function Foo(void)
By the way, note the syntax difference for sizeof(Type) and sizeof Expression, as they are formally not the same, even if sizeof(Expression) works — as long as sizeof (Expression) is valid. So, the most correct and portable form would be sizeof(decltype(Expression)), but unfortunately it was made available only in C++11; some compliers have provided typeof(Expression) for a long time, but this never was a standard extension.

Assign a value to a variable at compilation time

I'd like to assign a specific value to a variable when my code is compiling (for C and C++):
For example having :
//test.c
int main()
{
int x = MYTRICK ; (edit: changed __MYTRICK__ to MYTRICK to follow advices in comment)
printf ("%d\n", x);
return 0;
}
beeing able to do something like:
gcc -XXX MYTRICK=44 test.c -o test
and having as a result :
$./test
44

Use -D option:
gcc -DMYTRICK=44 test.c -o test
And use MYTRICK macro in your program and not __MYTRICK__. Names beginning with __ are reserved by the implementation.

Why would the size of a packed structure be different on Linux and Windows when using gcc?

In the code below, why is the size of the packed structure different on Linux and Windows when compiled with gcc?
#include <inttypes.h>
#include <cstdio>
// id3 header from an mp3 file
struct header
{
uint8_t version[ 2 ];
uint8_t flags;
uint32_t size;
} __attribute__((packed));
int main( int argc, char **argv )
{
printf( "%u\n", (unsigned int)sizeof( header ) );
return 0;
}
gcc versions used:
$ g++ --version
g++ (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2
$ x86_64-w64-mingw32-g++ --version
x86_64-w64-mingw32-g++ (GCC) 4.7.0 20110831 (experimental)
Compile and test:
$ g++ -Wall packed.cpp -o packed && ./packed
7
$ x86_64-w64-mingw32-g++ -Wall packed.cpp -o packed.exe
--> prints '8' when run on Windows.
The Linux binary prints the expected size of 7 bytes, the Windows binary 8 bytes. Why the difference?

gcc 4.7.0 does it this way to be compatible with 64-bit MSVC++. If you want to pack the structure properly, compile with -mno-ms-bitfields. (But then your layout will be incompatible with MSVC++.)

Section 6.37.3 of the gcc attributes explains it as a difference in ABI specs, see here: http://gcc.gnu.org/onlinedocs/gcc/Type-Attributes.html

The attribute((packed)) is compiler-specific to GCC.
Hence, that code won't even compile with MSVC++. Maybe you used another compiler for Windows, though. However, with MSVC++ you could do this:
#include <stdint.h>
#include <cstdio>
// id3 header from an mp3 file
#pragma pack(push,1)
struct header
{
uint8_t version[ 2 ];
uint8_t flags;
uint32_t size;
};
#pragma pack(pop)
int main( int argc, char **argv )
{
printf( "%u\n", (unsigned int)sizeof( header ) );
return 0;
}
and the struct will be 7 bytes.

This is all about attribute and word alignment in memory
see if you write
struct header
{
uint8_t version[ 2 ];
uint8_t flags;
uint32_t size;
};
then linux & windows both have size 8
but when you specify attribute to avoid default world allignment then
struct header
{
uint8_t version[ 2 ];
uint8_t flags;
uint32_t size;
} __attribute__((packed));
then in linux because of attritube size becomes 7
see gcc spec says that
If packed is used on a structure, or if bit-fields are used
it may be that the Microsoft ABI packs them differently than
GCC would normally pack them.

Update. Latest MinGW works fine.
Both g++ (i686-win32-dwarf-rev0, Built by MinGW-W64 project) 8.1.0 and
g++ (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 8.1.0
prints sizeof() of sample code is exactly equal 7 bytes.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++: Casting unsigned char to a Structure - c++

Related

Signedness of int in while comparing with string.size() showing warning

Must I care about the padding at the end of a C++ struct? I promise I won't use it in an array

Clang: error: invalid use of non-static data member

Assign a value to a variable at compilation time

Why would the size of a packed structure be different on Linux and Windows when using gcc?

Categories

Resources