C++ Union Array differs in 32/64 bits - c++

My code:
union FIELD {
int n;
char c;
const char *s;
FIELD(){}
FIELD(int v){ n = v; }
FIELD(char v){ c = v; }
FIELD(const char* v){ s = v; }
};
struct SF {
const char* s0;
char s1;
int s2;
const char* s3;
};
int main() {
printf("sizeof(long) = %ld\n", sizeof(long));
printf("now is %d bit\n", sizeof(long) == 8?64:32);
FIELD arrField[] = {
FIELD("any 8 words 0 mixed"), FIELD('d'), FIELD(251356), FIELD("edcba")
};
SF* sf0 = (SF*)&arrField;
printf("sf0->s0 = %s, ", sf0->s0);
printf("sf0->s1 = %c, ", sf0->s1);
printf("sf0->s2 = %d, ", sf0->s2);
printf("sf0->s3 = %s\n", sf0->s3);
}
When I use the default 64-bit execution output:
I add the compilation parameters in CMakeLists.txt:
set_target_properties(untitled PROPERTIES COMPILE_FLAGS "-m32" LINK_FLAGS "-m32")
It will compile the 32-bit program, then run and output:
My question is, how can I make a 64-bit program have the same output behavior as a 32-bit program?

Apply alignas(FIELD) to every single member variable of SF.
Additionally you cannot rely on the size of long to tell 64 bit and 32 bit systems appart. Check the size of a pointer to do this. On some 64 bit systems long is 32 bit. This is the case for my system for example.
Furthermore %ld requires a long parameter, but the sizeof operator yields size_t which is unsigned in addition to not necesarily matching long in size. You need to add a cast there to be safe (or just go with std::cout which automatically chooses the correct conversion based on the second operand of the << operator).
union FIELD {
int n;
char c;
const char* s;
FIELD() {}
FIELD(int v) { n = v; }
FIELD(char v) { c = v; }
FIELD(const char* v) { s = v; }
};
struct SF {
alignas(FIELD) const char* s0;
alignas(FIELD) char s1;
alignas(FIELD) int s2;
alignas(FIELD) const char* s3;
};
int main() {
printf("sizeof(long) = %ld\n", static_cast<long>(sizeof(long)));
printf("now is %d bit\n", static_cast<int>(sizeof(void*)) * 8);
FIELD arrField[] = {
FIELD("any 8 words 0 mixed"), FIELD('d'), FIELD(251356), FIELD("edcba")
};
SF* sf0 = (SF*)&arrField;
printf("sf0->s0 = %s, ", sf0->s0);
printf("sf0->s1 = %c, ", sf0->s1);
printf("sf0->s2 = %d, ", sf0->s2);
printf("sf0->s3 = %s\n", sf0->s3);
}

Related

How do I deserialise a const byte * to a structure in cpp?

I have a structure like this
struct foo {
string str1;
uint16_t int1
string str2;
uint32_t int2;
string str3;
};
strings str1, str2 , str3 are of fixed length of 12 bytes, 3 bytes,etc. left padded with spaces.
I have a function
void func(const byte* data, const size_t len) which is supposed to convert the byte * data to structure foo. len is length of data.What are the ways in which I can do this?
Again the data is const pointer of byte type and will not have null characters in between to distinguish different members.
Should I use character array instead of string for str1, str2, str3?
Easiest (but most errorprone) way is to just reinterpret_cast / std::memcpy if the strings have fixed length:
// no padding
#pragma pack(push, 1)
struct foo {
char str1[12];
uint16_t int1;
char str2[3];
uint32_t int2;
char str3[4];
};
#pragma pack(pop)
void func(const byte* data, const size_t len) {
assert(len == sizeof(foo));
// non owning
const foo* reinterpreted = reinterpret_cast<const foo*>(data);
// owning
foo reinterpreted_val = *reinterpret_cast<const foo*>(data);
foo copied;
memcpy(&copied, data, len);
}
Notes:
Make sure that you're allowed to use reinterpret_cast
https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing
if you'd try to use strlen or another string operation on any of the strings you most likely will get UB, since the strings are not null terminated.
Slightly better approach:
struct foo {
char str1[13];
uint16_t int1;
char str2[4];
uint32_t int2;
char str3[5];
};
void func(const char* data, const size_t len) {
foo f;
memcpy(f.str1, data, 12);
f.str1[12] = '\0';
data+=12;
memcpy(&f.int1, data, sizeof(uint16_t));
data+=sizeof(uint16_t);
memcpy(f.str2, data, 3);
f.str2[3] = '\0';
data+=3;
memcpy(&f.int2, data, sizeof(uint32_t));
data+=sizeof(uint32_t);
memcpy(f.str3, data, 4);
f.str3[4] = '\0';
data+=4;
}
Notes:
You could combine both approaches to get rid of the pointer arithmetic. That would also account for any padding in your struct you might have.
I think the easiest way to do this is to change the string inside the structure
to the type of char. Then you can easily copy the objects of this
structure according to its size.
you will have to somehow deal with the byte order on machines with different byte
order
struct foo {
char str1[12];
uint16_t int1;
char str2[3];
uint32_t int2;
char str3[5];
};
byte* Encode(foo* p, int Size) {
int FullSize = Size * sizeof(foo);
byte* Returner = new byte[FullSize];
memcpy_s(Returner, FullSize, p, FullSize);
return Returner;
}
foo * func(const byte* data, const size_t len) {
int ArrSize = len/sizeof(foo);
if (!ArrSize || (ArrSize* sizeof(foo)!= len))
return nullptr;
foo* Returner = new foo[ArrSize];
memcpy_s(Returner, len, data, len);
return Returner;
}
int main()
{
const size_t ArrSize = 3;
foo Test[ArrSize] = { {"Test1",1000,"TT",2000,"cccc"},{"Test2",1001,"YY",2001,"vvvv"},{"Test1",1002,"UU",2002,"bbbb"}};
foo* Test1 = nullptr;
byte* Data = Encode(Test, ArrSize);
Test1 = func(Data, ArrSize * sizeof(foo));
if (Test1 == nullptr) {
std::cout << "Error extracting data!" << std::endl;
delete [] Data;
return -1;
}
std::cout << Test1[0].str1 << " " << Test1[1].str1 << " " << Test1[2].str3 << std::endl;
delete [] Data;
delete[] Test1;
return 0;
}
output
Test1 Test2 bbbb

How to let a variable be dependent on other variables inside a class?

What is wrong with the variable international_standard_book_number? How can I make it that it changes, whenever isbn_field_i changes?
#include <iostream>
#include <string>
class ISBN
{
private:
unsigned int isbn_field_1 = 0;
unsigned int isbn_field_2 = 0;
unsigned int isbn_field_3 = 0;
char digit_or_letter = 'a';
std::string international_standard_book_number =
std::to_string(isbn_field_1) + "-" + std::to_string(isbn_field_2) + "-" +
std::to_string(isbn_field_3) + "-" + digit_or_letter;
public:
ISBN()
{
isbn_field_1 = 0, isbn_field_2 = 0, isbn_field_3 = 0, digit_or_letter = 'a';
}
ISBN(unsigned int a, unsigned int b, unsigned int c, char d)
{
isbn_field_1 = a, isbn_field_2 = b, isbn_field_3 = c, digit_or_letter = d;
}
friend std::ostream& operator<<(std::ostream& os, ISBN const& i)
{
return os << i.international_standard_book_number;
}
};
int
main()
{
ISBN test(1, 2, 3, 'b');
std::cout << test << "\n";
return 0;
}
Output:
0-0-0-a
Desired output:
1-2-3-b
Edit: This question aims at something else(why, instead of how) than mine, and its answers don't help me as much as the answers from this topic.
What is wrong with the variable international_standard_book_number? How can I make it that it changes, whenever isbn_field_i changes?
Generally speaking: you have to reassign it every time one component changes.
In your particular case: change the constructor using initialization list.
I mean... instead
ISBN(unsigned int a, unsigned int b, unsigned int c, char d)
{isbn_field_1=a, isbn_field_2=b, isbn_field_3=c, digit_or_letter=d;};
write
ISBN(unsigned int a, unsigned int b, unsigned int c, char d)
: isbn_field_1{a}, isbn_field_2{b}, isbn_field_3{c}, digit_or_letter{d}
{}
Now your example code write
1-2-3-b
What changes ?
With
ISBN(unsigned int a, unsigned int b, unsigned int c, char d)
{isbn_field_1=a, isbn_field_2=b, isbn_field_3=c, digit_or_letter=d;};
first your fields are default initialized, so
isbn_field_1 = 0;
isbn_field_2 = 0;
isbn_field_3 = 0;
digit_or_letter = 'a';
international_standard_book_number="0"+"-"+"0"+"-"+"0"+"-"+'a';
then is executed the body of the constructor
isbn_field_1 = 1;
isbn_field_2 = 2;
isbn_field_3 = 3;
digit_or_letter = 'b';
but international_standard_book_number remain unchanged.
With
ISBN(unsigned int a, unsigned int b, unsigned int c, char d)
: isbn_field_1{a}, isbn_field_2{b}, isbn_field_3{c}, digit_or_letter{d}
{}
the initialization list initialize the fields (and substitute the default initialization)
isbn_field_1 = 1;
isbn_field_2 = 2;
isbn_field_3 = 3;
digit_or_letter = 'b';
and then is executed the default initialization of international_standard_book_number but using the new values, so
international_standard_book_number="1"+"-"+"2"+"-"+"3"+"-"+'b';
Use a member function.
#include <iostream>
#include <string>
class ISBN
{
private:
unsigned int isbn_field_1=0;
unsigned int isbn_field_2=0;
unsigned int isbn_field_3=0;
char digit_or_letter='a';
std::string international_standard_book_number() const {
return std::to_string(isbn_field_1)+"-"+std::to_string(isbn_field_2)+"-"+std::to_string(isbn_field_3)+"-"+digit_or_letter;
}
public:
ISBN(){isbn_field_1=0, isbn_field_2=0, isbn_field_3=0, digit_or_letter='a';}
ISBN(unsigned int a, unsigned int b, unsigned int c, char d){isbn_field_1=a, isbn_field_2=b, isbn_field_3=c, digit_or_letter=d;};
friend std::ostream &operator<<(std::ostream &os, ISBN const &i)
{
return os << i.international_standard_book_number();
}
};
int main()
{
ISBN test(1,2,3,'b');
std::cout << test << "\n";
return 0;
}
Variables in c++ use value sematics. When you do
std::string international_standard_book_number=
std::to_string(isbn_field_1)+"-"+std::to_string(isbn_field_2)+"-"+std::to_string(isbn_field_3)+"-"+digit_or_letter;
it will assign a value to international_standard_book_number based on the values that isbn_field_n has right now. It does not create some kind of automatic link between these variables that make sure they stay in sync.
If you want that behaviour you would have to make sure you update international_standard_book_number everytime one the the other fields are updated.
If you only need to set the value once (e.g. the other values don't change after the object
got constructed) you can use an initializer list:
ISBN(unsigned int a, unsigned int b, unsigned int c, char d)
: isbn_field_1(a),
isbn_field_2(b),
isbn_field_3(c),
digit_or_letter(d),
international_standard_book_number(
std::to_string(isbn_field_1) + "-" +
std::to_string(isbn_field_2) + "-" +
std::to_string(isbn_field_3) + "-" +
digit_or_letter)
{};
But keep in mind, that the member are still initialized in the order they are declared, not in the order of the initializer list.
Technically, you don't need to initialize international_standard_book_number in the initializer list, as max66's answer shows, it's a question of personal preference.
Maintain class invariants (depends vars) it is what you have to code manually. It is one of reasons why we need classes. In a class you may forbid direct changes of members (make them private), but when they are changed via for instance special set methods update invariants accordingly.
E.g.
void set_field_1(int field) {
isbn_field_1 = field;
international_standard_book_number = std::to_string(isbn_field_1)+"-"+std::to_string(isbn_field_2)+"-"+std::to_string(isbn_field_3)+"-"+digit_or_letter;
}
I want to add to #max66's answer.
Usually you have a hierarchy of constructors calling each other and one final final "master" constructor that takes all arguments and initializes the variables. This avoids code duplication and vastly simplifies which constructors initialize what. You can see what I mean in the below example. Besides that, to do proper string formatting in a readable manner, use the {fmt} library:
#include <fmt/format.h>
#include <string>
class ISBN
{
private:
unsigned int isbn_field_1;
unsigned int isbn_field_2;
unsigned int isbn_field_3;
char digit_or_letter;
std::string international_standard_book_number;
public:
ISBN()
: ISBN{ 0, 0, 0, 'a' }
{}
ISBN(unsigned int a, unsigned int b, unsigned int c, char d)
: isbn_field_1{ a }
, isbn_field_2{ b }
, isbn_field_3{ c }
, digit_or_letter{ d }
, international_standard_book_number{ fmt::format("{}-{}-{}-{}",
isbn_field_1,
isbn_field_2,
isbn_field_3,
digit_or_letter) }
{}
};
This code in Visual Studio 2019 at least works:
#include <iostream>
#include <string>
class ISBN
{
private:
unsigned int isbn_field_1 = 0;
unsigned int isbn_field_2 = 0;
unsigned int isbn_field_3 = 0;
char digit_or_letter = 'a';
std::string international_standard_book_number =
std::to_string(isbn_field_1) + "-" + std::to_string(isbn_field_2) + "-" +
std::to_string(isbn_field_3) + "-" + digit_or_letter;
public:
ISBN(unsigned int a, unsigned int b, unsigned int c, char d)
:isbn_field_1(a), isbn_field_2(b), isbn_field_3(c), digit_or_letter(d), international_standard_book_number(std::to_string(isbn_field_1) + "-" + std::to_string(isbn_field_2) + "-" +
std::to_string(isbn_field_3) + "-" + digit_or_letter)
{
}
friend std::ostream& operator<<(std::ostream& os, ISBN const& i)
{
return os << i.international_standard_book_number;
}
};
int
main()
{
ISBN test(1, 2, 3, 'b');
std::cout << test << "\n";
test = {2, 3, 4, 'c'};
std::cout << test << "\n";
return 0;
}
Also, why the empty constructor?

How to allocate memory for array of structs?

I've a C++ library (really a wrapper for another C++ library) and I need pass some structs to my C application.
I don't know how allocate the memory dynamically.
I get a segmentation fault.
library.h
struct my_substruct {
unsigned char id ;
time_t date ;
char *info ;
};
typedef struct my_substruct My_substruct ;
struct my_struct {
char *description ;
unsigned char value ;
My_substruct *substruct ;
};
typedef my_struct My_struct ;
library.cpp
unsigned char getStructs(My_struct *structs)
{
vector <structCPPLibrary> structsCPPLibrary = getFromCPPLibrary();
unsigned char numStructs = structsCPPLibrary.size();
structs = (My_struct *)malloc(numStructs*sizeof(My_struct));
unsigned char indexStruct = 0;
for (auto s : structsCPPLibrary)
{
structs[indexStruct].description = (char *)malloc(s.description.size() + 1);
strcpy(structs[indexStruct].description, s.description.c_str()); // In 's' is a std::string
structs[indexStruct].value = s.value; // In 's' is an unsigned char too
unsigned char numSubstructs = s.substructs.size(); // In 's' is a vector of Substructs
structs[indexStruct].substruct = (My_substruct *)malloc(numSubstructs*sizeof(My_substruct));
unsigned char indexSubstruct = 0;
for (auto sub : s.substruct) {
structs[indexStruct].substruct[indexSubstruct].id = sub.id; // In 'sub' is an unsigned char too
structs[indexStruct].substruct[indexSubstruct].date = sub.date; // In 'sub' is a time_t too
structs[indexStruct].substruct[indexSubstruct].info = (char *)malloc(sub.info.size() + 1);
strcpy(structs[indexStruct].substruct[indexSubstruct].info, sub.info.c_str()); // In 'sub' is a std::string
indexSubstruct++;
}
indexStruct++;
}
return indexStruct;
}
main.c
void getStructFromWrapper(void)
{
My_struct *structs;
unsigned char numStruct = getStructs(structs);
show_content(structs);
}
Change
unsigned char getStructs(My_struct *structs) {
...
}
getStructs(structs);
To
unsigned char getStructs(My_struct **p_structs) {
// C function can't be pass by reference, so convert to a reference here
auto& struct = *p_structs;
...
}
...
getStructs(&structs);
Your problem is that your struct = ... line is not modifying the value of structs in getStructFromWrapper.

How to obtain sizes of arrays in a loop

I am aligning several arrays in order and performing some sort of classification. I created an array to hold other arrays in order to simplify the operations that I want to perform.
Sadly, my program crashed when I ran it and I went on to debug it to finally realize that the sizeof operator is giving me sizes of pointers and not arrays within the loop.So I resorted to the cumbersome solution and my program worked.
How can I avoid this cumbersome method? I want to calculate within a loop!
#include <iostream>
#include <string>
#define ARRSIZE(X) sizeof(X) / sizeof(*X)
int classify(const char *asset, const char ***T, size_t T_size, size_t *index);
int main(void)
{
const char *names[] = { "book","resources","vehicles","buildings" };
const char *books[] = { "A","B","C","D" };
const char *resources[] = { "E","F","G" };
const char *vehicles[] = { "H","I","J","K","L","M" };
const char *buildings[] = { "N","O","P","Q","R","S","T","U","V" };
const char **T[] = { books,resources,vehicles,buildings };
size_t T_size = sizeof(T) / sizeof(*T);
size_t n, *index = new size_t[T_size];
/* This will yeild the size of pointers not arrays...
for (n = 0; n < T_size; n++) {
index[n] = ARRSIZE(T[n]);
}
*/
/* Cumbersome solution */
index[0] = ARRSIZE(books);
index[1] = ARRSIZE(resources);
index[2] = ARRSIZE(vehicles);
index[3] = ARRSIZE(buildings);
const char asset[] = "L";
int i = classify(asset, T, T_size, index);
if (i < 0) {
printf("asset is alien !!!\n");
}
else {
printf("asset ---> %s\n", names[i]);
}
delete index;
return 0;
}
int classify(const char *asset, const char ***T, size_t T_size, size_t *index)
{
size_t x, y;
for (x = 0; x < T_size; x++) {
for (y = 0; y < index[x]; y++) {
if (strcmp(asset, T[x][y]) == 0) {
return x;
}
}
}
return -1;
}
As you are including <string> and <iostream> I assume that the question is about C++ and not C. To avoid all this complication, simply use containers. E.g:
#include <vector>
std::vector<int> vect = std::vector<int>(3,0);
std::cout << vect.size() << std::endl; // prints 3
One solution if you are coding in C is to terminate your array with a special item, like NULL
const char *books[] = { "A","B","C","D", NULL };
size_t size(const char *arr[])
{
const char **p = arr;
while (*p)
{
p++;
}
return p - arr;
}
You can specify the array size explizit:
size_t n, index[] = {ARRSIZE(books), ARRSIZE(resources), ARRSIZE(vehicles), ARRSIZE(vehicles)};
or if you want to avoid double typing you can you X-Macros to roll out everything:
#define TBL \
X(books) \
X(resources) \
X(vehicles) \
X(buildings)
const char **T[] = {
#define X(x) x,
TBL
};
#undef X
size_t n, index[] = {
#define X(x) ARRSIZE(x),
TBL
};
which produces the same. See Running Demo.

Is this a C++ alternative to the struct hack?

Is the following valid C++? It's an alternative way of implementing a variable length tail to a flat structure. In C this is commonly done with the struct hack
struct Str
{
Str(int c) : count(c) {}
size_t count;
Elem* data() { return (Elem*)(this + 1); }
};
Str* str = (Str*)new char[sizeof(Str) + sizeof(Elem) * count];
new (str) Str(count);
for (int i = 0; i < count; ++i)
new (str->data() + i) Elem();
str->data()[0] = elem0;
str->data()[1] = elem1;
// etc...
I ask this in response to the following related question
No, it is not valid:
Elem might have different alignment than Str, so (reinterpret_)casting Str+1 to Elem* might or might not give you a valid pointer, and acccessing might give undefined behavior.
But after all, why would you want to do something like that?
Valid in what sense? It is C++ using C-like techniques which imho is fine as long as the project requirements leave no other choice.
If you are asking if it will work, it will as long as data alignment issues do not crash the code (i.e. non x86 like SPARC, etc). C++ behaves much like C when addressing memory.
I tested it using the following modifications under gcc and VS and it works:
struct Elem
{
Elem() : x(0), t(0) { memset(c, 0, sizeof(c));}
Elem(int v) : x(v), t(0) { memset(c, 0, sizeof(c));}
Elem(const Elem &e) { *this = e; }
Elem &operator=(const Elem &e)
{
if (this != &e)
{
memcpy(c, e.c, sizeof(c));
x = e.x;
t = e.t;
}
return *this;
}
char c[21];
int x;
char t;
};
struct Str
{
Str(int c) : count(c) {}
size_t count;
Elem* data() { return (Elem*)(this + 1); }
};
int count = 11;
Str *str = (Str*)new char[sizeof(Str) + sizeof(Elem) * count];
new (str) Str(count);
for (int i = 0; i < count; ++i)
{
new (str->data() + i) Elem();
str->data()[i] = Elem(i+1);
}
for (int i=0; i<str->count; i++)
cout << "[" << i << "]: " << str->data()[i].x << endl;
Also, I added various different size members to Str and Elem to force different padding and played with alignments (VS/some GCC: #pragma pack(...), GCC: __ attribute__ ((aligned (...))) and , __ attribute__(packed) ).
Please note that playing with alignments is not safe on all architectures - Relevant question