Unsigned char alias for struct - c++

Suppose I have this function:
template<class T>
uint8_t* toBytes(T&& obj)
{
uint8_t* array = new uint8_t[sizeof(T)];
for (int x = 0; x < sizeof(T); x++)
{
array[x] = reinterpret_cast<uint8_t*>(&obj)[x];
}
return array;
}
I am fairly certain that this is defined behavior (as long as don't expect the memory to look like any thing specific ... I think).
But now suppose I have another function:
template<class T>
T* toType(uint8_t* array)
{
return reinterpret_cast<T*>(array);
}
Is the following defined?
class A { /* Members of A */ };
A a;
uint8_t array = toBytes(a);
A* anotherA = toType<A>(array);

I think it's undefined due to alignment issues. new uint8_t[sizeof(T)]; doesn't necessarily return memory that is suitably aligned for T.

Related

Out of bounds array accesses in C++ and reinterpret_cast

Say I have code like this
struct A {
int header;
unsigned char payload[1];
};
A* a = reinterpret_cast<A*>(new unsigned char[sizeof(A)+100]);
a->payload[50] = 42;
Is this undefined behavior? Creating a pointer that points outside payload should be undefined AFAIK, but I'm unsure whether this is also true in the case where I have allocated the memory after the array.
The standard says p[n] is the same as *(p+ n) and "if the expression P poinst to the i-th element of an array object, the expressions (P)+N point to the i+n-th elements of the array". In the example payload points to an element in the array allocated with new, so this might be ok.
If possible, it would be nice if your answers contained references to the C++ standard.
So the reinterpret_cast is undefined behavior, we can reinterpret_cast to a char or unsigned char we can never cast from a char or unsigned char, if we do:
Accessing the object through the new pointer or reference invokes undefined behavior. This is known as the strict aliasing rule.
So yes this is a violation of the strict aliasing rule.
Consider the code:
struct {char x[4]; char a; } foo;
int work_with_foo(int i)
{
foo.a = 1;
foo.x[i]++;
return foo.a;
}
Even though the program would "own" the storage at foo.x+4, the fact that
access via the array type is only defined for the first four elements would
allow a compiler to, among other things, replace the above code with either
of the following:
int work_with_foo(int i) { foo.a = 1; foo.x[i]++; return 1; }
int work_with_foo(int i) { foo.x[i]++; foo.a = 1; return 1; }
The above substitutions are clearly permissible under the Standard. It is
less clear what alternate ways of writing the increment would force the
compiler to behave as though it reloads foo.a. For example, I think the
code *(i+(char*)&foo)+=1; would have defined behavior when i equals the
offset of foo.a, and I would think the same should be true of
*(i+(char*)&foo.x)+=1; but I'm not sure about *(i+foo.x)+=1; or
*(i+(char*)foo.x)+=1;.
This old C hack is never necessary in C++.
consider:
#include <cstdint>
#include <utility>
#include <memory>
template<std::size_t Size>
struct A {
int header;
unsigned char payload[Size];
};
struct polyheader
{
struct concept
{
virtual int& header() = 0;
virtual unsigned char* payload() = 0;
virtual std::size_t size() const = 0;
virtual ~concept() = default; // not strictly necessary, but a reasonable precaution
};
template<std::size_t Size>
struct model : concept
{
using a_type = A<Size>;
model(a_type a) : _a(std::move(a)) {}
int& header() override {
return _a.header;
}
unsigned char* payload() override {
return _a.payload;
}
std::size_t size() const override {
return Size;
}
A<Size> _a;
};
int& header() { return _impl->header(); }
unsigned char* payload() { return _impl->payload(); }
std::size_t size() const { return _impl->size(); }
template<std::size_t Size>
polyheader(A<Size> a)
: _impl(std::make_unique<model<Size>>(std::move(a)))
{}
std::unique_ptr<concept> _impl;
};
int main()
{
auto p1 = polyheader(A<40>());
auto p2 = polyheader(A<80>());
}

Assigning to data inside class via pointer

Suppose we have a type like that:
struct MyType
{
OtherType* m_pfirst;
OtherType* m_psecond;
OtherType* m_pthird;
....
OtherType* m_pn;
};
Would it be a safe way to assign to its members?
MyType inst;
....
OtherType** pOther = &inst.m_pfirst;
for (int i = 0; i < numOfFields; ++i, ++pOther)
{
*pOther = getAddr(i);
}
If your fields are named this way, then you have no choice:
inst.m_pFirst = getaddr(0);
inst.m_pSecond = getaddr(1);
...
A better struct could be:
struct MyType {
OtherType *m_pFields[10];
}
...
for (int i=0; i<10; i++) {
inst.m_pFields[i] = getaddr(i);
}
As you tagged C++, you can use a ctor:
struct MyType {
OtherType *m_pFirst;
OtherType *m_pSecond;
MyType(OtherType *p1,OtherType *p2): m_pFirst(p1), m_pSecond(p2) {};
};
...
MyType inst(getaddr(0),getaddr(1));
*pOther = getAddr(i);
You assign here to the value of pOther not the address. * is dereferencing(accessing the pointer content). And & is the opposite, getting pointer or referencing.
OtherType** pOther = &inst.m_pfirst;
Here &inst.m_pfirst first .(prio. 2) and then &(prio. 3), so you get the struct field and then its address. OtherType** pOther is pointer on pointer. In short, you change the inner pointer to point to inst.m_pfirst. Now you can say for yourself whether that makes sense.
Ussually I do:
struct mystruct {
int data;
char *arr;
};
struct mystruct *s = malloc(sizeof *s);
s->data = 5; //same as (*s).data
int n = 3;
*(s->data) = &n; //now its 3
int capacity = s->data;
s->arr = malloc(capacity * sizeof(*(s->arr)))
And then you can plug in your address-functions at the appropriate places.
Hopefully that clears it up a little bit.
It is not the proper way. ++pOther will take you to nowhere. This code will produce undefined behavior.

Multidimensional array: operator overloading

I have a class with a multidimensional array:
it is possible to create a one, two, ..., n dimensional array with this class
if the array has n dimensions, i want to use n operator[] to get an object:
example:
A a({2,2,2,2}];
a[0][1][1][0] = 5;
but array is not a vector of pointer which lead to other vectors etc...
so i want the operator[] to return a class object until the last dimension, then return a integer
This is a strongly simplified code, but it shows my problem:
The error i receive: "[Error] cannot convert 'A::B' to 'int' in initialization"
#include <cstddef> // nullptr_t, ptrdiff_t, size_t
#include <iostream> // cin, cout...
class A {
private:
static int* a;
public:
static int dimensions;
A(int i=0) {
dimensions = i;
a = new int[5];
for(int j=0; j<5; j++) a[j]=j;
};
class B{
public:
B operator[](std::ptrdiff_t);
};
class C: public B{
public:
int& operator[](std::ptrdiff_t);
};
B operator[](std::ptrdiff_t);
};
//int A::count = 0;
A::B A::operator[] (std::ptrdiff_t i) {
B res;
if (dimensions <= 1){
res = C();
}
else{
res = B();
}
dimensions--;
return res;
}
A::B A::B::operator[] (std::ptrdiff_t i){
B res;
if (dimensions <=1){
res = B();
}
else{
res = C();
}
dimensions--;
return res;
}
int& A::C::operator[](std::ptrdiff_t i){
return *(a+i);
}
int main(){
A* obj = new A(5);
int res = obj[1][1][1][1][1];
std::cout<< res << std::endl;
}
The operator[] is evaluated from left to right in obj[1][1]...[1], so obj[1] returns a B object. Suppose now you just have int res = obj[1], then you'll assign to a B object (or C object in the case of multiple invocations of []) an int, but there is no conversion from B or C to int. You probably need to write a conversion operator, like
operator int()
{
// convert to int here
}
for A, B and C, as overloaded operators are not inherited.
I got rid of your compiling error just by writing such operators for A and B (of course I have linking errors since there are un-defined functions).
Also, note that if you want to write something like obj[1][1]...[1] = 10, you need to overload operator=, as again there is no implicit conversion from int to A or your proxy objects.
Hope this makes sense.
PS: see also #Oncaphillis' comment!
vsoftco is totally right, you need to implement an overload operator if you want to actually access your elements. This is necessary if you want it to be dynamic, which is how you describe it. I actually thought this was an interesting problem, so I implemented what you described as a template. I think it works, but a few things might be slightly off. Here's the code:
template<typename T>
class nDimArray {
using thisT = nDimArray<T>;
T m_value;
std::vector<thisT*> m_children;
public:
nDimArray(std::vector<T> sizes) {
assert(sizes.size() != 0);
int thisSize = sizes[sizes.size() - 1];
sizes.pop_back();
m_children.resize(thisSize);
if(sizes.size() == 0) {
//initialize elements
for(auto &c : m_children) {
c = new nDimArray(T(0));
}
} else {
//initialize children
for(auto &c : m_children) {
c = new nDimArray(sizes);
}
}
}
~nDimArray() {
for(auto &c : m_children) {
delete c;
}
}
nDimArray<T> &operator[](const unsigned int index) {
assert(!isElement());
assert(index < m_children.size());
return *m_children[index];
}
//icky dynamic cast operators
operator T() {
assert(isElement());
return m_value;
}
T &operator=(T value) {
assert(isElement());
m_value = value;
return m_value;
}
private:
nDimArray(T value) {
m_value = value;
}
bool isElement() const {
return m_children.size() == 0;
}
//no implementation yet
nDimArray(const nDimArray&);
nDimArray&operator=(const nDimArray&);
};
The basic idea is that this class can either act as an array of arrays, or an element. That means that in fact an array of arrays COULD be an array of elements! When you want to get a value, it tries to cast it to an element, and if that doesn't work, it just throws an assertion error.
Hopefully it makes sense, and of course if you have any questions ask away! In fact, I hope you do ask because the scope of the problem you describe is greater than you probably think it is.
It could be fun to use a Russian-doll style template class for this.
// general template where 'd' indicates the number of dimensions of the container
// and 'n' indicates the length of each dimension
// with a bit more template magic, we could probably support each
// dimension being able to have it's own size
template<size_t d, size_t n>
class foo
{
private:
foo<d-1, n> data[n];
public:
foo<d-1, n>& operator[](std::ptrdiff_t x)
{
return data[x];
}
};
// a specialization for one dimension. n can still specify the length
template<size_t n>
class foo<1, n>
{
private:
int data[n];
public:
int& operator[](std::ptrdiff_t x)
{
return data[x];
}
};
int main(int argc, char** argv)
{
foo<3, 10> myFoo;
for(int i=0; i<10; ++i)
for(int j=0; j<10; ++j)
for(int k=0; k<10; ++k)
myFoo[i][j][k] = i*10000 + j*100 + k;
return myFoo[9][9][9]; // would be 090909 in this case
}
Each dimension keeps an array of previous-dimension elements. Dimension 1 uses the base specialization that tracks a 1D int array. Dimension 2 would then keep an array of one-dimentional arrays, D3 would have an array of two-dimensional arrays, etc. Then access looks the same as native multi-dimensional arrays. I'm using arrays inside the class in my example. This makes all the memory contiguous for the n-dimensional arrays, and doesn't require dynamic allocations inside the class. However, you could provide the same functionality with dynamic allocation as well.

error: Illegal zero sized array

I get this error:
error C2229: class 'GenerateRandNum<int [],int>' has an illegal zero-sized array
In my main, I call my random generator function to input into a empty data set
I call the method in my main like so:
//declare small array
const int smallSize = 20;
int smallArray[smallSize];
// call helper function to put random data in small array
GenerateRandNum <int[], int> genData(smallArray, smallSize);
genData.generate();
Header file
template <class T, class B>
class GenerateRandNum
{
public:
T data;
B size;
GenerateRandNum(T list, B length)
{
data = list;
size = length;
}
void generate();
};
File with method definition
template<class T, class B>
void GenerateRandNum<T, B> ::generate()
{
for (B i = 0; i < size; i++)
{
data[0] = 1 + rand() % size;
}
}
Pointers and arrays are not the same in C/C++. They are two very different things. However, arrays decay into pointers. Most notably in function declarations: The declaration
void foo(int array[7]);
is defined to be equivalent to
void foo(int* array);
That said, all the GenerateRandNum constructor gets, is a int* because that's what T = int [] decays to in the function declaration context. The data member of GenerateRandNum, however, is of type int [] (no decay here), which your compiler assumes to be a zero sized array. Consequently, when you try to assign a pointer to the array, your compiler complains.
You have two options to fix this:
You use an std::vector<> instead, as Marco A. suggests.
You declare your GenerateRandNum class as:
template <class T>
class GenerateRandNum {
public:
T* data;
size_t size;
GenerateRandNum(T* list, size_t length) {
data = list;
size = length;
}
void generate();
};
Note:
I have removed the template parameter for the size type: size_t is guaranteed to be suitable for counting anything in memory, so there is absolutely no point in using anything different. Templating this parameter only obfuscates your code.
There are some problems with your approach:
The first array template parameter can't have its dimension deduced from the argument as n.m. noted, you would need to specify it explicitly:
GenerateRandNum<int[20], int>
There no point in doing
data = list
since in your code sample these are two arrays and you can't assign them directly. You can either copy the memory or specialize your routines/template
You should really consider using a vector of integers, e.g.
template <class T, class B>
class GenerateRandNum
{
public:
T data;
B size;
GenerateRandNum(T list, B length) {
data = list;
size = length;
}
void generate();
};
template<class T, class B>
void GenerateRandNum<T, B> ::generate()
{
srand((unsigned int)time(NULL)); // You should initialize with a seed
for (B i = 0; i < size; i++) {
data[i] = 1 + rand() % size; // I believe you wanted data[i] and not data[0]
}
}
int main(){
//declare small array
const int smallSize = 20;
std::vector<int> smallArray(smallSize);
// call helper function to put random data in small array
GenerateRandNum <std::vector<int>, int> genData(smallArray, smallSize);
genData.generate();
}
Example
I fixed two issues in the code above, take a look at the comments.

Template & memory-allocation

#include <iostream>
template<class T> T CreateArray(T a, int n)
{
a = new T [n]; // mistake: double* = double**
return a;
}
int main()
{
double* a;
int n = 5;
a = CreateArray(a,n);
return 0;
}
can I allocate memory using a template and new? And what my mistake?
Your code has some wrong things. First, you can do something like what you're trying to do, but you should write something like this:
template<class T> T* CreateArray(int n)
{
T* a = new T [n];
return a;
}
int main()
{
double* a;
int n = 5;
a = CreateArray<double>(n);
return 0;
}
Note that you don't have to pass the a array (it will be copied inside CreateArray, and its changes won't be visible inside main). Note also that you define the template to returning a pointer T*, that is what main() a is expecting.
So others have explained why your code doesn’t work and how it can be improved.
Now I’ll show how you can still get the following code to compile – and to work properly:
double* a = CreateArray(5);
int* b = CreateArray(7);
The problem, as already mentioned, is that C++ does not infer template arguments from return types alone.
You can circumvent this limitation by making the above function return a simple proxy object. The proxy object has a single operation: an (implicit) conversion to T*. This is where the actual allocation happens.
The CreateArray function is therefore very simple (and not a template):
CreateArrayProxy CreateArray(std::size_t num_elements) {
return CreateArrayProxy(num_elements);
}
As for the proxy:
struct CreateArrayProxy {
std::size_t num_elements;
CreateArrayProxy(std::size_t num_elements) : num_elements(num_elements) { }
template <typename T>
operator T*() const {
return new T[num_elements];
}
};
Easy as π.
Now, should you use this code? No, probably not. It offers no real advantage over direct allocation. But it’s a useful idiom to know.
You want to accept a pointer to the type you want to allocate:
template<class T> T* CreateArray(T* a, int n)
{
a = new T [n];
return a;
}
This should do the trick.
I prefer to keep empty pointers value NULL.
#include <iostream>
template<class T> bool CreateArray(T * &a, int n)
{
if ( a != 0 )
return false;
a = new T [n];
return true;
}
int main()
{
double* a = 0;
int n = 5;
CreateArray(a,n);
return 0;
}
vector could be a good solution, too. I think it is better one, because you won't make memory leak(s).
#include <vector>
int main()
{
std::vector<double> a;
int n = 5;
a.resize(n);
return 0;
}