Cleaner pointer arithmetic syntax for manipulation with byte offsets

Cleaner pointer arithmetic syntax for manipulation with byte offsets - c++

In the following lines of code, I need to adjust the pointer pm by an offset in bytes in one of its fields. Is there an better/easier way to do this, than incessantly casting back and forth from char * and PartitionMap * such that the pointer arithmetic still works out?
PartitionMap *pm(reinterpret_cast<PartitionMap *>(partitionMaps));
for ( ; index > 0 ; --index)
{
pm = (PartitionMap *)(((char *)pm) + pm->partitionMapLength);
}
return pm;
For those that can't grok from the code, it's looping through variable length descriptors in a buffer that inherit from PartitionMap.
Also for those concerned, partitionMapLength always returns lengths that are supported by the system this runs on. The data I'm traversing conforms to the UDF specification.

I often use these templates for this:
template<typename T>
T *add_pointer(T *p, unsigned int n) {
return reinterpret_cast<T *>(reinterpret_cast<char *>(p) + n);
}
template<typename T>
const T *add_pointer(const T *p, unsigned int n) {
return reinterpret_cast<const T *>(reinterpret_cast<const char *>(p) + n);
}
They maintain the type, but add single bytes to them, for example:
T *x = add_pointer(x, 1); // increments x by one byte, regardless of the type of x

Casting is the only way, whether it's to a char* or intptr_t or other some such type, and then to your final type.

You can of course just keep two variables around: a char * to step through the buffer and a PartitionMap * to access it. Makes it a little clearer what's going on.
for (char *ptr = ??, pm = (PartitionMap *)ptr ; index > 0 ; --index)
{
ptr += pm->partitionMapLength;
pm = (PartitionMap *)ptr;
}
return pm;

As others have mentioned you need the casts, but you can hide the ugliness in a macro or function. However, one other thing to keep in mind is alignment requirements. On most processors you can't simply increment a pointer to a type by an arbitrary number of bytes and cast the result back into a pointer to the original type without problems accessing the struct through the new pointer due to misalignment.
One of the few architectures (even if it is about the most popular) that will let you get away with it is the x86 architecture. However, even if you're writing for Windows, you'll want to take this problem into account - Win64 does enforce alignment requirements.
So even accessing the partitionMapLength member through the pointer might crash your program.
You might be able to easily work around this problem using a compiler extension like __unaligned on Windows:
PartitionMap __unaliged *pm(reinterpret_cast<PartitionMap *>(partitionMaps));
for ( ; index > 0 ; --index)
{
pm = (PartitionMap __unaligned *)(((char *)pm) + pm->partitionMapLength);
}
return pm;
Or you can copy the potentially unaligned data into a properly aligned struct:
PartitionMap *pm(reinterpret_cast<PartitionMap *>(partitionMaps));
char* p = reinterpret_cast<char*>( pm);
ParititionMap tmpMap;
for ( ; index > 0 ; --index)
{
p += pm->partitionMapLength;
memcpy( &tmpMap, p, sizeof( newMap));
pm = &tmpMap;
}
// you may need a more spohisticated copy to return something useful
size_t siz = pm->partitionMapLength;
pm = reinterpret_cast<PartitionMap*>( malloc( siz));
if (pm) {
memcpy( pm, p, siz);
}
return pm;

The casting has to be done, but it makes the code nearly unreadable. For readability's sake, isolate it in a static inline function.

What is puzzling me is why you have 'partitionMapLength' in bytes?
Wouldn't it be better if it was in 'partitionMap' units since you anyway cast it?
PartitionMap *pmBase(reinterpret_cast<PartitionMap *>(partitionMaps));
PartitionMap *pm;
...
pm = pmBase + index; // just guessing about your 'index' variable here

Both C and C++ allow you to iterate through an array via pointers and ++:
#include <iostream>
int[] arry = { 0, 1, 2, 3 };
int* ptr = arry;
while (*ptr != 3) {
std::cout << *ptr << '\n';
++ptr;
}
For this to work, adding to pointers is defined to take the memory address stored in the pointer and then add the sizeof whatever the type is times the value being added. For instance, in our example ++ptr adds 1 * sizeof(int) to the memory address stored in ptr.
If you have a pointer to a type, and want to advance a particular number of bytes from that spot, the only way to do so is to cast to char* (because sizeof(char) is defined to be one).

Related

How to convert a byte array of size 64 to a list of double values in Arduino C++?

void Manager::byteArrayToDoubleArray(byte ch[]) {
int counter = 0;
// temp array to break the byte array into size of 8 and read it
byte temp[64];
// double result values
double res[8];
int index = 0;
int size = (sizeof(ch) / sizeof(*ch));
for (int i = 0; i < size; i++) {
counter++;
temp[i] = ch[i];
if (counter % 8 == 0) {
res[index] = *reinterpret_cast<double * const>(temp);
index++;
counter = 0;
}
}
}
Here result would be a list of double values with count = 8.

Your problem is two things. You have some typos and misunderstanding. And the C++ standard is somewhat broken in this area.
I'll try to fix both.
First, a helper function called laundry_pods. It takes raw memory and "launders" it into an array of a type of your choice, so long as you pick a pod type:
template<class T, std::size_t N>
T* laundry_pods( void* ptr ) {
static_assert( std::is_pod<std::remove_cv_t<T>>{} );
char optimized_away[sizeof(T)*N];
std::memcpy( optimized_away, ptr , sizeof(T)*N );
T* r = ::new( ptr ) T[N];
assert( r == ptr );
std::memcpy( r, optimized_away, sizeof(T)*N );
return r;
}
now simply do
void Manager::byteArrayToDoubleArray(byte ch[]) {
double* pdouble = laundry_pods<double, 8>(ch);
}
and pdouble is a pointer to memory of ch interpreted as an array of 8 doubles. (It is not a copy of it, it interprets those bytes in-place).
While laundry_pods appears to copy the bytes around, both g++ and clang optimize it down into a binary noop. The seeming copying of bytes around is a way to get around aliasing restrictions and object lifetime rules in the C++ standard.
It relies on arrays of pod not having extra bookkeeping overhead (which C++ implementations are free to do; none do that I know of. That is what the non-static assert double-checks), but it returns a pointer to a real honest to goodness array of double. If you want to avoid that assumption, you could instead create each doulbe as a separate object. However, then they aren't an array, and pointer arithmetic over non-arrays is fraught as far as the standard is concerned.
The use of the term "launder" has to do with getting around aliasing and object lifetime requirements. The function does nothing at runtime, but in the C++ abstract machine it takes the memory and converts it into binary identical memory that is now a bunch of doubles.

The trick of doing this kind of "conversion" is to always cast the double* to a char* (or unsigned char or std::byte). Never the other way round.
You should be able to do something like this:
void byteArrayToDoubleArray(byte* in, std::size_t n, double* out)
{
for(auto out_bytes = (byte*) out; n--;)
*out_bytes++ = *in++;
}
// ...
byte ch[64];
// .. fill ch with double data somehow
double res[8];
byteArrayToDoubleArray(ch, 64, res);
Assuming that type byte is an alias of char or unsigned char or std::byte.

I am not completly sure what you are trying to achieve here because of the code (sizeof(ch) / sizeof(*ch)) which does not make sense for an array of undefined size.
If you have a byte-Array (POD data type; something like a typedef char byte;) then this most simple solution would be a reinterpret_cast:
double *result = reinterpret_cast<double*>(ch);
This allows you to use result[0]..result[7] as long as ch[] is valid and contains at least 64 bytes. Be aware that this construct does not generate code. It tells the compiler that result[0] corresponds to ch[0..7] and so on. An access to result[] will result in an access to ch[].
But you have to know the number of elements in ch[] to calculate the number of valid double elements in result.
If you need a copy (because - for example - the ch[] is a temporary array) you could use
std::vector<double> result(reinterpret_cast<double*>(ch), reinterpret_cast<double*>(ch) + itemsInCh * sizeof(*ch) / sizeof(double));
So if ch[] is an array with 64 items and a byte is really an 8-bit value, then
std::vector<double> result(reinterpret_cast<double*>(ch), reinterpet_cast<double*>(ch) + 8);
will provide a std::vector containing 8 double values.
There is another possible method using a union:
union ByteToDouble
{
byte b[64];
double d[8];
} byteToDouble;
the 8 double values will occupie the same memory as the 64 byte values. So you can write the byte values to byteToDouble.b[] and read the resultingdouble values from byteToDouble.d[].

Portable tagged pointers

Is there a portable way to implement a tagged pointer in C/C++, like some documented macros that work across platforms and compilers? Or when you tag your pointers you are at your own peril? If such helper functions/macros exist, are they part of any standard or just are available as open source libraries?
Just for those who do not know what tagged pointer is but are interested, it is a way to store some extra data inside a normal pointer, because on most architectures some bits in pointers are always 0 or 1, so you keep your flags/types/hints in those extra bits, and just erase them right before you want to use pointer to dereference some actual value.
const int gc_flag = 1;
const int flag_mask = 7; // aka 0b00000000000111, because on some theoretical CPU under some arbitrary OS compiled with some random compiler and using some particular malloc last three bits are always zero in pointers.
struct value {
void *data;
};
struct value val;
val.data = &data | gc_flag;
int data = *(int*)(val.data & flag_mask);
https://en.wikipedia.org/wiki/Pointer_tagging

You can get the lowest N bits of an address for your personal use by guaranteeing that the objects are aligned to multiples of 1 << N. This can be achieved platform-independently by different ways (alignas and aligned_storage for stack-based objects or std::aligned_alloc for dynamic objects), depending on what you want to achieve:
struct Data { ... };
alignas(1 << 4) Data d; // 4-bits, 16-byte alignment
assert(reinterpret_cast<std::uintptr_t>(&d) % 16 == 0);
// dynamic (preferably with a unique_ptr or alike)
void* ptr = std::aligned_alloc(1 << 4, sizeof(Data));
auto obj = new (ptr) Data;
...
obj->~Data();
std::free(ptr);
You pay by throwing away a lot of memory, exponentionally growing with the number of bits required. Also, if you plan to allocate many of such objects contiguously, such an array won't fit in the processor's cacheline for comparatively small arrays, possibly slowing down the program considerably. This solution therefore is not to scale.

If you're sure that the addresses you are passing around always have certain bits unused, then you could use uintptr_t as a transport type. This is an integer type that maps to pointers in the expected way (and will fail to exist on an obscure platform that offers no such possible map).
There aren't any standard macros but you can roll your own easily enough. The code (sans macros) might look like:
void T_func(uintptr_t t)
{
uint8_t tag = (t & 7);
T *ptr = (T *)(t & ~(uintptr_t)7);
// ...
}
int main()
{
T *ptr = new T;
assert( ((uintptr_t)ptr % 8) == 0 );
T_func( (uintptr_t)ptr + 3 );
}
This may defeat compiler optimizations that involve tracking pointer usage.

Well, GCC at least can compute the size of bit-fields, so you can get portability across platforms (I don't have an MSVC available to test with). You can use this to pack the pointer and tag into an intptr_t, and intptr_t is guaranteed to be able to hold a pointer.
#include <limits.h>
#include <stdio.h>
#include <stdint.h>
#include <stddef.h>
#include <inttypes.h>
struct tagged_ptr
{
intptr_t ptr : (sizeof(intptr_t)*CHAR_BIT-3);
intptr_t tag : 3;
};
int main(int argc, char *argv[])
{
struct tagged_ptr p;
p.tag = 3;
p.ptr = (intptr_t)argv[0];
printf("sizeof(p): %zu <---WTF MinGW!\n", sizeof p);
printf("sizeof(p): %lu\n", (unsigned long int)sizeof p);
printf("sizeof(void *): %u\n", (unsigned int)sizeof (void *));
printf("argv[0]: %p\n", argv[0]);
printf("p.tag: %" PRIxPTR "\n", p.tag);
printf("p.ptr: %" PRIxPTR "\n", p.ptr);
printf("(void *)*(intptr_t*)&p: %p\n", (void *)*(intptr_t *)&p);
}
Gives:
$ ./tag.exe
sizeof(p): zu <---WTF MinGW!
sizeof(p): 8
sizeof(void *): 8
argv[0]: 00000000007613B0
p.tag: 3
p.ptr: 7613b0
(void *)*(intptr_t*)&p: 60000000007613B0
I've put the tag at the top, but changing the order of the struct would put it at the bottom. Then shifting the pointer-to-be-stored right by 3 would implement the OP's use case. Probably make macros for access to make it easier.
I also kinda like the struct because you can't accidentally dereference it as if it were a plain pointer.

Allocating an array of aligned struct

I'm trying to allocate an array of struct and I want each struct to be aligned to 64 bytes.
I tried this (it's for Windows only for now), but it doesn't work (I tried with VS2012 and VS2013):
struct __declspec(align(64)) A
{
std::vector<int> v;
A()
{
assert(sizeof(A) == 64);
assert((size_t)this % 64 == 0);
}
void* operator new[] (size_t size)
{
void* ptr = _aligned_malloc(size, 64);
assert((size_t)ptr % 64 == 0);
return ptr;
}
void operator delete[] (void* p)
{
_aligned_free(p);
}
};
int main(int argc, char* argv[])
{
A* arr = new A[200];
return 0;
}
The assert ((size_t)this % 64 == 0) breaks (the modulo returns 16). It looks like it works if the struct only contains simple types though, but breaks when it contains an std container (or some other std classes).
Am I doing something wrong? Is there a way of doing this properly? (Preferably c++03 compatible, but any solution that works in VS2012 is fine).
Edit:
As hinted by Shokwav, this works:
A* arr = (A*)new std::aligned_storage<sizeof(A), 64>::type[200];
// this works too actually:
//A* arr = (A*)_aligned_malloc(sizeof(A) * 200, 64);
for (int i=0; i<200; ++i)
new (&arr[i]) A();
So it looks like it's related to the use of new[]... I'm very curious if anybody has an explanation.

I wonder why you need such a huge alignment requirement, moreover to store a dynamic heap allocated object in the struct. But you can do this:
struct __declspec(align(64)) A
{
unsigned char ___padding[64 - sizeof(std::vector<int>)];
std::vector<int> v;
void* operator new[] (size_t size)
{
// Make sure the buffer will fit even in the worst case
unsigned char* ptr = (unsigned char*)malloc(size + 63);
// Find out the next aligned position in the buffer
unsigned char* endptr = (unsigned char*)(((intptr_t)ptr + 63) & ~63ULL);
// Also store the misalignment in the first padding of the structure
unsigned char misalign = (unsigned char)(endptr - ptr);
*endptr = misalign;
return endptr;
}
void operator delete[] (void* p)
{
unsigned char * ptr = (unsigned char*)p;
// It's required to call back with the original pointer, so subtract the misalignment offset
ptr -= *ptr;
free(ptr);
}
};
int main()
{
A * a = new A[2];
printf("%p - %p = %d\n", &a[1], &a[0], int((char*)&a[1] - (char*)&a[0]));
return 0;
}
I did not have your align_malloc and free function, so the implementation I'm providing is doing this:
It allocates larger to make sure it will fit in 64-bytes boundaries
It computes the offset from the allocation to the closest 64-bytes boundary
It stores the "offset" in the padding of the first structure (else I would have required a larger allocation space each time)
This is used to compute back the original pointer to the free()
Outputs:
0x7fff57b1ca40 - 0x7fff57b1ca00 = 64
Warning: If there is no padding in your structure, then the scheme above will corrupt data, since I'll be storing the misalignement offset in a place that'll be overwritten by the constructor of the internal members.
Remember that when you do "new X[n]", "n" has to be stored "somewhere" so when calling delete[], "n" calls to the destructors will be done. Usually, it's stored before the returned memory buffer (new will likely allocate the required size + 4 for storing the number of elements). The scheme here avoid this.
Another warning: Because C++ calls this operator with some additional padding included in the size for storing the array's number of elements, you'll might still get a "shift" in the returned pointer address for your objects. You might need to account for it. This is what the std::align does, it takes the extra space, compute the alignment like I did and return the aligned pointer. However, you can not get both done in the new[] overload, because of the "count storage" shift that happens after returning from new(). However, you can figure out the "count storage" space once by a single allocation, and adjust the offset accordingly in the new[] implementation.

Assigning an array size depending on a condition

I am trying to assign an array of unsigned short depending on a condition. The problem I encounter is the following (according to the code below) :
error C2057: constant expression expected
error C2466: impossible to allocate array with constant size 0
error C2133: 'packet' : unknown size
unsigned int length=4;
if(...)
{
length = 8;
}
else if(...)
{
length = 6;
}
else
{
length = 4;
}
unsigned short packet[length/2];
I tried to do some shenanigans like adding this before the array declaration and using it for the array size but it doesn't do the trick:
const unsigned int halfLength=length/2;
I can't use vectors to replace my array. Do you have any idea ?

Yup, dynamically allocated arrays:
unsigned short* packet = new unsigned short[length/2];
You can't specify the size of an automatic-storage allocated array at run-time.
You also have to free up the memory yourself:
delete[] packet;

The number of elements in a C style array must be an integral constant
expression in C++. (C90 has some support for non-constant expressions
here, but I'm not familiar with it.) The obvious answer is
std::vector, but you say you can't use that. If that's the case, you
probably can't use dynamic allocation either; otherwise, a pointer and
new unsigned short[length / 2] can be used, although you'll have to
ensure that a delete[] also occurs when your done with it, including
if you leave scope via an exception—in the end, you're not far
from having implemented about half of std::vector locally.
If your code extract isn't too simplified: why not just reserve the
maximum length, e.g.:
unsigned short packet[8 / 2];
In your example, the largest length is 8, and always reserving for 8
isn't going to cause any problems. (Obviously, if the actual length
can vary more with values coming from an external function, etc., this
may not be a realistic solution. But if it is... Why do complicated
when you can do simple?)

I would take it to a class to avoid memory leaks:
template <class T1> class array
{
public:
array( size_t size )
: addr(0)
{
if ( size > 0 )
this->addr = new T1[size];
};
~array( void )
{
if ( this->addr != 0 )
{
delete [] this->addr;
this->addr = 0;
}
};
T1 & operator[]( size_t index )
{
return this->addr[index];
};
bool empty( void ) { return (this->addr != 0); };
private:
T1 * addr;
};
array<unsigned short> packet(length/2);

for c programmers:
//length value is dynamically assigned
int length=10;
//runtime allocation
unsigned short * f = (unsigned short *) malloc (length/2 * sizeof(unsigned short));
//use the vector
f[0]=1;
...
//free the memory once the program does not need more
free(f);
f=NULL;

You cannot assign the size of array dynamically.
You can use a pointer for allocating dynamic size of array.
int * t = malloc(a * sizeof(int))

Increment void pointer by one byte? by two?

I have a void pointer called ptr. I want to increment this value by a number of bytes. Is there a way to do this?
Please note that I want to do this in-place without creating any more variables.
Could I do something like ptr = (void *)(++((char *) ptr)); ?

You cannot perform arithmetic on a void pointer because pointer arithmetic is defined in terms of the size of the pointed-to object.
You can, however, cast the pointer to a char*, do arithmetic on that pointer, and then convert it back to a void*:
void* p = /* get a pointer somehow */;
// In C++:
p = static_cast<char*>(p) + 1;
// In C:
p = (char*)p + 1;

No arithmeatic operations can be done on void pointer.
The compiler doesn't know the size of the item(s) the void pointer is pointing to. You can cast the pointer to (char *) to do so.
In gcc there is an extension which treats the size of a void as 1. so one can use arithematic on a void* to add an offset in bytes, but using it would yield non-portable code.

Just incrementing the void* does happen to work in gcc:
#include <stdlib.h>
#include <stdio.h>
int main() {
int i[] = { 23, 42 };
void* a = &i;
void* b = a + 4;
printf("%i\n", *((int*)b));
return 0;
}
It's conceptually (and officially) wrong though, so you want to make it explicit: cast it to char* and then back.
void* a = get_me_a_pointer();
void* b = (void*)((char*)a + some_number);
This makes it obvious that you're increasing by a number of bytes.

You can do:
++(*((char **)(&ptr)));

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Cleaner pointer arithmetic syntax for manipulation with byte offsets - c++

Casting is the only way, whether it's to a char* or intptr_t or other some such type, and then to your final type.

The casting has to be done, but it makes the code nearly unreadable. For readability's sake, isolate it in a static inline function.

Related

How to convert a byte array of size 64 to a list of double values in Arduino C++?

Portable tagged pointers

Allocating an array of aligned struct

Assigning an array size depending on a condition

Increment void pointer by one byte? by two?

Categories

Resources