Converting an std::array of std::bytes to a numeric value

Converting an std::array of std::bytes to a numeric value - c++

I wonder what a safe way to convert a compact subarray of std::bytes to a numeric value is. I guess this code:
std::array<std::byte, 12> ar = { /* some bytes */ };
uint32_t value = *reinterpret_cast<uint32_t*>(&ar[4]);
is rather error-prone, for example because it depends on how the compiler aligns the values in the array. Thanks in advance!

A reliable approach is to use std::memcpy to copy the target bytes over a uint32_t object. memcpy is required to safely accomplish this for every version of C++. This pattern is common enough that compilers can usually optimize out the copy.
#include <array>
#include <cstddef>
#include <cstdint>
#include <cstring>
#include <stdexcept>
template<class T, std::size_t N>
T read_int_from_bytes(const std::array<std::byte, N> & data, std::size_t index)
{
if(index + sizeof(T) > N) {
throw std::invalid_argument("read_int index out of bounds");
}
// Integer to copy the bytes to
T result;
// Copy the bytes
std::memcpy(&result, &(data[index]), sizeof(result));
return result;
}
Here is an example. This test creates an array of bytes with values { 0x00, 0x10, ..., 0xB0 } and reads a uint32_t starting at the 4th byte.
#include <iostream>
#include <iomanip>
int main()
{
std::array<std::byte, 12> data{};
for(std::size_t i = 0; i < data.size(); ++i)
{
data[i] = static_cast<std::byte>(0x10 * i);
}
std::cout << "Ox" << std::hex << read_int_from_bytes<std::uint32_t>(data, 4);
}
The test produces Ox70605040 when I try it here. You can also notice from the assembly that the entire function call is optimized out and the result is precalculated, clearly showing that the compiler was able to reason through the memcpy and remove it entirely.
Beware the the results are unspecified, it depends on the Endianness of the target platform. That is, whether the first byte is the most significant or the least significant. For many applications this doesn't matter, but if it does C++20 introduced std::endian which you can use to check the system's Endianness.

Related

uint32_t pointer to the same location as uint8_t pointer

#include <iostream>
int main(){
uint8_t memory[1024];
memory[0] = 1;
memory[1] = 1;
uint32_t *test = memory;
//is it possible to get a value for *test that would be in this example 257?
}
I want to create a uin32_t pointer to the same adress as the uint8_t pointer. Is this possible without using new(adress)? I don't want to lose the information at the adress. I know pointers are just adresses and therefor I should be able to just set the uint32_t pointer to the same adress.
This code produces an error:
invalid conversion from 'uint8_t*' to 'uint32_t*' in initialization

This would be a violation of so-called Strict Aliasing Rule, so it can not be done. Sad, but true.
Use memcpy to copy data and in many cases compilers will optimize memory copy and generate the same code as they would with cast, but in Standard-conforming way.

As already mentioned you cannot convert uint8_t * to uint32_t * due to strict aliasing rule, you can convert uint32_t * to unsigned char * though:
#include <iostream>
int main(){
uint32_t test[1024/4] = {}; // initialize it!
auto memory = reinterpret_cast<unsigned char *>( test );
memory[0] = 1;
memory[1] = 1;
std::cout << test[0] << std::endl;
}
this is not portable code due to Endianness, but at least it does not have UB.

This question completely ignores the concept of endian-ness; while your example has the lower and upper byte the same value, if the byte order is swapped it makes no difference; but in the case where it is; your number will be wrong unexpectedly.
As such, there's no portable way to use the resulting number.

You can do that with union. As mentioned above, you have to be aware of endianness of target device, but in most cases it will be little-endian. And there is also a bit of controversy about using unions in such way, but fwiw it's getting a job done and for some uses it's good enough.
#include <iostream>
int main(){
union {
uint8_t memory[1024] = {};
uint32_t test[1024/4];
};
memory[0] = 1;
memory[1] = 1;
std::cout << test[0]; // 257
}

uint32_t *test =(uint32_t*) memory;
uint32_t shows that the memory pointed by test should contain uint32_t .

Portable tagged pointers

Is there a portable way to implement a tagged pointer in C/C++, like some documented macros that work across platforms and compilers? Or when you tag your pointers you are at your own peril? If such helper functions/macros exist, are they part of any standard or just are available as open source libraries?
Just for those who do not know what tagged pointer is but are interested, it is a way to store some extra data inside a normal pointer, because on most architectures some bits in pointers are always 0 or 1, so you keep your flags/types/hints in those extra bits, and just erase them right before you want to use pointer to dereference some actual value.
const int gc_flag = 1;
const int flag_mask = 7; // aka 0b00000000000111, because on some theoretical CPU under some arbitrary OS compiled with some random compiler and using some particular malloc last three bits are always zero in pointers.
struct value {
void *data;
};
struct value val;
val.data = &data | gc_flag;
int data = *(int*)(val.data & flag_mask);
https://en.wikipedia.org/wiki/Pointer_tagging

You can get the lowest N bits of an address for your personal use by guaranteeing that the objects are aligned to multiples of 1 << N. This can be achieved platform-independently by different ways (alignas and aligned_storage for stack-based objects or std::aligned_alloc for dynamic objects), depending on what you want to achieve:
struct Data { ... };
alignas(1 << 4) Data d; // 4-bits, 16-byte alignment
assert(reinterpret_cast<std::uintptr_t>(&d) % 16 == 0);
// dynamic (preferably with a unique_ptr or alike)
void* ptr = std::aligned_alloc(1 << 4, sizeof(Data));
auto obj = new (ptr) Data;
...
obj->~Data();
std::free(ptr);
You pay by throwing away a lot of memory, exponentionally growing with the number of bits required. Also, if you plan to allocate many of such objects contiguously, such an array won't fit in the processor's cacheline for comparatively small arrays, possibly slowing down the program considerably. This solution therefore is not to scale.

If you're sure that the addresses you are passing around always have certain bits unused, then you could use uintptr_t as a transport type. This is an integer type that maps to pointers in the expected way (and will fail to exist on an obscure platform that offers no such possible map).
There aren't any standard macros but you can roll your own easily enough. The code (sans macros) might look like:
void T_func(uintptr_t t)
{
uint8_t tag = (t & 7);
T *ptr = (T *)(t & ~(uintptr_t)7);
// ...
}
int main()
{
T *ptr = new T;
assert( ((uintptr_t)ptr % 8) == 0 );
T_func( (uintptr_t)ptr + 3 );
}
This may defeat compiler optimizations that involve tracking pointer usage.

Well, GCC at least can compute the size of bit-fields, so you can get portability across platforms (I don't have an MSVC available to test with). You can use this to pack the pointer and tag into an intptr_t, and intptr_t is guaranteed to be able to hold a pointer.
#include <limits.h>
#include <stdio.h>
#include <stdint.h>
#include <stddef.h>
#include <inttypes.h>
struct tagged_ptr
{
intptr_t ptr : (sizeof(intptr_t)*CHAR_BIT-3);
intptr_t tag : 3;
};
int main(int argc, char *argv[])
{
struct tagged_ptr p;
p.tag = 3;
p.ptr = (intptr_t)argv[0];
printf("sizeof(p): %zu <---WTF MinGW!\n", sizeof p);
printf("sizeof(p): %lu\n", (unsigned long int)sizeof p);
printf("sizeof(void *): %u\n", (unsigned int)sizeof (void *));
printf("argv[0]: %p\n", argv[0]);
printf("p.tag: %" PRIxPTR "\n", p.tag);
printf("p.ptr: %" PRIxPTR "\n", p.ptr);
printf("(void *)*(intptr_t*)&p: %p\n", (void *)*(intptr_t *)&p);
}
Gives:
$ ./tag.exe
sizeof(p): zu <---WTF MinGW!
sizeof(p): 8
sizeof(void *): 8
argv[0]: 00000000007613B0
p.tag: 3
p.ptr: 7613b0
(void *)*(intptr_t*)&p: 60000000007613B0
I've put the tag at the top, but changing the order of the struct would put it at the bottom. Then shifting the pointer-to-be-stored right by 3 would implement the OP's use case. Probably make macros for access to make it easier.
I also kinda like the struct because you can't accidentally dereference it as if it were a plain pointer.

C++ unsigned char array length

I have in my C++ program unsigned char array of hex values:
unsigned char buff[] = {0x03, 0x35, 0x6B};
And I would like to calculate the size of this array so that I can send it on UART port linux using this function:
if ((count = write(file,buff,length))<0)
{
perror("FAIL to write on exit\n");
}
as I can see the length is int number, and buff is an array which can change size during program execution.
can anyone help me how to write it. Thanks

As one of the options to get the number of elements you can use such template:
template<typename T, size_t s>
size_t arrSize(T(&)[s])
{
return s;
}
And afterwards call:
auto length = arrSize(buff);
This could be used across the code for various array types.
In case by array size you mean its total byte size you can just use the sizeof(buff). Or as others suggested you can use std::array, std::vector or any other container instead and write a helper like this:
template<typename T>
size_t byteSize(const T& data)
{
typename T::value_type type;
return data.size() * sizeof(type);
}
Then to acquire the actual byte size of the data you can simply call:
std::vector<unsigned char> buff{0x03, 0x35, 0x6B};
auto bSize = byteSize(buff);

You can do this with an array:
size_t size = sizeof array;
with your example that give:
ssize_t count = write(file, buff, sizeof buff);
if (count < 0 || (size_t)count != sizeof buff) {
perror("FAIL to write on exit\n");
}
Note: I use C semantic because write is from lib C.
In C++, you can use template to be sure that you use sizeof with an array.
template<typename T, size_t s>
size_t array_sizeof(T (&array)[s]) {
return sizeof array;
}
with your example that give:
ssize_t count = write(file, buff, array_sizeof(buff));
if (count < 0 || static_cast<size_t>(count) != array_sizeof(buff)) {
perror("FAIL to write on exit\n");
}

If you are using C++11 you might think of switching to
#include <array>
std::array<char, 3> buff{ {0x03, 0x35, 0x6B} };
That offers an interface like std::vector (including size & data) for fixed arrays.
Using array might prevent some usual errors and offer some functionality covered by <algorithm>.
The call to write will then be:
write(file,buff.data(),buf.size())

And I would like to calculate the size of this array so that I can send it on UART port linux using this function...
You need a COUNTOF macro or function. They can be tricky to get right in all cases. For example, the accepted answer shown below will silently fail when working with pointers:
size_t size = sizeof array;
size_t number_element = sizeof array / sizeof *array;
Microsoft Visual Studio 2005 has a built-in macro or template class called _countof. It handles all cases properly. Also see the _countof Macro documentation on MSDN.
On non-Microsoft systems, I believe you can use something like the following. It will handle pointers properly (from making COUNTOF suck less):
template <typename T, size_t N>
char (&ArraySizeHelper( T (&arr)[N] ))[N];
#define COUNTOF(arr) ( sizeof(ArraySizeHelper(arr)) )
void foo(int primes[]) {
// compiler error: primes is not an array
std::cout << COUNTOF(primes) << std::endl;
}
Another good reference is Better array 'countof' implementation with C++ 11. It discusses the ways to do things incorrectly, and how to do things correctly under different compilers, like Clang, ICC, GCC and MSVC. It include the Visual Studio trick.
buff is an array which can change size during program execution
As long as you have the data at compile time, the countof macro or function should work. If you are building data on the fly, then it probably won't work.
This is closely related: Common array length macro for C?. It may even be a duplicate.

Can you write a static assert to verify the offset of data members?

Given the following struct:
struct ExampleStruct {
char firstMember[8];
uint64_t secondMember;
};
Is there a way to write a static assert to verify that the offset of secondMember is some multiple of 8 bytes?

Offsetof
You can use the offsetof marco brought by the cstddef library. Here I first get the offset, then I use the modulus operator to check if it is a multiple of 8. Then, if the remainder is 0, the offset is indeed a multiple of 8 bytes.
// Offset.cpp
#include <iostream>
#include <string>
#include <cstddef>
#include <stdarg.h>
struct ExampleStruct {
char firstMember[8];
uint64_t secondMember;
};
int main()
{
size_t offset = offsetof(ExampleStruct, secondMember);
if(offset%8==0)
std::cout << "Offset is a multiple of 8 bytes";
}
Demo here
Offsetof with static_assert
Or by the context of this question, the goal is to have a static_assert. Well, that is pretty much the same thing:
// OffsetAssert.cpp
#include <iostream>
#include <string>
#include <cstddef>
#include <stdarg.h>
struct ExampleStruct {
char firstMember[8];
uint64_t secondMember;
};
int main()
{
size_t offset = offsetof(ExampleStruct, secondMember); // Get Offset
static_assert(offsetof(ExampleStruct, secondMember)%8==0,"Not Aligned 8 Bytes"); // Check if the offset modulus 8 is remainer 0 , if so it is a multiple of 8
std::cout << "Aligned 8 Bytes" << std::endl; // If Assert Passes it is aligned 8 bytes
}
Demo here
Type Uses
I use std::size_t type because that's the type you normally use to store sizes of variables, objects, and so on. And also because it expands to the std::size_t expression according to cppreference.com:
The macro offsetof expands to an integral constant expression of type std::size_t, the value of which is the offset, in bytes, from the beginning of an object of specified type to its specified member, including padding if any.
References
cpprefrence
cplusplus

If your type has standard layout, you can use the offsetof macro:
#include <cstddef>
static_assert(offsetof(ExampleStruct, secondMember) % 8 == 0, "Bad alignment");
This the offsetof macro results in a constant expression, you can use a static assertion to produce a translation error if the condition is not met.

Boost.Multiprecision cpp_int - convert into an array of bytes?

http://www.boost.org/doc/libs/1_53_0/libs/multiprecision/doc/html/index.html
I just started exploring this library. There doesn't seem to be a way to convert cpp_int into an array of bytes.
Can someone see such functionality?

This is undocument way. cpp_int's backend have limbs() member function. This function return internal byte array value.
#include <iostream>
#include <boost/multiprecision/cpp_int.hpp>
namespace mp = boost::multiprecision;
int main()
{
mp::cpp_int x("11111111112222222222333333333344444444445555555555");
std::size_t size = x.backend().size();
mp::limb_type* p = x.backend().limbs();
for (std::size_t i = 0; i < size; ++i) {
std::cout << *p << std::endl;
++p;
}
}
result:
10517083452262317283
8115000988553056298
32652620859

This is the documented way of exporting and importing the underlying limb data of a cpp_int (and cpp_float). From the example given in the docs, trimmed down for the specific question:
#include <boost/multiprecision/cpp_int.hpp>
#include <vector>
using boost::multiprecision::cpp_int;
cpp_int i{"2837498273489289734982739482398426938568923658926938478923748"};
// export into 8-bit unsigned values, most significant bit first:
std::vector<unsigned char> bytes;
export_bits(i, std::back_inserter(bytes), 8);
This mechanism is quite flexible, as you can save the bytes into other integral types (just remember to specify the number of bits per array element), which in turn works with import_bits, too, if you need to restore a cpp_int from the deserialized sequence.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Converting an std::array of std::bytes to a numeric value - c++

Related

uint32_t pointer to the same location as uint8_t pointer

Portable tagged pointers

C++ unsigned char array length

Can you write a static assert to verify the offset of data members?

Boost.Multiprecision cpp_int - convert into an array of bytes?

Categories

Resources