what is the benefit of detecting endian at runtime? - c++

i've searched for macro's to determine endianess on a machine and didn't found any standard proprocessor macros for this, but a lot of solutions doing that on runtime. why should i detect endianess at runtime?
if i do somthing like that:
#ifdef LITTLE_ENDIAN
inline int swap(int& x) {
// do swap anyhow
return swapped;
}
#elif BIG_ENDIAN
inline int& swap(int& x) { return x; }
#else
#error "some blabla"
#endif
int main() {
int x = 0x1234;
int y = swap(x);
return 0;
}
the compiler will generate only one function.
but if i do it like (see predef.endian):
enum {
ENDIAN_UNKNOWN,
ENDIAN_BIG,
ENDIAN_LITTLE,
ENDIAN_BIG_WORD, /* Middle-endian, Honeywell 316 style */
ENDIAN_LITTLE_WORD /* Middle-endian, PDP-11 style */
};
int endianness(void)
{
uint8_t buffer[4];
buffer[0] = 0x00;
buffer[1] = 0x01;
buffer[2] = 0x02;
buffer[3] = 0x03;
switch (*((uint32_t *)buffer)) {
case 0x00010203: return ENDIAN_BIG;
case 0x03020100: return ENDIAN_LITTLE;
case 0x02030001: return ENDIAN_BIG_WORD;
case 0x01000302: return ENDIAN_LITTLE_WORD;
default: return ENDIAN_UNKNOWN;
}
int swap(int& x) {
switch(endianess()) {
case ENDIAN_BIG:
return x;
break;
case LITTLE_ENDIAN:
// do swap
return swapped;
break;
default:
// error blabla
}
// do swap anyhow
}
the compiler generates code for the detection.
i don't get it, why should i do this?
if i have code, compiled for a little-endian machine, the whole code is generated for little endian, and if i try to run such code on a big-endian machine (on a bi-endian machine like arm wiki:bi-endian) the whole code is compiled for a little-endian machine. so all other declarations of e.g. int are also le.
// compiled on little endian
uint32_t 0x1234; // 0x1234 constant literal
// should result 34120000 on BE

There are actually systems where SOFTWARE can set whether the system is (currently running in) little or big endian mode. Most systems only support switching that under special circumstances, and not (fortunately for system programmers and such) switching back and forth arbitrarily. But it would be conceivable to support that an executable file defines whether that particular executable runs in LE or BE mode. In that case, you can't rely on picking out what OS and processor model it is...
On the other hand, if the hardware only EVER supports one endianness (e.g. x86 in its different forms), then I don't see a need to check at runtime. You know it's little endian, and that's it. It is wasteful (in terms of performance and code-size) to have the system contain code to check which endianness it is, and carry around conversion methods to convert from big endian to little endian.

Robust endian detection at compile time isn't necessarily possible. There are platforms where endianess can change even between runs of the same binary.
http://gcc.gnu.org/ml/gcc-help/2007-07/msg00343.html

I think the only benefit of detecting endianness in runtime is that you don't have to mess around with macros. As you have noticed yourself, there is no standard macro saying what is the endiannes of the machine you are compiling your code on, so you must define something yourself and pass it to the compiler, or define it conditionally depending on other flags indicating architecture/operating system, something like:
#ifdef _this_system_
#define LITTLE_ENDIAN
#endif
#ifdef _that_system_
#define BIG_ENDIAN
#endif
but repeated many times, for every possible architecture, which is messy and error prone. It is easier and safer to check it in runtime. I know, it seems silly, but it is really more practical.

Related

Why do we need to downcast a variable even if the function will upcast it again right before returning?

While reading a book about modern C++, I have seen a code snippet that confused me. The code has been written to set up PWM (16 bit Timer) for 8 bit AVR microcontroller. The code is like that:
class pwm_base : private util::noncopyable
{
public:
typedef std::uint_fast16_t duty_type;
pwm_base(const duty_type resol,
const duty_type duty = 0U) : resolution(resol),
counter (0U),
duty_cycle(duty),
shadow (duty) { }
duty_type get_resolution() const { return resolution; }
void set_duty(const duty_type duty)
{
// Set a new duty cycle in the shadow register.
mcal::irq::disable_all();
shadow = static_cast<std::uint_fast8_t>((std::min)(duty, get_resolution())); // ???? (1)
mcal::irq::enable_all();
}
duty_type get_duty() const
{
// Retrieve the duty cycle.
mcal::irq::disable_all();
const volatile std::uint_fast8_t the_duty = duty_cycle; // ???? (2)
mcal::irq::enable_all();
return the_duty;
}
virtual void service() = 0;
protected:
const duty_type resolution;
duty_type counter;
duty_type duty_cycle;
duty_type shadow;
};
I have problems with lines indicated by ????. It is clearly seen that both, shadow and duty_cycle have been defined as uint_fast16_t. So, it means that they have to be at least 16 bits. If it is the case, why does the author downcast the result of min method to uint_fast8_t instead of casting to uint_fast16_t at line ???? (1) ? And also, why does at line ???? (2) he downcast again the variable to uint_fast8_t even if the function return type is uint_fast16_t? Are these downcastings required? What are their purposes?
This code seems severely over-engineered, which is always the biggest danger whenever you allow C++. The language tends to encourage making things needlessly complicated just for the heck of it, instead of utilizing the KISS principle.
The hard requirements are: on the given hardware, the CPU works fastest with 8 bit types but the PWM register is 16 bits. Period.
Therefore any variable describing period or duty cycle needs to be exactly 16 bits. It doesn't make sense to declare it as uint_fast16_t because the corresponding hardware registers will always be exactly 16 bits, no more, no less.
In general the uint_fast types are only helpful if planning to port this to a 32 bit system somehow, but it is very unlikely that there will exist a 32 bit MCU with an identical PWM peripheral. Thus in this case uint_fast is just useless noise, because the code will never get ported.
The cast to uint8_t supposedly truncates the 16 bit value to the 8 least significant bits. If so, the cast is incorrect, it should never be static_cast<std::uint_fast8_t> but static_cast<std::uint8_t>. I'm otherwise not quite following what the code is supposed to do, update the duty cycle somewhere, I would assume.
Furthermore, disabling the interrupt mask as a re-entrancy protection feature probably doesn't make sense, as this can cause unexpected timing problems. If this class is to be used from inside an ISR then another protection mechanism might be more suitable, such as a semaphore or atomic access/inline asm. PWM in particular is picky with when/how you change the duty cycle, or you might get glitches.
Overall, I would strongly recommend not to use C++ on an ancient, severely resource-confined 8 bit microcontroller. Or to use ancient 8 bit microcontrollers instead of ARM when learning embedded systems, 8-bitters are much harder to program in C (or C++) than 32-bitters.

How to make sure a data type is as large as it needs to be in C++

Is there a simple way to make sure data types remain the same size across different platforms and architectures in C/C++?
In C++, let's say I have written a hypothetical, bare-bones program:
#include <iostream>
int foo;
int main(void)
{
std::cout<<"Size of foo is "<<sizeof(foo)<<" bytes."<<std::endl;
return 0;
}
All this program does is print the size of an arbitrary integer data type and exit. On my machine, it returns the size of foo as 4 bytes (I have a 64-bit machine).
But, let's say I write this program for an Arduino or other ATMega powered board:
int foo;
void setup()
{
Serial.begin(9600);
Serial.print("Size of foo is ");
Serial.print(sizeof(foo));
Serial.println(" bytes.");
}
void loop(){}
On my Arduino UNO board, this only returns 2 (same size as short on my machine).
Is it possible to make a (prefereably cross-platform) header or declaration to make sure that the size of a data type is a certain size, so as not to run into precision problems or issues with very large numbers?
Ideally, this would work with not just Arduino boards, but also other compact MCUs (like the PicKit).
I'm relatively new to Arduino and the whole Arduino IDE in general, so forgive me if I'm missing something super simple.
A reasonably modern C++ implementation should have the header <cstdint>, which provides intN_t and uintN_t, where N is 8, 16, 32 and 64. Using those well defined types will guarantee that either the code will compile and behave as you wish, or not compile, so you won't get the problems with "my code isn't doing what I expect from a 32-bit integer". For super-portability, you could use `int_
In a system that doesn't have <cstdint>, you could achieve the same thing with some suitable typedef based on for example what compiler (e.g. #if defined(__GCC__) && define(__i386__) ... or some make-file trickery of sampling the size of various expected types and then generating the right typedef lines.
Yes. Use a fixed width integer type defined in <cstdint>, usually one of the following:
int8_t
int16_t
int32_t
int64_t
uint8_t
uint16_t
uint32_t
uint64_t
There are also other ones for different use cases. They are available since C++11.

portable ntohl and friends

I'm writing a small program that will save and load data, it'll be command line (and not interactive) so there's no point in including libraries I need not include.
When using sockets directly, I get the ntohl functions just by including sockets, however here I don't need sockets. I'm not using wxWidgets, so I don't get to use its byte ordering functions.
In C++ there are lot of new standardised things, for example look at timers and regex (although that's not yet fully supported) but certainly timers!
Is there a standardised way to convert things to network-byte ordered?
Naturally I've tried searching "c++ network byte order cppreference" and similar things, nothing comes up.
BTW in this little project, the program will manipulate files that may be shared across computers, it'd be wrong to assume "always x86_64"
Is there a standardised way to convert things to network-byte ordered?
No. There isn't.
Boost ASIO has equivalents, but that somewhat violates your requirements.
GCC has __BYTE_ORDER__ which is as good as it will get! It's easy to detect if the compiler is GCC and test this macro, or detect if it is Clang and test that, then stick the byte ordering in a config file and use the pre-processor to conditionally compile bits of code.
There are no C++ standard functions for that, but you can compose the required functionality from the C++ standard functions.
Big-endian-to-host byte-order conversion can be implemented as follows:
#include <boost/detail/endian.hpp>
#include <boost/utility/enable_if.hpp>
#include <boost/type_traits/is_arithmetic.hpp>
#include <algorithm>
#ifdef BOOST_LITTLE_ENDIAN
# define BE_TO_HOST_COPY std::reverse_copy
#elif defined(BOOST_BIG_ENDIAN)
# define BE_TO_HOST_COPY std::copy
#endif
inline void be_to_host(void* dst, void const* src, size_t n) {
char const* csrc = static_cast<char const*>(src);
BE_TO_HOST_COPY(csrc, csrc + n, static_cast<char*>(dst));
}
template<class T>
typename boost::enable_if<boost::is_integral<T>, T>::type
be_to_host(T const& big_endian) {
T host;
be_to_host(&host, &big_endian, sizeof(T));
return host;
}
Host-to-big-endian byte-order conversion can be implemented in the same manner.
Usage:
uint64_t big_endian_piece_of_data;
uint64_t host_piece_of_data = be_to_host(big_endian_piece_of_data);
The following should work correctly on any endian platform
int32_t getPlatformInt(uint8_t* bytes, size_t num)
{
int32_t ret;
assert(num == 4);
ret = bytes[0] << 24;
ret |= bytes[1] << 16;
ret |= bytes[2] << 8;
ret |= bytes[3];
return ret;
}
You network integer can easily be cast to an array of chars using:
uint8_t* p = reiterpret_cast<uint8_t*>(&network_byte_order_int)
The code from Doron that should work on any platform did not work for me on a big-endian system (Power7 CPU architecture).
Using a compiler built_in is much cleaner and worked great for me using gcc on both Windows and *nix (AIX):
uint32_t getPlatformInt(const uint32_t* bytes)
{
uint32_t ret;
ret = __builtin_bswap32 (*bytes));
return ret;
}
See also How can I reorder the bytes of an integer in c?

Pointers Casting Endianness

#include "stdio.h"
typedef struct CustomStruct
{
short Element1[10];
}CustomStruct;
void F2(char* Y)
{
*Y=0x00;
Y++;
*Y=0x1F;
}
void F1(CustomStruct* X)
{
F2((char *)X);
printf("s = %x\n", (*X).Element1[0]);
}
int main(void)
{
CustomStruct s;
F1(&s);
return 0;
}
The above C code prints 0x1f00 when compiled and ran on my PC.
But when I flash it to an embedded target (uController) and debugging, I find that
(*X).Element1[0] = 0x001f.
1- Why the results are different on PC and on the embedded target?
2- What can I modify in this code so that it prints 0x001f in the PC case,
without changing the core of code (by adding a compiler option or something maybe).
shorts are typically two bytes and 16 bits. When you say:
short s;
((char*)&s)[0] = 0x00;
((char*)&s)[1] = 0x1f;
This sets the first of those two bytes to 0x00 and the second of those two bytes to 0x1f. The thing is that C++ doesn't specify what setting the first or second byte does to the value of the overall short, so different platforms can do different things. In particular, some platforms say that setting the first byte affects the 'most significant' bits of the short's 16 bits and setting the second byte affects the 'least significant' bits of the short's 16 bits. Other platforms say the opposite; That setting the first byte affect the least significant bits and setting the second byte affects the most significant bits. These two platform behaviors are referred to as big-endian and little-endian respectively.
The solution to getting consistent behavior independent of these differences is to not access the bytes of the short this way. Instead you should simply manipulate the value of the short using methods that the language does define, such as with bitwise and arithmetic operators.
short s;
s = (0x1f << 8) | (0x00 << 0); // set the most significant bits to 0x1f and the least significant bits to 0x00.
The problem is that, for many reasons, I can only change the body of the function F2. I can not change its prototype. Is there a way to find the sizeof Y before it have been castled or something?
You cannot determine the original type and size using only the char*. You have to know the correct type and size through some other means. If F2 is never called except with CustomStruct then you can simply cast the char* back to CustomStruct like this:
void F2(char* Y)
{
CustomStruct *X = (CustomStruct*)Y;
X->Element[0] = 0x1F00;
}
But remember, such casts are not safe in general; you should only cast a pointer back to what it was originally cast from.
The portable way is to change the definition of F2:
void F2(short * p)
{
*p = 0x1F;
}
void F1(CustomStruct* X)
{
F2(&X.Element1[0]);
printf("s = %x\n", (*X).Element1[0]);
}
When you reinterpret an object as an array of chars, you expose the implementation details of the representation, which is inherently non-portable and... implementation-dependent.
If you need to do I/O, i.e. interface with a fixed, specified, external wire format, use functions like htons and ntohs to convert and leave the platform specifics to your library.
It appears that the PC is little endian and the target is either big-endian, or has 16-bit char.
There isn't a great way to modify the C code on the PC, unless you replace your char * references with short * references, and perhaps use macros to abstract the differences between your microcontroller and your PC.
For example, you might make a macro PACK_BYTES(hi, lo) that packs two bytes into a short the same way, regardless of machine endian. Your example becomes:
#include "stdio.h"
#define PACK_BYTES(hi,lo) (((short)((hi) & 0xFF)) << 8 | (0xFF & (lo)))
typedef struct CustomStruct
{
short Element1[10];
}CustomStruct;
void F2(short* Y)
{
*Y = PACK_BYTES(0x00, 0x1F);
}
void F1(CustomStruct* X)
{
F2(&(X->Element1[0]));
printf("s = %x\n", (*X).Element1[0]);
}
int main(void)
{
CustomStruct s;
F1(&s);
return 0;
}

What can go wrong in following code - and compile time requirements?

first let me say I know the following code will be considered "bad" practices.. But I'm limited by the environment a "little" bit:
In an dynamic library I wish to use "pointers" (to point to classes) - however the program that will use this dll, can only pass & receive doubles. So I need to "fit" the pointer in a double. The following code tries to achieve this, which I hope to work in a 64-bit environment:
EXPORT double InitializeClass() {
SampleClass* pNewObj = new SampleClass;
double ret;
unsigned long long tlong(reinterpret_cast<unsigned long long>(pNewObj));
memcpy(&ret, &tlong, sizeof(tlong));
return ret;
}
EXPORT double DeleteClass(double i) {
unsigned long long tlong;
memcpy(&tlong, &i, sizeof(i));
SampleClass* ind = reinterpret_cast<SampleClass* >(tlong);
delete ind;
return 0;
}
Now once again I realize I might've been better of using vectors & storing the pointers inside the vector. However I really wish to do this using pointers (as alternative). So can anyone tell me possible failures/better versions?
The obvious failure is if double & unsigned long long aren't the same length in size (or pointers being longer than 64 bits). Is there a method to check this at compile time? - And give a compile error in case the sizes aren't the same?
In theory, at least, a 64 bit pointer, type punned to a 64 bit IEEE
double, could result in a trapping NaN, which would in turn trap. In
practice, this might not be a problem; my attempts to get trapping NaN
to actually do something other than be ignored have not been very
successful.
Another possible problem is that the values might not be normalized
(and in fact, probably won't be). What the hardware does with
non-normalized values depends: it could either just pass them on
transparently, silently normalize them (changing the value of the
"pointer"), or trigger some sort of runtime error.
There's also the issue of aliasing. Accessing a pointer through an
lvalue which has a type of double is undefined behavior, and many
compilers will take advantage of this when optimizing, assuming that
changes through a double* or a double& reference cannot affect any
pointers (and moving the load of the pointer before the write of the
double, or not reloading the pointer after a modification of the
double).
In practice if you're working in an Intel environment, I think all
"64-bit" pointers will in fact have the upper 16 bits 0. This is where
the exponent lives in an IEEE double, and an exponent of 0 is a gradual
underflow, which won't trap (at least with the default modes), and won't
be changes. So your code might actually seem to work, as long as the
compiler doesn't optimize too much.
assert(sizeof(SampleClass*) <= sizeof(unsigned long long));
assert(sizeof(unsigned long long) <= sizeof(double));
I would say that you'll have to test it in both 64-bit and 32-bit to make sure it works. Say it does have a different behaviour in 64-bit systems, then you could use this format to get around the problem (since you've mentioned that you're using VS2010):
EXPORT double InitializeClass64() {
// Assert the pointer-size is the same as the data-type being used
assert(sizeof(void*) == sizeof(double));
// 64-bit specific code
return ret;
}
EXPORT double DeleteClass64(double i) {
// Assert the pointer-size is the same as the data-type being used
assert(sizeof(void*) == sizeof(double));
// 64-bit specific code
return 0;
}
EXPORT double InitializeClass32() {
// Assert the pointer-size is the same as the data-type being used
assert(sizeof(void*) == sizeof(double));
// 32-bit specific code
return ret;
}
EXPORT double DeleteClass32(double i) {
// Assert the pointer-size is the same as the data-type being used
assert(sizeof(void*) == sizeof(double));
// 32-bit specific code
return 0;
}
#if defined(_M_X64) || defined(_M_IA64)
// If it's 64-bit
# define InitializeClass InitializeClass64
# define DeleteClass DeleteClass64
#else
// If it's 32-bit
# define InitializeClass InitializeClass32
# define DeleteClass DeleteClass32
#endif // _M_X64 || _M_IA64