I am having trouble finding information on the proper handling of variables during binary arithmetic. I am currently working on implementing an algorithm on an Atmel ATTiny84 microcontroller. I am coding it in C++.
The issues I am having is that with Binary Arithmetic you could end up in overflow or you could end up with a variable size that is larger than the value being stored in it. I apologize if this is confusing let me explain with an example.
uint16_t A=500;
uint8_t B=8;
uint32_t C;
C=A*B;
From what I've learned via google search, if you multiply a variable of size M by a variable of size N it results in a variable of size M+N. In the above case C=4000 but M+N is 24. The value 4000 however can fit in 16 bits. Can I simply declare C as 16 bit or does it have to be 32 bit as shown above?
uint16_t A=500;
uint8_t B=8;
uint16_t C;
C=A*B;
If I do have to store 4000 in a variable that is 32 bits, can I simply transfer it to a variable that is 16 bits by the following
uint16_t D;
uint32_t C
C=4000;
D=C;
Thanks in advance for the help.
Multiplication won't return a larger type than the two operands unless you specifically tell it to. The operation will convert all variables to the largest width and then multiply. When you multiply a 16 bit int by an 8 bit int, the 8 bit int will be converted to a 16 bit int, and then the two will be multiplied. Although mathematically the result can be larger than 16 bits, the operation will only return a 16 bit number. If the result cannot fit in 16 bits of space, then an overflow flag will be set (check your microcontroller manual to see how to check that).
In your case, the operation will return 4000, which can be stored in a 16 bit variable, so that is fine.
uint16_t A=500;
uint8_t B=8;
uint32_t C;
C=A*B;
It will promote a and b to the size of C then do the math.
uint16_t A=500;
uint8_t B=8;
uint16_t C;
C=A*B;
It will promote a and b to the size of C then do the math.
uint16_t D;
uint32_t C;
C=4000;
D=C;
You may or may not get a warning that you are trying to shove 32 bits into 16.
D=(uint16_t)C;
Will clip off the lower 16 bits without a warning. Both cases result in the lower 16 bits going into D.
Related
I want to decode a GPS navigation message where some parameters are marked such that:
Parameters so indicated shall be two's complement, with the sign bit
(+ or -) occupying the MSB
For example, I want to store a parameter af0 which has 22 number of bits, with bit 22 as the MSB.
The parameter af0 has been decoded by me and now I need to perform the two's complement operation. I stored af0 using an uint32_t integer type.
There are also other parameters like IDOT which has 14 number of bits and I stored it using an uint16_t.
I'm not sure, but if I understand it correctly if have to check the MSB for 1 or 0. If it is 1 I can
simply calculate the two's complement by negation (and casting) of the value, i.e. int32_t af0_i = -(int32_t)af0. If the MSB is 0 I just cast the value according: int32_t af0_i = (int32_t)af0.
Is this correct for uintX_t integer types? I also tried out: https://stackoverflow.com/a/34076866/6518689 but it didn't fixed my problem, the value remains the same.
af0_i = -(int32_t)af0 will not work as expected; it'll flip all the bits, whereas you need to sign-extend the MSB instead and keep the rest unchanged.
Let's assume you extracted the raw 22 bits into a 32-bit variable:
int32_t af0 = ... /* some 22-bit value, top 10 bits are 0 */;
So now bit 21 is the sign bit. But with int32_t the sign bit is bit 31 (technically two's complement isn't guaranteed until C++20).
So we can shift left by 10 bits and immediately back right, which will sign-extend it.
af0 <<= 10; af0 >>= 10;
The code above is guaranteed to sign-extend since C++20, and is implementation-defined before that (on x86 will work as expected, though you can add a static_assert for that).
I was asked this question on the interview, and I can't really understand what is going on here. The question is "What would be displayed in the console?"
#include <iostream>
int main()
{
unsigned long long n = 0;
((char*)&n)[sizeof(unsigned long long)-1] = 0xFF;
n >>= 7*8;
std::cout << n;
}
What is happening here, step by step?
Let's get this one step at a time:
((char*)&n)
This casts the address of the variable n from unsigned long long* to char*. This is legal and actually accessing objects of different types via pointer of char is one of the very few "type punning" cases accepted by the language. This in effect allows you to access the memory of the object n as an array of bytes (aka char in C++)
((char*)&n)[sizeof(unsigned long long)-1]
You access the last byte of the object n. Remember sizeof returns the dimension of a data type in bytes (in C++ char has an alter ego of byte)
((char*)&n)[sizeof(unsigned long long)-1] = 0xFF;
You set the last byte of n to the value 0xFF.
Since n was 0 initially the layout memory of n is now:
00 .. 00 FF
Now notice the ... I put in the middle. That's not because I am lazy to copy paste the values the amount of bytes n has, it's because the size of unsigned long long is not set by the standard to a fixed dimension. There are some restrictions, but it can vary from implementation to implementation. So this is the first "unknown". However on most modern architectures sizeof (unsigned long long) is 8, so we are going to go with this, but in a serious interview you are expected to mention this.
The other "unknown" is how these bytes are interpreted. Unsigned integers are simply encoded in binary. But it can be little endian or big endian. x86 is little endian so we are going with it for the exemplification. And again, in a serious interview you are expected to mention this.
n >>= 7*8;
This right shifts the value of n 56 times. Pay attention, now we are talking about the value of n, not the bytes in memory. With our assumptions (size 8, little endian) the value encoded in memory is 0xFF000000 00000000 so shifting it 7*8 times will result in the value 0xFF which is 255.
So, assuming sizeof(unsigned long long) is 8 and a little endian encoding the program prints 255 to the console.
If we are talking about a big endian system, the memory layout after setting the last byte to 0xff is still the same: 00 ... 00 FF, but now the value encoded is 0xFF. So the result of n >>= 7*8; would be 0. In a big endian system the program would print 0 to the console.
As pointed out in the comments, there are other assumptions:
char being 8 bits. Although sizeof(char) is guaranteed to be 1, it doesn't have to have 8 bits. All modern systems I know of have bits grouped in 8-bit bytes.
integers don't have to be little or big endian. There can be other arrangement patterns like middle endian. Being something other than little or big endian is considered esoteric nowadays.
Cast the address of n to a pointer to chars, set the 7th (assuming sizeof(long long)==8) char element to 0xff, then right-shift the result (as a long long) by 56 bits.
I am trying to extract the lower 25 bits of uint64_t to uint32_t. This solution shows how to extract lower 16 bits from uint32_t, but I am not able to figure out for uint64_t. Any help would be appreciated.
See How do you set, clear, and toggle a single bit? for bit operations.
To answer your question:
uint64_t lower25Bits = inputValue & (uint64_t)0x1FFFFFF;
Just mask with a mask that leaves just the bits you care about.
uint32_t out = input & ((1UL<<26)-1);
The idea here is: 1UL<<26 provides an (unsigned long, which is guaranteed to be at least 32-bit wide) integer with just the 26th bit set, i.e.
00000100000000000000000000000000
the -1 makes it become a value with all the bits below it set, i.e.:
00000011111111111111111111111111
the AND "lets through" only the bits that in the mask correspond to zero.
Another way is to throw away those bits with a double shift:
uint32_t out = (((uint32_t)input)<<7)>>7;
The cast to uint32_t makes sure we are dealing with a 32-bit wide unsigned integer; the unsigned part is important to get well-defined results with shifts (and bitwise operations in general), the 32 bit-wide part because we need a type with known size for this trick to work.
Let's say that (uint32_t)input is
11111111111111111111111111111111
we left shift it by 32-25=7; this throws away the top 7 bits
11111111111111111111111110000000
and we right-shift it back in place:
00000001111111111111111111111111
and there we go, we got just the bottom 25 bits.
Notice that the first uint32_t cast wouldn't be strictly necessary because you already have a known-size unsigned value; you could just do (input<<39)>>39, but (1) I prefer to be sure - what if tomorrow input becomes a type with another size/signedness? and (2) in general current CPUs are more efficient working with 32 bit integers than 64 bit integers.
Say, i have binary protocol, where first 4 bits represent a numeric value which can be less than or equal to 10 (ten in decimal).
In C++, the smallest data type available to me is char, which is 8 bits long. So, within my application, i can hold the value represented by 4 bits in a char variable. My question is, if i have to pack the char value back into 4 bits for network transmission, how do i pack my char's value back into 4 bits?
You do bitwise operation on the char;
so
unsigned char packedvalue = 0;
packedvalue |= 0xF0 & (7 <<4);
packedvalue |= 0x0F & (10);
Set the 4 upper most bit to 7 and the lower 4 bits to 10
Unpacking these again as
int upper, lower;
upper = (packedvalue & 0xF0) >>4;
lower = packedvalue & 0x0F;
As an extra answer to the question -- you may also want to look at protocol buffers for a way of encoding and decoding data for binary transfers.
Sure, just use one char for your value:
std::ofstream outfile("thefile.bin", std::ios::binary);
unsigned int n; // at most 10!
char c = n << 4; // fits
outfile.write(&c, 1); // we wrote the value "10"
The lower 4 bits will be left at zero. If they're also used for something, you'll have to populate c fully before writing it. To read:
infile.read(&c, 1);
unsigned int n = c >> 4;
Well, there's the popular but non-portable "Bit Fields". They're standard-compliant, but may create a different packing order on different platforms. So don't use them.
Then, there are the highly portable bit shifting and bitwise AND and OR operators, which you should prefer. Essentially, you work on a larger field (usually 32 bits, for TCP/IP protocols) and extract or replace subsequences of bits. See Martin's link and Soren's answer for those.
Are you familiar with C's bitfields? You simply write
struct my_bits {
unsigned v1 : 4;
...
};
Be warned, various operations are slower on bitfields because the compiler must unpack them for things like addition. I'd imagine unpacking a bitfield will still be faster than the addition operation itself, even though it requires multiple instructions, but it's still overhead. Bitwise operations should remain quite fast. Equality too.
You must also take care with endianness and threads (see the wikipedia article I linked for details, but the issues are kinda obvious). You should leearn about endianness anyways since you said "binary protocol" (see this previous questions)
To begin with, the application in question is always going to be on the same processor, and the compiler is always gcc, so I'm not concerned about bitfields not being portable.
gcc lays out bitfields such that the first listed field corresponds to least significant bit of a byte. So the following structure, with a=0, b=1, c=1, d=1, you get a byte of value e0.
struct Bits {
unsigned int a:5;
unsigned int b:1;
unsigned int c:1;
unsigned int d:1;
} __attribute__((__packed__));
(Actually, this is C++, so I'm talking about g++.)
Now let's say I'd like a to be a six bit integer.
Now, I can see why this won't work, but I coded the following structure:
struct Bits2 {
unsigned int a:6;
unsigned int b:1;
unsigned int c:1;
unsigned int d:1;
} __attribute__((__packed__));
Setting b, c, and d to 1, and a to 0 results in the following two bytes:
c0 01
This isn't what I wanted. I was hoping to see this:
e0 00
Is there any way to specify a structure that has three bits in the most significant bits of the first byte and six bits spanning the five least significant bits of the first byte and the most significant bit of the second?
Please be aware that I have no control over where these bits are supposed to be laid out: it's a layout of bits that are defined by someone else's interface.
(Note that all of this is gcc-specific commentary - I'm well aware that the layout of bitfields is implementation-defined).
Not on a little-endian machine: The problem is that on a little-endian machine, the most significant bit of the second byte isn't considered "adjacent" to the least significant bits of the first byte.
You can, however, combine the bitfields with the ntohs() function:
union u_Bits2{
struct Bits2 {
uint16_t _padding:7;
uint16_t a:6;
uint16_t b:1;
uint16_t c:1;
uint16_t d:1;
} bits __attribute__((__packed__));
uint16_t word;
}
union u_Bits2 flags;
flags.word = ntohs(flag_bytes_from_network);
However, I strongly recommend you avoid bitfields and instead use shifting and masks.
Usually you can't do strong assumptions on how the union will be packed, every compiler implementation may choose to pack it differently (to save space or align bitfields inside bytes).
I would suggest you to just work out with masking and bitwise operators..
from this link:
The main use of bitfields is either to allow tight packing of data or to be able to specify the fields within some externally produced data files. C gives no guarantee of the ordering of fields within machine words, so if you do use them for the latter reason, you program will not only be non-portable, it will be compiler-dependent too. The Standard says that fields are packed into ‘storage units’, which are typically machine words. The packing order, and whether or not a bitfield may cross a storage unit boundary, are implementation defined. To force alignment to a storage unit boundary, a zero width field is used before the one that you want to have aligned.
C/C++ has no means of specifying the bit by bit memory layout of structs, so you will need to do manual bit shifting and masking on 8 or 16bit (unsigned) integers (uint8_t, uint16_t from <stdint.h> or <cstdint>).
Of the good dozen of programming languages I know, only very few allow you to specify bit-by-bit memory layout for bit fields: Ada, Erlang, VHDL (and Verilog).
(Community wiki if you want to add more languages to that list.)