I am trying to extract the lower 25 bits of uint64_t to uint32_t. This solution shows how to extract lower 16 bits from uint32_t, but I am not able to figure out for uint64_t. Any help would be appreciated.
See How do you set, clear, and toggle a single bit? for bit operations.
To answer your question:
uint64_t lower25Bits = inputValue & (uint64_t)0x1FFFFFF;
Just mask with a mask that leaves just the bits you care about.
uint32_t out = input & ((1UL<<26)-1);
The idea here is: 1UL<<26 provides an (unsigned long, which is guaranteed to be at least 32-bit wide) integer with just the 26th bit set, i.e.
00000100000000000000000000000000
the -1 makes it become a value with all the bits below it set, i.e.:
00000011111111111111111111111111
the AND "lets through" only the bits that in the mask correspond to zero.
Another way is to throw away those bits with a double shift:
uint32_t out = (((uint32_t)input)<<7)>>7;
The cast to uint32_t makes sure we are dealing with a 32-bit wide unsigned integer; the unsigned part is important to get well-defined results with shifts (and bitwise operations in general), the 32 bit-wide part because we need a type with known size for this trick to work.
Let's say that (uint32_t)input is
11111111111111111111111111111111
we left shift it by 32-25=7; this throws away the top 7 bits
11111111111111111111111110000000
and we right-shift it back in place:
00000001111111111111111111111111
and there we go, we got just the bottom 25 bits.
Notice that the first uint32_t cast wouldn't be strictly necessary because you already have a known-size unsigned value; you could just do (input<<39)>>39, but (1) I prefer to be sure - what if tomorrow input becomes a type with another size/signedness? and (2) in general current CPUs are more efficient working with 32 bit integers than 64 bit integers.
Related
I want to decode a GPS navigation message where some parameters are marked such that:
Parameters so indicated shall be two's complement, with the sign bit
(+ or -) occupying the MSB
For example, I want to store a parameter af0 which has 22 number of bits, with bit 22 as the MSB.
The parameter af0 has been decoded by me and now I need to perform the two's complement operation. I stored af0 using an uint32_t integer type.
There are also other parameters like IDOT which has 14 number of bits and I stored it using an uint16_t.
I'm not sure, but if I understand it correctly if have to check the MSB for 1 or 0. If it is 1 I can
simply calculate the two's complement by negation (and casting) of the value, i.e. int32_t af0_i = -(int32_t)af0. If the MSB is 0 I just cast the value according: int32_t af0_i = (int32_t)af0.
Is this correct for uintX_t integer types? I also tried out: https://stackoverflow.com/a/34076866/6518689 but it didn't fixed my problem, the value remains the same.
af0_i = -(int32_t)af0 will not work as expected; it'll flip all the bits, whereas you need to sign-extend the MSB instead and keep the rest unchanged.
Let's assume you extracted the raw 22 bits into a 32-bit variable:
int32_t af0 = ... /* some 22-bit value, top 10 bits are 0 */;
So now bit 21 is the sign bit. But with int32_t the sign bit is bit 31 (technically two's complement isn't guaranteed until C++20).
So we can shift left by 10 bits and immediately back right, which will sign-extend it.
af0 <<= 10; af0 >>= 10;
The code above is guaranteed to sign-extend since C++20, and is implementation-defined before that (on x86 will work as expected, though you can add a static_assert for that).
As a beginner, I know we can use an ARRAY to store larger numbers if required, but I want to have a 16 bytes INT data type in c++ on which I can perform all arithmetic operations as performed on basic data types like INT or FLOAT
So can we in effect increase, default data types size as desired, like int of 64 bytes or double of 120 bytes, not directly on basic data type but in effect which is the same as of increasing capacity of datatypes.
Is this even possible, if yes then how and if not then what are completely different ways to achieve the same?
Yes, it's possible, but no, it's not trivial.
First, I feel obliged to point out that this is one area where C and C++ really don't provide as much access to the hardware at the lowest level as you'd really like. In assembly language, you normally get a couple of features that make multiple-precision arithmetic quite a bit easier to implement. One is a carry flag. This tracks whether a previous addition generated a carry (or a previous subtraction a borrow). So to add two 12-bit numbers on a machine with 64-bit registers you'd typically write code on this general order:
; r0 contains the bottom 64-bits of the first operand
; r1 contains the upper 64 bits of the first operand
; r2 contains the lower 64 bits of the second operand
; r3 contains the upper 64 bits of the second operand
add r0, r2
adc r1, r3
Likewise, when you multiply two numbers, most processors generate the full answer in two separate registers, so when (for example) you multiply two 64-bit numbers, you get a 128-bit result.
In C and C++, however, we don't get that. One easy way to get around it is to work in smaller chunks. For example, if we want a 128-bit type on an implementation that provides 64-bit long long as its largest integer type, we can work in 32-bit chunks. When we're going to do an operation, we widen those to a long long, and do the operation on the long long. This way, when we add or multiply two 32-bit chunks, if the result is larger than 32 bits, we can still store it all in our 64-bit long long.
So, for addition life is pretty easy. We add the two lowest order words. We use a bitmask to get the bottom 32 bits and store them into the bottom 32 bits of the result. Then we take the upper 32 bits, and use them as a "carry" when we add the next 32 bits of the operands. Continue until we've added all 128 (or whatever) bits of operands and gotten our overall result.
Subtraction is pretty similar. In fact, we can do 2's complement on the second operand, then add to get our result.
Multiplication gets a little trickier. It's not always immediately obvious how we can carry out multiplication in smaller pieces. The usual is based on the distributive property. That is, we can take some large numbers A and B, and break them up into (a0 + a1) and (b0 + b1), where each an and bn is a 32-bit chunk of the operand. Then we use the distributive property to turn that into:
a0 * b0 + a0 * b1 + a1 * b0 + a1 * b1
This can be extended to an arbitrary number of "chunks", though if you're dealing with really large numbers there are much better ways (e.g., karatsuba).
If you want to define non-atomic big integers, you can use plain structs.
template <std::size_t size>
struct big_int {
std::array<std::int8_t, size> bytes;
};
using int128_t = big_int<16>;
using int256_t = big_int<32>;
using int512_t = big_int<64>;
int main() {
int128_t i128 = { 0 };
}
I'm working on some simple bit manipulation problems in C++, and came across this while trying to visualize my steps. I understand that the number of bits assigned to different primitive types may vary from system to system. For my machine, sizeof(int) outputs 4, so I've got 4 char worth of bits for my value. I also know now that the definition of a byte is usually 8 bits, but is not necessarily the case. When I output CHAR_BIT I get 8. I therefore expect there to be a total of 32 bits for my int values.
I can then go ahead and print the binary value of my int to the screen:
int max=~0; //All my bits are turned on now
std::cout<<std::bitset<sizeof(int)*CHAR_BIT>(max)<<std::endl;
$:11111111111111111111111111111111
I can increase the bitset size if I want though:
int max=~0;
std::cout<<std::bitset<sizeof(int)*CHAR_BIT*3>(max)<<std::endl;
$:000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111
Why are there so many ones? I would have expected to have only 32 ones, padded with zeros. Instead there's twice as many, what's going on?
When I repeat the experiment with unsigned int, which has the same size as int, the extra ones don't appear:
unsigned int unmax=~0;
std::cout<<std::bitset<sizeof(unsigned int)*CHAR_BIT*3>(unmax)<<std::endl;
$:000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111
The constructor of std::bitset takes an unsigned long long, and when you try to assign a -1 (which is what ~0 is in an int) to an unsigned long long, you get 8 bytes (64 bits) worth of 1s.
It doesn't happen with unsigned int because you are assigning the value of 4294967295 instead of -1, which is 32 1s in a unsigned long long
When you write int max=~0;, max will be 32 bits filled with 1s, which interpreted as integer is -1.
When you write
std::bitset<sizeof(int)*CHAR_BIT>(max)
// basically, same as
std::bitset<32>(-1)
You need to keep in mind that the std::bitset constructor takes an unsigned long long. So the -1 that you pass to it, gets converted to a 64 bit representation of -1, which is 64 bits all filled with 1 (because you have a negative value, sign extension maintains it as such, by filling the 32 leftmost bits with 1s).
Therefore, the constructor of std::bitset gets an unsigned long long all filled with 1s, and it initializes the 32 bits you asked with 1s. So, when you print it, you get:
11111111111111111111111111111111
Then, when you write:
std::bitset<sizeof(int)*CHAR_BIT*3>(max)
// basically, same as
std::bitset<96>(-1)
The std::bitset constructor will initialize 64 rightmost bits of the 96 that you asked with the value of the unsigned long long that you passed, so those 64 bits are filled with 1s. The remaining bits (32 leftmost) are initialized with zeros. So when you print it, you get:
000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111111111
On the other hand, when you write unsigned int unmax=~0;, you're assigning all 1s to an unsigned int, so you get UINT_MAX.
Then, when you write:
std::bitset<sizeof(unsigned int)*CHAR_BIT*3>(unmax)
// basically, same as
std::bitset<96>(UINT_MAX)
The UINT_MAX that you pass, gets converted to a 64 bit representation, which is 32 rightmost bits filled with 1s and the remaining all 0s (because you have a positive value, sign extension maintains it as such, by filling the 32 leftmost bits with 0s).
So the unsinged long long that std::bitset constructor gets is represented as 32 0s, followed by 32 1s. It will initialize 64 rightmost bits of the 96 that you asked with 32 0s followed by 32 1s. The remaining 32 leftmost bits (of 96) are initialized with zeros. So when you print it, you get (64 0s followed by 32 1s):
000000000000000000000000000000000000000000000000000000000000000011111111111111111111111111111111
I am having trouble finding information on the proper handling of variables during binary arithmetic. I am currently working on implementing an algorithm on an Atmel ATTiny84 microcontroller. I am coding it in C++.
The issues I am having is that with Binary Arithmetic you could end up in overflow or you could end up with a variable size that is larger than the value being stored in it. I apologize if this is confusing let me explain with an example.
uint16_t A=500;
uint8_t B=8;
uint32_t C;
C=A*B;
From what I've learned via google search, if you multiply a variable of size M by a variable of size N it results in a variable of size M+N. In the above case C=4000 but M+N is 24. The value 4000 however can fit in 16 bits. Can I simply declare C as 16 bit or does it have to be 32 bit as shown above?
uint16_t A=500;
uint8_t B=8;
uint16_t C;
C=A*B;
If I do have to store 4000 in a variable that is 32 bits, can I simply transfer it to a variable that is 16 bits by the following
uint16_t D;
uint32_t C
C=4000;
D=C;
Thanks in advance for the help.
Multiplication won't return a larger type than the two operands unless you specifically tell it to. The operation will convert all variables to the largest width and then multiply. When you multiply a 16 bit int by an 8 bit int, the 8 bit int will be converted to a 16 bit int, and then the two will be multiplied. Although mathematically the result can be larger than 16 bits, the operation will only return a 16 bit number. If the result cannot fit in 16 bits of space, then an overflow flag will be set (check your microcontroller manual to see how to check that).
In your case, the operation will return 4000, which can be stored in a 16 bit variable, so that is fine.
uint16_t A=500;
uint8_t B=8;
uint32_t C;
C=A*B;
It will promote a and b to the size of C then do the math.
uint16_t A=500;
uint8_t B=8;
uint16_t C;
C=A*B;
It will promote a and b to the size of C then do the math.
uint16_t D;
uint32_t C;
C=4000;
D=C;
You may or may not get a warning that you are trying to shove 32 bits into 16.
D=(uint16_t)C;
Will clip off the lower 16 bits without a warning. Both cases result in the lower 16 bits going into D.
Say, i have binary protocol, where first 4 bits represent a numeric value which can be less than or equal to 10 (ten in decimal).
In C++, the smallest data type available to me is char, which is 8 bits long. So, within my application, i can hold the value represented by 4 bits in a char variable. My question is, if i have to pack the char value back into 4 bits for network transmission, how do i pack my char's value back into 4 bits?
You do bitwise operation on the char;
so
unsigned char packedvalue = 0;
packedvalue |= 0xF0 & (7 <<4);
packedvalue |= 0x0F & (10);
Set the 4 upper most bit to 7 and the lower 4 bits to 10
Unpacking these again as
int upper, lower;
upper = (packedvalue & 0xF0) >>4;
lower = packedvalue & 0x0F;
As an extra answer to the question -- you may also want to look at protocol buffers for a way of encoding and decoding data for binary transfers.
Sure, just use one char for your value:
std::ofstream outfile("thefile.bin", std::ios::binary);
unsigned int n; // at most 10!
char c = n << 4; // fits
outfile.write(&c, 1); // we wrote the value "10"
The lower 4 bits will be left at zero. If they're also used for something, you'll have to populate c fully before writing it. To read:
infile.read(&c, 1);
unsigned int n = c >> 4;
Well, there's the popular but non-portable "Bit Fields". They're standard-compliant, but may create a different packing order on different platforms. So don't use them.
Then, there are the highly portable bit shifting and bitwise AND and OR operators, which you should prefer. Essentially, you work on a larger field (usually 32 bits, for TCP/IP protocols) and extract or replace subsequences of bits. See Martin's link and Soren's answer for those.
Are you familiar with C's bitfields? You simply write
struct my_bits {
unsigned v1 : 4;
...
};
Be warned, various operations are slower on bitfields because the compiler must unpack them for things like addition. I'd imagine unpacking a bitfield will still be faster than the addition operation itself, even though it requires multiple instructions, but it's still overhead. Bitwise operations should remain quite fast. Equality too.
You must also take care with endianness and threads (see the wikipedia article I linked for details, but the issues are kinda obvious). You should leearn about endianness anyways since you said "binary protocol" (see this previous questions)