I have areas of memory that could be considered "array of bits". They are equivalent to
unsigned char arr[256];
But it would be better thought of as
bit arr[2048];
I'm accessing separate bits from it with
#define GETBIT(x,in) ((in)[ ((x)/8) ] & 1<<(7-((x)%8)))
but I do it a lot in many places of the code, often in performance-critical sections and I wonder if there are any smarter, more optimal methods to do it.
extra info: Architecture: ARM9 (32 bit); gcc/Linux. The physical data representation can't be changed - it is externally provided or mapped for external use.
I don't think so. In fact, many CPU architectures won't access bits individually.
On C++ you have std::bitset<N>. but may not have highest-performance depending on your compiler's implementation and optimization.
BTW, it may be better to group your bit-array as uint32_t[32] (or uint64_t[16]) for aligned dereferencing (which bitset does this for you already).
For randomly accessing individual bits, the macro you've suggested is as good as you're going to get (as long as you turn on optimisations in your compiler).
If there is any pattern at all to the bits you're accessing, then you may be able to do better. For example, if you often access pairs of bits, then you may see some improvement by providing a method to get two bits instead of one, even if you don't always end up using both bits.
As with any optimisation problem, you will need to be very familiar with the behaviour of your code, in particular its access patterns in your bit array, to make a meaningful improvement in performance.
Update: Since you access ranges of bits, you can probably squeeze some more performance out of your macros. For example, if you need to access four bits you might have macros like this:
#define GETBITS_0_4(x,in) (((in)[(x)/8] & 0x0f))
#define GETBITS_1_4(x,in) (((in)[(x)/8] & 0x1e) >> 1)
#define GETBITS_2_4(x,in) (((in)[(x)/8] & 0x3c) >> 2)
#define GETBITS_3_4(x,in) (((in)[(x)/8] & 0x78) >> 3)
#define GETBITS_4_4(x,in) (((in)[(x)/8] & 0xf0) >> 4)
#define GETBITS_5_4(x,in) ((((in)[(x)/8] & 0xe0) >> 5) | (((in)[(x)/8+1] & 0x01)) << 3)
#define GETBITS_6_4(x,in) ((((in)[(x)/8] & 0xc0) >> 6) | (((in)[(x)/8+1] & 0x03)) << 2)
#define GETBITS_7_4(x,in) ((((in)[(x)/8] & 0x80) >> 7) | (((in)[(x)/8+1] & 0x07)) << 1)
// ...etc
These macros would clip out four bits from each bit position 0, 1, 2, etc. (To cut down on the proliferation of pointless parentheses, you might want to use inline functions for the above.) Then perhaps define an inline function like:
inline int GETBITS_4(int x, unsigned char *in) {
switch (x % 8) {
case 0: return GETBITS_0_4(x,in);
case 1: return GETBITS_1_4(x,in);
case 2: return GETBITS_2_4(x,in);
// ...etc
}
}
Since this is a lot of tedious boilerplate code, especially if you've got multiple different widths, you may want to write a program to generate all the GETBIT_* accessor functions.
(I notice that the bits in your bytes are stored in the reverse order from what I've written above. Apply an appropriate transformation to match your structure if you need to.)
Taking Greg's solution as a basis:
template<unsigned int n, unsigned int m>
inline unsigned long getbits(unsigned long[] bits) {
const unsigned bitsPerLong = sizeof(unsigned long) * CHAR_BIT
const unsigned int bitsToGet = m - n;
BOOST_STATIC_ASSERT(bitsToGet < bitsPerLong);
const unsigned mask = (1UL << bitsToGet) - 1;
const size_t index0 = n / bitsPerLong;
const size_t index1 = m / bitsPerLong;
// Do the bits to extract straddle a boundary?
if (index0 == index1) {
return (bits[index0] >> (n % bitsPerLong)) & mask;
} else {
return ((bits[index0] >> (n % bitsPerLong)) + (bits[index1] << (bitsPerLong - (m % bitsPerLong)))) & mask;
}
}
Can get at least 32 bits, even if they are not aligned. Note that's intentionally inline as you don't want to have tons of these functions.
If You reverse the bit order in 'arr', then You can eliminate the substraction from the macro. It is the best what i can say, without knowledge of the problem context (how the bits are used).
#define GETBIT(x,in) ((in)[ ((x)/8) ] & 1<<(7-((x)%8)))
can be optimized.
1) Use standard int which is normally the fastest accessible integer datatype.
If you don't need to be portable, you can find out the size of an int with
sizeof and adapt the following code.
2)
#define GETBIT(x,in) ((in)[ ((x) >>> 3) ] & 1<<((x) & 7))
The mod operator % is slower than ANDing. And you don't need to subtract,
simply adjust your SETBIT routine.
Why not create your own wrapper class?
You could then add bits to the "array" using an operator such as + and get back the individual bits using the [] operator.
Your macro could be improved by using & 7 instead of % 8 but its likely the compiler will make that optimisation for you anyway.
I recently did exactly what you are doing and my stream could consist of any number of bits.
So I have something like the following:
BitStream< 1 > oneBitBitStream;
BitStream< 2 > twoBitBitStream;
oneBitBitStream += Bit_One;
oneBitBitStream += Bit_Zero;
twoBitBitStream += Bit_Three;
twoBitBitStream += Bit_One;
and so on. It makes for nice readable code and you can provide an STL like interface to it for aiding faimilarity :)
Since the question is tagged with C++, is there any reason you can't simply use the standard bitset?
Instead of the unsigned char array and custom macros, you can use std::vector<bool>. The vector class template has a special template specialization for the bool type. This specialization is provided to optimize for space allocation: In this template specialization, each element occupies only one bit (which is eight times less than the smallest type in C++: char).
Related
I have a hex pattern stored in a variable, how to do I know what is the size of the hex pattern
E.g. --
#define MY_PATTERN 0xFFFF
now I want to know the size of MY_PATTERN, to use somewhere in my code.
sizeof (MY_PATTERN)
this is giving me warning -- "integer conversion resulted in truncation".
How can I fix this ? What is the way I should write it ?
The pattern can increase or decrease in size so I can't hard code it.
Don't do it.
There's no such thing in C++ as a "hex pattern". What you actually use is an integer literal. See paragraph "The type of the literal". Thus, sizeof (0xffff) is equal to sizeof(int). And the bad thing is: the exact size may vary.
From the design point of view, I can't really think of a situation where such a solution is acceptable. You're not even deriving a type from a literal value, which would be a suspicious as well, but at least, a typesafe solution. Sizes of values are mostly used in operations working with memory buffers directly, like memcpy() or fwrite(). Sizes defined in such indirect ways lead to a very brittle binary interface and maintenance difficulties. What if you compile a program on both x86 and Motorola 68000 machines and want them to interoperate via a network protocol, or want to write some files on the first machine, and read them on another? sizeof(int) is 4 for the first and 2 for the second. It will break.
Instead, explicitly use the exactly sized types, like int8_t, uint32_t, etc. They're defined in the <cstdint> header.
This will solve your problem:
#define MY_PATTERN 0xFFFF
struct TypeInfo
{
template<typename T>
static size_t SizeOfType(T) { return sizeof(T); }
};
void main()
{
size_t size_of_type = TypeInfo::SizeOfType(MY_PATTERN);
}
as pointed out by Nighthawk441 you can just do:
sizeof(MY_PATTERN);
Just make sure to use a size_t wherever you are getting a warning and that should solve your problem.
You could explicitly typedef various types to hold hex numbers with restricted sizes such that:
typedef unsigned char one_byte_hex;
typedef unsigned short two_byte_hex;
typedef unsigned int four_byte_hex;
one_byte_hex pattern = 0xFF;
two_byte_hex bigger_pattern = 0xFFFF;
four_byte_hex big_pattern = 0xFFFFFFFF;
//sizeof(pattern) == 1
//sizeof(bigger_pattern) == 2
//sizeof(biggest_pattern) == 4
four_byte_hex new_pattern = static_cast<four_byte_hex>(pattern);
//sizeof(new_pattern) == 4
It would be easier to just treat all hex numbers as unsigned ints regardless of pattern used though.
Alternatively, you could put together a function which checks how many times it can shift the bits of the pattern until it's 0.
size_t sizeof_pattern(unsigned int pattern)
{
size_t bits = 0;
size_t bytes = 0;
unsigned int tmp = pattern;
while(tmp >> 1 != 0){
bits++;
tmp = tmp >> 1;
}
bytes = (bits + 1) / 8; //add 1 to bits to shift range from 0-31 to 1-32 so we can divide properly. 8 bits per byte.
if((bits + 1) % 8 != 0){
bytes++; //requires one more byte to store value since we have remaining bits.
}
return bytes;
}
I need to define a struct which has data members of size 2 bits and 6 bits.
Should I use char type for each member?Or ,in order not to waste a memory,can I use something like :2\ :6 notation?
how can I do that?
Can I define a typedef for 2 or 6 bits type?
You can use something like:
typedef struct {
unsigned char SixBits:6;
unsigned char TwoBits:2;
} tEightBits;
and then use:
tEightBits eight;
eight.SixBits = 31;
eight.TwoBits = 3;
But, to be honest, unless you're having to comply with packed data external to your application, or you're in a very memory constrained situation, this sort of memory saving is not usually worth it. You'll find your code is a lot faster if it's not having to pack and unpack data all the time with bitwise and bitshift operations.
Also keep in mind that use of any type other than _Bool, signed int or unsigned int is an issue for the implementation. Specifically, unsigned char may not work everywhere.
It's probably best to use uint8_t for something like this. And yes, use bit fields:
struct tiny_fields
{
uint8_t twobits : 2;
uint8_t sixbits : 6;
}
I don't think you can be sure that the compiler will pack this into a single byte, though. Also, you can't know how the bits are ordered, within the byte(s) that values of the the struct type occupies. It's often better to use explicit masks, if you want more control.
Personally I prefer shift operators and some macros over bit fields, so there's no "magic" left for the compiler. It is usual practice in embedded world.
#define SET_VAL2BIT(_var, _val) ( (_var) | ((_val) & 3) )
#define SET_VAL6BIT(_var, _val) ( (_var) | (((_val) & 63) << 2) )
#define GET_VAL2BIT(_var) ( (_val) & 3)
#define GET_VAL6BIT(_var) ( ((_var) >> 2) & 63 )
static uint8_t my_var;
<...>
SET_VAL2BIT(my_var, 1);
SET_VAL6BIT(my_var, 5);
int a = GET_VAL2BIT(my_var); /* a == 1 */
int b = GET_VAL6BIT(my_var); /* b == 5 */
When asking a question on how to do wrapped N bit signed subtraction I got the following answer:
template<int bits>
int
sub_wrap( int v, int s )
{
struct Bits { signed int r: bits; } tmp;
tmp.r = v - s;
return tmp.r;
}
That's neat and all, but how will a compiler implement this? From this question I gather that accessing bit fields is more or less the same as doing it by hand, but what about when combined with arithmetic as in this example? Would it be as fast as a good manual bit-twiddling approach?
An answer for "gcc" in the role of "a compiler" would be great if anyone wants to get specific. I've tried reading the generated assembly, but it is currently beyond me.
As written in the other question, unsigned wrapping math can be done as:
int tmp = (a - b) & 0xFFF; /* 12 bit mask. */
Writing to a (12bit) bitfield will do exactly that, signed or unsigned. The only difference is that you might get a warning message from the compiler.
For reading though, you need to do something a bit different.
For unsigned maths, it's enough to do this:
int result = tmp; /* whatever bit count, we know tmp contains nothing else. */
or
int result = tmp & 0xFFF; /* 12bit, again, if we have other junk in tmp. */
For signed maths, the extra magic is the sign-extend:
int result = (tmp << (32-12)) >> (32-12); /* asssuming 32bit int, and 12bit value. */
All that does is replicate the top bit of the bitfield (bit 11) across the wider int.
This is exactly what the compiler does for bitfields. Whether you code them by hand or as bitfields is up to you, but just make sure you get the magic numbers right.
(I have not read the standard, but I suspect that relying on bitfields to do the right thing on overflow might not be safe?)
The compiler has knowledge about the size and exact position of r in your example. Suppose it is like
[xxxxrrrr]
Then
tmp.r = X;
could e.g. be expanded to (the b-suffix indicating binary literals, & is bitwise and, | is bitwise or)
tmp = (tmp & 11110000b) // <-- get the remainder which is not tmp.r
| (X & 00001111b); // <-- put X into tmp.r and filter away unwanted bits
Imagine your layout is
[xxrrrrxx] // 4 bits, 2 left-shifts
the expansion could be
tmp = (tmp & 11000011b) // <-- get the remainder which is not tmp.r
| ((X<<2) & 00111100b); // <-- filter 4 relevant bits, then shift left 2
How X actually looks like, whether a complex formulation or just a literal, is actually irrelevant.
If your architecture does not support such bitwise operations, there are still multiplications and divisions by power of two to simulate shifting, and probably these can also be used to filter out unwanted bits.
I want to store bits in an array (like structure). So I can follow either of the following two approaches
Approach number 1 (AN 1)
struct BIT
{
int data : 1
};
int main()
{
BIT a[100];
return 0;
}
Approach number 2 (AN 2)
int main()
{
std::bitset<100> BITS;
return 0;
}
Why would someone prefer AN 2 over AN 1?
Because approach nr. 2 actually uses 100 bits of storage, plus some very minor (constant) overhead, while nr. 1 typically uses four bytes of storage per Bit structure. In general, a struct is at least one byte large per the C++ standard.
#include <bitset>
#include <iostream>
struct Bit { int data : 1; };
int main()
{
Bit a[100];
std::bitset<100> b;
std::cout << sizeof(a) << "\n";
std::cout << sizeof(b) << "\n";
}
prints
400
16
Apart from this, bitset wraps your bit array in a nice object representation with many useful operations.
A good choice depends on how you're going to use the bits.
std::bitset<N> is of fixed size. Visual C++ 10.0 is non-conforming wrt. to constructors; in general you have to provide a workaround. This was, ironically, due to what Microsoft thought was a bug-fix -- they introduced a constructor taking int argument, as I recall.
std::vector<bool> is optimized in much the same way as std::bitset. Cost: indexing doesn't directly provide a reference (there are no references to individual bits in C++), but instead returns a proxy object -- which isn't something you notice until you try to use it as a reference. Advantage: minimal storage, and the vector can be resized as required.
Simply using e.g. unsigned is also an option, if you're going to deal with a small number of bits (in practice, 32 or less, although the formal guarantee is just 16 bits).
Finally, ALL UPPERCASE identifiers are by convention (except Microsoft) reserved for macros, in order to reduce the probability of name collisions. It's therefore a good idea to not use ALL UPPERCASE identifiers for anything else than macros. And to always use ALL UPPERCASE identifiers for macros (this also makes it easier to recognize them).
Cheers & hth.,
bitset has more operations
Approach number 1 will most likely be compiled as an array of 4-byte integers, and one bit of each will be used to store your data. Theoretically a smart compiler could optimize this, but I wouldn't count on it.
Is there a reason you don't want to use std::bitset?
To quote cplusplus.com's page on bitset, "The class is very similar to a regular array, but optimizing for space allocation". If your ints are 4 bytes, a bitset uses 32 times less space.
Even doing bool bits[100], as sbi suggested, is still worse than bitset, because most implementations have >= 1-byte bools.
If, for reasons of intellectual curiosity only, you wanted to implement your own bitset, you could do so using bit masks:
typedef struct {
unsigned char bytes[100];
} MyBitset;
bool getBit(MyBitset *bitset, int index)
{
int whichByte = index / 8;
return bitset->bytes[whichByte] && (1 << (index = % 8));
}
bool setBit(MyBitset *bitset, int index, bool newVal)
{
int whichByte = index / 8;
if (newVal)
{
bitset->bytes[whichByte] |= (1 << (index = % 8));
}
else
{
bitset->bytes[whichByte] &= ~(1 << (index = % 8));
}
}
(Sorry for using a struct instead of a class by the way. I'm thinking in straight C because I'm in the middle of a low-level assignment for school. Obviously two huge benefits of using a class are operator overloading and the ability to have a variable-sized array.)
Is there an easy way to read/write a nibble in a byte without using bit fields?
I'll always need to read both nibbles, but will need to write each nibble individually.
Thanks!
Use masks :
char byte;
byte = (byte & 0xF0) | (nibble1 & 0xF); // write low quartet
byte = (byte & 0x0F) | ((nibble2 & 0xF) << 4); // write high quartet
You may want to put this inside macros.
The smallest unit you can work with is a single byte. If you want to manage the bits you should use bitwise operators.
Here's a modern answer that takes C++11 into account:
// Fixed-width integer types
#include <cstdint>
// Constexpr construction
constexpr uint8_t makeByte(uint8_t highNibble, uint8_t lowNibble)
{
return (((highNibble & 0xF) << 4) | ((lowNibble & 0xF) << 0));
}
// Constexpr high nibble extraction
constexpr uint8_t getHighNibble(uint8_t byte)
{
return ((byte >> 4) & 0xF);
}
// Constexpr low nibble extraction
constexpr uint8_t getLowNibble(uint8_t byte)
{
return ((byte >> 0) & 0xF);
}
Big benefits:
No nasty union trickery
No ugly macros
No boilerplate
Using standardised fixed-width types
contexpr functions
(I.e. can be used in compile-time calculations and template paramters.)
Just plain simple
(Before anyone asks, the >> 0 and << 0 are for primarily for visual balance, to demonstrate that the same concept is in use even in the exceptional case where no shift is actually needed. If your compiler doesn't optimise those away, complain to your compiler provider, not me.)
However, if your nibbles actually represent something important, e.g. a bitfield, then you might want to create a class/struct.
For example if you were programming a device that required a frame buffer with index 16-colour values, with 2 pixel values packed per byte, you might want to create something like this:
struct PixelPair
{
private:
uint8_t value;
public:
contexpr explicit PixelPair(uint8_t rawValue) :
value { rawValue }
{
}
constexpr PixelPair(uint8_t leftPixel, uint8_t rightPixel) :
value { makeByte(leftPixel, rightPixel) }
{
}
constexpr uint8_t getLeftPixel() const
{
return getHighNibble(this->value);
}
constexpr uint8_t getRightPixel() const
{
return getLowNibble(this->value);
}
constexpr uint8_t getRawValue() const
{
return this->value;
}
};
Note that this is essentially just a vanishingly thin wrapper around the above functions.
In this case it provides:
Type safety - No accidentally mixing up a plain old uint8_t and a specifically designated PixelPair. (See also: Bjarne Stroustrup's 2012 Keynote, where he discusses "Type-rich Programming".)
Improved readability - pixelPair.getLeftPixel() tells you exactly what the code is dealing with: the left-hand pixel of a pair of pixels.
Clear semantics - The code tells you what it is dealing with, not how it is dealing with it. pixelPair.getLeftPixel() tells you that the function is retrieving the left-hand pixel without specifying how, whereas getHighNibble(pixelByte) only tells you the how, i.e. that the high nibble of a pixel byte is being retrieved, it doesn't tell you what that nibble represents - perhaps the high nibble actually represents the right-hand pixel?
You could take this further and create a Pixel class too if you wanted even more type safety, and it could have relevant functions for dealing with the specific pixel format. This sort of code gets you thinking about what kind of data you are dealing with and the relationships between the data, rather than just thinking about the data as quantities of bits and bytes.
You could create yourself a pseudo union for convenience:
union ByteNibbles
{
ByteNibbles(BYTE hiNibble, BYTE loNibble)
{
data = loNibble;
data |= hiNibble << 4;
}
BYTE data;
};
Use it like this:
ByteNibbles byteNibbles(0xA, 0xB);
BYTE data = byteNibbles.data;