I need to define a struct which has data members of size 2 bits and 6 bits.
Should I use char type for each member?Or ,in order not to waste a memory,can I use something like :2\ :6 notation?
how can I do that?
Can I define a typedef for 2 or 6 bits type?
You can use something like:
typedef struct {
unsigned char SixBits:6;
unsigned char TwoBits:2;
} tEightBits;
and then use:
tEightBits eight;
eight.SixBits = 31;
eight.TwoBits = 3;
But, to be honest, unless you're having to comply with packed data external to your application, or you're in a very memory constrained situation, this sort of memory saving is not usually worth it. You'll find your code is a lot faster if it's not having to pack and unpack data all the time with bitwise and bitshift operations.
Also keep in mind that use of any type other than _Bool, signed int or unsigned int is an issue for the implementation. Specifically, unsigned char may not work everywhere.
It's probably best to use uint8_t for something like this. And yes, use bit fields:
struct tiny_fields
{
uint8_t twobits : 2;
uint8_t sixbits : 6;
}
I don't think you can be sure that the compiler will pack this into a single byte, though. Also, you can't know how the bits are ordered, within the byte(s) that values of the the struct type occupies. It's often better to use explicit masks, if you want more control.
Personally I prefer shift operators and some macros over bit fields, so there's no "magic" left for the compiler. It is usual practice in embedded world.
#define SET_VAL2BIT(_var, _val) ( (_var) | ((_val) & 3) )
#define SET_VAL6BIT(_var, _val) ( (_var) | (((_val) & 63) << 2) )
#define GET_VAL2BIT(_var) ( (_val) & 3)
#define GET_VAL6BIT(_var) ( ((_var) >> 2) & 63 )
static uint8_t my_var;
<...>
SET_VAL2BIT(my_var, 1);
SET_VAL6BIT(my_var, 5);
int a = GET_VAL2BIT(my_var); /* a == 1 */
int b = GET_VAL6BIT(my_var); /* b == 5 */
Related
My task is to create a class that implements Floating point number.
The size of the class must be exactly 3 bytes:
1 bit for the sign
6 bits for exponent
17 bits for mantissa
I tried to implement the class using bit fields, but the size
is 4 bytes :
class FloatingPointNumber
{
private:
unsigned int sign : 1;
unsigned int exponent : 6;
unsigned int mantissa : 17;
};
C++ (and C for that matter) compilers are permitted to insert and append any amount of padding into a struct as they see fit. So if your task specifies that it must be exactly 3 bytes, then this task can not be done with struct (or class) using just standard language elements.
Using compiler specific attributes or pragmas, you can force the compiler to not insert padding; however for bitfields the compiler still might see the need to fill up any gaps left to type alignment requirements.
For this specific task your best bet probably is to use a class like this
class CustomFloat {
protected: // or private: as per #paddy's comment
unsigned char v[3];
}
…and hoping for the compiler not to append some padding bytes.
The surefire way would be to simply to
typedef char CustomFloat[3];
and accept, that you'll not enjoy static type checking benefits whatsoever.
And then for each operation use a form of type punning to transfer the contents of v into a (at least 32 bit wide) variable, unpack the bits from there, perform the desired operation, pack the bits and transfer back into v. E.g. something like this:
uint32_t u = 0;
static_assert( sizeof(u) >= sizeof(v) );
memcpy((void*)&u, sizeof(v), (void const*)v);
unsigned sign = (u & SIGN_MASK) >> SIGN_SHIFT;
unsigned mant = (u & MANT_MASK) >> MANT_SHIFT;
unsigned expt = (u & EXPT_MASK) >> EXPT_SHIFT;
// perform operation
u = 0;
u |= (sign << SIGN_SHIFT) & SIGN_MASK;
u |= (mant << MANT_SHIFT) & MANT_MASK;
u |= (expt << EXPT_SHIFT) & EXPT_MASK;
memcpy((void*)v, sizeof(v), (void const*)&u);
Yes, this looks ugly. Yes, it is quite verbose. But that's what going to happen under the hood anyway, so you might just as well write it down.
I'm trying to interface with Ada code using C++, so I'm defining a struct using bit fields, so that all the data is in the same place in both languages. The following is not precisely what I'm doing, but outlines the problem. The following is also a console application in VS2008, but that's not super relevant.
using namespace System;
int main() {
int array1[2] = {0, 0};
int *array2 = new int[2]();
array2[0] = 0;
array2[1] = 0;
#pragma pack(1)
struct testStruct {
// Word 0 (desired)
unsigned a : 8;
unsigned b : 1;
bool c : 1;
unsigned d : 21;
bool e : 1;
// Word 1 (desired)
int f : 32;
// Words 2-3 (desired)
int g[2]; //Cannot assign bit field but takes 64 bits in my compiler
};
testStruct test;
Console::WriteLine("size of char: {0:D}", sizeof(char) * 8);
Console::WriteLine("size of short: {0:D}", sizeof(short) * 8);
Console::WriteLine("size of int: {0:D}", sizeof(int) * 8);
Console::WriteLine("size of unsigned: {0:D}", sizeof(unsigned) * 8);
Console::WriteLine("size of long: {0:D}", sizeof(long) * 8);
Console::WriteLine("size of long long: {0:D}", sizeof(long long) * 8);
Console::WriteLine("size of bool: {0:D}", sizeof(bool) * 8);
Console::WriteLine("size of int[2]: {0:D}", sizeof(array1) * 8);
Console::WriteLine("size of int*: {0:D}", sizeof(array2) * 8);
Console::WriteLine("size of testStruct: {0:D}", sizeof(testStruct) * 8);
Console::WriteLine("size of test: {0:D}", sizeof(test) * 8);
Console::ReadKey(true);
delete[] array2;
return 0;
}
(If it wasn't clear, in the real program, the basic idea is that the program gets a void* from something communicating with the Ada code and casts it to a testStruct* to access the data.)
With #pragma pack(1) commented out, the output is:
size of char: 8
size of short: 16
size of int: 32
size of unsigned: 32
size of long: 32
size of long long: 64
size of bool: 8
size of int[2]: 64
size of int*: 32
size of testStruct: 224
size of test: 224
Obviously 4 words (indexed 0-3) should be 448 = 32*4 = 128 bits, not 224. The other output lines were to help confirm the size of types under the VS2008 compiler.
With #pragma pack(1) uncommented, that number (on the last two lines of output) is reduced to 176, which is still greater than 128. It seems that the bools aren't being packed together with the unsigned ints in "Word 0".
Note: a&b, c, d, e, f, packaged in different words would be 5, +2 for the array = 7 words, times 32 bits = 224, the number we get with #pragma pack(1) commented out. If c and e (the bools) instead take up 8 bits each, as opposed to 32, we get 176, which is the number we get with #pragma pack(1) uncommented. It seems #pragma pack(1) is only allowing the bools to be packed into single bytes by themselves, instead of words, but not the bools with the unsigned ints at all.
So my question, in one sentence: Is there a way to force the compiler to pack a through e into one word? Related is this question: C++ bitfield packing with bools , but that doesn't answer my question; it only points out the behavior I'm trying to force to go away.
If there is literally no way to do this, does anyone have any ideas for workarounds? I'm at a loss, because:
I was asked to avoid changing the struct format that I'm copying (no re-ordering).
I don't want to change the bools to unsigned ints because it may cause problems down the road with constantly having to re-cast it to bool and maybe accidentally using the wrong version of an overloaded function, not to mention making the code more obscure for others who read it later.
I don't want to declare them as private unsigned ints then make public accessors or something because all other members of all other structs in the project are accessed directly without () afterward, so it would seem a bit hacky and obtuse, and one would almost NEED the IntelliSense or trial-and-error to remember which needs () and which doesn't.
I would like to avoid creating another struct type just for the data conversion (and e.g. make a constructor for testStruct that takes in a single testStructImport-type object) because the actual struct is very long with lots of bit-field-specified variables.
I recommend that you create a "normal" structure without any bit packing. Use default POD types for the members.
Create interface functions for loading the "normal" fields from a buffer (uint8_t), and storing to a buffer.
This will allow you to use the data members in a sane method in your program. The bit packing and unpacking will be handled by the interface function. The bit twiddling should use bitwise AND and bitwise OR functions and not rely on the bit field notation in a structure. This will allow you to adjust the bit twiddling and be more portable among compilers.
This is how I designed my protocol classes. And I don't have to worry about bit field positioning, Endianess or things of that sort.
Also, I can use block I/O for reading and writing the buffer.
Try packing in this way:
#pragma pack( push, 1 )
struct testStruct {
// Word 0 (desired)
unsigned a : 8;
unsigned b : 1;
unsigned c : 1;
unsigned d : 21;
unsigned e : 1;
// Word 1 (desired)
unsigned f : 32;
// Words 2-3 (desired)
unsigned g[2]; //Cannot assign bit field but takes 64 bits in my compiler
};
#pragma pack(pop)
There is no easy, elegant method without using accessors or an interface layer. Unfortunately, there is nothing like a #pragma thing to fix this. I ended up just converting the bools to unsigned int and renaming variables from e.g. f to f_flag or f_bool to encourage correct usage and make it clear what the variables contained. It's lower-effort than Thomas's solution, but not as robust, obviously, and still gets around some of the main drawbacks with any of the easier methods.
Years after I posted this question, user #WaltK added this comment to the linked, related question:
"If you want to have more control over the layout of bit field
structures in memory, consider using this bit field facility,
implemented as a library header file."
I have a hex pattern stored in a variable, how to do I know what is the size of the hex pattern
E.g. --
#define MY_PATTERN 0xFFFF
now I want to know the size of MY_PATTERN, to use somewhere in my code.
sizeof (MY_PATTERN)
this is giving me warning -- "integer conversion resulted in truncation".
How can I fix this ? What is the way I should write it ?
The pattern can increase or decrease in size so I can't hard code it.
Don't do it.
There's no such thing in C++ as a "hex pattern". What you actually use is an integer literal. See paragraph "The type of the literal". Thus, sizeof (0xffff) is equal to sizeof(int). And the bad thing is: the exact size may vary.
From the design point of view, I can't really think of a situation where such a solution is acceptable. You're not even deriving a type from a literal value, which would be a suspicious as well, but at least, a typesafe solution. Sizes of values are mostly used in operations working with memory buffers directly, like memcpy() or fwrite(). Sizes defined in such indirect ways lead to a very brittle binary interface and maintenance difficulties. What if you compile a program on both x86 and Motorola 68000 machines and want them to interoperate via a network protocol, or want to write some files on the first machine, and read them on another? sizeof(int) is 4 for the first and 2 for the second. It will break.
Instead, explicitly use the exactly sized types, like int8_t, uint32_t, etc. They're defined in the <cstdint> header.
This will solve your problem:
#define MY_PATTERN 0xFFFF
struct TypeInfo
{
template<typename T>
static size_t SizeOfType(T) { return sizeof(T); }
};
void main()
{
size_t size_of_type = TypeInfo::SizeOfType(MY_PATTERN);
}
as pointed out by Nighthawk441 you can just do:
sizeof(MY_PATTERN);
Just make sure to use a size_t wherever you are getting a warning and that should solve your problem.
You could explicitly typedef various types to hold hex numbers with restricted sizes such that:
typedef unsigned char one_byte_hex;
typedef unsigned short two_byte_hex;
typedef unsigned int four_byte_hex;
one_byte_hex pattern = 0xFF;
two_byte_hex bigger_pattern = 0xFFFF;
four_byte_hex big_pattern = 0xFFFFFFFF;
//sizeof(pattern) == 1
//sizeof(bigger_pattern) == 2
//sizeof(biggest_pattern) == 4
four_byte_hex new_pattern = static_cast<four_byte_hex>(pattern);
//sizeof(new_pattern) == 4
It would be easier to just treat all hex numbers as unsigned ints regardless of pattern used though.
Alternatively, you could put together a function which checks how many times it can shift the bits of the pattern until it's 0.
size_t sizeof_pattern(unsigned int pattern)
{
size_t bits = 0;
size_t bytes = 0;
unsigned int tmp = pattern;
while(tmp >> 1 != 0){
bits++;
tmp = tmp >> 1;
}
bytes = (bits + 1) / 8; //add 1 to bits to shift range from 0-31 to 1-32 so we can divide properly. 8 bits per byte.
if((bits + 1) % 8 != 0){
bytes++; //requires one more byte to store value since we have remaining bits.
}
return bytes;
}
I have areas of memory that could be considered "array of bits". They are equivalent to
unsigned char arr[256];
But it would be better thought of as
bit arr[2048];
I'm accessing separate bits from it with
#define GETBIT(x,in) ((in)[ ((x)/8) ] & 1<<(7-((x)%8)))
but I do it a lot in many places of the code, often in performance-critical sections and I wonder if there are any smarter, more optimal methods to do it.
extra info: Architecture: ARM9 (32 bit); gcc/Linux. The physical data representation can't be changed - it is externally provided or mapped for external use.
I don't think so. In fact, many CPU architectures won't access bits individually.
On C++ you have std::bitset<N>. but may not have highest-performance depending on your compiler's implementation and optimization.
BTW, it may be better to group your bit-array as uint32_t[32] (or uint64_t[16]) for aligned dereferencing (which bitset does this for you already).
For randomly accessing individual bits, the macro you've suggested is as good as you're going to get (as long as you turn on optimisations in your compiler).
If there is any pattern at all to the bits you're accessing, then you may be able to do better. For example, if you often access pairs of bits, then you may see some improvement by providing a method to get two bits instead of one, even if you don't always end up using both bits.
As with any optimisation problem, you will need to be very familiar with the behaviour of your code, in particular its access patterns in your bit array, to make a meaningful improvement in performance.
Update: Since you access ranges of bits, you can probably squeeze some more performance out of your macros. For example, if you need to access four bits you might have macros like this:
#define GETBITS_0_4(x,in) (((in)[(x)/8] & 0x0f))
#define GETBITS_1_4(x,in) (((in)[(x)/8] & 0x1e) >> 1)
#define GETBITS_2_4(x,in) (((in)[(x)/8] & 0x3c) >> 2)
#define GETBITS_3_4(x,in) (((in)[(x)/8] & 0x78) >> 3)
#define GETBITS_4_4(x,in) (((in)[(x)/8] & 0xf0) >> 4)
#define GETBITS_5_4(x,in) ((((in)[(x)/8] & 0xe0) >> 5) | (((in)[(x)/8+1] & 0x01)) << 3)
#define GETBITS_6_4(x,in) ((((in)[(x)/8] & 0xc0) >> 6) | (((in)[(x)/8+1] & 0x03)) << 2)
#define GETBITS_7_4(x,in) ((((in)[(x)/8] & 0x80) >> 7) | (((in)[(x)/8+1] & 0x07)) << 1)
// ...etc
These macros would clip out four bits from each bit position 0, 1, 2, etc. (To cut down on the proliferation of pointless parentheses, you might want to use inline functions for the above.) Then perhaps define an inline function like:
inline int GETBITS_4(int x, unsigned char *in) {
switch (x % 8) {
case 0: return GETBITS_0_4(x,in);
case 1: return GETBITS_1_4(x,in);
case 2: return GETBITS_2_4(x,in);
// ...etc
}
}
Since this is a lot of tedious boilerplate code, especially if you've got multiple different widths, you may want to write a program to generate all the GETBIT_* accessor functions.
(I notice that the bits in your bytes are stored in the reverse order from what I've written above. Apply an appropriate transformation to match your structure if you need to.)
Taking Greg's solution as a basis:
template<unsigned int n, unsigned int m>
inline unsigned long getbits(unsigned long[] bits) {
const unsigned bitsPerLong = sizeof(unsigned long) * CHAR_BIT
const unsigned int bitsToGet = m - n;
BOOST_STATIC_ASSERT(bitsToGet < bitsPerLong);
const unsigned mask = (1UL << bitsToGet) - 1;
const size_t index0 = n / bitsPerLong;
const size_t index1 = m / bitsPerLong;
// Do the bits to extract straddle a boundary?
if (index0 == index1) {
return (bits[index0] >> (n % bitsPerLong)) & mask;
} else {
return ((bits[index0] >> (n % bitsPerLong)) + (bits[index1] << (bitsPerLong - (m % bitsPerLong)))) & mask;
}
}
Can get at least 32 bits, even if they are not aligned. Note that's intentionally inline as you don't want to have tons of these functions.
If You reverse the bit order in 'arr', then You can eliminate the substraction from the macro. It is the best what i can say, without knowledge of the problem context (how the bits are used).
#define GETBIT(x,in) ((in)[ ((x)/8) ] & 1<<(7-((x)%8)))
can be optimized.
1) Use standard int which is normally the fastest accessible integer datatype.
If you don't need to be portable, you can find out the size of an int with
sizeof and adapt the following code.
2)
#define GETBIT(x,in) ((in)[ ((x) >>> 3) ] & 1<<((x) & 7))
The mod operator % is slower than ANDing. And you don't need to subtract,
simply adjust your SETBIT routine.
Why not create your own wrapper class?
You could then add bits to the "array" using an operator such as + and get back the individual bits using the [] operator.
Your macro could be improved by using & 7 instead of % 8 but its likely the compiler will make that optimisation for you anyway.
I recently did exactly what you are doing and my stream could consist of any number of bits.
So I have something like the following:
BitStream< 1 > oneBitBitStream;
BitStream< 2 > twoBitBitStream;
oneBitBitStream += Bit_One;
oneBitBitStream += Bit_Zero;
twoBitBitStream += Bit_Three;
twoBitBitStream += Bit_One;
and so on. It makes for nice readable code and you can provide an STL like interface to it for aiding faimilarity :)
Since the question is tagged with C++, is there any reason you can't simply use the standard bitset?
Instead of the unsigned char array and custom macros, you can use std::vector<bool>. The vector class template has a special template specialization for the bool type. This specialization is provided to optimize for space allocation: In this template specialization, each element occupies only one bit (which is eight times less than the smallest type in C++: char).
I noticed while making a program that a lot of my int type variables never went above ten. I figure that because an int is 2 bytes at the shortest (1 if you count char), so I should be able to store 4 unsigned ints with a max value of 15 in a short int, and I know I can access each one individually using >> and <<:
short unsigned int SLWD = 11434;
S is (SLWD >> 12), L is ((SLWD << 4) >> 12),
W is ((SLWD << 8) >> 12), and D is ((SLWD << 8) >> 12)
However, I have no idea how to encompase this in a function of class, since any type of GetVal() function would have to be of type int, which defeats the purpose of isolating the bits in the first place.
First, remember the Rules of Optimization. But this is possible in C or C++ using bitfields:
struct mystruct {
unsigned int smallint1 : 3; /* 3 bits wide, values 0 -- 7 */
signed int smallint2 : 4; /* 4 bits wide, values -8 -- 7 */
unsigned int boolean : 1; /* 1 bit wide, values 0 -- 1 */
};
It's worth noting that while you gain by not requiring so much storage, you lose because it becomes more costly to access everything, since each read or write now has a bunch of bit twiddling mechanics associated with it. Given that storage is cheap, it's probably not worth it.
Edit: You can also use vector<bool> to store 1-bit bools; but beware of it because it doesn't act like a normal vector! In particular, it doesn't provide iterators. It's sufficiently different that it's fair to say a vector<bool> is not actually a vector. Scott Meyers wrote very clearly on this topic in 'Effective STL'.
In C, and for the sole purpose of saving space, you can reinterpret the unsigned short as a structure with bitfields (or use such structure without messing with reinterpretations):
#include <stdio.h>
typedef struct bf_
{
unsigned x : 4;
unsigned y : 4;
unsigned z : 4;
unsigned w : 4;
} bf;
int main(void)
{
unsigned short i = 5;
bf *bitfields = (bf *) &i;
bitfields->w = 12;
printf("%d\n", bitfields->x);
// etc..
return 0;
}
That's a very common technique. You usually allocate an array of the larger primitive type (e.g., ints or longs), and have some abstraction to deal with the mapping. If you're using an OO language, it's usually a good idea to actually define some sort of BitArray or SmartArray or something like that, and impement a getVal() that takes an index. The important thing is to make sure you hide the details of the internal representation (e.g., for when you move between platforms).
That being said, most mainstream languages already have this functionality available.
If you just want bits, WikiPedia has a good list.
If you want more than bits, you can still find something, or implement it yourself with a similar interface. Take a look at the Java BitSet for reference