I want to toggle a bit at a given 'offset', I have tried by using typedef to create a new type as "BYTEBUF" and its variable as bitstream.
...
typedef struct{
char *data;
unsigned int nb_bytes;
unsigned long bitlength;
}BYTEBUF;
this is my typedefinition
i want to toggle the bit at a given offset,
i tried using :
bitstream->data[offset]^=1
but many suggest that instead of "offset" it should be "offset/8".
(this is my first question so pls bare for any mistakes)
You can simply use the std::bitset class from the std which offers you all the tools you need for manipulating bits. In your case you would use it like this:
// A array of bits of size 16
std::bitset<16> bits;
// Flip the 6th bit
bits.flip(5);
// Set the 6th bit to one
bits.set(5, true);
If you need to have a struct of variable size (which in your example is the case) then you could do something like this:
struct BYTES
{
char* bytes;
// Toggle the byte at position
// Note that I'm not checking for any overflow
// which you should definitely do
void toggle(const size_t position)
{
bytes[position/8] ^= 1 << (position % 8);
}
};
// I'm assuming everything has been allocated properly
BYTES b;
// Toggle the 14th bit
b.toggle(14);
The position/8 gives you the index in the array (as it is an array of char) and position%8 gives you the offset for the single bit inside one char. I would strongly advice you do the arithmetic on a paper yourself to see the picture here!
If you want to toggle the bit corresponding to the integer offset, you can calculate:
int bytenum = (offset >> 3);
int bitnum = offset - (bytenum << 3);
Then assuming bitstream is of type BYTEBUF you can do:
bitstream.data[bytenum] ^= (1 << bitnum);
Obviously, you need to be careful that the bytenum is in range (within length of valid memory pointed to by data), that the object has been initialised/constructed properly, etc...
Related
The dataFile.bin is a binary file with 6-byte records. The first 3
bytes of each record contain the latitude and the last 3 bytes contain
the longitude. Each 24 bit value represents radians multiplied by
0X1FFFFF
This is a task I've been working on. I havent done C++ in years so its taking me way longer than I thought it would -_-. After googling around I saw this algorthim which made sense to me.
int interpret24bitAsInt32(byte[] byteArray) {
int newInt = (
((0xFF & byteArray[0]) << 16) |
((0xFF & byteArray[1]) << 8) |
(0xFF & byteArray[2])
);
if ((newInt & 0x00800000) > 0) {
newInt |= 0xFF000000;
} else {
newInt &= 0x00FFFFFF;
}
return newInt;
}
The problem is a syntax issue I am restricting to working by the way the other guy had programmed this. I am not understanding how I can store the CHAR "data" into an INT. Wouldn't it make more sense if "data" was an Array? Since its receiving 24 integers of information stored into a BYTE.
double BinaryFile::from24bitToDouble(char *data) {
int32_t iValue;
// ****************************
// Start code implementation
// Task: Fill iValue with the 24bit integer located at data.
// The first byte is the LSB.
// ****************************
//iValue +=
// ****************************
// End code implementation
// ****************************
return static_cast<double>(iValue) / FACTOR;
}
bool BinaryFile::readNext(DataRecord &record)
{
const size_t RECORD_SIZE = 6;
char buffer[RECORD_SIZE];
m_ifs.read(buffer,RECORD_SIZE);
if (m_ifs) {
record.latitude = toDegrees(from24bitToDouble(&buffer[0]));
record.longitude = toDegrees(from24bitToDouble(&buffer[3]));
return true;
}
return false;
}
double BinaryFile::toDegrees(double radians) const
{
static const double PI = 3.1415926535897932384626433832795;
return radians * 180.0 / PI;
}
I appreciate any help or hints even if you dont understand a clue or hint will help me alot. I just need to talk to someone.
I am not understanding how I can store the CHAR "data" into an INT.
Since char is a numeric type, there is no problem combining them into a single int.
Since its receiving 24 integers of information stored into a BYTE
It's 24 bits, not bytes, so there are only three integer values that need to be combined.
An easier way of producing the same result without using conditionals is as follows:
int interpret24bitAsInt32(byte[] byteArray) {
return (
(byteArray[0] << 24)
| (byteArray[1] << 16)
| (byteArray[2] << 8)
) >> 8;
}
The idea is to store the three bytes supplied as an input into the upper three bytes of the four-byte int, and then shift it down by one byte. This way the program would sign-extend your number automatically, avoiding conditional execution.
Note on portability: This code is not portable, because it assumes 32-bit integer size. To make it portable use <cstdint> types:
int32_t interpret24bitAsInt32(const std::array<uint8_t,3> byteArray) {
return (
(const_cast<int32_t>(byteArray[0]) << 24)
| (const_cast<int32_t>(byteArray[1]) << 16)
| (const_cast<int32_t>(byteArray[2]) << 8)
) >> 8;
}
It also assumes that the most significant byte of the 24-bit number is stored in the initial element of byteArray, then comes the middle element, and finally the least significant byte.
Note on sign extension: This code automatically takes care of sign extension by constructing the value in the upper three bytes and then shifting it to the right, as opposed to constructing the value in the lower three bytes right away. This additional shift operation ensures that C++ takes care of sign-extending the result for us.
When an unsigned char is casted to an int the higher order bits are filled with 0's
When a signed char is casted to a casted int, the sign bit is extended.
ie:
int x;
char y;
unsigned char z;
y=0xFF
z=0xFF
x=y;
/*x will be 0xFFFFFFFF*/
x=z;
/*x will be 0x000000FF*/
So, your algorithm, uses 0xFF as a mask to remove C' sign extension, ie
0xFF == 0x000000FF
0xABCDEF10 & 0x000000FF == 0x00000010
Then uses bit shifts and logical ands to put the bits in their proper place.
Lastly checks the most significant bit (newInt & 0x00800000) > 0 to decide if completing with 0's or ones the highest byte.
int32_t upperByte = ((int32_t) dataRx[0] << 24);
int32_t middleByte = ((int32_t) dataRx[1] << 16);
int32_t lowerByte = ((int32_t) dataRx[2] << 8);
int32_t ADCdata32 = (((int32_t) (upperByte | middleByte | lowerByte)) >> 8); // Right-shift of signed data maintains signed bit
I have a long list of numbers between 0 and 67600. Now I want to store them using an array that is 67600 elements long. An element is set to 1 if a number was in the set and it is set to 0 if the number is not in the set. ie. each time I need only 1bit information for storing the presence of a number. Is there any hack in C/C++ that helps me achieve this?
In C++ you can use std::vector<bool> if the size is dynamic (it's a special case of std::vector, see this) otherwise there is std::bitset (prefer std::bitset if possible.) There is also boost::dynamic_bitset if you need to set/change the size at runtime. You can find info on it here, it is pretty cool!
In C (and C++) you can manually implement this with bitwise operators. A good summary of common operations is here. One thing I want to mention is its a good idea to use unsigned integers when you are doing bit operations. << and >> are undefined when shifting negative integers. You will need to allocate arrays of some integral type like uint32_t. If you want to store N bits, it will take N/32 of these uint32_ts. Bit i is stored in the i % 32'th bit of the i / 32'th uint32_t. You may want to use a differently sized integral type depending on your architecture and other constraints. Note: prefer using an existing implementation (e.g. as described in the first paragraph for C++, search Google for C solutions) over rolling your own (unless you specifically want to, in which case I suggest learning more about binary/bit manipulation from elsewhere before tackling this.) This kind of thing has been done to death and there are "good" solutions.
There are a number of tricks that will maybe only consume one bit: e.g. arrays of bitfields (applicable in C as well), but whether less space gets used is up to compiler. See this link.
Please note that whatever you do, you will almost surely never be able to use exactly N bits to store N bits of information - your computer very likely can't allocate less than 8 bits: if you want 7 bits you'll have to waste 1 bit, and if you want 9 you will have to take 16 bits and waste 7 of them. Even if your computer (CPU + RAM etc.) could "operate" on single bits, if you're running in an OS with malloc/new it would not be sane for your allocator to track data to such a small precision due to overhead. That last qualification was pretty silly - you won't find an architecture in use that allows you to operate on less than 8 bits at a time I imagine :)
You should use std::bitset.
std::bitset functions like an array of bool (actually like std::array, since it copies by value), but only uses 1 bit of storage for each element.
Another option is vector<bool>, which I don't recommend because:
It uses slower pointer indirection and heap memory to enable resizing, which you don't need.
That type is often maligned by standards-purists because it claims to be a standard container, but fails to adhere to the definition of a standard container*.
*For example, a standard-conforming function could expect &container.front() to produce a pointer to the first element of any container type, which fails with std::vector<bool>. Perhaps a nitpick for your usage case, but still worth knowing about.
There is in fact! std::vector<bool> has a specialization for this: http://en.cppreference.com/w/cpp/container/vector_bool
See the doc, it stores it as efficiently as possible.
Edit: as somebody else said, std::bitset is also available: http://en.cppreference.com/w/cpp/utility/bitset
If you want to write it in C, have an array of char that is 67601 bits in length (67601/8 = 8451) and then turn on/off the appropriate bit for each value.
Others have given the right idea. Here's my own implementation of a bitsarr, or 'array' of bits. An unsigned char is one byte, so it's essentially an array of unsigned chars that stores information in individual bits. I added the option of storing TWO or FOUR bit values in addition to ONE bit values, because those both divide 8 (the size of a byte), and would be useful if you want to store a huge number of integers that will range from 0-3 or 0-15.
When setting and getting, the math is done in the functions, so you can just give it an index as if it were a normal array--it knows where to look.
Also, it's the user's responsibility to not pass a value to set that's too large, or it will screw up other values. It could be modified so that overflow loops back around to 0, but that would just make it more convoluted, so I decided to trust myself.
#include<stdio.h>
#include <stdlib.h>
#define BYTE 8
typedef enum {ONE=1, TWO=2, FOUR=4} numbits;
typedef struct bitsarr{
unsigned char* buckets;
numbits n;
} bitsarr;
bitsarr new_bitsarr(int size, numbits n)
{
int b = sizeof(unsigned char)*BYTE;
int numbuckets = (size*n + b - 1)/b;
bitsarr ret;
ret.buckets = malloc(sizeof(ret.buckets)*numbuckets);
ret.n = n;
return ret;
}
void bitsarr_delete(bitsarr xp)
{
free(xp.buckets);
}
void bitsarr_set(bitsarr *xp, int index, int value)
{
int buckdex, innerdex;
buckdex = index/(BYTE/xp->n);
innerdex = index%(BYTE/xp->n);
xp->buckets[buckdex] = (value << innerdex*xp->n) | ((~(((1 << xp->n) - 1) << innerdex*xp->n)) & xp->buckets[buckdex]);
//longer version
/*unsigned int width, width_in_place, zeros, old, newbits, new;
width = (1 << xp->n) - 1;
width_in_place = width << innerdex*xp->n;
zeros = ~width_in_place;
old = xp->buckets[buckdex];
old = old & zeros;
newbits = value << innerdex*xp->n;
new = newbits | old;
xp->buckets[buckdex] = new; */
}
int bitsarr_get(bitsarr *xp, int index)
{
int buckdex, innerdex;
buckdex = index/(BYTE/xp->n);
innerdex = index%(BYTE/xp->n);
return ((((1 << xp->n) - 1) << innerdex*xp->n) & (xp->buckets[buckdex])) >> innerdex*xp->n;
//longer version
/*unsigned int width = (1 << xp->n) - 1;
unsigned int width_in_place = width << innerdex*xp->n;
unsigned int val = xp->buckets[buckdex];
unsigned int retshifted = width_in_place & val;
unsigned int ret = retshifted >> innerdex*xp->n;
return ret; */
}
int main()
{
bitsarr x = new_bitsarr(100, FOUR);
for(int i = 0; i<16; i++)
bitsarr_set(&x, i, i);
for(int i = 0; i<16; i++)
printf("%d\n", bitsarr_get(&x, i));
for(int i = 0; i<16; i++)
bitsarr_set(&x, i, 15-i);
for(int i = 0; i<16; i++)
printf("%d\n", bitsarr_get(&x, i));
bitsarr_delete(x);
}
Is it even possible to create an array of bits with more than 100000000 elements? If it is, how would I go about doing this? I know that for a char array I can do this:
char* array;
array = (char*)malloc(100000000 * sizeof(char));
If I was to declare the array by char array[100000000] then I would get a segmentation fault, since the maximum number of elements has been exceeded, which is why I use malloc.
Is there something similar I can do for an array of bits?
If you are using C++, std::vector<bool> is specialized to pack elements into a bit map. Of course, if you are using C++, you need to stop using malloc.
You could try looking at boost::dynamic_bitset. Then you could do something like the following (taken from Boost's example page):
boost::dynamic_bitset<> x(100000000); // all 0's by default
x[0] = 1;
x[1] = 1;
x[4] = 1;
The bitset will use a single bit for each element so you can store 32 items in the space of 4 bytes, decreasing the amount of memory required considerably.
In C and C++, char is the smallest type. You can't directly declare an array of bits. However, since an array of any basic type is fundamentally made of bits, you can emulate them, something like this (code untested):
unsigned *array;
array = (unsigned *) malloc(100000000 / sizeof(unsigned) + 1);
/* Retrieves the value in bit i */
#define GET_BIT(array, i) (array[i / sizeof(unsigned)] & (1 << (i % sizeof(unsigned))))
/* Sets bit i to true*/
#define SET_BIT(array, i) (array[i / sizeof(unsigned)] |= (1 << (i % sizeof(unsigned))))
/* Sets bit i to false */
#define CLEAR_BIT(array, i) (array[i / sizeof(unsigned)] &= ~(1 << (i % sizeof(unsigned))))
The segmentation fault you noticed is due to running out of stack space. Of course you can't declare a local variable that is 12.5 MB in size (100 million bits), let alone 100MB in size (100 million bytes) in a thread with a stack of ~ 4 MB. Should work as a global variable, although then you may end up with a 12 or 100 MB executable file -- still not a good idea. Dynamic allocation is definitely the right thing to do for large buffers like that.
If it is allowed to use STL, then I would use std::bitset.
(For 100,000,000 bits, it would use 100000000 / 32 unsigned int underneath, each storing 32 bits.)
std::vector<bool>, already mentioned, is another good solution.
There are a few approaches to creating a bitmap in C++.
If you already know the size of bitmap at compile time, you can use the STL, std::bitset template.
This is how you would do it with bitset
std::bitset<100000000> array
Otherwise, if the size of the bitmap changes dynamically during runtime, you can use std::vector<bool> or boost::dynamic_bitset as recommended here http://en.cppreference.com/w/cpp/utility/bitset (See note at the bottom)
Yes but it's going to be a little bit more complicated !
The better way to store bits is to use the bits into the char itself !
So you can store 8 bits in a char !
Which will "only" require 12'500'000 octets !
Here is some documentation about binaries : http://www.somacon.com/p125.php
You should look on google :)
Other solution:
unsigned char * array;
array = (unsigned char *) malloc ( 100000000 / sizeof(unsigned char) + 1);
bool MapBit ( unsigned char arraybit[], DWORD position, bool set)
{
//work for 0 at 4294967295 bit position
//calc bit position
DWORD bytepos = ( position / 8 );
//
unsigned char bitpos = ( position % 8);
unsigned char bit = 0x01;
//get bit
if ( bitpos )
{
bit = bit << bitpos;
}
if ( set )
{
arraybit [ bytepos ] |= bit;
}
else
{
//get
if ( arraybit [ bytepos ] & bit )
return true;
}
return false;
}
I'm fond of the bitarray that's in the open source fxt library at http://www.jjj.de/fxt/. It's simple, efficient and contained in a few headers, so it's easy to add to your project. Plus there's many complementary functions to use with the bitarray (see http://www.jjj.de/bitwizardry/bitwizardrypage.html).
In C/C++, is there an easy way to apply bitwise operators (specifically left/right shifts) to dynamically allocated memory?
For example, let's say I did this:
unsigned char * bytes=new unsigned char[3];
bytes[0]=1;
bytes[1]=1;
bytes[2]=1;
I would like a way to do this:
bytes>>=2;
(then the 'bytes' would have the following values):
bytes[0]==0
bytes[1]==64
bytes[2]==64
Why the values should be that way:
After allocation, the bytes look like this:
[00000001][00000001][00000001]
But I'm looking to treat the bytes as one long string of bits, like this:
[000000010000000100000001]
A right shift by two would cause the bits to look like this:
[000000000100000001000000]
Which finally looks like this when separated back into the 3 bytes (thus the 0, 64, 64):
[00000000][01000000][01000000]
Any ideas? Should I maybe make a struct/class and overload the appropriate operators? Edit: If so, any tips on how to proceed? Note: I'm looking for a way to implement this myself (with some guidance) as a learning experience.
I'm going to assume you want bits carried from one byte to the next, as John Knoeller suggests.
The requirements here are insufficient. You need to specify the order of the bits relative to the order of the bytes - when the least significant bit falls out of one byte, does to go to the next higher or next lower byte.
What you are describing, though, used to be very common for graphics programming. You have basically described a monochrome bitmap horizontal scrolling algorithm.
Assuming that "right" means higher addresses but less significant bits (ie matching the normal writing conventions for both) a single-bit shift will be something like...
void scroll_right (unsigned char* p_Array, int p_Size)
{
unsigned char orig_l = 0;
unsigned char orig_r;
unsigned char* dest = p_Array;
while (p_Size > 0)
{
p_Size--;
orig_r = *p_Array++;
*dest++ = (orig_l << 7) + (orig_r >> 1);
orig_l = orig_r;
}
}
Adapting the code for variable shift sizes shouldn't be a big problem. There's obvious opportunities for optimisation (e.g. doing 2, 4 or 8 bytes at a time) but I'll leave that to you.
To shift left, though, you should use a separate loop which should start at the highest address and work downwards.
If you want to expand "on demand", note that the orig_l variable contains the last byte above. To check for an overflow, check if (orig_l << 7) is non-zero. If your bytes are in an std::vector, inserting at either end should be no problem.
EDIT I should have said - optimising to handle 2, 4 or 8 bytes at a time will create alignment issues. When reading 2-byte words from an unaligned char array, for instance, it's best to do the odd byte read first so that later word reads are all at even addresses up until the end of the loop.
On x86 this isn't necessary, but it is a lot faster. On some processors it's necessary. Just do a switch based on the base (address & 1), (address & 3) or (address & 7) to handle the first few bytes at the start, before the loop. You also need to special case the trailing bytes after the main loop.
Decouple the allocation from the accessor/mutators
Next, see if a standard container like bitset can do the job for you
Otherwise check out boost::dynamic_bitset
If all fails, roll your own class
Rough example:
typedef unsigned char byte;
byte extract(byte value, int startbit, int bitcount)
{
byte result;
result = (byte)(value << (startbit - 1));
result = (byte)(result >> (CHAR_BITS - bitcount));
return result;
}
byte *right_shift(byte *bytes, size_t nbytes, size_t n) {
byte rollover = 0;
for (int i = 0; i < nbytes; ++i) {
bytes[ i ] = (bytes[ i ] >> n) | (rollover < n);
byte rollover = extract(bytes[ i ], 0, n);
}
return &bytes[ 0 ];
}
Here's how I would do it for two bytes:
unsigned int rollover = byte[0] & 0x3;
byte[0] >>= 2;
byte[1] = byte[1] >> 2 | (rollover << 6);
From there, you can generalize this into a loop for n bytes. For flexibility, you will want to generate the magic numbers (0x3 and 6) rather then hardcode them.
I'd look into something similar to this:
#define number_of_bytes 3
template<size_t num_bytes>
union MyUnion
{
char bytes[num_bytes];
__int64 ints[num_bytes / sizeof(__int64) + 1];
};
void main()
{
MyUnion<number_of_bytes> mu;
mu.bytes[0] = 1;
mu.bytes[1] = 1;
mu.bytes[2] = 1;
mu.ints[0] >>= 2;
}
Just play with it. You'll get the idea I believe.
Operator overloading is syntactic sugar. It's really just a way of calling a function and passing your byte array without having it look like you are calling a function.
So I would start by writing this function
unsigned char * ShiftBytes(unsigned char * bytes, size_t count_of_bytes, int shift);
Then if you want to wrap this up in an operator overload in order to make it easier to use or because you just prefer that syntax, you can do that as well. Or you can just call the function.
The short version is: How do I learn the size (in bits) of an individual field of a c++ field?
To clarify, an example of the field I am talking about:
struct Test {
unsigned field1 : 4; // takes up 4 bits
unsigned field2 : 8; // 8 bits
unsigned field3 : 1; // 1 bit
unsigned field4 : 3; // 3 bits
unsigned field5 : 16; // 16 more to make it a 32 bit struct
int normal_member; // normal struct variable member, 4 bytes on my system
};
Test t;
t.field1 = 1;
t.field2 = 5;
// etc.
To get the size of the entire Test object is easy, we just say
sizeof(Test); // returns 8, for 8 bytes total size
We can get a normal struct member through
sizeof(((Test*)0)->normal_member); // returns 4 (on my system)
I would like to know how to get the size of an individual field, say Test::field4. The above example for a normal struct member does not work. Any ideas? Or does someone know a reason why it cannot work? I am fairly convinced that sizeof will not be of help since it only returns size in bytes, but if anyone knows otherwise I'm all ears.
Thanks!
You can calculate the size at run time, fwiw, e.g.:
//instantiate
Test t;
//fill all bits in the field
t.field1 = ~0;
//extract to unsigned integer
unsigned int i = t.field1;
... TODO use contents of i to calculate the bit-width of the field ...
You cannot take the sizeof a bitfield and get the number of bits.
Your best bet would be use #defines or enums:
struct Test {
enum Sizes {
sizeof_field1 = 4,
sizeof_field2 = 8,
sizeof_field3 = 1,
sizeof_field4 = 3,
sizeof_field5 = 16,
};
unsigned field1 : sizeof_field1; // takes up 4 bits
unsigned field2 : sizeof_field2; // 8 bits
unsigned field3 : sizeof_field3; // 1 bit
unsigned field4 : sizeof_field4; // 3 bits
unsigned field5 : sizeof_field5; // 16 more to make it a 32 bit struct
int normal_member; // normal struct variable member, 4 bytes on my system
};
printf("%d\n", Test::sizeof_field1); // prints 4
For the sake of consistency, I believe you can move normal_member up to the top and add an entry in Sizes using sizeof(normal_member). This messes with the order of your data, though.
Seems unlikely, since sizeof() is in bytes, and you want bits.
http://en.wikipedia.org/wiki/Sizeof
building on the bit counting answer, you can use.
http://www-graphics.stanford.edu/~seander/bithacks.html
Using ChrisW's idea (nice, by the way), you can create a helper macro:
#define SIZEOF_BITFIELD(class,member,out) { \
class tmp_; \
tmp_.member = ~0; \
unsigned int tmp2_ = tmp_.member; \
++tmp2_; \
out = log2(tmp2_); \
}
unsigned int log2(unsigned int x) {
// Overflow occured.
if(!x) {
return sizeof(unsigned int) * CHAR_BIT;
}
// Some bit twiddling... Exploiting the fact that floats use base 2 and store the exponent. Assumes 32-bit IEEE.
float f = (float)x;
return (*(unsigned int *)&f >> 23) - 0x7f;
}
Usage:
size_t size;
SIZEOF_BITFIELD(Test, field1, size); // Class of the field, field itself, output variable.
printf("%d\n", size); // Prints 4.
My attempts to use templated functions have failed. I'm not an expert on templates, however, so it may still be possible to have a clean method (e.g. sizeof_bitfield(Test::field1)).
I don't think you can do it. If you really need the size, I suggest you use a #define (or, better yet, if possible a const variable -- I'm not sure if that's legal) as so:
#define TEST_FIELD1_SIZE 4
struct Test {
unsigned field1 : TEST_FIELD1_SIZE;
...
}
This is not possible
Answer to comment:
Because the type is just an int, there is no 'bit' type. The bit field assignment syntax is just short hand for performing the bitwise code for reads and writes.