Keeping track of boolean data - c++

I need to keep track of n samples. The information I am keeping track of is of boolean type, i.e. something is true or false. As soon as I am on sample n+1, i basically want to ignore the oldest sample and record information about the newest one.
So say I keep track of samples, I may have something like
OLDEST 0 0 1 1 0 NEWEST
If the next sample is 1, this will become
OLDEST 0 1 1 0 1 NEWEST
if the next one is 0, this will become...
OLDEST 1 1 0 1 0 NEWEST
So what is the best way to implement this in terms of simplicity and memory?
Some ideas I had:
Vector of bool (this would require shifting elements so seems expensive)
Storing it as bits...and using bit shifting (memorywise --cheap? but is there a limit on the number of samples?)
Linked lists? (might be an overkill for the task)
Thanks for the ideas and suggestions :)

You want a set of bits. Maybe you can look into a std::bitset
http://www.sgi.com/tech/stl/bitset.html
Very straightfoward to use, optimal memory consumption and probably the best performance
The only limitation is that you need to know at compile-time the value of n. If you want to set it on runtime, have a look at boost http://www.boost.org/doc/libs/1_36_0/libs/dynamic_bitset/dynamic_bitset.html

Sounds like a perfect use of a ring buffer. Unfortunately there isn't one in the standard library, but you could use boost.
Alternately roll your own using a fixed-length std::list and splice the head node to the tail when you need to overwrite an old element.

It really depends on how many samples you want to keep.
vector<bool> could be a valid option; I would expect an
erase() on the first element to be reasonably efficient.
Otherwise, there's deque<bool>. If you know how many elements
you want to keep at compile time, bitset<N> is probably better
than either.
In any case, you'll have to wrap the standard container in some
additional logic; none have the actual logic you need (that of
a ring buffer).

If you only need 8 bits... then use a char and do logical shifts "<<, >>" and do a mask to look at the one you need.
16 Bits - short
32 Bits - int
64 Bits - long
etc...
Example:
Oldest 00110010 Newest -> Oldest 1001100101 Newest
Done by:
char c = 0x32; // 50 decimal or 00110010 in binary
c<<1; // Logical shift left once.
c++; // Add one, sense LSB is the newest.
//Now look at the 3rd newest bit
print("The 3rd newest bit is: %d\n", (c & 0x4));
Simple and EXTREMELY cheap on resources. Will be VERY VERY high performance.

From your question, it's not clear what you intend to do with the samples. If all you care about is storing the N most recent samples, you could try the following. I'll do it for "chars" and let you figure out how to optimize for "bool" should you need that.
char buffer[N];
int samples = 0;
void record_sample( char value )
{
buffer[samples%N] = value;
samples = samples + 1;
}
Once you've stored N samples (once you've called record_sample N times) you can read the oldest and newest samples like so:
char oldest_sample()
{
return buffer[samples%N];
}
char newest_sample()
{
return buffer[(samples+N-1)%N];
}
Things get a little trickier if you intend to read the oldest sample before you've already stored N samples - but not that much trickier. For that, you want a "ring buffer" which you can find in boost and on wikipedia.

Related

Using part of a variable as bool

Let's say memory is precious, and I have a class with a uint32_t member variable ui and I know that the values will stay below 1 million. The class also hase some bool members.
Does it make sense to use the highest (highest 2,3,..) bit(s) of ui in order to save memory, since bool is 1 byte?
If it does make sense, what is the most efficient way to get the highest (leftmost?) bit (or 2nd)? I read a few old threads and there seems to be disagreement about using inline ASM or some sort of shift.
It's a bit dangerous to use part of the bits as bool. The thing is that the way the numbers are kept in binary, makes it harder to maintain that keeping mechanism correct.
Negative numbers are kept as a complement of positive. Check this for more explanation. You may assign number to be 10 and then setting bool bit from false to true, and the number may turn out to become huge negative number as a result.
As for getting if n-th bit is 0 or 1 you can use this, where 0-th bit is the right most:
int nth_bit(int a, int n){
return a & (1 << n);
}
It will return 0 or 1 identifying the n-th bit.
Well, if the memory is in fact precious, you should look deeper.
1,000,000 uses only 20 bits. This is less that 3 bytes. So you can allocate 3 bytes to keep your value and up to four booleans. Obviously, access will be a bit more complicated, but you save 25% of memory!
If you know that the values are below 524,287, for example, you can save another 15% by packing it (with bool) into 20 bits :)
Also, keeping bool in a separate array (as you said in a comment) would kill performance if you need to access the value and a corresponding bool simultaneously because they are far apart and will likely never be in a cache.

Fast code for searching bit-array for contiguous set/clear bits?

Is there some reasonably fast code out there which can help me quickly search a large bitmap (a few megabytes) for runs of contiguous zero or one bits?
By "reasonably fast" I mean something that can take advantage of the machine word size and compare entire words at once, instead of doing bit-by-bit analysis which is horrifically slow (such as one does with vector<bool>).
It's very useful for e.g. searching the bitmap of a volume for free space (for defragmentation, etc.).
Windows has an RTL_BITMAP data structure one can use along with its APIs.
But I needed the code for this sometime ago, and so I wrote it here (warning, it's a little ugly):
https://gist.github.com/3206128
I have only partially tested it, so it might still have bugs (especially on reverse). But a recent version (only slightly different from this one) seemed to be usable for me, so it's worth a try.
The fundamental operation for the entire thing is being able to -- quickly -- find the length of a run of bits:
long long GetRunLength(
const void *const pBitmap, unsigned long long nBitmapBits,
long long startInclusive, long long endExclusive,
const bool reverse, /*out*/ bool *pBit);
Everything else should be easy to build upon this, given its versatility.
I tried to include some SSE code, but it didn't noticeably improve the performance. However, in general, the code is many times faster than doing bit-by-bit analysis, so I think it might be useful.
It should be easy to test if you can get a hold of vector<bool>'s buffer somehow -- and if you're on Visual C++, then there's a function I included which does that for you. If you find bugs, feel free to let me know.
I can't figure how to do well directly on memory words, so I've made up a quick solution which is working on bytes; for convenience, let's sketch the algorithm for counting contiguous ones:
Construct two tables of size 256 where you will write for each number between 0 and 255, the number of trailing 1's at the beginning and at the end of the byte. For example, for the number 167 (10100111 in binary), put 1 in the first table and 3 in the second table. Let's call the first table BBeg and the second table BEnd. Then, for each byte b, two cases: if it is 255, add 8 to your current sum of your current contiguous set of ones, and you are in a region of ones. Else, you end a region with BBeg[b] bits and begin a new one with BEnd[b] bits.
Depending on what information you want, you can adapt this algorithm (this is a reason why I don't put here any code, I don't know what output you want).
A flaw is that it does not count (small) contiguous set of ones inside one byte ...
Beside this algorithm, a friend tells me that if it is for disk compression, just look for bytes different from 0 (empty disk area) and 255 (full disk area). It is a quick heuristic to build a map of what blocks you have to compress. Maybe it is beyond the scope of this topic ...
Sounds like this might be useful:
http://www.aggregate.org/MAGIC/#Population%20Count%20%28Ones%20Count%29
and
http://www.aggregate.org/MAGIC/#Leading%20Zero%20Count
You don't say if you wanted to do some sort of RLE or to simply count in-bytes zeros and one bits (like 0b1001 should return 1x1 2x0 1x1).
A look up table plus SWAR algorithm for fast check might gives you that information easily.
A bit like this:
byte lut[0x10000] = { /* see below */ };
for (uint * word = words; word < words + bitmapSize; word++) {
if (word == 0 || word == (uint)-1) // Fast bailout
{
// Do what you want if all 0 or all 1
}
byte hiVal = lut[*word >> 16], loVal = lut[*word & 0xFFFF];
// Do what you want with hiVal and loVal
The LUT will have to be constructed depending on your intended algorithm. If you want to count the number of contiguous 0 and 1 in the word, you'll built it like this:
for (int i = 0; i < sizeof(lut); i++)
lut[i] = countContiguousZero(i); // Or countContiguousOne(i)
// The implementation of countContiguousZero can be slow, you don't care
// The result of the function should return the largest number of contiguous zero (0 to 15, using the 4 low bits of the byte, and might return the position of the run in the 4 high bits of the byte
// Since you've already dismissed word = 0, you don't need the 16 contiguous zero case.

C++: I need some guidance in how to create dynamic sized bitmaps

I'm trying to create a simple DBMS and although I've read a lot about it and have already designed the system, I have some issues about the implementation.
I need to know what's the best method in C++ to use a series of bits whose length will be dynamic. This series of bits will be saved in order to figure out which pages in the files are free and not free. For a single file the number of pages used will be fixed, so I can probably use a bitset for that. However the number of records per page AND file will not be fixed. So I don't think bitset would be the best way to do this.
I thought maybe to just use a sequence of characters, since each character is 1 byte = 8 bits maybe if I use an array of them I would be able to create the bit map that I want.
I never had to manipulate bits at such a low level, so I don't really know if there is some other better method to do this, or even if this method would work at all.
thanks in advance
If you are just wanting the basics on the bit twiddling, the following is one way of doing it using an array of characters.
Assume you have an array for the bits (the length needs to be (totalitems / 8 )):
unsigned char *bits; // this of course needs to be allocated somewhere
You can compute the index into the array and the specific bit within that position as follows:
// compute array position
int pos = item / 8; // 8 bits per byte
// compute the bit within the byte. Could use "item & 7" for the same
// result, however modern compilers will typically already make
// that optimization.
int bit = item % 8;
And then you can check if a bit is set with the following (assumes zero-based indexing):
if ( bits[pos] & ( 1 << bit ))
return 1; // it is set
else
return 0; // it is not set
The following will set a specific bit:
bits[pos] |= ( 1 << bit );
And the following can be used to clear a specific bit:
bits[pos] &= ~( 1 << bit );
I would implement a wrapper class and simply store your bitmap in a linked list of chunks where each chunk would hold a fixed size array (I would use a stdint type like uint32_t to ensure a given number of bits) then you simply add links to your list to expand. I'll leave contracting as an exercise to the reader.

Iterating through a boost::dynamic_bitset

I have a boost dynamic_bitset that I am trying to extract the set bits from:
boost::dynamic_bitset<unsigned long> myBitset(1000);
My first thought was to do a simple 'dump' loop through each index and ask if it was set:
for(size_t index = 0 ; index < 1000 ; ++index)
{
if(myBitset.test(index))
{
/* do something */
}
}
But then I saw two interesting methods, find_first() and find_next() that I thought for sure were meant for this purpose:
size_t index = myBitset.find_first();
while(index != boost::dynamic_bitset::npos)
{
/* do something */
index = myBitset.find_next(index);
}
I ran some tests and it seems like the second method is more efficient, but this concerns me that there might be another 'more correct' way to perform this iteration. I wasn't able to find any examples or notes in the documentation indicating the correct way to iterate over the set bits.
So, is using find_first() and find_next() the best way to iterate over a dynamic_bitset, or is there another way?
find_first and find_next are the fastest way. The reason is that these can skip over an entire block (of dynamic_bitset::bits_per_block bits, probably 32 or 64) if none of them are set.
Note that dynamic_bitset does not have iterators, so it will behave a bit un-C++'ish no matter what.
Depends on your definition of more correct. A correct method probably must yield correct results on all valid inputs and be fast enough.
find_first and find_next are there so that they can be optimized to scan entire blocks of bits in one comparison. If a block is, say, an unsigned long of 64 bits, one block comparison analyses 64 bits at once, where a straightforward loop like you posted would do 64 iterations for that.

how to efficiently access 3^20 vectors in a 2^30 bits of memory

I want to store a 20-dimensional array where each coordinate can have 3 values,
in a minimal amount of memory (2^30 or 1 Gigabyte).
It is not a sparse array, I really need every value.
Furthermore I want the values to be integers of arbirary but fixed precision,
say 256 bits or 8 words
example;
set_big_array(1,0,0,0,1,2,2,0,0,2,1,1,2,0,0,0,1,1,1,2, some_256_bit_value);
and
get_big_array(1,0,0,0,1,2,2,0,0,2,1,1,2,0,0,0,1,1,1,2, &some_256_bit_value);
Because the value 3 is relative prime of 2. its difficult to implement this using
efficient bitwise shift, and and or operators.
I want this to be as fast as possible.
any thoughts?
Seems tricky to me without some compression:
3^20 = 3486784401 values to store
256bits / 8bitsPerByte = 32 bytes per value
3486784401 * 32 = 111577100832 size for values in bytes
111577100832 / (1024^3) = 104 Gb
You're trying to fit 104 Gb in 1 Gb. There'd need to be some pattern to the data that could be used to compress it.
Sorry, I know this isn't much help, but maybe you can rethink your strategy.
There are 3.48e9 variants of 20-tuple of indexes that are 0,1,2. If you wish to store a 256 bit value at each index, that means you're talking about 8.92e11 bits - about a terabit, or about 100GB.
I'm not sure what you're trying to do, but that sounds computationally expensive. It may be reasonable feasible as a memory-mapped file, and may be reasonably fast as a memory-mapped file on an SSD.
What are you trying to do?
So, a practical solution would be to use a 64-bit OS and a large memory-mapped file (preferably on an SSD) and simply compute the address for a given element in the typical way for arrays, i.e. as sum-of(forall-i(i-th-index * 3^i)) * 32 bytes in pseudeo-math. Or, use a very very expensive machine with that much memory, or another algorithm that doesn't require this array in the first place.
A few notes on platforms: Windows 7 supports just 192GB of memory, so using physical memory for a structure like this is possible but really pushing it (more expensive editions support more). If you can find a machine at all that is. According to microsoft's page on the matter the user-mode virtual address space is 7-8TB, so mmap/virtual memory should be doable. Alex Ionescu explains why there's such a low limit on virtual memory despite an apparently 64-bit architecture. Wikipedia puts linux's addressable limits at 128TB, though probably that's before the kernel/usermode split.
Assuming you want to address such a multidimensional array, you must process each index at least once: that means any algorithm will be O(N) where N is the number of indexes. As mentioned before, you don't need to convert to base-2 addressing or anything else, the only thing that matters is that you can compute the integer offset - and which base the maths happens in is irrelevant. You should use the most compact representation possible and ignore the fact that each dimension is not a multiple of 2.
So, for a 16-dimensional array, that address computation function could be:
int offset = 0;
for(int ii=0;ii<16;ii++)
offset = offset*3 + indexes[ii];
return &the_array[offset];
As previously said, this is just the common array indexing formula, nothing special about it. Note that even for "just" 16 dimensions, if each item is 32 bytes, you're dealing with a little more than a gigabyte of data.
Maybe i understand your question wrong. But can't you just use a normal array?
INT256 bigArray[3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3];
OR
INT256 ********************bigArray = malloc(3^20 * 8);
bigArray[1][0][0][1][2][0][1][1][0][0][0][0][1][1][2][1][1][1][1][1] = some_256_bit_value;
etc.
Edit:
Will not work because you would need 3^20 * 8Byte = ca. 25GByte.
The malloc variant is wrong.
I'll start by doing a direct calculation of the address, then see if I can optimize it
address = 0;
for(i=15; i>=0; i--)
{
address = 3*address + array[i];
}
address = address * number_of_bytes_needed_for_array_value
2^30 bits is 2^27 bytes so not actually a gigabyte, it's an eighth of a gigabyte.
It appears impossible to do because of the mathematics although of course you can create the data size bigger then compress it, which may get you down to the required size although it cannot guarantee. (It must fail to some of the time as the compression is lossless).
If you do not require immediate "random" access your solution may be a "variable sized" two-bit word so your most commonly stored value takes only 1 bit and the other two take 2 bits.
If 0 is your most common value then:
0 = 0
10 = 1
11 = 2
or something like that.
In that case you will be able to store your bits in sequence this way.
It could take up to 2^40 bits this way but probably will not.
You could pre-run through your data and see which is the commonly occurring value and use that to indicate your single-bit word.
You can also compress your data after you have serialized it in up to 2^40 bits.
My assumption here is that you will be using disk possibly with memory mapping as you are unlikely to have that much memory available.
My assumption is that space is everything and not time.
You might want to take a look at something like STXXL, an implementation of the STL designed for handling very large volumes of data
You can actually use a pointer-to-array20 to have your compiler implement the index calculations for you:
/* Note: there are 19 of the [3]'s below */
my_256bit_type (*foo)[3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3][3];
foo = allocate_giant_array();
foo[0][1][1][0][2][1][2][2][0][2][1][0][2][1][0][0][2][1][0][0] = some_256bit_value;