Bit-Mapped Memory Manager - c++

I am reading a tutorial on writing memory management tools for c++ programs. Here is the link to the tutorial.
One of the variants of this memory manager is a Bit-Mapped Memory Manager in which optimization is based on the idea of prefetching a large chunk of memory and using it in our programs later.
This chunk is further divided into smaller fixed-sized units called blocks to be used for the allocation of a particular type of object.
The tutorial clearly mentions, "All free blocks have their corresponding bit set to 1. Occupied blocks have their bits reset to 0."
With each chunk, a bitmap is associated which represents the above idea. But, in the implementation of BitMap class, each corresponding bit for each block is represented by a 32-bit integer instead of just a single bit boolean value. This is what I am not able to understand.
Also, below is the declaration of the above-mentioned class. You can see this in the Listing-12 of the tutorial. I also think the line with memset is incorrect. The initialization is incomplete and it should be BIT_MAP_SIZE*4 even if we go their way.
typedef struct BitMapEntry
{
int Index;
int BlocksAvailable;
int BitMap[BIT_MAP_SIZE];
public:
BitMapEntry(): BlocksAvailable(BIT_MAP_SIZE)
{
memset(BitMap, 0xff, BIT_MAP_SIZE / sizeof(char));
// initially all blocks are free and bit value 1 in the map denotes
// available block
}
void SetBit(int position, bool flag);
void SetMultipleBits(int position, bool flag, int count);
void SetRangeOfInt(int* element, int msb, int lsb, bool flag);
Complex* FirstFreeBlock(size_t size);
Complex* ComplexObjectAddress(int pos);
void* Head();
}
BitMapEntry;

The initialization does seem to be incorrect, but the choice of a "bit container" type as int is not necessarily invalid.
You see, C++ doesn't have a native bit type; and on typical computers - you can't address single bits.
What could be happening (and you haven't shown use the implementation), since that each int can be the container for sizeof(int) * CHAR_BIT bits, thus if you ask for position k, you'll be looking at the k % (sizeof(int) * CHAR_BIT) in the k / sizeof(int) * CHAR_BIT) integer.

Related

Why reserve memory in the structure?

I often see structures in the code, at the end of which there is a memory reserve.
struct STAT_10K4
{
int32_t npos; // position number
...
float Plts;
Pxts;
float Plto [NUM];
uint32_t reserv [(NUM * 3)% 2 + 1];
};
Why do they do this?
Why are some of the reserve values dependent on constants?
What can happen if you do not make such reserves? Or make a mistake in their size?
This is a form of manual padding of a class to make its size a multiple of some number. In your case:
uint32_t reserv [(NUM * 3)% 2 + 1];
NUM * 3 % 2 is actually nonsensical, as it would be equivalent to NUM % 2 (not considering overflow). So if the array size is odd, we pad the struct with one additional uint32_t, on top of + 1 additional ones. This padding means that STAT_10K4's size is always a multiple of 8 bytes.
You will have to consult the documentation of your software to see why exactly this is done. Perhaps padding this struct with up to 8 bytes makes some algorithm easier to implement. Or maybe it has some perceived performance benefit. But this is pure speculation.
Typically, the compiler will pad your structs to 64-bit boundaries if you use any 64-bit types, so you don't need to do this manually.
Note: This answer is specific to mainstream compilers and x86. Obviously this does not apply to compiling for TI-calculators with 20-bit char & co.
This would typically be to support variable-length records. A couple of ways this could be used will be:
1 If the maximum number of records is known then a simple structure definition can accomodate all cases.
2 In many protocols there is a "header-data" idiom. The header will be a fixed size but the data variable. The data will be received as a "blob". Thus the structure of the header can be declared and accessed by a pointer to the blob, and the data will follow on from that. For example:
typedef struct
{
uint32_t messageId;
uint32_t dataType;
uint32_t dataLenBytes;
uint8_t data[MAX_PAYLOAD];
}
tsMessageFormat;
The data is received in a blob, so a void* ptr, size_t len.
The buffer pointer is then cast so the message can be read as follows:
tsMessageFormat* pMessage = (psMessageFormat*) ptr;
for (int i = 0; i < pMessage->dataLenBytes; i++)
{
//do something with pMessage->data[i];
}
In some languages the "data" could be specified as being an empty record, but C++ does not allow this. Sometimes you will see the "data" omitted and you have to perform pointer arithmetic to access the data.
The alternative to this would be to use a builder pattern and/or streams.
Windows uses this pattern a lot; many structures have a cbSize field which allows additional data to be conveyed beyond the structure. The structure accomodates most cases, but having cbSize allows additional data to be provided if necessary.

What is QList's maximum size?

Has anyone encountered a maximum size for QList?
I have a QList of pointers to my objects and have found that it silently throws an error when it reaches the 268,435,455th item, which is exactly 28 bits. I would have expected it to have at least a 31bit maximum size (minus one bit because size() returns a signed integer), or a 63bit maximum size on my 64bit computer, but this doesn't appear to be the case. I have confirmed this in a minimal example by executing QList<void*> mylist; mylist.append(0); in a counting loop.
To restate the question, what is the actual maximum size of QList? If it's not actually 2^32-1 then why? Is there a workaround?
I'm running a Windows 64bit build of Qt 4.8.5 for MSVC2010.
While the other answers make a useful attempt at explaining the problem, none of them actually answer the question or missed the point. Thanks to everyone for helping me track down the issue.
As Ali Mofrad mentioned, the error thrown is a std::bad_alloc error when the QList fails to allocate additional space in my QList::append(MyObject*) call. Here's where that happens in the Qt source code:
qlist.cpp: line 62:
static int grow(int size) //size = 268435456
{
//this is the problem line
volatile int x = qAllocMore(size * sizeof(void *), QListData::DataHeaderSize) / sizeof(void *);
return x; //x = -2147483648
}
qlist.cpp: line 231:
void **QListData::append(int n) //n = 1
{
Q_ASSERT(d->ref == 1);
int e = d->end;
if (e + n > d->alloc) {
int b = d->begin;
if (b - n >= 2 * d->alloc / 3) {
//...
} else {
realloc(grow(d->alloc + n)); //<-- grow() is called here
}
}
d->end = e + n;
return d->array + e;
}
In grow(), the new size requested (268,435,456) is multiplied by sizeof(void*) (8) to compute the size of the new block of memory to accommodate the growing QList. The problem is, 268435456*8 equals +2,147,483,648 if it's an unsigned int32, or -2,147,483,648 for a signed int32, which is what's getting returned from grow() on my OS. Therefore, when std::realloc() is called in QListData::realloc(int), we're trying to grow to a negative size.
The workaround here, as ddriver suggested, is to use QList::reserve() to pre-allocate the space, preventing my QList from ever having to grow.
In short, the maximum size for QList is 2^28-1 items unless you pre-allocate, in which case the maximum size truly is 2^31-1 as expected.
Update (Jan 2020): This appears to have changed in Qt 5.5, such that 2^28-1 is now the maximum size allowed for QList and QVector, regardless of whether or not you reserve in advance. A shame.
Has anyone encountered a maximum size for QList? I have a QList of pointers to my objects and have found that it silently throws an error when it reaches the 268,435,455th item, which is exactly 28 bits. I would have expected it to have at least a 31bit maximum size (minus one bit because size() returns a signed integer), or a 63bit maximum size on my 64bit computer, but this doesn't appear to be the case.
Theoretical maximum positive number stored in int is 2^31 - 1. Size of pointer is 4 bytes (for 32bit machine), so maximum possible number of them is 2^29 - 1. Appending data to the container will increases fragmentation of heap memory, so there is possible that you can allocate only half of possible memory. Try use reserve() or resize() instead.
Moreover, Win32 has some limits for memory allocation. So application compiled without special options cannot allocate more than this limit (1G or 2G).
Are you sure about this huge containers? Is it better to optimize application?
QList stores its elements in a void * array.
Hence, a list with 228 items, of which each one is a void *, will be 230 bytes long on a 32 bit machine, and 231 bytes on a 64 bit machine. I doubt you can request such a big chunk of contiguous memory.
And why allocating such a huge list anyhow? Are you sure you really need it?
The idea of be backed by an array of void * elements is because several operations on the list can be moved to non-templated code, therefore reducing the amount of generated code.
QList stores items straight in the void * array if the type is small enough (i.e. sizeof(T) <= sizeof(void*)), and if the type can be moved in memory via memmove. Otherwise, each item will be allocated on the heap via new, and the array will store the pointers to those items. A set of type traits is used to figure out how to handle each type, see Q_DECLARE_TYPEINFO.
While in theory this approach may sound attractive, in practice:
For all primitive types smaller than void * (char; int and float on 64 bit; etc.) you waste from 50 to 75% of the allocated space in the array
For all movable types bigger than void * (double on 32bit, QVariant, ...), you pay a heap allocation per each item in the list (plus the array itself)
QList code is generally less optimized than QVector one
Compilers these days do a pretty good job at merging template instantiations, hence the original reason for this design gets lost.
Today it's a much better idea to stick with QVector. Unfortunately the Qt APIs expose QList everywhere and can't change them (and we need C++11 to define QList as a template alias for QVector...)
I test this in Ubuntu 32bit with 4GB RAM using qt4.8.6. Maximum size for me is 268,435,450
I test this in Windows7 32bit with 4GB RAM using qt4.8.4. Maximum size for me is 134,217,722
This error happend : 'std::bad_alloc'
#include <QCoreApplication>
#include <QDebug>
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
QList<bool> li;
for(int i=0; ;i++)
{
li.append(true);
if(i>268435449)
qDebug()<<i;
}
return a.exec();
}
Output is :
268435450
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

can anyone explain why size_t type is used with an example?

I was wondering why this size_t is used where I can use say int type. Its said that size_t is a return type of sizeof operator. What does it mean? like if I use sizeof(int) and store what its return to an int type variable, then it also works, it's not necessary to store it in a size_t type variable. I just clearly want to know the basic concept of using size_t with a clearly understandable example.Thanks
size_t is guaranteed to be able to represent the largest size possible, int is not. This means size_t is more portable.
For instance, what if int could only store up to 255 but you could allocate arrays of 5000 bytes? Clearly this wouldn't work, however with size_t it will.
The simplest example is pretty dated: on an old 16-bit-int system with 64 k of RAM, the value of an int can be anywhere from -32768 to +32767, but after:
char buf[40960];
the buffer buf occupies 40 kbytes, so sizeof buf is too big to fit in an int, and it needs an unsigned int.
The same thing can happen today if you use 32-bit int but allow programs to access more than 4 GB of RAM at a time, as is the case on what are called "I32LP64" models (32 bit int, 64-bit long and pointer). Here the type size_t will have the same range as unsigned long.
You use size_t mostly for casting pointers into unsigned integers of the same size, to perform calculations on pointers as if they were integers, that would otherwise be prevented at compile time. Such code is intended to compile and build correctly in the context of different pointer sizes, e.g. 32-bit model versus 64-bit.
It is implementation defined but on 64bit systems you will find that size_t is often 64bit while int is still 32bit (unless it's ILP64 or SILP64 model).
depending on what architecture you are on (16-bit, 32-bit or 64-bit) an int could be a different size.
if you want a specific size I use uint16_t or uint32_t .... You can check out this thread for more information
What does the C++ standard state the size of int, long type to be?
size_t is a typedef defined to store object size. It can store the maximum object size that is supported by a target platform. This makes it portable.
For example:
void * memcpy(void * destination, const void * source, size_t num);
memcpy() copies num bytes from source into destination. The maximum number of bytes that can be copied depends on the platform. So, making num as type size_t makes memcpy portable.
Refer https://stackoverflow.com/a/7706240/2820412 for further details.
size_t is a typedef for one of the fundamental unsigned integer types. It could be unsigned int, unsigned long, or unsigned long long depending on the implementation.
Its special property is that it can represent the size of (in bytes) of any object (which includes the largest object possible as well!). That is one of the reasons it is widely used in the standard library for array indexing and loop counting (that also solves the portability issue). Let me illustrate this with a simple example.
Consider a vector of length 2*UINT_MAX, where UINT_MAX denotes the maximum value of unsigned int (which is 4294967295 for my implementation considering 4 bytes for unsigned int).
std::vector vec(2*UINT_MAX,0);
If you would want to fill the vector using a for-loop such as this, it would not work because unsigned int can iterate only upto the point UINT_MAX (beyond which it will start again from 0).
for(unsigned int i = 0; i<2*UINT_MAX; ++i) vec[i] = i;
The solution here is to use size_t since it is guaranteed to represent the size of any object (and therefore our vector vec too!) in bytes. Note that for my implementation size_t is a typedef for unsigned long and therefore its max value = ULONG_MAX = 18446744073709551615 considering 8 bytes.
for(size_t i = 0; i<2*UINT_MAX; ++i) vec[i] = i;
References: https://en.cppreference.com/w/cpp/types/size_t

Unions between pointers and data, possible pitfalls?

I'm programming a system which has a massive amount of redundant data that needs to be kept in memory, and accessible with as little latency as possible. (uncompressed, the data is guaranteed to absorb 1GB of memory, minimum).
One such method I thought of is creating a container class like the following:
class Chunk{
public:
Chunk(){ ... };
~Chunk() { /*carefully delete elements according to mask*/ };
getElement(int index);
setElement(int index);
private:
unsigned char mask; // on bit == data is not-redundant, array is 8x8, 64 elements
union{
Uint32 redundant; // all 8 elements are this value if mask bit == 0
Uint32 * ptr; // pointer to 8 allocated elements if mask bit == 1
}array[8];
};
My question, is that is there any unseen consequences of using a union to shift between a Uint32 primative, and a Uint32* pointer?
This approach should be safe on all C++ implementations.
Note, however, that if you know your platform's memory alignment requirements, you may be able to do better than this. In particular, if you know that memory allocations are aligned to 2 bytes or greater (many platforms use 8 or 16 bytes), you can use the lower bit of the pointer as a flag:
class Chunk {
//...
uintptr_t ptr;
};
// In your get function:
if ( (ptr & 1) == 0 ) {
return ((uint32_t *)ptr)[index];
} else {
return *((uint32_t *)(ptr & ~(uintptr_t)0);
}
You can further reduce space usage by using a custom allocation method (with placement new) and placing the pointer immediately after the class, in a single memory allocation (ie, you'll allocate room for Chunk and either the mask or the array, and have ptr point immediately after Chunk). Or, if you know most of your data will have the low bit off, you can use the ptr field directly as the fill-in value:
} else {
return ptr & ~(uintptr_t)0;
}
If it's the high bit that's usually unused, a bit of bit shifting will work:
} else {
return ptr >> 1;
}
Note that this approach of tagging pointers is unportable. It is only safe if you can ensure your memory allocations will be properly aligned. On most desktop OSes, this will not be a problem - malloc already ensures some degree of alignment; on Unixes, you can be absolutely sure by using posix_memalign. If you can obtain such a guarentee for your platform, though, this approach can be quite effective.
If space is at a premium you may be wasting memory. It will allocate enough space for the largest element, which in this case could be up to be 64 bits for the pointer.
If you stick to 32-bit architectures you should not have problems with the cast.

Copying array of ints vs pointers to bools

I'm working on a program that requires an array to be copied many thousands/millions of times. Right now I have two ways of representing the data in the array:
An array of ints:
int someArray[8][8];
where someArray[a][b] can have a value of 0, 1, or 2, or
An array of pointers to booleans:
bool * someArray[8][8];
where someArray[a][b] can be 0 (null pointer), otherwise *someArray[a][b] can be true (corresponding to 1), or false (corresponding to 2).
Which array would be copied faster (and yes, if I made the pointers to booleans array, I would have to declare new bools every time I copy the array)?
Which would copy faster is beside the point, The overhead of allocating and freeing entries, and dereferencing the pointer to retrieve each value, for your bool* approach will swamp the cost of copying.
If you just have 3 possible values, use an array of char and that will copy 4 times faster than int. OK, that's not a scientifically proven statement but the array will be 4 times smaller.
Actually, both look more or less the same in terms of copying - an array of 32-bit ints vs an array of 32-bit pointers. If you compile as 64-bit, then the pointer would probably be bigger.
BTW, if you store pointers, you probably don't want to have a SEPARATE instance of "bool" for every field of that array, do you? That would be certainly much slower.
If you want a fast copy, reduce the size as much as possible, Either:
use char instead of int, or
devise a custom class with bit manipulations for this array. If you represent one value as two bits - a "null" bit and "value-if-not-null" bit, then you'd need 128 bits = 4 ints for this whole array of 64 values. This would certainly be copied very fast! But the access to any individual bit would be a bit more complex - just a few cycles more.
OK, you made me curious :) I rolled up something like this:
struct BitArray {
public:
static const int DIMENSION = 8;
enum BitValue {
BitNull = -1,
BitTrue = 1,
BitFalse = 0
};
BitArray() {for (int i=0; i<DIMENSION; ++i) data[i] = 0;}
BitValue get(int x, int y) {
int k = x+y*DIMENSION; // [0 .. 64)
int n = k/16; // [0 .. 4)
unsigned bit1 = 1 << ((k%16)*2);
unsigned bit2 = 1 << ((k%16)*2+1);
int isnull = data[n] & bit1;
int value = data[n] & bit2;
return static_cast<BitValue>( (!!isnull)*-1 + (!isnull)*!!value );
}
void set(int x, int y, BitValue value) {
int k = x+y*DIMENSION; // [0 .. 64)
int n = k/16; // [0 .. 4)
unsigned bit1 = 1 << ((k%16)*2);
unsigned bit2 = 1 << ((k%16)*2+1);
char v = static_cast<char>(value);
// set nullbit to 1 if v== -1, else 0
if (v == -1) {
data[n] |= bit1;
} else {
data[n] &= ~bit1;
}
// set valuebit to 1 if v== 1, else 0
if (v == 1) {
data[n] |= bit2;
} else {
data[n] &= ~bit2;
}
}
private:
unsigned data[DIMENSION*DIMENSION/16];
};
The size of this object for an 8x8 array is 16 bytes, which is a nice improvement compared to 64 bytes with the solution of char array[8][8] and 256 bytes of int array[8][8].
This is probably as low as one can go here without delving into greater magic.
I would say you need to redesign your program. Converting between int x[8][8] and bool *b[8][8] "millions" of times cannot be "right" however your definition of "right" is lax.
The answer to your question will be linked to the size of the data types. Typically bool is one byte while int is not. A pointer varies in length depending on the architecture, but these days is usually 32- or 64-bits.
Not taking caching or other processor-specific optimizations into consideration, the data type that is larger will take longer to copy.
Given that you have three possible states (0, 1, 2) and 64 entries you can represent your entire structure in 128 bits. Using some utility routines and two unsigned 64-bit integers you can efficiently copy your array around very quickly.
I am not 100% sure, but I think they will take roughly the same time, though I prefer using stack allocation (since dynamic allocation might take some time looking for a free space).
Consider using short type instead of int since you do not need a wide range of numbers.
I think it might be better to use one dimension array if you really want maximum speed since using the for loops in the wrong order which the compiler use for storing multidimensional arrays (raw major or column major) could cause performance penalty!
Without knowing too much about how you use the arrays, this is a possible solution:
typedef char Array[8][8];
Array someArray, otherArray;
memcpy(someArray, otherArray, sizeof(Array));
These arrays are only 64 bytes and should copy fairly fast. You can change the data type to int but that means copying at least 256 bytes.
"copying" this array with the pointers would require a deep copy, since otherwise changing the copy will affect the original, which is probably not what you want. This is going to slow things down immensely due to the memory allocation overhead.
You can get around this by using boost::optional to represent "optional" quantities - which is the only reason you're adding the level of indirection here. There are very few situations in modern C++ where a raw pointer is really the best thing to be using :) However, since you only need a char to store the values {0, 1, 2} anyway, that will probably be better in terms of space. I am pretty sure that sizeof(boost::optional<bool>) > 1, though I haven't tested it. I would be impressed if they specialized for this :)
You could even bit-pack an array of 2-bit quantities, or use two bit-packed boolean arrays (one "mask" and then another set of actual true-false values) - using std::bitset for example. That will certainly save space and reduce copying time, although it would probably increase access time (assuming you really do need to access one value at a time).