Remove memory from the middle of a file - c++

I have a binary format which is build up like that:
magic number
name size blob
name size blob
name size blob
...
it is build up to easy move through the file and find the right entry. But I would like also to remove an entry (let's call it a chunk as it is one). I guess I can use std::copy/memmove with some iostream iterators to move the chunks behind the one to delete and copy them over the chunk to delete. But then I have the space I deleted at the end filled with unusable data(I could fill it up with zeros or not). I likely would shrink the file afterwards.
I know I can read the whole data that I want to keep in a buffer and put it into a new file, but I dislike it to rewrite the whole file for deleting just one chunk.
Any ideas for the best way of removing data in a file?

#MarkSetchell: Had a good idea how to threat that problem:
I now have a magic number at the beginning from every chunk to check whether there is an other valid chunk comming. After moving some data towards the beginning, I move the writer-pointer right behind the last chunk and fill the space for the next magic number with zeros. So when listing up the entries it will stop when there is no valid magic number and if I add an other entry it will automatically override the unused space.

I know I can read the whole data that I want to keep in a buffer and put it into a new file, but I dislike it to rewrite the whole file for deleting just one chunk.
Any ideas for the best way of removing data in a file?
You can't have the best of both worlds. If you want to preserve space, you will need something to describe the file sections (lets call it an allocation table), with each file sections consisting of sequence of shards).
A section would start of normally (one shard), but as soon as it is de-allocated, the de-allocated section will be made available as part of a shard for a new section. One can now choose at what point in time you are willing to live with sharded (non-contiguous) sections (perhaps only after your file reaches a certain size limit).
The allocation table describes each section as a serious (link list) of shards (or one shard, if contiguous). One could either preserve a fixed size for the allocation table, or have it in a different file, or shard it and give it the ability to reconstruct itself.
struct Section
{
struct Shard
{
std::size_t baseAddr_;
std::size_t size_;
};
std::string name_;
std::size_t shardCount_;
std::vector<Shard> shards_;
istream& readFrom( std::istream& );
};
struct AllocTable
{
std::size_t sectionCount_;
std::vector<Section> sections_;
std::size_t next_;
istream& readFrom( std::istream& is, AllocTable* previous )
{
//Brief code... error handling left as your exercise
is >> sectionCount_;
sections_.resize( sectionCount_ );
for( std::size_t i = 0; i < sectionCount_; ++i )
{
sections_[i].readFrom( is );
}
is >> next_; //Note - no error handling for brevity
if( next_ != static_cast<std::size_t>(-1) )
{
is.seekg( next_ ); //Seek to next_ from file beginning
AllocTable nextTable;
nextTable.readFrom( is, this );
sections_.insert( sections_.end(),
nextTable.sections_.begin(), table_.sections_.end() );
}
return is;
}
};
...

Related

C++ - Efficient way to group double vectors following a certain criteria

I have a list of objects saved in a CSV-like file using the following scheme:
[value11],...,[value1n],[label1]
[value21],...,[value2n],[label2]
...
[valuen1],...,[valuenn],[labeln]
(each line is a single object, i.e. a vector of doubles and the respective label).
I would like to collect them in groups with a certain custom criteria (i.e. same values at n-th and (n+1)-th position of all objects of that group). And i need to do that in the most efficient way, since the text file contains hundreds of thounsands of objects. I'm using the C++ programming language.
To do so, firstly I load all the CSV lines in a simple custom container (with getObject, getLabel and import methods). Then i use the following code to read them and make groups. "verifyGroupRequirements" is a function which returns true if the group conditions are satisfied, false otherwise.
for (size_t i = 0; i < ObjectsList.getSize(); ++i) {
MyObject currentObj;
currentObj.attributes = ObjectsList.getObject(i);
currentObj.label = ObjectsList.getLabel(i);
if (i == 0) {
// Sequence initialization with the first object
ObjectsGroup currentGroup = ObjectsGroup();
currentGroup.objectsList.push_back(currentObj);
tmpGroupList.push_back(currentGroup);
} else {
// if it is not the first pattern, then we check sequence conditions
list<ObjectsGroup>::iterator it5;
for (it5 = tmpGroupList.begin(); it5 != tmpGroupList.end(); ++it5) {
bool AddObjectToGroupRequirements =
verifyGroupRequirements(it5->objectsList.back(), currentObj) &
( (it5->objectsList.size() < maxNumberOfObjectsPerGroup) |
(maxNumberOfObjectsPerGroup == 0) );
if (AddObjectToGroupRequirements) {
// Object added to the group
it5->objectsList.push_back(currentObj);
break;
} else {
// If we can't find a group which satisfy those conditions and we
// arrived at the end of the list of groups, then we create a new
// group with that object.
size_t gg = std::distance(it5, tmpGroupList.end());
if (gg == 1) {
ObjectsGroup tmp1 = ObjectsGroup();
tmp1.objectsList.push_back(currentObj);
tmpGroupList.push_back(tmp1);
break;
}
}
}
}
if (maxNumberOfObjectsPerGroup > 0) {
// With a for loop we can take all the elements of
// tmpGroupList which have reached the maximum size
list<ObjectsGroup>::iterator it2;
for (it2 = tmpGroupList.begin(); it2 != tmpGroupList.end(); ++it2) {
if (it2->objectsList.size() == maxNumberOfObjectsPerGroup)
finalGroupList.push_back(*it2);
}
// Since tmpGroupList is a list we can use remove_if to remove them
tmpGroupList.remove_if(rmCondition);
}
}
if (maxNumberOfObjectsPerGroup == 0)
finalGroupList = vector<ObjectsGroup> (tmpGroupList.begin(), tmpGroupList.end());
else {
list<ObjectsGroup>::iterator it6;
for (it6 = tmpGroupList.begin(); it6 != tmpGroupList.end(); ++it6)
finalGroupList.push_back(*it6);
}
Where tmpGroupList is a list<MyObject>, finalGroupList is a vector<MyObject> and rmCondition is a boolean function that returns true if the size of a ObjectsGroup is bigger than a fixed value. MyObject and ObjectsGroup are two simple data structures, written in the following way:
// Data structure of the single object
class MyObject {
public:
MyObject(
unsigned short int &spaceToReserve,
double &defaultContent,
string &lab) {
attributes = vector<double>(spaceToReserve, defaultContent);
label = lab;
}
vector<double> attributes;
string label;
};
// Data structure of a group of object
class ObjectsGroup {
public:
list<MyObject> objectsList;
double health;
};
This code seems to work, but it is really slow. Since, as i said before, i have to apply it on a large set of objects, is there a way to improve it and make it faster? Thanks.
[EDIT] What I'm trying to achieve is to make groups of objects where each object is a vector<double> (got from a CSV file). So what I'm asking here is, is there a more efficient way to collect those kind of objects in groups than what is exposed in the code example above?
[EDIT2] I need to make groups using all of those vectors.
So, I'm reading your question...
... I would like to collect them in groups with a certain custom
criteria (i.e. same values at n-th and (n+1)-th position of all
objects of that group) ...
Ok, I read this part, and kept on reading...
... And i need to do that in the most efficient way, since the text file
contains hundreds of thounsands of objects...
I'm still with you, makes perfect sense.
... To do so, firstly I load all the CSV lines ...
{thud} {crash} {loud explosive noises}
Ok, I stopped reading right there, and didn't pay much attention to the rest of the question, including the large code sample. This is because we have a basic problem right from the start:
1) You say that your intention is, typically, to read only a small
portion of this huge CSV file, and...
2) ... to do that you load the entire CSV file, into a fairly sophisticated data structure.
These two statements are at odds with each. You're reading a huge number of values from a file. You are creating an object for each value. Based on the premise of your question, you're going to have a large number of these objects. But then, when all is said and done, you're only going to look at a small number of them, and throw the rest away?
You are doing a lot of work, presumably using up a lot of memory, and CPU cycles, loading a huge data set, only to ignore most of it. And you are wondering why you're having performance issues? Seems pretty cut and dry to me.
What would be an alternative way of doing this? Well, let's turn this whole problem inside out, and approach it piecemeal. Let's read a CSV file, one line at a time, parse the values in the CSV-formatted file, and pass the resulting strings to a lambda.
Something like this:
template<typename Callback> void parse_csv_lines(std::ifstream &i,
Callback &&callback)
{
std::string line;
while (1)
{
line.clear();
std::getline(i, line);
// Deal with missing newline on the last line...
if (i.eof() && line.empty())
break;
std::vector<std::string> words;
// At this point, you'll take this "line", and split it apart, at
// the commas, into the individual words. Parsing a CSV-
// formatted file. Not very exciting, you're doing this
// already, the algorithm is boring to implement, you know
// how to do it, so let's say you replace this entire comment
// with your boiler-plate CSV parsing logic from your existing
// code
callback(words);
}
}
Ok, now we've done that task of parsing the CSV file. Now, let's say we want to do the task you've set out in the beginning of your question, grab every nth and n+1th position. So...
void do_something_with_n_and_nplus1_words(size_t n)
{
std::ifstream input_file("input_file.csv");
// Insert code to check if input_file.is_open(), and if not, do
// whatever
parse_csv_lines(input_file,
[n]
(const auto &words)
{
// So now, grab words[n] and words[n+1]
// (after checking, of course, for a malformed
// CSV file with fewer than "n+2" values)
// and do whatever you want with them.
});
}
That's it. Now, you end up simply reading the CSV file, and doing the absolute minimum amount of work required to extract the nth and the n+1th values from each CSV file. It's going to be fairly difficult to come up with an approach that does less work (except, of course, micro-optimizations related to CSV parsing, and word buffers; or perhaps foregoing the overhead of std::ifstream, but rather mmap-ing the entire file, and then parsing it out by scanning its mmap-ed contents, something like that), I'd think.
For other similar one-off tasks, requiring only a small number of values from the CSV files, just write an appropriate lambda to fetch them out.
Perhaps, you need to retrieve two or more subsets of values from the large CSV file, and you want to read the CSV file once, maybe? Well, hard to give the best general approach. Each one of these situations will require individual analysis, to pick the best approach.

detecting invalid iterators for a ring buffer

I'm trying to implement a ring buffer (or circular buffer). As with most of these implementations it should be as fast and lightweight as possible but still provide enough safety to be robust enough for production use. This is a difficult balance to strike. In particular I'm faced with the following problem.
I want to use said buffer to store the last n system events. As new events come in the oldest get deleted. Other parts of my software can then access those stored events and process them at their own pace. Some systems might consume events almost as fast as they arrive, others may only check sporadically. Each system would store an iterator into the buffer so that they know where they left off last time they checked. This is no problem as long they check often enough but especially the slower systems may oftentimes find themselves with an old iterator that points to a buffer element that has since been overwritten without a way to detect that.
Is there a good (not too costly) way of checking whether any given iterator is still valid?
Things I came up with so far:
keep a list of all iterators and store their valid state (rather costly)
store not only the iterator in the calling systems but also a copy of the pointed-to element in the client of the buffer. On each access, check whether the element is still the same. This can be unreliable. If the element has been overwritten by an identical element it is impossible to check whether it has changed or not. Also, the responsibility of finding a good way to check elements lies with the client, which is not ideal in my mind.
Many ring buffer implementations don't bother with this at all or use a single-read-single-write idiom, where reading is deleting.
Instead of storing values, store (value, sequence_num) pairs. When you push a new value, always make sure that it uses a different sequence_num. You can use a monotonically increasing integer for sequence_num.
Then, the iterator remembers the sequence_num of the element that it was last looking at. If it doesn't match, it's been overwritten.
I agree with Roger Lipscombe, use sequence numbers.
But you don't need to store (value, sequence_num) pairs: just store the values, and keep track of the highest sequence number so far. Since it's a ring buffer, you can deduce the seq num for all entries.
Thus, the iterators consist simply of a sequence number.
Given Obj the type of object you store in your ring buffer, if you use a simple array, your ring buffer would look like this:
struct RingBuffer {
Obj buf[ RINGBUFFER_SIZE ] ;
size_t idx_last_element ;
uint32_t seqnum_last_element ;
void Append( const Obj& obj ) { // TODO: Locking needed if multithreaded
if ( idx_last_element == RINGBUFFER_SIZE - 1 )
idx_last_element = 0 ;
else
++idx_last_element ;
buf[ idx_last_element ] = obj ; // copy.
++ seqnum_last_element ;
}
}
And the iterator would look like this:
struct RingBufferIterator {
const RingBuffer* ringbuf ;
uint32_t seqnum ;
bool IsValid() {
return ringbuf &&
seqnum <= ringbuf->seqnum_last_element &&
seqnum > ringbuf->seqnum_last_element - RINGBUFFER_SIZE ; //TODO: handle seqnum rollover.
}
Obj* ToPointer() {
if ( ! IsValid() ) return NULL ;
size_t idx = ringbuf->idx_last_element - (ringbuf->seqnum_last_element-seqnum) ; //TODO: handle seqnum rollover.
// handle wrap around:
if ( idx < 0 ) return ringbuf->buf + RINGBUFFER_SIZE- idx ;
return ringbuf->buf + idx ;
}
}
A variation of Roger Lipscombe's answer, is to use a sequence number as the iterator. The sequence number should be monotonically increasing (take special care of when your integer type overflows) with a fixed step (1 eg.).
The circular buffer itself would store the data as normal, and would keep track of the oldest sequence number it currently contains (at the tail position).
When dereferencing the iterator, the iterator's sequence number is checked against the buffer's oldest sequence number. If it's bigger or equal (again take special care of integer overflow), the data can be retrieved using a simple index calculation. If it's smaller, it means the data has been overwritten, and the current tail data should be retrieved instead (updating the iterator's sequence number accordingly).

Find hit/miss in cache c++

I'm struggling with my hw. It asks read a trace file, where each line has reference type, and address in hex. For example, 1st line in the file has address 0x4ef1200231, with type of instruction. It also asks to check if this address in cache is a hit or miss(in L1 and L2). I'm not quite sure how to write c++(i'm very new) to check if it is a hit or miss.
I'm picturing there is a function, say address(long int), then if I call address(0x4ef1200231) then the console can tell me if this address is a hit or miss at L1, and if it is a miss, then call another function to check this address at L2. Is this too naive? Please help. Thanks.
---few lines in the trace---
4ef1200231 Int
2ff1e0122234 WR
82039ef9a3 R
comment: Int means instruction, WR means data write, R means data read. Question is after reading the whole trace file, how many hits, and misses total. Thanks.
This question is probably too advanced for a C++ beginner, but here's some explanation of how to implement a solution....
First, you need to have a container that mimicks the logic used by each level of cache: the simplest (and likely adequate) such container is an Least Recently Used (LRU) data structure. What that does is record a fixed maximum number of in-cache elements, and when an element is accessed it searches for it in the list: if it's found it's moved to the top/front of the list, displacing the 1st and subsequent list elements until the gap it left behind is again filled. If it's not in the list, then it's also added at the top, with all other elements shifted down to make room, and the last element removed if the list is at its maximum size. To implement an LRU nicely, you need to be able to find elements quickly by value, while inserting and removing elements quickly mid-list. This is best done with a combination of an unordered_map and a list, but implementation of that alone is more than you can reasonably be expected to do as a C++ beginner. You could start by using only a list - the searching will be slow (O(n) or linear / brute force), but you can get it working functionally.
Given such an LRU class, you can set the sizes of two instances to represent pages in L1 and L2 cache, then for each address in the input file, you seek that page (say for 4k pages you could divide it by 4096, or bitwise-and it with the bitwise negation of 4095, or bitwise-or it with 4095, or bitshift it right 12 times etc.) in L1, falling back on L2 if necessary. The "is it already in the cache" code can keep hit/miss counters.
Here's some example code to get you started:
template <typename T>
class Dumb_LRU
{
Dumb_LRU(size_t max_size) : n_(max_size) { }
bool operator()(const T& t)
{
std::list<T>::iterator i = std::find(l_.begin(), l_.end(), t);
if (i == l_.end())
{
l_.push_front(t);
if (l_.size() > n_)
l_.pop_back();
return false;
}
if (i != l_.begin()) // not already the first element...
{
l_.erase(i);
l_.push_front(t);
}
return true;
}
private:
std::list<T> l_;
size_t n_;
};
You can then do your simulation like this:
static const size_t l1_cache_pages = 256;
static const size_t l2_cache_pages = 2048;
static const size_t page_size = 4096;
Dumb_LRU<size_t> l1(l1_cache_pages);
Dumb_LRU<size_t> l2(l2_cache_pages);
size_t address;
std::string doing_what;
int l1_hits = 0, l1_misses = 0, l2_hits = 0, l2_misses = 0;
while (std::cin >> address >> doing_what)
{
if (l1(address / page_size))
++l1_hits;
else
{
++l1_misses;
if (l2(address / page_size))
++l2_hits;
else
++l2_misses;
}
// ...print out hits/misses...

Better understanding the LRU algorithm

I need to implement a LRU algorithm in a 3D renderer for texture caching. I write the code in C++ on Linux.
In my case I will use texture caching to store "tiles" of image data (16x16 pixels block). Now imagine that I do a lookup in the cache, get a hit (tile is in the cache). How do I return the content of the "cache" for that entry to the function caller? I explain. I imagine that when I load a tile in the cache memory, I allocate the memory to store 16x16 pixels for example, then load the image data for that tile. Now there's two solutions to pass the content of the cache entry to the function caller:
1) either as pointer to the tile data (fast, memory efficient),
TileData *tileData = cache->lookup(tileId); // not safe?
2) or I need to recopy the tile data from the cache within a memory space allocated by the function caller (copy can be slow).
void Cache::lookup(int tileId, float *&tileData)
{
// find tile in cache, if not in cache load from disk add to cache, ...
...
// now copy tile data, safe but ins't that slow?
memcpy((char*)tileData, tileDataFromCache, sizeof(float) * 3 * 16 * 16);
}
float *tileData = new float[3 * 16 * 16]; // need to allocate the memory for that tile
// get tile data from cache, requires a copy
cache->lookup(tileId, tileData);
I would go with 1) but the problem is, what happens if the tile gets deleted from the cache just after the lookup, and that the function tries to access the data using the return pointer? The only solution I see to this, is to use a form of referencing counting (auto_ptr) where the data is actually only deleted when it's not used anymore?
the application might access more than 1 texture. I can't seem to find of a way of creating a key which is unique to each texture and each tile of a texture. For example I may have tile 1 from file1 and tile1 from file2 in the cache, so making the search on tildId=1 is not enough... but I can't seem to find a way of creating the key that accounts for the file name and the tileID. I can build a string that would contain the file name and the tileID (FILENAME_TILEID) but wouldn't a string used as a key be much slower than an integer?
Finally I have a question regarding time stamp. Many papers suggest to use a time stamp for ordering the entry in the cache. What is a good function to use a time stamp? the time() function, clock()? Is there a better way than using time stamps?
Sorry I realise it's a very long message, but LRU doesn't seem as simple to implement than it sounds.
Answers to your questions:
1) Return a shared_ptr (or something logically equivalent to it). Then all of the "when-is-it-safe-to-delete-this-object" issues pretty much go away.
2) I'd start by using a string as a key, and see if it actually is too slow or not. If the strings aren't too long (e.g. your filenames aren't too long) then you may find it's faster than you expect. If you do find out that string-keys aren't efficient enough, you could try something like computing a hashcode for the string and adding the tile ID to it... that would probably work in practice although there would always be the possibility of a hash-collision. But you could have a collision-check routine run at startup that would generate all of the possible filename+tileID combinations and alert you if map to the same key value, so that at least you'd know immediately during your testing when there is a problem and could do something about it (e.g. by adjusting your filenames and/or your hashcode algorithm). This assumes that what all the filenames and tile IDs are going to be known in advance, of course.
3) I wouldn't recommend using a timestamp, it's unnecessary and fragile. Instead, try something like this (pseudocode):
typedef shared_ptr<TileData *> TileDataPtr; // automatic memory management!
linked_list<TileDataPtr> linkedList;
hash_map<data_key_t, TileDataPtr> hashMap;
// This is the method the calling code would call to get its tile data for a given key
TileDataPtr GetData(data_key_t theKey)
{
if (hashMap.contains_key(theKey))
{
// The desired data is already in the cache, great! Just move it to the head
// of the LRU list (to reflect its popularity) and then return it.
TileDataPtr ret = hashMap.get(theKey);
linkedList.remove(ret); // move this item to the head
linkedList.push_front(ret); // of the linked list -- this is O(1)/fast
return ret;
}
else
{
// Oops, the requested object was not in our cache, load it from disk or whatever
TileDataPtr ret = LoadDataFromDisk(theKey);
linkedList.push_front(ret);
hashMap.put(theKey, ret);
// Don't let our cache get too large -- delete
// the least-recently-used item if necessary
if (linkedList.size() > MAX_LRU_CACHE_SIZE)
{
TileDataPtr dropMe = linkedList.tail();
hashMap.remove(dropMe->GetKey());
linkedList.remove(dropMe);
}
return ret;
}
}
In the same order as your questions:
Copying over the texture date does not seem reasonable from a performance standpoint. Reference counting sound far better, as long as you can actually code it safely. The data memory would be freed as soon as it is not used by the renderer or have a reference stored in the cache.
I assume that you are going to use some sort of hash table for the look-up part of what you are describing. The common solution to your problem has two parts:
Using a suitable hashing function that combines multiple values e.g. the texture file name and the tile ID. Essentially you create a composite key that is treated as one entity. The hashing function could be a XOR operation of the hashes of all elementary components, or something more complex.
Selecting a suitable hash function is critical for performance reasons - if the said function is not random enough, you will have a lot of hash collisions.
Using a suitable composite equality check to handle the case of hash collisions.
This way you can look-up the combination of all attributes of interest in a single hash table look-up.
Using timestamps for this is not going to work - period. Most sources regarding caching usually describe the algorithms in question with network resource caching in mind (e.g. HTTP caches). That is not going to work here for three reasons:
Using natural time only makes sense of you intend to implement caching policies that take it into account, e.g. dropping a cache entry after 10 minutes. Unless you are doing something very weird something like this makes no sense within a 3D renderer.
Timestamps have a relatively low actual resolution, even if you use high precision timers. Most timer sources have a precision of about 1ms, which is a very long time for a processor - in that time your renderer would have worked through several texture entries.
Do you have any idea how expensive timer calls are? Abusing them like this could even make your system perform worse than not having any cache at all...
The usual solution to this problem is to not use a timer at all. The LRU algorithm only needs to know two things:
The maximum number of entries allowed.
The order of the existing entries w.r.t. their last access.
Item (1) comes from the configuration of the system and typically depends on the available storage space. Item (2) generally implies the use of a combined linked list/hash table data structure, where the hash table part provides fast access and the linked list retains the access order. Each time an entry is accessed, it is placed at the end of the list, while old entries are removed from its start.
Using a combined data structure, rather than two separate ones allows entries to be removed from the hash table without having to go through a look-up operation. This improves the overall performance, but it is not absolutely necessary.
As promised I am posting my code. Please let me know if I have made mistakes or if I could improve it further. I am now going to look into making it work in a multi-threaded environment. Again thanks to Jeremy and Thkala for their help (sorry the code doesn't fit the comment block).
#include <cstdlib>
#include <cstdio>
#include <memory>
#include <list>
#include <unordered_map>
#include <cstdint>
#include <iostream>
typedef uint32_t data_key_t;
class TileData
{
public:
TileData(const data_key_t &key) : theKey(key) {}
data_key_t theKey;
~TileData() { std::cerr << "delete " << theKey << std::endl; }
};
typedef std::shared_ptr<TileData> TileDataPtr; // automatic memory management!
TileDataPtr loadDataFromDisk(const data_key_t &theKey)
{
return std::shared_ptr<TileData>(new TileData(theKey));
}
class CacheLRU
{
public:
// the linked list keeps track of the order in which the data was accessed
std::list<TileDataPtr> linkedList;
// the hash map (unordered_map is part of c++0x while hash_map isn't?) gives quick access to the data
std::unordered_map<data_key_t, TileDataPtr> hashMap;
CacheLRU() : cacheHit(0), cacheMiss(0) {}
TileDataPtr getData(data_key_t theKey)
{
std::unordered_map<data_key_t, TileDataPtr>::const_iterator iter = hashMap.find(theKey);
if (iter != hashMap.end()) {
TileDataPtr ret = iter->second;
linkedList.remove(ret);
linkedList.push_front(ret);
++cacheHit;
return ret;
}
else {
++cacheMiss;
TileDataPtr ret = loadDataFromDisk(theKey);
linkedList.push_front(ret);
hashMap.insert(std::make_pair<data_key_t, TileDataPtr>(theKey, ret));
if (linkedList.size() > MAX_LRU_CACHE_SIZE) {
const TileDataPtr dropMe = linkedList.back();
hashMap.erase(dropMe->theKey);
linkedList.remove(dropMe);
}
return ret;
}
}
static const uint32_t MAX_LRU_CACHE_SIZE = 8;
uint32_t cacheMiss, cacheHit;
};
int main(int argc, char **argv)
{
CacheLRU cache;
for (uint32_t i = 0; i < 238; ++i) {
int key = random() % 32;
TileDataPtr tileDataPtr = cache.getData(key);
}
std::cerr << "Cache hit: " << cache.cacheHit << ", cache miss: " << cache.cacheMiss << std::endl;
return 0;
}

Queue + Stack C++

How do u push Items to the front of the array, ( like a stack ) without starting at MAXSIZE-1? I've been trying to use the modulus operator to do so..
bool quack::pushFront(const int nPushFront)
{
if ( count == maxSize ) // indicates a full array
{
return false;
}
else if ( count == 0 )
{
++count;
items[0].n = nPushFront;
return true;
}
intBack = intFront;
items[++intBack] = items[intFront];
++count;
items[(top+(count)+maxSize)%maxSize].n = nPushFront;
/*
for ( int shift = count - 1; shift >= 0; --shift )
{
items[shift] = i€tems[shift-1];
}
items[top+1].n = nPushFront; */
return true;
}
"quack" meaning a cross between a queue and a stack. I cannot simply shift my elements by 1 because it is terribly inefficient. I've been working on this for over a month now. I just need some guidence to push_front by using the modulus operator...I dont think a loop is even necessary.
Its funny because I will need to print the list randomly. So if I start adding values to the MAXSIZE-1 element of my integer array, and then need to print the array, I will have garbage values..
not actual code:
pushFront(2);
pushFront(4);
cout << q;
if we started adding from the back i would get several null values.
I cannot just simply shift the array elements down or up by one.
I cant use any stls, or boosts.
Not sure what your problem is. Are you trying to implement a queue (which also can work as a stack, no need for your quack) as a ring buffer?
In that case, you need to save both a front and a back index. The mechanics are described in the article linked above. Pay attention to the “Difficulties” section: in particular, you need to either have an extra variable or pay attention to leave one field empty – otherwise you won’t know how to differentiate between a completely empty and a completely full queue.
Well, it seems kind of silly to rule out the stl, since std::deque is exactly what you want. Amortized constant time random access. Amortized constant insert/removal time from both the front and the back.
This can be achieved with an array with extra space at the beginning and end. When you run out of space at either end, allocate a new array with twice the space and copy everything over, again with space at both the end and the beginning. You need to keep track of the beginning index and the end index in your class.
It seems to me that you have some conflicting requirements:
You have to push to the head of a C++ array primitive.
Without shifting all of the existing elements.
Maintain insertion order.
Short answer: You can't do it, as the above requirements are mutually exclusive.
One of these requirements has to be relaxed.
To help you without having to guess, we need more information about what you are trying to do.