I'm looking for an implementation of a stack allocated 2d array (array of arrays) which supports O(1) reads. You can guess what I mean from the below picture. Black are filled entries, white are possible gaps from erasure.
The implementation should allow fast random access O(1) on each element but also allow insertion and erase operations which do not shift around too many elements (there might be string objects located there).
Information for each array is held in an object like this:
struct array
{
iterator begin_;
iterator end_;
array* next_;
array* prev_;
};
It contains information on where this particular array starts and the memory neighbours (prev_ and next_).
I'm looking for a proven battle hardened algorithm for insertion and erasing that I can rely on. I tried constructing a few on my own but they tend to become very complicated very quickly.
Hurdles:
When arrays are shifted, each updated array needs to somehow receive the memo (adapt begin and end pointers).
Array objects will be themselves located in an array. This means that with every additional data member of struct array, the memory requirements of the whole thing will grow by member_size * 2d_array_size.
I'm open for all suggestions!
I am thinking of an idea, where we can segment the storage into different segments of size n. entire buffer size will be the multiple of n.
When a new array is to be initialized, we allocate a segment to it. normal array operations can be performed there. when it needs more space, it request for one more segment, and if more segment space available we allocate it to them to extend it.
In this case, Minimum length of an array cannot go below segment size n. this size n can be fine tuned as per requirement for better space efficiency and utilization.
Each segment is numbered, so we can calculate the index of an element and fetch it in O(1).
Sample program (In Python):
class segment:
def __init__(self,number):
self.number=number
class storage:
def __init__(self):
self.size=100
self.default_value=None
self.array=[self.default_value]*self.size
self.segment_size=5
self.number_of_segments =len(self.array)//self.segment_size
self.segment_map=[None]*self.number_of_segments
def store(self,index,value):
if index<self.size and index>=0:
self.array[index]=value
def get(self,index):
return self.array[index]
def get_next_segment(self):
new_seg=None
for i,seg in enumerate(self.segment_map):
if seg == self.default_value:
new_seg= segment(i)
break
self.occupy_segment(new_seg)
self.clean_segment(new_seg)
return new_seg
def occupy_segment(self,seg):
self.segment_map[seg.number]=True
def free_segment(self,seg):
self.segment_map[seg.number]=self.default_value
def destroy_segment(self,seg):
self.clean_segment(seg)
self.free_segment(seg)
def clean_segment(self,segment):
if segment==None:
return
segment_start_index=((segment.number) * (self.segment_size)) + 0
segment_end_index=((segment.number) * (self.segment_size)) + self.segment_size
for i in range(segment_start_index,segment_end_index):
self.store(i,self.default_value)
class array:
def __init__(self,storage):
self.storage=storage
self.length=0
self.segments=[]
self.add_new_segment()
def add_new_segment(self):
new_segment=self.storage.get_next_segment()
if new_segment==None:
raise Exception("Out of storage")
self.segments.append(new_segment)
def is_full(self):
return self.length!=0 and self.length%self.storage.segment_size==0
def calculate_storage_index_of(self,index):
segment_number=index//self.storage.segment_size
element_position=index%self.storage.segment_size
return self.segments[segment_number].number * self.storage.segment_size + element_position
def add(self,value):
if self.is_full():
self.add_new_segment()
last_segement=self.segments[-1]
element_position=0
if self.length!=0:
element_position=(self.length%self.storage.segment_size)
index=(last_segement.number*self.storage.segment_size)+element_position
self.__store(index,value)
self.length+=1
def __store(self,index,value):
self.storage.store(index,value)
def update(self,index,value):
self.__store(
self.calculate_storage_index_of(index),
value
)
def get(self,index):
return self.storage.get(self.calculate_storage_index_of(index))
def destroy(self):
for seg in self.segments:
self.storage.destroy_segment(seg)
st=storage()
array1=array(st)
array1.add(3)
Hi I did not have enough time to implement a full solution (and no time to complete it) but here is the direction I was taking (live demo : https://onlinegdb.com/sp7spV_Ui)
#include <cassert>
#include <array>
#include <stdexcept>
#include <vector>
#include <iostream>
namespace meta_array
{
template<typename type_t, std::size_t N> class meta_array_t;
// template internal class not for public use.
namespace details
{
// a block contains the meta information on a subarray within the meta_array
template<typename type_t, std::size_t N>
class meta_array_block_t
{
public:
// the iterator within a block is of the same type as that of the containing array
using iterator_t = typename std::array<type_t, N>::iterator;
/// <summary>
///
/// </summary>
/// <param name="parent">parent, this link is needed if blocks need to move within the parent array</param>
/// <param name="begin">begin iterator of the block</param>
/// <param name="end">end iterator of the block (one past last)</param>
/// <param name="size">cached size (to not have to calculate it from iterator differences)</param>
meta_array_block_t(meta_array_t<type_t, N>& parent, const iterator_t& begin, const iterator_t& end, std::size_t size) :
m_parent{ parent },
m_begin{ begin },
m_end{ end },
m_size{ size }
{
}
// the begin and end methods allow a block to be used in a range based for loop
iterator_t begin() const noexcept
{
return m_begin;
}
iterator_t end() const noexcept
{
return m_end;
}
// operation to shrink the size of the last free block in the meta-array
void move_begin(std::size_t n) noexcept
{
assert(n <= m_size);
m_size -= n;
m_begin += n;
}
// operation to move a block n items back in the meta array
void move_to_back(std::size_t n) noexcept
{
m_begin += n;
m_end += n;
}
std::size_t size() const noexcept
{
return m_size;
}
// assign a new array to the sub array
// if the new array is bigger then the array that is already there
// then move the blocks after it toward the end of the meta-array
template<std::size_t M>
meta_array_block_t& operator=(const type_t(&values)[M])
{
// move all other sub-arrays back if the new sub-array is bigger
// if it is smaller then adjusting the end iterator of the block is fine
if (M > m_size)
{
m_parent.move_back(m_end, M - m_size);
}
m_size = M;
// copy will do the right thing (copy from back to front) if needed
std::copy(std::begin(values), std::end(values), m_begin);
m_end = m_begin + m_size;
return *this;
}
private:
meta_array_t<type_t, N>& m_parent;
std::size_t m_index;
iterator_t m_begin;
iterator_t m_end;
std::size_t m_size;
};
} // details
//---------------------------------------------------------------------------------------------------------------------
//
template<typename type_t, std::size_t N>
class meta_array_t final
{
public:
meta_array_t() :
m_free_size{ N },
m_size{ 0ul },
m_last_free_block{ *this, m_buffer.begin(), m_buffer.end(), N }
{
}
~meta_array_t() = default;
// meta_array is non copyable & non moveable
meta_array_t(const meta_array_t&) = delete;
meta_array_t operator=(const meta_array_t&) = delete;
meta_array_t(meta_array_t&&) = delete;
meta_array_t operator=(meta_array_t&&) = delete;
// return the number of subarrays
std::size_t array_count() const noexcept
{
return m_size;
}
// return number of items that can still be allocated
std::size_t free_size() const noexcept
{
return m_free_size;
}
template<std::size_t M>
std::size_t push_back(const type_t(&values)[M])
{
auto block = allocate(M);
std::copy(std::begin(values), std::end(values), block.begin());
return m_blocks.size();
}
auto begin()
{
return m_blocks.begin();
}
auto end()
{
return m_blocks.end();
}
auto& operator[](const std::size_t index)
{
assert(index < m_size);
return m_blocks[index];
}
private:
friend class details::meta_array_block_t<type_t, N>;
void move_back(std::array<type_t,N>::iterator begin, std::size_t offset)
{
std::copy(begin, m_buffer.end() - offset - 1, begin + offset);
// update block administation
for (auto& block : m_blocks)
{
if (block.begin() >= begin )
{
block.move_to_back(offset);
}
}
}
auto allocate(std::size_t size)
{
if ((size == 0ul) || (size > m_free_size)) throw std::bad_alloc();
if (m_last_free_block.size() < size)
{
compact();
}
m_blocks.push_back({ *this, m_last_free_block.begin(), m_last_free_block.begin() + size, size });
m_last_free_block.move_begin(size);
m_free_size -= size;
m_size++;
return m_blocks.back();
}
void compact()
{
assert(false); // not implemented yet
// todo when a gap is found between 2 sub-arrays (compare begin/end iterators) then move
// the next array to the front
// the array after that will move to the front by the sum of the gaps ... etc...
}
std::array<type_t, N> m_buffer;
std::vector<details::meta_array_block_t<type_t,N>> m_blocks;
details::meta_array_block_t<type_t,N> m_last_free_block;
std::size_t m_size;
std::size_t m_free_size;
};
} // meta_array
//---------------------------------------------------------------------------------------------------------------------
#define ASSERT_TRUE(x) assert(x);
#define ASSERT_FALSE(x) assert(!x);
#define ASSERT_EQ(x,y) assert(x==y);
static constexpr std::size_t test_buffer_size = 16;
template<typename type_t, std::size_t N>
void show_arrays(meta_array::meta_array_t<type_t, N>& meta_array)
{
std::cout << "\n--- meta_array ---\n";
for (const auto& sub_array : meta_array)
{
std::cout << "sub array = ";
auto comma = false;
for (const auto& value : sub_array)
{
if (comma) std::cout << ", ";
std::cout << value;
comma = true;
}
std::cout << "\n";
}
}
void test_construction()
{
meta_array::meta_array_t<int, test_buffer_size> meta_array;
ASSERT_EQ(meta_array.array_count(),0ul);
ASSERT_EQ(meta_array.free_size(),test_buffer_size);
}
void test_push_back_success()
{
meta_array::meta_array_t<int, test_buffer_size> meta_array;
meta_array.push_back({ 1,2,3 });
meta_array.push_back({ 14,15 });
meta_array.push_back({ 26,27,28,29 });
ASSERT_EQ(meta_array.array_count(),3ul); // cont
ASSERT_EQ(meta_array.free_size(),(test_buffer_size-9ul));
}
void test_range_based_for()
{
meta_array::meta_array_t<int, test_buffer_size> meta_array;
meta_array.push_back({ 1,2,3 });
meta_array.push_back({ 14,15 });
meta_array.push_back({ 26,27,28,29 });
show_arrays(meta_array);
}
void test_assignment()
{
meta_array::meta_array_t<int, test_buffer_size> meta_array;
meta_array.push_back({ 1,2,3 });
meta_array.push_back({ 4,5,6 });
meta_array.push_back({ 7,8,9 });
meta_array[0] = { 11,12 }; // replace with a smaller array then what there was
meta_array[1] = { 21,22,23,24 }; // replace with a bigger array then there was
show_arrays(meta_array);
}
//---------------------------------------------------------------------------------------------------------------------
int main()
{
test_construction();
test_push_back_success();
test_range_based_for();
test_assignment();
return 0;
}
Related
I've been thinking if it is possible to actually let the user decide how many "dimensions" an array should have, based on a number given.
Let n be the number of dimensions, the user will type in its value.
It will create an array with n dimensions.
Example: for n=5, it will create an array called list, like that: int list[size1][size2][size3][size4][size5].
size variables will still be mentioned by the user, but that's actually part 2.
I want to know if I can add more dimensions to an array, after I have declared it. And if not, I want to find a solution to this problem.
The C++ language does not have provision for variable-sized or variable-dimensioned arrays.
You can, however, create a class to encapsulate these behaviors.
The important characteristic is the dimensions. You can use a std::vector<int> to track the number of elements per dimension; for example, {3, 4, 5} to represent a three-dimensional matrix where the rank of the innermost dimension is 3, the middle 4, and the outer 5.
Use a templated vector or deque to allocate space for the elements. The number of elements required is the product of the dimension ranks. (You can use std::accumulate with a multiplication operator to compute this over your ranks vector.)
Next, you'll need a method that takes some object (say, a vector of int) that provides all the indices into the MD-array necessary to access an element. You can provide overloads that take a variable number of arguments using some fancy template metaprogramming.
All of this is overkill outside of some very specialized uses, such as: you are writing Mathematica-like software that allows users to play with these things.
You may be interested in an array class I implemented a few months ago that aims to provide a syntax for arrays that mimics that of matlab arrays. It utilizes initilizer_list syntax to allow for arbitrary dimensional arrays to be created using
Array<double> array({10, 20, 30});
You can then access and modify individual elements using
double d = array[{1, 2, 3}];
array[{1, 2, 3}] = 10;
And even slice the matrix up into pieces using
array.getSlice({___, 3, 4});
where "___" is used as a wildcard.
See more on: http://www.second-quantization.com/Array.html
Implementation: https://github.com/dafer45/TBTK/blob/master/Lib/include/Utilities/TBTK/Array.h
Solution for object let's the user choose the number of dimensions. A little robust, my C++ maybe is not the best, but it was fun implementing. nvector<T> represents resizable (in dimensions and count of elements in each dimension) array of element of T type, although only some resize functions are implemented. narray<T> is the same, but the number of dimensions is not resizable. This works around the idea of recalculating index position of a multidimensional array using a single continuous array.
#include <cstdio>
#include <vector>
#include <iostream>
#include <cstddef>
#include <cstdarg>
#include <algorithm>
#include <numeric>
#include <cassert>
#include <memory>
#include <cstring>
using namespace std;
template<typename T>
class narray {
public:
static size_t compute_size(initializer_list<size_t>& dims) {
return accumulate(dims.begin(), dims.end(), 1, multiplies<size_t>());
}
static size_t compute_size(vector<size_t>& dims) {
return accumulate(dims.begin(), dims.end(), 1, multiplies<size_t>());
}
static size_t compute_distance(vector<size_t>& dims) {
return dims.size() > 1 ? dims[1] : 1;
}
static vector<size_t> remove_one_dim(vector<size_t> dims_) {
return vector<size_t>(dims_.begin() + 1, dims_.end());
}
narray(initializer_list<size_t> dims, T* data) :
dims_(dims), data_(data) {}
narray(vector<size_t> dims, T* data) :
dims_(dims), data_(data) {}
T operator*() {
return *data_;
}
T* operator&() {
return data_;
}
void operator=(T v) {
if (dims_.size() != 0)
throw runtime_error(__PRETTY_FUNCTION__);
*data_ = v;
}
void operator=(initializer_list<T> v) {
if (v.size() > size())
throw runtime_error(__PRETTY_FUNCTION__);
copy(v.begin(), v.end(), data_);
}
T* data() {
return data_;
}
T* data_last() {
return &data()[compute_size(dims_)];
}
size_t size() {
return compute_size(dims_);
}
size_t size(size_t idx) {
return dims_[idx];
}
narray<T> operator[](size_t idx) {
if (dims_.size() == 0)
throw runtime_error(__PRETTY_FUNCTION__);
return narray<T>(remove_one_dim(dims_),
&data_[idx * compute_distance(dims_)]);
}
class iterator {
public:
iterator(initializer_list<size_t>& dims, T* data) :
dims_(dims), data_(data) { }
iterator(vector<size_t>& dims, T* data) :
dims_(dims), data_(data) { }
iterator operator++() {
iterator i = *this;
data_ += compute_distance(dims_);
return i;
}
narray<T> operator*() {
return narray<T>(remove_one_dim(dims_), data_);
}
bool operator!=(const iterator& rhs) {
if (dims_ != rhs.dims_)
throw runtime_error(__PRETTY_FUNCTION__);
return data_ != rhs.data_;
}
private:
vector<size_t> dims_;
T* data_;
};
iterator begin() {
return iterator(dims_, data());
}
iterator end() {
return iterator(dims_, data_last());
}
private:
vector<size_t> dims_;
T* data_;
};
template<typename T>
class nvector {
public:
nvector(initializer_list<size_t> dims) :
dims_(dims), data_(narray<T>::compute_size(dims)) {}
nvector(vector<size_t> dims) :
dims_(dims), data_(narray<T>::compute_size(dims)) {}
nvector(initializer_list<size_t> dims, T* data) :
dims_(dims), data_(data) {}
nvector(vector<size_t> dims, T* data) :
dims_(dims), data_(data) {}
T* data() {
return data_.data();
}
T* data_last() {
return &data()[narray<T>::compute_size(dims_)];
}
size_t size() {
return narray<T>::compute_size(dims_);
}
narray<T> operator&() {
return narray<T>(dims_, data());
}
narray<T> operator[](size_t idx) {
if (dims_.size() == 0)
throw runtime_error(__PRETTY_FUNCTION__);
return narray<T>(narray<T>::remove_one_dim(dims_),
&data()[idx * narray<T>::compute_distance(dims_)]);
}
void operator=(initializer_list<T> v) {
if (v.size() > size())
throw runtime_error(__PRETTY_FUNCTION__);
copy(v.begin(), v.end(), data_.begin());
}
auto begin() {
return typename narray<T>::iterator(dims_, data());
}
auto end() {
return typename narray<T>::iterator(dims_, data_last());
}
// add and remove dimensions
void dimension_push_back(size_t dimsize) {
dims_.push_back(dimsize);
data_.resize(size());
}
void dimension_pop_back() {
dims_.pop_back();
data_.resize(size());
}
// TODO: resize dimension of index idx?
private:
vector<size_t> dims_;
vector<T> data_;
};
int main()
{
nvector<int> A({2, 3});
A = { 1,2,3, 4,5,6 };
assert(A.size() == 6);
assert(&A[0] == &A.data()[0]);
assert(&A[0][0] == &A.data()[0]);
assert(&A[1] == &A.data()[3]);
assert(&A[0][1] == &A.data()[1]);
assert(&A[1][1] == &A.data()[4]);
cout << "Currently array has " << A.size() << " elements: " << endl;
for(narray<int> arr1 : A) { // we iterate over arrays/dimensions
for(narray<int> arr2 : arr1) { // the last array has no dimensions
cout << "elem: " << *arr2 << endl;
}
}
cout << endl;
// assigment example
cout << "Now it is 4: " << *A[1][0] << endl;
A[1][0] = 10;
cout << "Now it is 10: " << *A[1][0] << endl;
return 0;
}
This code needs still much more work. It works only as a simple example. Maybe use shared_ptr in narray? Implement better exceptions?
So creating an array of n=5 dimensions with sizes size1, size2, size3, size4 and size5 would like this:
narray<int> arr({size1, size2, size3, size4, size5});
arr[0][1][2][3][4] = 5; // yay!
I am given series of ranges and I need to iterate each number in any of the ranges exactly once. The ranges may overlap and contain the same numbers.
The numbers in the range are
using Number = uint32_t;
Ranges are of this form
struct Range {
Number first;
Number last;
Number interval;
};
Just to clarify the representation of Range.
Range range = {
2, //first
14, //last
3 //interval
};
//is equivalent to...
std::vector<Number> list = {2, 5, 8, 11, 14};
I have a few Ranges and I need to efficiently iterate all of the numbers in any order only once.
How do I efficiently iterate a set of ranges?
Also, Is there there a more efficient algorithm if interval is always 1?
For each range, remember the "current" value (going from first to last with the step size). Put that along with the range in a priority queue, sorted after the current value.
Take the top out, if its current value is different from the last, then use it. Then, insert the next step if there is another.
Assumes positive step size.
template<typename Iterator, typename Operation>
void iterate_ranges (Iterator from, Iterator to, Operation op) {
using R = typename std::iterator_traits<Iterator>::value_type;
using N = typename std::decay<decltype(std::declval<R>().first)>::type;
using P = std::pair<N, R>;
auto compare = [](P const & left, P const & right) {
return left.first > right.first;};
std::priority_queue<P, std::vector<P>, decltype(compare)> queue(compare);
auto push = [& queue] (P p) {
if (p.first < p.second.last) queue.push(p); };
auto next = [](P const & p) -> P {
assert(p.second.step > 0);
return {p.first + p.second.step, p.second}; };
auto init = [&push] (R const & r) {
push({r.first, r}); };
std::for_each(from, to, init);
if (queue.empty()) return;
N last = queue.top().first;
push(next(queue.top()));
queue.pop();
op(last);
while (! queue.empty()) {
P current = queue.top();
queue.pop();
if (current.first != last) {
op(current.first);
last = current.first;
}
push(next(current));
}
}
Memory requirement: linear in the number of ranges. Time requirement: sum of all step counts within each range.
Small example:
struct Range {
int first;
int last;
int step; // a better name ...
};
int main() {
Range ranges [] = {
{1, 10, 2},
{2, 50, 5}};
auto print = [](auto n) { std::cout << n << std::endl; };
iterate_ranges(std::begin(ranges), std::end(ranges), print);
}
To get all numbers in a vector, use a lambda with a reference to a vector and push back each one.
Is there there a more efficient algorithm if interval is always 1?
You could add that as a special case, but I don't think it will be necessary. If you only got ~50 ranges, then above push won't be that expensive. Though, with all optimisation: profile first!
If the sequences are very long you might like to just take each result in order, without storing the list, discarding duplicates as you go.
#include <vector>
// algorithm to interpolate integer ranges/arithmetic_sequences
template<typename ASqs, typename Action>
void arithmetic_sequence_union(ASqs arithmetic_sequences, Action action)
{
using ASq = ASqs::value_type;
using T = ASq::value_type;
std::vector<ASq> remaining_asqs(begin(arithmetic_sequences), end(arithmetic_sequences));
while (remaining_asqs.size()) {
// get next value
T current_value = **std::min_element(begin(remaining_asqs), end(remaining_asqs),
[](auto seq1, auto seq2) { return *seq1 < *seq2; }
);
// walk past this value and any duplicates, dropping any completed arithmetic_sequence iterators
for (size_t seq_index = 0; seq_index < remaining_asqs.size(); )
{
ASq &asq = remaining_asqs[seq_index];
if (current_value == *asq // do we have the next value in this sequence?
&& !++asq) { // consume it; was it the last value in this sequence?
remaining_asqs.erase(begin(remaining_asqs) + seq_index);//drop the empty sequence
}
else {
++seq_index;
}
}
action(current_value);
}
}
This wants the range presented in a "generator"-type object. Would probably look very like the implementation of checked a iterator, but iterators don't expose the notion of knowing they are at the end of the sequence so we might have to roll our own simple generator.
template <typename ValueType, typename DifferenceType>
class arithmetic_sequence {
public:
using value_type = ValueType;
using difference_type = DifferenceType;
arithmetic_sequence(value_type start, difference_type stride, value_type size) :
start_(start), stride_(stride), size_(size) {}
arithmetic_sequence() = default;
operator bool() { return size_ > 0; }
value_type operator*() const { return start_; }
arithmetic_sequence &operator++() { --size_; start_ += stride_; return *this;}
private:
value_type start_;
difference_type stride_;
value_type size_;
};
Test example:
#include "sequence_union.h"
#include "arithmetic_sequence.h"
#include <cstddef>
#include <array>
#include <algorithm>
#include <iostream>
using Number = uint32_t;
struct Range {
Number first;
Number last;
Number interval;
};
using Range_seq = arithmetic_sequence<Number, Number>;
Range_seq range2seq(Range range)
{
return Range_seq(range.first, range.interval, (range.last - range.first) / range.interval + 1 );
}
int main() {
std::array<Range, 2> ranges = { { { 2,14,3 },{ 2,18,2 } } };
std::array<Range_seq, 2> arithmetic_sequences;
std::transform(begin(ranges), end(ranges), begin(arithmetic_sequences), range2seq);
std::vector<size_t> results;
arithmetic_sequence_union(
arithmetic_sequences,
[&results](auto item) {std::cout << item << "; "; }
);
return 0;
}
// output: 2; 4; 5; 6; 8; 10; 11; 12; 14; 16; 18;
I am new to C++ world and I need a help. My problem is I try implement my structure hash pair array, there is key and data. In this structure I have nested structure iterator with methods hasNext and next. Because I can not see my array (this array is in parent) from nested structure I need pass it through constructor, but there is error ": cannot convert from...", problem is with pass _array in method getIterator. Code is below. Could you help me? Thanks
#pragma once
template<typename T, typename U, int Size, int(*HashFunction)(T)>
struct HashPairPole {
// Pair - key - data
struct Par {
// key
T _first;
// data
U _second;
// list for collision records
Par* _overflow;
Par(T t, U u) {
_first = t;
_second = u;
_overflow = nullptr;
}
};
HashParovePole() {}
// Static array for save data
Par* _array[Size];
// Add record into hash table
void add(T t, U u) {
// calculating of index
Par* prvek;
int idx = HashFunction(t) % Size;
// Element will be saved in _array[idx], if it is free, else will be
//saved to list (->_overflow)
prvek = new Par(t, u);
if (_array[idx] == nullptr) {
_array[idx] = prvek;
}
else {
prvek->_overflow = _array[idx];
}
_array[idx] = prvek;
}
// Get data from hash tabule
U& get(T t) {
int idx = HashFunction(t) % Size;
Par * prvni = _array[idx];
while (prvni->_overflow != nullptr) {
if (prvni->_first == t) {
return prvni->_second;
}
prvni = prvni->_overflow;
}
}
U& operator[](T t) {
return get(t);
}
U operator[](T t) const {
const U temp = get(t);
return temp;
}
// Iterator for walking all hash table
struct iterator {
Par* index[Size];
Par* pomPar;
int temp = 0;
iterator(Par * _array) {
index = _array;
pomPar = index[0];
}
bool hasNext()const {
return pomPar != nullptr;
}
std::pair<T, U> next() {
std::pair<T, U> data;
if (hasNext()) {
data.first = pomPar->_first;
data.second = pomPar->_second;
pomPar = pomPar->_overflow;
}
temp++;
pomPar = index[temp];
return data;
}
};
// Vytvori iterator
iterator getIterator() {
return iterator(_array);
}
};
As far as I see, the problem is in this line:
Par* _array[Size];
Here you declare an array of size Size of pointers to Par structures, which is probably not what you want.
Later you try to pass this array to constructor iterator(Par * _array), which accepts a pointer to Par structure, which is impossible.
I would fix this code in the following way:
Par _array[Size]; // Instead of Par* _array[Size]
// You need an array of structures instead of array of pointers
...
Par* index; // Instead of Par* index[Size]
// Here looks like index is a pointer to a current element
...
pomPar = index; // Instead of pomPar = index[0];
// This is a pointer to the node, while index[0] is its value
Also, consider using std::vector instead of raw pointers. It will handle memory management issues for you.
I'm trying to keep a vector of commands so that it keeps 10 most recent. I have a push_back and a pop_back, but how do I delete the oldest without shifting everything in a for loop? Is erase the only way to do this?
Use std::deque which is a vector-like container that's good at removal and insertion at both ends.
If you're amenable to using boost, I'd recommend looking at circular_buffer, which deals with this exact problem extremely efficiently (it avoids moving elements around unnecessarily, and instead just manipulates a couple of pointers):
// Create a circular buffer with a capacity for 3 integers.
boost::circular_buffer<int> cb(3);
// Insert threee elements into the buffer.
cb.push_back(1);
cb.push_back(2);
cb.push_back(3);
cb.push_back(4);
cb.push_back(5);
The last two ops simply overwrite the elements of the first two.
Write a wrapper around a vector to give yourself a circular buffer. Something like this:
include <vector>
/**
Circular vector wrapper
When the vector is full, old data is overwritten
*/
class cCircularVector
{
public:
// An iterator that points to the physical begining of the vector
typedef std::vector< short >::iterator iterator;
iterator begin() { return myVector.begin(); }
iterator end() { return myVector.end(); }
// The size ( capacity ) of the vector
int size() { return (int) myVector.size(); }
void clear() { myVector.clear(); next = 0; }
void resize( int s ) { myVector.resize( s ); }
// Constructor, specifying the capacity
cCircularVector( int capacity )
: next( 0 )
{
myVector.resize( capacity );
}
// Add new data, over-writing oldest if full
void push_back( short v )
{
myVector[ next] = v;
advance();
}
int getNext()
{
return next;
}
private:
std::vector< short > myVector;
int next;
void advance()
{
next++;
if( next == (int)myVector.size() )
next = 0;
}
};
What about something like this:
http://ideone.com/SLSNpc
Note: It's just a base, you still need to work a bit on it. The idea is that it's easy to use because it has it's own iterator, which will give you the output you want. As you can see the last value inserted is the one shown first, which I'm guessing is what you want.
#include <iostream>
#include <vector>
template<class T, size_t MaxSize>
class TopN
{
public:
void push_back(T v)
{
if (m_vector.size() < MaxSize)
m_vector.push_back(v);
else
m_vector[m_pos] = v;
if (++m_pos == MaxSize)
m_pos = 0;
}
class DummyIterator
{
public:
TopN &r; // a direct reference to our boss.
int p, m; // m: how many elements we can pull from vector, p: position of the cursor.
DummyIterator(TopN& t) : r(t), p(t.m_pos), m(t.m_vector.size()){}
operator bool() const { return (m > 0); }
T& operator *()
{
static T e = 0; // this could be removed
if (m <= 0) // if someone tries to extract data from an empty vector
return e; // instead of throwing an error, we return a dummy value
m--;
if (--p < 0)
p = MaxSize - 1;
return r.m_vector[p];
}
};
decltype(auto) begin() { return m_vector.begin(); }
decltype(auto) end() { return m_vector.end(); }
DummyIterator get_dummy_iterator()
{
return DummyIterator(*this);
}
private:
std::vector<T> m_vector;
int m_pos = 0;
};
template<typename T, size_t S>
void show(TopN<T,S>& t)
{
for (auto it = t.get_dummy_iterator(); it; )
std::cout << *it << '\t';
std::cout << std::endl;
};
int main(int argc, char* argv[])
{
TopN<int,10> top10;
for (int i = 1; i <= 10; i++)
top10.push_back(5 * i);
show(top10);
top10.push_back(60);
show(top10);
top10.push_back(65);
show(top10);
return 0;
}
I'm creating a pre-allocator with dynamic memory chunk size, and I need to unify contiguous memory chunks.
struct Chunk // Chunk of memory
{
Ptr begin, end; // [begin, end) range
}
struct PreAlloc
{
std::vector<Chunk> chunks; // I need to unify contiguous chunks here
...
}
I tried a naive solution, that, after sorting the chunks based on their begin, basically did a pass through the vector checking if the next chunk's begin was equal to the current chunk's end. I'm sure it could be improved.
Is there a good algorithm to unify contiguous ranges?
Information:
Chunks can never "overlap".
Chunks can have any size greater than 0.
Performance is the most important factor.
NOTE: there was an error in my original algorithm, where I only considered blocks to the left of the current block.
Use two associative tables (e.g. unordered_map), one mapping the begin address to the Chunk, another mapping the end to the Chunk. This lets you find the neighbouring blocks quickly. Alternatively, you can change the Chunk struct to store a pointer/id/whatever to the neighbouring Chunk, plus a flag to mark to tell if it's free.
The algorithm consists of scanning the vector of chunks once, while maintaining the invariant: if there is a neighbour to the left, you merge them; if there is a neighbour to the right, you merge them. At the end, just collect the remaining chunks.
Here's the code:
void unify(vector<Chunk>& chunks)
{
unordered_map<Ptr, Chunk> begins(chunks.size() * 1.25); // tweak this
unordered_map<Ptr, Chunk> ends(chunks.size() * 1.25); // tweak this
for (Chunk c : chunks) {
// check the left
auto left = ends.find(c.begin);
if (left != ends.end()) { // found something to the left
Chunk neighbour = left->second;
c.begin = neighbour.begin;
begins.erase(neighbour.begin);
ends.erase(left);
}
// check the right
auto right = begins.find(c.end);
if (right != begins.end()) { // found something to the right
Chunk neighbour = right->second;
c.end = neighbour.end;
begins.erase(right);
ends.erase(neighbour.end);
}
begins[c.begin] = c;
ends[c.end] = c;
}
chunks.clear();
for (auto x : begins)
chunks.push_back(x.second);
}
The algorithm has O(n) complexity assuming constant time access to the begins and ends tables (which is nearly what you get if you don't trigger rehashing, hence the "tweak this" comments). There are quite a few options to implement associative tables, make sure to try a few different alternatives; as pointed out in the comment by Ben Jackson, a hash table doesn't always make good use of cache, so even a sorted vector with binary searches might be faster.
If you can change the Chunk structure to store left/right pointers, you get a guaranteed O(1) lookup/insert/remove. Assuming you are doing this to consolidate free chunks of memory, the left/right checking can be done in O(1) during the free() call, so there is no need to consolidate it afterwards.
I think you can not do better then N log(N) - the naive approach. The idea using an unordered associative container I dislike - the hashing will degenerate performance. An improvement might be: keep the chunks sorted at each insert, making 'unify' O(N).
It seems you are writing some allocator, hence I dig up some old code of mine (with some adjustment regarding C++ 11 and without any warranty). The allocator is for small objects having a size <= 32 * sizeof(void*).
Code:
// Copyright (c) 1999, Dieter Lucking.
//
// Permission is hereby granted, free of charge, to any person or organization
// obtaining a copy of the software and accompanying documentation covered by
// this license (the "Software") to use, reproduce, display, distribute,
// execute, and transmit the Software, and to prepare derivative works of the
// Software, and to permit third-parties to whom the Software is furnished to
// do so, all subject to the following:
//
// The copyright notices in the Software and this entire statement, including
// the above license grant, this restriction and the following disclaimer,
// must be included in all copies of the Software, in whole or in part, and
// all derivative works of the Software, unless such copies or derivative
// works are solely in the form of machine-executable object code generated by
// a source language processor.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
// SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
// FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
// ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
// DEALINGS IN THE SOFTWARE.
//
#include <limits>
#include <chrono>
#include <iomanip>
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>
// raw_allocator
// =============================================================================
class raw_allocator
{
// Types
// =====
public:
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
typedef void value_type;
typedef void* pointer;
typedef const void* const_pointer;
typedef unsigned char byte_type;
typedef byte_type* byte_pointer;
typedef const unsigned char* const_byte_pointer;
// Information
// ===========
public:
static size_type max_size() noexcept {
return std::numeric_limits<size_type>::max();
}
static size_type mem_size(size_type) noexcept;
// Allocation.System
// =================
public:
static pointer system_allocate(size_type) noexcept;
static void system_allocate(size_type, pointer&, size_type&) noexcept;
static void system_deallocate(pointer) noexcept;
// Allocation
// ==========
public:
static void allocate(size_type, pointer& result, size_type& capacity) noexcept;
static pointer allocate(size_type n) noexcept {
pointer result;
allocate(n, result, n);
return result;
}
static void deallocate(pointer p, size_type n) noexcept;
// Allocation.Temporary:
//======================
public:
static void allocate_temporary(size_type, pointer& result,
size_type& capacity) noexcept;
static pointer allocate_temporary(size_type n) noexcept {
pointer result;
allocate_temporary(n, result, n);
return result;
}
static void deallocate_temporary(pointer, size_type) noexcept;
// Logging
// =======
public:
static void log(std::ostream& stream);
};
// static_allocator
// =============================================================================
template<class T> class static_allocator;
template<>
class static_allocator<void>
{
public:
typedef void value_type;
typedef void* pointer;
typedef const void* const_pointer;
template<class U> struct rebind
{
typedef static_allocator<U> other;
};
};
template<class T>
class static_allocator
{
// Types
// =====
public:
typedef raw_allocator::size_type size_type;
typedef raw_allocator::difference_type difference_type;
typedef T value_type;
typedef T& reference;
typedef const T& const_reference;
typedef T* pointer;
typedef const T* const_pointer;
template<class U> struct rebind
{
typedef static_allocator<U> other;
};
// Construction/Destruction
// ========================
public:
static_allocator() noexcept {};
static_allocator(const static_allocator&) noexcept {};
~static_allocator() noexcept {};
// Information
// ===========
public:
static size_type max_size() noexcept {
return raw_allocator::max_size() / sizeof(T);
}
static size_type mem_size(size_type n) noexcept {
return raw_allocator::mem_size(n * sizeof(T)) / sizeof(T);
}
static pointer address(reference x) {
return &x;
}
static const_pointer address(const_reference x) {
return &x;
}
// Construct/Destroy
//==================
public:
static void construct(pointer p, const T& value) {
new ((void*) p) T(value);
}
static void destroy(pointer p) {
((T*) p)->~T();
}
// Allocation
//===========
public:
static pointer allocate(size_type n) noexcept {
return (pointer)raw_allocator::allocate(n * sizeof(value_type));
}
static void allocate(size_type n, pointer& result, size_type& capacity) noexcept
{
raw_allocator::pointer p;
raw_allocator::allocate(n * sizeof(value_type), p, capacity);
result = (pointer)(p);
capacity /= sizeof(value_type);
}
static void deallocate(pointer p, size_type n) noexcept {
raw_allocator::deallocate(p, n * sizeof(value_type));
}
// Allocation.Temporary
// ====================
static pointer allocate_temporary(size_type n) noexcept {
return (pointer)raw_allocator::allocate_temporary(n * sizeof(value_type));
}
static void allocate_temporary(size_type n, pointer& result,
size_type& capacity) noexcept
{
raw_allocator::pointer p;
raw_allocator::allocate_temporary(n * sizeof(value_type), p, capacity);
result = (pointer)(p);
capacity /= sizeof(value_type);
}
static void deallocate_temporary(pointer p, size_type n) noexcept {
raw_allocator::deallocate_temporary(p, n);
}
// Logging
// =======
public:
static void log(std::ostream& stream) {
raw_allocator::log(stream);
}
};
template <class T1, class T2>
inline bool operator ==(const static_allocator<T1>&,
const static_allocator<T2>&) noexcept {
return true;
}
template <class T1, class T2>
inline bool operator !=(const static_allocator<T1>&,
const static_allocator<T2>&) noexcept {
return false;
}
// allocator:
// =============================================================================
template<class T> class allocator;
template<>
class allocator<void>
{
public:
typedef static_allocator<void>::value_type value_type;
typedef static_allocator<void>::pointer pointer;
typedef static_allocator<void>::const_pointer const_pointer;
template<class U> struct rebind
{
typedef allocator<U> other;
};
};
template<class T>
class allocator
{
// Types
// =====
public:
typedef typename static_allocator<T>::size_type size_type;
typedef typename static_allocator<T>::difference_type difference_type;
typedef typename static_allocator<T>::value_type value_type;
typedef typename static_allocator<T>::reference reference;
typedef typename static_allocator<T>::const_reference const_reference;
typedef typename static_allocator<T>::pointer pointer;
typedef typename static_allocator<T>::const_pointer const_pointer;
template<class U> struct rebind
{
typedef allocator<U> other;
};
// Constructor/Destructor
// ======================
public:
template <class U>
allocator(const allocator<U>&) noexcept {}
allocator() noexcept {};
allocator(const allocator&) noexcept {};
~allocator() noexcept {};
// Information
// ===========
public:
size_type max_size() const noexcept {
return static_allocator<T>::max_size();
}
pointer address(reference x) const {
return static_allocator<T>::address(x);
}
const_pointer address(const_reference x) const {
return static_allocator<T>::address(x);
}
// Construct/Destroy
// =================
public:
void construct(pointer p, const T& value) {
static_allocator<T>::construct(p, value);
}
void destroy(pointer p) {
static_allocator<T>::destroy(p);
}
// Allocation
// ==========
public:
pointer allocate(size_type n, typename allocator<void>::const_pointer = 0) {
return static_allocator<T>::allocate(n);
}
void deallocate(pointer p, size_type n) {
static_allocator<T>::deallocate(p, n);
}
// Logging
// =======
public:
static void log(std::ostream& stream) {
raw_allocator::log(stream);
}
};
template <class T1, class T2>
inline bool operator ==(const allocator<T1>&, const allocator<T2>&) noexcept {
return true;
}
template <class T1, class T2>
inline bool operator !=(const allocator<T1>&, const allocator<T2>&) noexcept {
return false;
}
// Types
// =============================================================================
typedef raw_allocator::size_type size_type;
typedef raw_allocator::byte_pointer BytePointer;
struct LinkType
{
LinkType* Link;
};
struct FreelistType
{
LinkType* Link;
};
// const
// =============================================================================
// Memory layout:
// ==============
//
// Freelist
// Index Request Alignment
// =============================================================================
// [ 0 ... 7] [ 0 * align ... 8 * align] every 1 * align bytes
// [ 8 ... 11] ( 8 * align ... 16 * align] every 2 * align bytes
// [12 ... 13] ( 16 * align ... 24 * align] every 4 * align bytes
// [14] ( 24 * align ... 32 * align] 8 * align bytes
//
// temporary memory:
// [15] [ 0 * align ... 256 * align] 256 * align
static const unsigned FreeListArraySize = 16;
static const size_type FreelistInitSize = 16;
static const size_type MinAlign =
(8 < 2 * sizeof(void*)) ? 2 * sizeof(void*) : 8;
static const size_type MaxAlign = 32 * MinAlign;
static const size_type MaxIndex = 14;
static const size_type TmpIndex = 15;
static const size_type TmpAlign = 256 * MinAlign;
static const size_type IndexTable[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 8, 9, 9, 10,
10, 11, 11, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14 };
static_assert(sizeof(IndexTable) / sizeof(size_type) == MaxAlign / MinAlign, "Invalid Index Table");
inline size_type get_index(size_type n) {
return IndexTable[long(n - 1) / MinAlign];
}
static const size_type AlignTable[] = { MinAlign * 1, MinAlign * 2, MinAlign
* 3, MinAlign * 4, MinAlign * 5, MinAlign * 6, MinAlign * 7, MinAlign * 8,
MinAlign * 10, MinAlign * 12, MinAlign * 14, MinAlign * 16, MinAlign * 20,
MinAlign * 24, MinAlign * 32, TmpAlign, };
static_assert(sizeof(AlignTable) / sizeof(size_type) == TmpIndex + 1, "Invalid Align Table");
inline size_type get_align(size_type i) {
return AlignTable[i];
}
// Thread
// ============================================================================
static LinkType* Freelist[FreeListArraySize];
static BytePointer HeapBeg;
static BytePointer HeapEnd;
static size_type TotalHeapSize;
static std::mutex FreelistMutex[FreeListArraySize] = { };
inline void lock_free_list(size_type i) {
FreelistMutex[i].lock();
}
inline void unlock_free_list(size_type i) {
FreelistMutex[i].unlock();
}
// Allocation
// ============================================================================
// Requiers: freelist[index] is locked
LinkType* allocate_free_list(size_type index) noexcept {
static std::mutex mutex;
const size_type page_size = 4096; // FIXME some system_page_size();
std::lock_guard<std::mutex> guard(mutex);
size_type heap_size = HeapEnd - HeapBeg;
size_type align = get_align(index);
if(heap_size < align) {
LinkType* new_list = (LinkType*)(HeapBeg);
// If a temporary list:
if(MaxAlign <= heap_size) {
LinkType* current = new_list;
LinkType* next;
while(2*MaxAlign <= heap_size) {
next = (LinkType*)(BytePointer(current) + MaxAlign);
current->Link = next;
current = next;
heap_size -= MaxAlign;
}
if(index != MaxIndex) lock_free_list(MaxIndex);
current->Link = Freelist[MaxIndex];
Freelist[MaxIndex] = new_list;
if(index != MaxIndex) unlock_free_list(MaxIndex);
new_list = (LinkType*)(BytePointer(current) + MaxAlign);
heap_size -= MaxAlign;
}
if(MinAlign <= heap_size) {
std::cout << "heap_size: " << heap_size << std::endl;
size_type i = get_index(heap_size);
if(heap_size < get_align(i)) --i;
if(index != i) lock_free_list(i);
new_list->Link = Freelist[i];
Freelist[i] = new_list;
if(index != i) unlock_free_list(i);
}
heap_size = FreelistInitSize * align + TotalHeapSize / FreelistInitSize;
heap_size = (((heap_size - 1) / page_size) + 1) * page_size;
HeapBeg = BytePointer(raw_allocator::system_allocate(heap_size));
if(HeapBeg) {
HeapEnd = HeapBeg + heap_size;
TotalHeapSize += heap_size;
}
else {
HeapEnd = 0;
size_type i = FreeListArraySize;
while(HeapBeg == 0) {
--i;
if(i <= index) return 0;
lock_free_list(i);
if(Freelist[i]) {
heap_size = get_align(i);
HeapBeg = (BytePointer)(Freelist[i]);
HeapEnd = HeapBeg + heap_size;
Freelist[i] = Freelist[i]->Link;
}
unlock_free_list(i);
}
}
}
size_type size = FreelistInitSize * align;
size_type count = FreelistInitSize;
if(heap_size < size) {
count = heap_size / align;
size = align * count;
}
LinkType* beg_list = (LinkType*)(HeapBeg);
LinkType* end_list = beg_list;
while(--count) {
LinkType* init = (LinkType*)(BytePointer(end_list) + align);
end_list->Link = init;
end_list = init;
}
LinkType*& freelist = Freelist[index];
end_list->Link = freelist;
freelist = beg_list;
HeapBeg += size;
return freelist;
}
// raw_allocator
// ============================================================================
// size
// ====
raw_allocator::size_type
raw_allocator::mem_size(size_type n) noexcept {
if( ! n) return 0;
else {
if(n <= MaxAlign) return get_align(get_index(n));
else return ((difference_type(n) - 1) / difference_type(MaxAlign)) * MaxAlign
+ MaxAlign;
}
}
// allocation.system
// =================
raw_allocator::pointer raw_allocator::system_allocate(size_type n) noexcept
{
return ::malloc(n);
}
void raw_allocator::system_allocate(size_type n, pointer& p, size_type& capacity) noexcept
{
capacity = mem_size(n);
p = ::malloc(capacity);
if(p == 0) capacity = 0;
}
void raw_allocator::system_deallocate(pointer p) noexcept {
::free(p);
}
// allocation
// ==========
void raw_allocator::allocate(size_type n, pointer& p, size_type& capacity) noexcept
{
if(n == 0 || MaxAlign < n) system_allocate(n, p, capacity);
else {
p = 0;
capacity = 0;
size_type index = get_index(n);
lock_free_list(index);
LinkType*& freelist = Freelist[index];
if(freelist == 0) {
freelist = allocate_free_list(index);
}
if(freelist != 0) {
p = freelist;
capacity = get_align(index);
freelist = freelist->Link;
}
unlock_free_list(index);
}
}
void raw_allocator::deallocate(pointer p, size_type n) noexcept {
if(p) {
if(n == 0 || MaxAlign < n) system_deallocate(p);
else {
size_type index = get_index(n);
lock_free_list(index);
LinkType*& freelist = Freelist[index];
LinkType* new_list = ((LinkType*)(p));
new_list->Link = freelist;
freelist = new_list;
unlock_free_list(index);
}
}
}
// allocation.temporary
// ====================
void raw_allocator::allocate_temporary(size_type n, pointer& p,
size_type& capacity) noexcept
{
if(n == 0 || size_type(TmpAlign) < n) system_allocate(n, p, capacity);
else {
p = 0;
capacity = 0;
lock_free_list(TmpIndex);
LinkType*& freelist = Freelist[TmpIndex];
if(freelist == 0) freelist = allocate_free_list(TmpIndex);
if(freelist != 0) {
p = freelist;
freelist = freelist->Link;
capacity = TmpAlign;
}
unlock_free_list(TmpIndex);
}
}
void raw_allocator::deallocate_temporary(pointer p, size_type n) noexcept {
if(p) {
if(n == 0 || size_type(TmpAlign) < n) system_deallocate(p);
else {
lock_free_list(TmpIndex);
LinkType*& freelist = Freelist[TmpIndex];
LinkType* new_list = ((LinkType*)(p));
new_list->Link = freelist;
freelist = new_list;
unlock_free_list(TmpIndex);
}
}
}
void raw_allocator::log(std::ostream& stream) {
stream << " Heap Size: " << TotalHeapSize << '\n';
size_type total_size = 0;
for (unsigned i = 0; i < FreeListArraySize; ++i) {
size_type align = get_align(i);
size_type size = 0;
size_type count = 0;
lock_free_list(i);
LinkType* freelist = Freelist[i];
while (freelist) {
size += align;
++count;
freelist = freelist->Link;
}
total_size += size;
unlock_free_list(i);
stream << " Freelist: " << std::setw(4) << align << ": " << size
<< " [" << count << ']' << '\n';
}
size_type heap_size = HeapEnd - HeapBeg;
stream << " Freelists: " << total_size << '\n';
stream << " Free Heap: " << heap_size << '\n';
stream << " Allocated: " << TotalHeapSize - total_size - heap_size
<< '\n';
}
int main() {
const unsigned sample_count = 100000;
std::vector<char*> std_allocate_pointers;
std::vector<char*> allocate_pointers;
std::vector<unsigned> sample_sizes;
typedef std::chrono::nanoseconds duration;
duration std_allocate_duration;
duration std_deallocate_duration;
duration allocate_duration;
duration deallocate_duration;
std::allocator<char> std_allocator;
allocator<char> allocator;
for (unsigned i = 0; i < sample_count; ++i) {
if (std::rand() % 2) {
unsigned size = unsigned(std::rand()) % MaxAlign;
//std::cout << " Allocate: " << size << std::endl;
sample_sizes.push_back(size);
{
auto start = std::chrono::high_resolution_clock::now();
auto p = std_allocator.allocate(size);
auto end = std::chrono::high_resolution_clock::now();
std_allocate_pointers.push_back(p);
std_allocate_duration += std::chrono::duration_cast<duration>(
end - start);
}
{
auto start = std::chrono::high_resolution_clock::now();
auto p = allocator.allocate(size);
auto end = std::chrono::high_resolution_clock::now();
allocate_pointers.push_back(p);
allocate_duration += std::chrono::duration_cast<duration>(
end - start);
}
}
else {
if (!sample_sizes.empty()) {
char* std_p = std_allocate_pointers.back();
char* p = allocate_pointers.back();
unsigned size = sample_sizes.back();
//std::cout << "Deallocate: " << size << std::endl;
{
auto start = std::chrono::high_resolution_clock::now();
std_allocator.deallocate(std_p, size);
auto end = std::chrono::high_resolution_clock::now();
std_deallocate_duration += std::chrono::duration_cast<
duration>(end - start);
}
{
auto start = std::chrono::high_resolution_clock::now();
allocator.deallocate(p, size);
auto end = std::chrono::high_resolution_clock::now();
deallocate_duration += std::chrono::duration_cast<duration>(
end - start);
}
std_allocate_pointers.pop_back();
allocate_pointers.pop_back();
sample_sizes.pop_back();
}
}
}
for (unsigned i = 0; i < sample_sizes.size(); ++i) {
unsigned size = sample_sizes[i];
std_allocator.deallocate(std_allocate_pointers[i], size);
allocator.deallocate(allocate_pointers[i], size);
}
std::cout << "std_allocator: "
<< (std_allocate_duration + std_deallocate_duration).count() << " "
<< std_allocate_duration.count() << " "
<< std_deallocate_duration.count() << std::endl;
std::cout << " allocator: "
<< (allocate_duration + deallocate_duration).count() << " "
<< allocate_duration.count() << " " << deallocate_duration.count()
<< std::endl;
raw_allocator::log(std::cout);
return 0;
}
Note: The raw allocator never release memory to the system (That
might be a bug).
Note: Without optimizations enabled the performance
is lousy (g++ -std=c++11 -O3 ...)
Result:
std_allocator: 11645000 7416000 4229000
allocator: 5155000 2758000 2397000
Heap Size: 94208
Freelist: 16: 256 [16]
Freelist: 32: 640 [20]
Freelist: 48: 768 [16]
Freelist: 64: 1024 [16]
Freelist: 80: 1280 [16]
Freelist: 96: 1536 [16]
Freelist: 112: 1792 [16]
Freelist: 128: 2176 [17]
Freelist: 160: 5760 [36]
Freelist: 192: 6144 [32]
Freelist: 224: 3584 [16]
Freelist: 256: 7936 [31]
Freelist: 320: 10240 [32]
Freelist: 384: 14208 [37]
Freelist: 512: 34304 [67]
Freelist: 4096: 0 [0]
Freelists: 91648
Free Heap: 2560
Allocated: 0
It seemed like an interesting problem so I invested some time in it. The aproach you took is far from being naive. Actually it has pretty good results. It can definetly be optimized further though. I will assume the list of chunks is not already sorted because your algo is probably optimal then.
To optimize it my aproach was to optimize the sort itself eliminating the chunks that can be combined during the sort, thus making the sort faster for the remaining elements.
The code below is basically a modified version of bubble-sort. I also implemented your solution using std::sort just for comparison.
The results are suprisingly good using my also. For a data set of 10 million chunks the combined sort with the merge of chunks performs 20 times faster.
The output of the code is (algo1 is std::sort followed by merging consecutive elements, algo 2 is the sort optimized with removing the chunks that can be merged):
generating input took: 00:00:19.655999
algo 1 took 00:00:00.968738
initial chunks count: 10000000, output chunks count: 3332578
algo 2 took 00:00:00.046875
initial chunks count: 10000000, output chunks count: 3332578
You can probably improve it further using a better sort algo like introsort.
full code:
#include <vector>
#include <map>
#include <set>
#include <iostream>
#include <boost\date_time.hpp>
#define CHUNK_COUNT 10000000
struct Chunk // Chunk of memory
{
char *begin, *end; // [begin, end) range
bool operator<(const Chunk& rhs) const
{
return begin < rhs.begin;
}
};
std::vector<Chunk> in;
void generate_input_data()
{
std::multimap<int, Chunk> input_data;
Chunk chunk;
chunk.begin = 0;
chunk.end = 0;
for (int i = 0; i < CHUNK_COUNT; ++i)
{
int continuous = rand() % 3; // 66% chance of a chunk being continuous
if (continuous)
chunk.begin = chunk.end;
else
chunk.begin = chunk.end + rand() % 100 + 1;
int chunk_size = rand() % 100 + 1;
chunk.end = chunk.begin + chunk_size;
input_data.insert(std::multimap<int, Chunk>::value_type(rand(), chunk));
}
// now we have the chunks randomly ordered in the map
// will copy them in the input vector
for (std::multimap<int, Chunk>::const_iterator it = input_data.begin(); it != input_data.end(); ++it)
in.push_back(it->second);
}
void merge_chunks_sorted(std::vector<Chunk>& chunks)
{
if (in.empty())
return;
std::vector<Chunk> res;
Chunk ch = in[0];
for (size_t i = 1; i < in.size(); ++i)
{
if (in[i].begin == ch.end)
{
ch.end = in[i].end;
} else
{
res.push_back(ch);
ch = in[i];
}
}
res.push_back(ch);
chunks = res;
}
void merge_chunks_orig_algo(std::vector<Chunk>& chunks)
{
std::sort(in.begin(), in.end());
merge_chunks_sorted(chunks);
}
void merge_chunks_new_algo(std::vector<Chunk>& chunks)
{
size_t new_last_n = 0;
Chunk temp;
do {
int last_n = new_last_n;
new_last_n = chunks.size() - 1;
for (int i = chunks.size() - 2; i >= last_n; --i)
{
if (chunks[i].begin > chunks[i + 1].begin)
{
if (chunks[i].begin == chunks[i + 1].end)
{
chunks[i].begin = chunks[i + 1].begin;
if (i + 1 != chunks.size() - 1)
chunks[i + 1] = chunks[chunks.size() - 1];
chunks.pop_back();
} else
{
temp = chunks[i];
chunks[i] = chunks[i + 1];
chunks[i + 1] = temp;
}
new_last_n = i + 1;
} else
{
if (chunks[i].end == chunks[i + 1].begin)
{
chunks[i].end = chunks[i + 1].end;
if (i + 1 != chunks.size() - 1)
chunks[i + 1] = chunks[chunks.size() - 1];
chunks.pop_back();
}
}
}
} while (new_last_n < chunks.size() - 1);
}
void run_algo(void (*algo)(std::vector<Chunk>&))
{
static int count = 1;
// allowing the algo to modify the input vector is intentional
std::vector<Chunk> chunks = in;
size_t in_count = chunks.size();
boost::posix_time::ptime start = boost::posix_time::microsec_clock::local_time();
algo(chunks);
boost::posix_time::ptime stop = boost::posix_time::microsec_clock::local_time();
std::cout<<"algo "<<count++<<" took "<<stop - start<<std::endl;
// if all went ok, statistically we should have around 33% of the original chunks count in the output vector
std::cout<<" initial chunks count: "<<in_count<<", output chunks count: "<<chunks.size()<<std::endl;
}
int main()
{
boost::posix_time::ptime start = boost::posix_time::microsec_clock::local_time();
generate_input_data();
boost::posix_time::ptime stop = boost::posix_time::microsec_clock::local_time();
std::cout<<"generating input took:\t"<<stop - start<<std::endl;
run_algo(merge_chunks_orig_algo);
run_algo(merge_chunks_new_algo);
return 0;
}
I've seen below you mention n is not that high. so I rerun the test with 1000 chunks, 1000000 runs to make the times significant. The modified bubble sort still performs 5 times better. Basically for 1000 chunks total run time is 3 microseconds. Numbers below.
generating input took: 00:00:00
algo 1 took 00:00:15.343456, for 1000000 runs
initial chunks count: 1000, output chunks count: 355
algo 2 took 00:00:03.374935, for 1000000 runs
initial chunks count: 1000, output chunks count: 355
Add pointers to the chunk struct for previous and next adjacent chunk in contiguous memory, if such exists, null otherwise. When a chunk is released you check if adjacent chunks are free, and if they are you merge them and update prev->next and next->prev pointers. This procedure is O(1) and you do it each time a chunk is released.
Some memory allocators put the size of current and previous chunk at the memory position immediately before the address returned by malloc. It is then possible calculate the offset to adjacent chunks without explicit pointers.
The following doesn't require sorted input or provide sorted output. Treat the input as a stack. Pop a chunk off and check if it is adjacent to a member of the initially empty output set. If not, add it to the output set. If it is adjacent, remove the adjacent chunk from the output set and push the new combined chunk onto the input stack. Repeat until input is empty.
vector<Chunk> unify_contiguous(vector<Chunk> input)
{
vector<Chunk> output;
unordered_set<Ptr, Chunk> begins;
unordered_set<Ptr, Chunk> ends;
while (!input.empty())
{
// pop chunk from input
auto chunk = input.back();
input.pop_back();
// chunk end adjacent to an output begin?
auto it = begins.find(chunk.end);
if (it != begins.end())
{
auto end = it->second.end;
Chunk combined{chunk.begin, end};
ends.erase(end);
begins.erase(it);
input.push_back(combined);
continue;
}
// chunk begin adjacent to an output end?
it = ends.find(chunk.begin);
if (it != ends.end())
{
auto begin = it->second.begin;
Chunk combined{begin, chunk.end};
begins.erase(begin);
ends.erase(it);
input.push_back(combined);
continue;
}
// if not add chunk to output
begins[chunk.begin] = chunk;
ends[chunk.end] = chunk;
}
// collect output
for (auto kv : begins)
output.push_back(kv.second);
return output;
}