Efficiently storing a matrix with many zeros, dynamically

Efficiently storing a matrix with many zeros, dynamically - c++

Background:
I'm working in c++.
I recall there being a method to efficiently (memory-wise) store "arrays" (where an array might be made of std::vector's, std::set's, etc... I don't care how, so long as it is memory efficient and I'm able to check the value of each element) of 0's and 1's (or, equivalently, truth/false, etc), wherein there is a disproportionate number of one or the other (e.g. mostly zeroes).
I've written an algorithm, which populates an "array" (currently, a vector<vector<size_t>>) with 0's and 1's according to some function. For these purposes, we can more-or-less consider it as being done randomly. The array is to be quite large (of variable size... on the order of 1000 columns, and 1E+8 or more rows), and always rectangular.
There need be this many data points. In the best of times, my machine becomes quickly resource constrained and slows to a crawl. At worst, I get std::bad_alloc.
Putting aside what I intend to do with this array, what is the most efficient (memory-wise) way to store a rectangular array of 1's and 0's (or T/F, etc), where there are mostly 1's or 0's (and I know which is most populous)?.
Note that the array need be created "dynamically" (i.e. one element at a time), elements must maintain their location, and I need only to check the value of individual elements after creation. I'm concerned about memory footprint, nothing else.

This is known as a sparse array or matrix.
std::set<std::pair<int,int>> bob;
If you want 7,100 to be 1, just bob.insert({7,100});. Missing elements are 0. You can use bob.count({3,7}) for a 0/1 value if you like.
Now looping over both columns are rows is tricky; easiest is to make 2 sets each backwards.
If you have no need to loop in order, use an unordered set instead.

Related

Finding k smallest/largest elements in array with focus on memory

I have an unsigned int array with n elements (with n being at most around 20-25 elements). Duplicates are possible.
I know that the smallest k values are of type A and the other (larger) n-k values are of type B. In order to differentiate between A and B I need to find the indices of the k smallest values (or the n-k largest values, depending on what is easier/faster). The original array must not be altered as the element's index contains information.
There are multiple solutions for this problem on the web (e.g. here). However, most of them try to optimize processing time and neglect memory usage.
As I am implementing the code in C++ on a (Arduino based) microcontroller, I have to focus on low memory usage and, if necessary, take a slightly longer processing time. I therefore feel unsafe using pointers and recursion (maybe I wouldn't if I knew more about it, but in fact I don't).
Can you recommend which algorithm would be best for that task (an implementation is welcome but not essential)?

Efficient data structure to map integer-to-integer with find & insert, no allocations and fixed upper bound

I am looking for input on an associative data structure that might take advantage of the specific criteria of my use case.
Currently I am using a red/black tree to implement a dictionary that maps keys to values (in my case integers to addresses).
In my use case, the maximum number of elements is known up front (1024), and I will only ever be inserting and searching. Searching happens twenty times more often than inserting. At the end of the process I clear the structure and repeat again. There can be no allocations during use - only the initial up front one. Unfortunately, the STL and recent versions of C++ are not available.
Any insight?

I ended up implementing a simple linear-probe HashTable from an example here. I used the MurmurHash3 hash function since my data is randomized.
I modified the hash table in the following ways:
The size is a template parameter. Internally, the size is doubled. The implementation requires power of 2 sizes, and traditionally resizes at 75% occupation. Since I know I am going to be filling up the hash table, I pre-emptively double it's capacity to keep it sparse enough. This might be less efficient when adding small number of objects, but it is more efficient once the capacity starts to fill up. Since I cannot resize it I chose to start it doubled in size.
I do not allow keys with a value of zero to be stored. This is okay for my application and it keeps the code simple.
All resizing and deleting is removed, replaced by a single clear operation which performs a memset.
I chose to inline the insert and lookup functions since they are quite small.
It is faster than my red/black tree implementation before. The only change I might make is to revisit the hashing scheme to see if there is something in the source keys that would help make a cheaper hash.
Billy ONeal suggested, given a small number of elements (1024) that a simple linear search in a fixed array would be faster. I followed his advice and implemented one for side by side comparison. On my target hardware (roughly first generation iPhone) the hash table outperformed a linear search by a factor of two to one. At smaller sizes (256 elements) the hash table was still superior. Of course these values are hardware dependant. Cache line sizes and memory access speed are terrible in my environment. However, others looking for a solution to this problem would be smart to follow his advice and try and profile it first.

3D-Grid of bins: nested std::vector vs std::unordered_map

pros, I need some performance-opinions with the following:
1st Question:
I want to store objects in a 3D-Grid-Structure, overall it will be ~33% filled, i.e. 2 out of 3 gridpoints will be empty.
Short image to illustrate:
Maybe Option A)
vector<vector<vector<deque<Obj>> grid;// (SizeX, SizeY, SizeZ);
grid[x][y][z].push_back(someObj);
This way I'd have a lot of empty deques, but accessing one of them would be fast, wouldn't it?
The Other Option B) would be
std::unordered_map<Pos3D, deque<Obj>, Pos3DHash, Pos3DEqual> Pos3DMap;
where I add&delete deques when data is added/deleted. Probably less memory used, but maybe less fast? What do you think?
2nd Question (follow up)
What if I had multiple containers at each position? Say 3 buckets for 3 different entities, say object types ObjA, ObjB, ObjC per grid point, then my data essentially becomes 4D?
Another illustration:
Using Option 1B I could just extend Pos3D to include the bucket number to account for even more sparse data.
Possible queries I want to optimize for:
Give me all Objects out of ObjA-buckets from the entire structure
Give me all Objects out of ObjB-buckets for a set of
grid-positions
Which is the nearest non-empty ObjC-bucket to
position x,y,z?
PS:
I had also thought about a tree based data-structure before, reading about nearest neighbour approaches. Since my data is so regular I had thought I'd save all the tree-building dividing of the cells into smaller pieces and just make a static 3D-grid of the final leafs. Thats how I came to ask about the best way to store this grid here.
Question associated with this, if I have a map<int, Obj> is there a fast way to ask for "all objects with keys between 780 and 790"? Or is the fastest way the building of the above mentioned tree?
EDIT
I ended up going with a 3D boost::multi_array that has fortran-ordering. It's a little bit like the chunks games like minecraft use. Which is a little like using a kd-tree with fixed leaf-size and fixed amount of leaves? Works pretty fast now so I'm happy with this approach.

Answer to 1st question
As #Joachim pointed out, this depends on whether you prefer fast access or small data. Roughly, this corresponds to your options A and B.
A) If you want fast access, go with a multidimensional std::vector or an array if you will. std::vector brings easier maintenance at a minimal overhead, so I'd prefer that. In terms of space it consumes O(N^3) space, where N is the number of grid points along one dimension. In order to get the best performance when iterating over the data, remember to resolve the indices in the reverse order as you defined it: innermost first, outermost last.
B) If you instead wish to keep things as small as possible, use a hash map, and use one which is optimized for space. That would result in space O(N), with N being the number of elements. Here is a benchmark comparing several hash maps. I made good experiences with google::sparse_hash_map, which has the smallest constant overhead I have seen so far. Plus, it is easy to add it to your build system.
If you need a mixture of speed and small data or don't know the size of each dimension in advance, use a hash map as well.
Answer to 2nd question
I'd say you data is 4D if you have a variable number of elements a long the 4th dimension, or a fixed large number of elements. With option 1B) you'd indeed add the bucket index, for 1A) you'd add another vector.
Which is the nearest non-empty ObjC-bucket to position x,y,z?
This operation is commonly called nearest neighbor search. You want a KDTree for that. There is libkdtree++, if you prefer small libraries. Otherwise, FLANN might be an option. It is a part of the Point Cloud Library which accomplishes a lot of tasks on multidimensional data and could be worth a look as well.

how to store data in a large double dimension array

I want to allocate memory of 10^9*10^9 in a double dimension array but this is not possible.is their any way out?
I think vector could be solution to this but i dont know how to do it.

You cannot allocate 1018 bytes of memory in any computer today (that's roughly a million terabytes). However, if your data is mostly zeros (ie. is a sparse matrix), then you can use a different kind of data structure to store your data. It all depends on what kind of data you are storing and whether it has any redundant characteristics.

Assuming that the number of non-zero elements is much less than 10^18, you'll want to read up on sparse arrays. In fact, it's not even a requirement that most of the elements in a sparse array be zero -- they just need to be the same. The essential idea is to keep the non-default values in a structure like a list; any values not found in the list are assumed to be the default value.

I want to allocate memory of 10^9*10^9 in a double dimension array but this is not possible.is their any way out?
That's way beyond current hardware capabilities, and array this big is unsuitable for any practical purpose (you're free to calculate how many thousands of years it would take to walk through every element).
You need to create "sparse" array. Store only non-zero elements in memory, provide array-like interface to access them, but internally store them in something like std::map<std::pair<xcoord, ycoord>, value>, return zero for all elements not in map. As long as you don't do something reckless like trying to set every element to non-zero value, this should be sufficient array replacement.
so....
What do you need that much memory for?

Is it possible to do simple arithmetic (e.g. addition) on "compressed" integers?

I would like to compress an an array of integers, initially initialized to 0, using a yet-to-be-determined integer compression/decompression method.
Is it possible with some integer compression method to increment (+1) a specific element of an array of compressed integers accurately using C or C++?

Of all the common compression techniques, two stand out as potentially usable in this without a full decompress cycle.
First, sparse arrays were built specifically for this. With a sparse array, you typically store a map of index to value. You don't store array elements that haven't been modified, so if most of your array is 0, it need not be stored. Many arrays (and matrices) in simulations are sparse, and there's a huge literature. Here adding to a value would simply be accessing the index with [] and incrementing - the access will create if nonexistent.
Next, run length encoding may also work if you find that you are working with large sequences of the same number, but those "runs" are not all the same number. Since they are not the same, a sparse array would not work and RLE is a solution. Incrementing a number is not as easy as for sparse, but basically if not a run, you add and check to see if you can make a new run. If part of a run, you split the run. RLE typically only makes sense with visual data or certain mathematical patterns.

You can certainly implement this, if your increment method:
Decompresses the entire array.
Increments the desired entry.
Compresses the entire array again.
If you want to increment in less of a dumb way you'll need intimate knowledge of the compression process, and so would we to give you more assistance.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js