I have a 2D lookup table of int16_t.
int16_t my_array[37][73] = {{**DATA HERE**}}
I have a mixture of values that range from just above the range of int8_t to just below the range of int8_t and some of the values repeat themselves. I am trying to reduce the size of this lookup table.
What I have done so far is split each int16_t value into two int8_t values to visualize the wasted bytes.
int8_t part_1 = original_value >> 4;
int8_t part_2 = original_value & 0x0000FFFF;
// If the upper 4 bits of the original_value were empty
if(part_1 == 0) wasted_bytes_count++;
I can easily remove the zero value int8_t that are wasting a byte of space and I can also remove the duplicate values, but my question is how do I do remove those values while retaining the ability to lookup based on the two indices?
I contemplated translating this into a 1D array and adding a number following each duplicated value that would represent the number of duplicates that were removed, but I am struggling with how I would then identify what is a lookup value and what is a duplicate count. Also, it is further complicated by stripping out the zero int8_t values that were wasted bytes.
EDIT: This array is stored in ROM already. RAM is even more limited than ROM so it is already stored in ROM.
EDIT: I am going to post a bounty for this question as soon as I can. I need a complete answer of how to store the information AND retrieve it. It does not need to be a 2D array as long as I can get the same values.
EDIT: Adding the actual array below:
{150,145,140,135,130,125,120,115,110,105,100,95,90,85,80,75,70,65,60,55,50,45,40,35,30,25,20,15,10,5,0,-4,-9,-14,-19,-24,-29,-34,-39,-44,-49,-54,-59,-64,-69,-74,-79,-84,-89,-94,-99,104,109,114,119,124,129,134,139,144,149,154,159,164,169,174,179,175,170,165,160,155,150}, \
{143,137,131,126,120,115,110,105,100,95,90,85,80,75,71,66,62,57,53,48,44,39,35,31,27,22,18,14,9,5,1,-3,-7,-11,-16,-20,-25,-29,-34,-38,-43,-47,-52,-57,-61,-66,-71,-76,-81,-86,-91,-96,101,107,112,117,123,128,134,140,146,151,157,163,169,175,178,172,166,160,154,148,143}, \
{130,124,118,112,107,101,96,92,87,82,78,74,70,65,61,57,54,50,46,42,38,34,31,27,23,19,16,12,8,4,1,-2,-6,-10,-14,-18,-22,-26,-30,-34,-38,-43,-47,-51,-56,-61,-65,-70,-75,-79,-84,-89,-94,100,105,111,116,122,128,135,141,148,155,162,170,177,174,166,159,151,144,137,130}, \
{111,104,99,94,89,85,81,77,73,70,66,63,60,56,53,50,46,43,40,36,33,30,26,23,20,16,13,10,6,3,0,-3,-6,-9,-13,-16,-20,-24,-28,-32,-36,-40,-44,-48,-52,-57,-61,-65,-70,-74,-79,-84,-88,-93,-98,103,109,115,121,128,135,143,152,162,172,176,165,154,144,134,125,118,111}, \
{85,81,77,74,71,68,65,63,60,58,56,53,51,49,46,43,41,38,35,32,29,26,23,19,16,13,10,7,4,1,-1,-3,-6,-9,-13,-16,-19,-23,-26,-30,-34,-38,-42,-46,-50,-54,-58,-62,-66,-70,-74,-78,-83,-87,-91,-95,100,105,110,117,124,133,144,159,178,160,141,125,112,103,96,90,85}, \
{62,60,58,57,55,54,52,51,50,48,47,46,44,42,41,39,36,34,31,28,25,22,19,16,13,10,7,4,2,0,-3,-5,-8,-10,-13,-16,-19,-22,-26,-29,-33,-37,-41,-45,-49,-53,-56,-60,-64,-67,-70,-74,-77,-80,-83,-86,-89,-91,-94,-97,101,105,111,130,109,84,77,74,71,68,66,64,62}, \
{46,46,45,44,44,43,42,42,41,41,40,39,38,37,36,35,33,31,28,26,23,20,16,13,10,7,4,1,-1,-3,-5,-7,-9,-12,-14,-16,-19,-22,-26,-29,-33,-36,-40,-44,-48,-51,-55,-58,-61,-64,-66,-68,-71,-72,-74,-74,-75,-74,-72,-68,-61,-48,-25,2,22,33,40,43,45,46,47,46,46}, \
{36,36,36,36,36,35,35,35,35,34,34,34,34,33,32,31,30,28,26,23,20,17,14,10,6,3,0,-2,-4,-7,-9,-10,-12,-14,-15,-17,-20,-23,-26,-29,-32,-36,-40,-43,-47,-50,-53,-56,-58,-60,-62,-63,-64,-64,-63,-62,-59,-55,-49,-41,-30,-17,-4,6,15,22,27,31,33,34,35,36,36}, \
{30,30,30,30,30,30,30,29,29,29,29,29,29,29,29,28,27,26,24,21,18,15,11,7,3,0,-3,-6,-9,-11,-12,-14,-15,-16,-17,-19,-21,-23,-26,-29,-32,-35,-39,-42,-45,-48,-51,-53,-55,-56,-57,-57,-56,-55,-53,-49,-44,-38,-31,-23,-14,-6,0,7,13,17,21,24,26,27,29,29,30}, \
{25,25,26,26,26,25,25,25,25,25,25,25,25,26,25,25,24,23,21,19,16,12,8,4,0,-3,-7,-10,-13,-15,-16,-17,-18,-19,-20,-21,-22,-23,-25,-28,-31,-34,-37,-40,-43,-46,-48,-49,-50,-51,-51,-50,-48,-45,-42,-37,-32,-26,-19,-13,-7,-1,3,7,11,14,17,19,21,23,24,25,25}, \
{21,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,21,20,18,16,13,9,5,1,-3,-7,-11,-14,-17,-18,-20,-21,-21,-22,-22,-22,-23,-23,-25,-27,-29,-32,-35,-37,-40,-42,-44,-45,-45,-45,-44,-42,-40,-36,-32,-27,-22,-17,-12,-7,-3,0,3,7,9,12,14,16,18,19,20,21,21}, \
{18,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,18,17,16,14,10,7,2,-1,-6,-10,-14,-17,-19,-21,-22,-23,-24,-24,-24,-24,-23,-23,-23,-24,-26,-28,-30,-33,-35,-37,-38,-39,-39,-38,-36,-34,-31,-28,-24,-19,-15,-10,-6,-3,0,1,4,6,8,10,12,14,15,16,17,18,18}, \
{16,16,17,17,17,17,17,17,17,17,17,16,16,16,16,16,16,15,13,11,8,4,0,-4,-9,-13,-16,-19,-21,-23,-24,-25,-25,-25,-25,-24,-23,-21,-20,-20,-21,-22,-24,-26,-28,-30,-31,-32,-31,-30,-29,-27,-24,-21,-17,-13,-9,-6,-3,-1,0,2,4,5,7,9,10,12,13,14,15,16,16}, \
{14,14,14,15,15,15,15,15,15,15,14,14,14,14,14,14,13,12,11,9,5,2,-2,-6,-11,-15,-18,-21,-23,-24,-25,-25,-25,-25,-24,-22,-21,-18,-16,-15,-15,-15,-17,-19,-21,-22,-24,-24,-24,-23,-22,-20,-18,-15,-12,-9,-5,-3,-1,0,1,2,4,5,6,8,9,10,11,12,13,14,14}, \
{12,13,13,13,13,13,13,13,13,13,13,13,12,12,12,12,11,10,9,6,3,0,-4,-8,-12,-16,-19,-21,-23,-24,-24,-24,-24,-23,-22,-20,-17,-15,-12,-10,-9,-9,-10,-12,-13,-15,-17,-17,-18,-17,-16,-15,-13,-11,-8,-5,-3,-1,0,1,1,2,3,4,6,7,8,9,10,11,12,12,12}, \
{11,11,11,11,11,12,12,12,12,12,11,11,11,11,11,10,10,9,7,5,2,-1,-5,-9,-13,-17,-20,-22,-23,-23,-23,-23,-22,-20,-18,-16,-14,-11,-9,-6,-5,-4,-5,-6,-8,-9,-11,-12,-12,-12,-12,-11,-9,-8,-6,-3,-1,0,0,1,1,2,3,4,5,6,7,8,9,10,11,11,11}, \
{10,10,10,10,10,10,10,10,10,10,10,10,10,10,9,9,9,7,6,3,0,-3,-6,-10,-14,-17,-20,-21,-22,-22,-22,-21,-19,-17,-15,-13,-10,-8,-6,-4,-2,-2,-2,-2,-4,-5,-7,-8,-8,-9,-8,-8,-7,-5,-4,-2,0,0,1,1,1,2,2,3,4,5,6,7,8,9,10,10,10}, \
{9,9,9,9,9,9,9,10,10,9,9,9,9,9,9,8,8,6,5,2,0,-4,-7,-11,-15,-17,-19,-21,-21,-21,-20,-18,-16,-14,-12,-10,-8,-6,-4,-2,-1,0,0,0,-1,-2,-4,-5,-5,-6,-6,-5,-5,-4,-3,-1,0,0,1,1,1,1,2,3,3,5,6,7,8,8,9,9,9}, \
{9,9,9,9,9,9,9,9,9,9,9,9,8,8,8,8,7,5,4,1,-1,-5,-8,-12,-15,-17,-19,-20,-20,-19,-18,-16,-14,-11,-9,-7,-5,-4,-2,-1,0,0,1,1,0,0,-2,-3,-3,-4,-4,-4,-3,-3,-2,-1,0,0,0,0,0,1,1,2,3,4,5,6,7,8,8,9,9}, \
{9,9,9,8,8,8,9,9,9,9,9,8,8,8,8,7,6,5,3,0,-2,-5,-9,-12,-15,-17,-18,-19,-19,-18,-16,-14,-12,-9,-7,-5,-4,-2,-1,0,0,1,1,1,1,0,0,-1,-2,-2,-3,-3,-2,-2,-1,-1,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,8,9}, \
{8,8,8,8,8,8,9,9,9,9,9,9,8,8,8,7,6,4,2,0,-3,-6,-9,-12,-15,-17,-18,-18,-17,-16,-14,-12,-10,-8,-6,-4,-2,-1,0,0,1,2,2,2,2,1,0,0,-1,-1,-1,-2,-2,-1,-1,0,0,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,8}, \
{8,8,8,8,9,9,9,9,9,9,9,9,9,8,8,7,5,3,1,-1,-4,-7,-10,-13,-15,-16,-17,-17,-16,-15,-13,-11,-9,-6,-5,-3,-2,0,0,0,1,2,2,2,2,1,1,0,0,0,-1,-1,-1,-1,-1,0,0,0,0,-1,-1,-1,-1,-1,0,0,1,3,4,5,7,7,8}, \
{8,8,9,9,9,9,10,10,10,10,10,10,10,9,8,7,5,3,0,-2,-5,-8,-11,-13,-15,-16,-16,-16,-15,-13,-12,-10,-8,-6,-4,-2,-1,0,0,1,2,2,3,3,2,2,1,0,0,0,0,0,0,0,0,0,0,-1,-1,-2,-2,-2,-2,-2,-1,0,0,1,3,4,6,7,8}, \
{7,8,9,9,9,10,10,11,11,11,11,11,10,10,9,7,5,3,0,-2,-6,-9,-11,-13,-15,-16,-16,-15,-14,-13,-11,-9,-7,-5,-3,-2,0,0,1,1,2,3,3,3,3,2,2,1,1,0,0,0,0,0,0,0,-1,-1,-2,-3,-3,-4,-4,-4,-3,-2,-1,0,1,3,5,6,7}, \
{6,8,9,9,10,11,11,12,12,12,12,12,11,11,9,7,5,2,0,-3,-7,-10,-12,-14,-15,-16,-15,-15,-13,-12,-10,-8,-7,-5,-3,-1,0,0,1,2,2,3,3,4,3,3,3,2,2,1,1,1,0,0,0,0,-1,-2,-3,-4,-4,-5,-5,-5,-5,-4,-2,-1,0,2,3,5,6}, \
{6,7,8,10,11,12,12,13,13,14,14,13,13,11,10,8,5,2,0,-4,-8,-11,-13,-15,-16,-16,-16,-15,-13,-12,-10,-8,-6,-5,-3,-1,0,0,1,2,3,3,4,4,4,4,4,3,3,3,2,2,1,1,0,0,-1,-2,-3,-5,-6,-7,-7,-7,-6,-5,-4,-3,-1,0,2,4,6}, \
{5,7,8,10,11,12,13,14,15,15,15,14,14,12,11,8,5,2,-1,-5,-9,-12,-14,-16,-17,-17,-16,-15,-14,-12,-11,-9,-7,-5,-3,-1,0,0,1,2,3,4,4,5,5,5,5,5,5,4,4,3,3,2,1,0,-1,-2,-4,-6,-7,-8,-8,-8,-8,-7,-6,-4,-2,0,1,3,5}, \
{4,6,8,10,12,13,14,15,16,16,16,16,15,13,11,9,5,2,-2,-6,-10,-13,-16,-17,-18,-18,-17,-16,-15,-13,-11,-9,-7,-5,-4,-2,0,0,1,3,3,4,5,6,6,7,7,7,7,7,6,5,4,3,2,0,-1,-3,-5,-7,-8,-9,-10,-10,-10,-9,-7,-5,-4,-1,0,2,4}, \
{4,6,8,10,12,14,15,16,17,18,18,17,16,15,12,9,5,1,-3,-8,-12,-15,-18,-19,-20,-20,-19,-18,-16,-15,-13,-11,-8,-6,-4,-2,-1,0,1,3,4,5,6,7,8,9,9,9,9,9,9,8,7,5,3,1,-1,-3,-6,-8,-10,-11,-12,-12,-11,-10,-9,-7,-5,-2,0,1,4}, \
{4,6,8,11,13,15,16,18,19,19,19,19,18,16,13,10,5,0,-5,-10,-15,-18,-21,-22,-23,-22,-22,-20,-18,-17,-14,-12,-10,-8,-5,-3,-1,0,1,3,5,6,8,9,10,11,12,12,13,12,12,11,9,7,5,2,0,-3,-6,-9,-11,-12,-13,-13,-12,-11,-10,-8,-6,-3,-1,1,4}, \
{3,6,9,11,14,16,17,19,20,21,21,21,19,17,14,10,4,-1,-8,-14,-19,-22,-25,-26,-26,-26,-25,-23,-21,-19,-17,-14,-12,-9,-7,-4,-2,0,1,3,5,7,9,11,13,14,15,16,16,16,16,15,13,10,7,4,0,-3,-7,-10,-12,-14,-15,-14,-14,-12,-11,-9,-6,-4,-1,1,3}, \
{4,6,9,12,14,17,19,21,22,23,23,23,21,19,15,9,2,-5,-13,-20,-25,-28,-30,-31,-31,-30,-29,-27,-25,-22,-20,-17,-14,-11,-9,-6,-3,0,1,4,6,9,11,13,15,17,19,20,21,21,21,20,18,15,11,6,2,-2,-7,-11,-13,-15,-16,-16,-15,-13,-11,-9,-7,-4,-1,1,4}, \
{4,7,10,13,15,18,20,22,24,25,25,25,23,20,15,7,-2,-12,-22,-29,-34,-37,-38,-38,-37,-36,-34,-31,-29,-26,-23,-20,-17,-13,-10,-7,-4,-1,2,5,8,11,13,16,18,21,23,24,26,26,26,26,24,21,17,12,5,0,-6,-10,-14,-16,-16,-16,-15,-14,-12,-10,-7,-4,-1,1,4}, \
{4,7,10,13,16,19,22,24,26,27,27,26,24,19,11,-1,-15,-28,-37,-43,-46,-47,-47,-45,-44,-41,-39,-36,-32,-29,-26,-22,-19,-15,-11,-8,-4,-1,2,5,9,12,15,19,22,24,27,29,31,33,33,33,32,30,26,21,14,6,0,-6,-11,-14,-15,-16,-15,-14,-12,-9,-7,-4,-1,1,4}, \
{6,9,12,15,18,21,23,25,27,28,27,24,17,4,-14,-34,-49,-56,-60,-60,-60,-58,-56,-53,-50,-47,-43,-40,-36,-32,-28,-25,-21,-17,-13,-9,-5,-1,2,6,10,14,17,21,24,28,31,34,37,39,41,42,43,43,41,38,33,25,17,8,0,-4,-8,-10,-10,-10,-8,-7,-4,-2,0,3,6}, \
{22,24,26,28,30,32,33,31,23,-18,-81,-96,-99,-98,-95,-93,-89,-86,-82,-78,-74,-70,-66,-62,-57,-53,-49,-44,-40,-36,-32,-27,-23,-19,-14,-10,-6,-1,2,6,10,15,19,23,27,31,35,38,42,45,49,52,55,57,60,61,63,63,62,61,57,53,47,40,33,28,23,21,19,19,19,20,22}, \
{168,173,178,176,171,166,161,156,151,146,141,136,131,126,121,116,111,106,101,-96,-91,-86,-81,-76,-71,-66,-61,-56,-51,-46,-41,-36,-31,-26,-21,-16,-11,-6,-1,3,8,13,18,23,28,33,38,43,48,53,58,63,68,73,78,83,88,93,98,103,108,113,118,123,128,133,138,143,148,153,158,163,168}, \
Thanks for your time.
I see several options for your array compaction.
1. Separate 8-bit and 1-bit arrays
You can split your array into 2 parts: first one stores 8 low-order bits of your original array, second one stores '1' if value does not fit in 8 bits or '0' otherwise. This will take 9 bits per value (same space as in nightcracker's approach, but a little bit simpler). To read value from these two arrays, do the following:
int8_t array8[37*73] = {...};
uint16_t array1[(37*73+15)/16] = {...};
size_t offset = 37 * x + y;
int16_t item = static_cast<int16_t>(array8[offset]); // sign extend
int16_t overflow = ((array1[offset/16] >> (offset%16)) & 0x0001) << 7;
item ^= overflow;
2. Approximation
If you can approximate your array with some efficiently computed function (like polynomial or exponent), you can store in the array only the difference between your value and the approximation. This may require only 8 bits per value or even less.
3. Delta encoding
If your data is smooth enough, in addition to applying either of previous methods, you can store a shorter table with only part of the data values and other table, containing only differences between all values, absent in the first table, and values from the first table. This requires less bits for each value.
For example, you can store every fifth value and differences for other values:
Original array: 0 0 1 1 2 2 2 2 2 3 3 3 4 4 5 5 5 5 5 6 6 6 6 6 6 6 6 7 7 7
Short array: 0 2 3 5 6 6
Difference array: 0 1 1 2 0 0 0 1 0 1 1 2 0 0 0 1 0 0 0 0 0 1 1 1
Alternatively, you can use differences from previous value, which requires even less bits per value:
Original array: 0 0 1 1 2 2 2 2 2 3 3 3 4 4 5 5 5 5 5 6 6 6 6 6 6 6 6 7 7 7
Short array: 0 2 3 5 6 6
Delta array: 0 1 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 0 0 0 0 1 0 0
Approach with delta array may be efficiently implemented using bitwise operations if a group of delta values fits exactly in int16_t.
Initialization
For option #2, preprocessor may be used. For other options, preprocessor is possible, but may be not very convenient (preprocessor is not very good to process long value lists). Some combination of preprocessor and variadic templates may be better. Or it may be easier to use some text-processing script.
Update
After looking at the actual data, I can tell some more details. Option #2 (Approximation) is not very convenient for your data. Option #1 seems to be better. Or you can use Mark Ransom's or nightcracker's approach. It doesn't matter, which one - in all cases you save 7 bits out of 16.
Option #3 (Delta encoding) allows to save much more space. It cannot be used directly, because in some cells of the array data changes abruptly. But, as far as I know, these large changes happen at most once for each row. Which may be implemented by one additional column with full data value and one special value in the delta array.
I noticed, that (ignoring these abrupt changes) difference between neighbor values is never more than +/- 32. This requires 6 bits to encode each delta value. This means 6.6 bits per value. 58% compression. About 2400 bytes. (Not much, but a little bit better than 2464K in your comments).
Middle part of the array is much more smooth. You'll need only 5 bits per value to encode it separately. This may save 300..400 bytes more. Probably it's a good idea to split this array into several parts and encode each part differently.
As nightcracker has noted your values will fit into 9 bits. There's an easier way to store those values though. Put the absolute values into a byte array and put the sign bits into a separate packed bit array.
int8_t my_array[37][73] = {{**DATA ABSOLUTE VALUES HERE**}};
int8_t my_signs[37][10] = {{**SIGN BITS HERE**}};
int16_t my_value = my_array[i][j];
if (my_signs[i][j/8] & (1 << j%8))
my_value = -my_value;
This is a 44% reduction in your original table size without too much effort.
I know from experience that visualizing things can help find a good solution to a problem. Since it isn't very clear what your data is actually representing (and so we know nothing/very little about the problem domain) we might not come up with "the best" solution (if one exists at all ofcourse). So I took the liberty and visualized the data; as the saying goes: a picture is worth a 1000 words :-)
I am sorry I do not have a solution (yet) better than the ones already posted but I thought the plot might help someone (or myself) come up with a better solution.
You want the range +-179. This means that with 360 values you'll be settled. It is possible to express 360 unique values in 9 bits. This is an example of a 9 bit integer lookup table:
// size is ceil(37 * 73 * 9 / 16)
uint16_t my_array[1520];
int16_t get_lookup_item(int x, int y) {
// calculate bitoffset
size_t bitoffset = (37 * x + y) * 9;
// calculate difference with 16 bit array offset
size_t diff = bitoffset % 16;
uint16_t item;
// our item doesn't overlap a 16 bit boundary
if (diff < (16 - 9)) {
item = my_array[bitoffset / 16]; // get item
item >>= diff;
item &= (1 << 9) - 1;
// our item does overlap a 16 bit boundary
} else {
item = my_array[bitoffset / 16];
item >>= diff;
item &= (1 << (16 - diff)) - 1;
item += my_array[bitoffset / 16 + 1] & ((1 << (9 - 16 + diff)) - 1);
}
// we now have the unsigned item, substract 179 to bring in the correct range
return item - 179;
}
Here's another approach, totally different from my first one which is why it's a separate answer.
If the number of values that won't fit in 8 bits is less than 1/8 of the total, you can devote an entire extra byte to each and still wind up with a smaller result versus keeping another 1-bit array.
In the interest of simplicity and speed I wanted to stick with full byte values, rather than bit packing. You've never said if there are speed constraints to this problem, but decoding an entire file just to look up one value seems wasteful. If this really isn't a problem for you, your best results would probably come from implementing the decoding part of some readily available open-source compression utility.
For this implementation I kept to a very simple encoding. First I did a delta as suggested by Evgeny Kluev, starting over for each row; your data is uncommonly amenable to this approach. Each byte is then encoded via the following rules:
An absolute value >= 97 is given a leading byte of 97. This value was arrived at by trying different thresholds and choosing the one that generated the smallest result. This is followed by the value less 97.
The run length is only checked for values between -96 and 96. Run lengths between 3 and 32 are encoded as 98 to 127, and run lengths between 33 and 64 are encoded as -97 to -128.
Finally the values between -96 and 96 are output as is.
This results in an encoded array of 2014 bytes, plus another of 36 bytes for indexing to the start of each row for a total of 2050 bytes.
A full implementation can be found at http://ideone.com/SNdRI . The output is identical to the table posted in the question.
As others have suggested, you can save a lot of space by storing the absolute value of each entry in an array of 8-bit integers, and the sign bit in a separate packed bit array. Mark Ransom's solution is simple and will give good performance, and will reduce the size from 5,402 bytes to 3,071 bytes, saving 43.1%.
If you are really trying to squeeze every last bit of space, you can do a bit better still by exploiting the characteristics of this data set. In particular, note that the values are mostly positive, and that there are several runs of values with the same sign. Instead of tracking the sign for every value in the "my_signs" array, you could track only the runs of negative values as a start index (two bytes, for the range [0..2701]) and a run length (one byte, since the longest run is 36 entries long). For this data set, that reduces the size of the signs table from 370 bytes to 168 bytes. The total storage is then 2,869 bytes, a savings of 46.8% compared to the original (2,533 bytes less).
Here's code that implements this strategy:
uint8_t my_array[37][73] = {{ /* ABSOLUTE VALUES OF ORIGINAL ARRAY HERE */ }};
// Sign bits for the values in my_array. The data is arranged in groups of
// three bytes. The first two give the starting index of a run of negative
// values. The third gives the length of the run. To determine if a given
// value should be negated, compute it's index as (row * 73) + col, then scan this
// table to see if that index appears in any of the runs. If it does, the value
// should be negated.
uint8_t my_signs[168] = {
0x00, 0x1f, 0x14, 0x00, 0x68, 0x15, 0x00, 0xb1, 0x16, 0x00, 0xfa, 0x18,
0x01, 0x42, 0x1a, 0x01, 0x8b, 0x1e, 0x01, 0xd2, 0x23, 0x02, 0x1a, 0x24,
0x02, 0x62, 0x24, 0x02, 0xaa, 0x25, 0x02, 0xf2, 0x25, 0x03, 0x3a, 0x25,
0x03, 0x83, 0x25, 0x03, 0xcb, 0x25, 0x04, 0x14, 0x24, 0x04, 0x5c, 0x24,
0x04, 0xa5, 0x23, 0x04, 0xee, 0x14, 0x05, 0x05, 0x0c, 0x05, 0x36, 0x14,
0x05, 0x50, 0x0a, 0x05, 0x7f, 0x13, 0x05, 0x9a, 0x09, 0x05, 0xc8, 0x12,
0x05, 0xe4, 0x07, 0x06, 0x10, 0x12, 0x06, 0x2f, 0x05, 0x06, 0x38, 0x05,
0x06, 0x59, 0x12, 0x06, 0x7f, 0x08, 0x06, 0xa2, 0x11, 0x06, 0xc7, 0x0b,
0x06, 0xeb, 0x11, 0x07, 0x10, 0x0c, 0x07, 0x34, 0x11, 0x07, 0x59, 0x0d,
0x07, 0x7c, 0x12, 0x07, 0xa2, 0x0d, 0x07, 0xc5, 0x12, 0x07, 0xeb, 0x0e,
0x08, 0x0e, 0x13, 0x08, 0x34, 0x0e, 0x08, 0x57, 0x13, 0x08, 0x7e, 0x0e,
0x08, 0x9f, 0x14, 0x08, 0xc7, 0x0e, 0x08, 0xe8, 0x14, 0x09, 0x10, 0x0e,
0x09, 0x30, 0x16, 0x09, 0x5a, 0x0d, 0x09, 0x78, 0x17, 0x09, 0xa4, 0x0c,
0x09, 0xc0, 0x18, 0x09, 0xef, 0x09, 0x0a, 0x04, 0x1d, 0x0a, 0x57, 0x14
};
int getSign(int row, int col)
{
int want = (row * 73) + col;
for (int i = 0 ; i < 168 ; i += 3) {
int16_t start = (my_signs[i] << 8) | my_signs[i + 1];
if (start > want) {
// Not going to find it, so may as well stop now.
break;
}
int runlength = my_signs[i + 2];
if (want < start + runlength) {
// Found this index in the signs array, so this entry is negative.
return -1;
}
}
return 1;
}
int16_t getValue(int row, int col)
{
return getSign(row, col) * my_values[row][col];
}
In fact you could even do a little bit better still, at the cost of more complex code, by recognizing that for the run-length encoded version of the signs table, you really need only 12 bits for the start index and 6 bits for the run length, for 18 bits total (compared to the 24 that the simple implementation above uses). That would cut the size another 42 bytes to 2,827 total, a 47.6% savings compared to the original (2,575 bytes less).
Investigating the actual array show that data is very smooth and may be compacted significantly. Simple methods do not give much space reduction after encoding 16 bit values in 9 bits. This is because different varying data characteristics at different places in the array. Splitting the array to several pieces and encoding them differently may reduce array size further, but this is more complicated and increases code size.
Approach, described here, allows to encode data blocks of variable length, giving access to original values relatively quickly (but more slowly, than simple methods). For the price of speed, compression ratio significantly increases.
The main idea is delta encoding. But in comparison to simple algorithm in my previous post, variable block length and variable bit depth are possible. This allows, for example, to use zero bit depth for deltas of the repeating values. Which means only fixed header and no delta values at all (similar to run-length encoding).
Also there is a single base value for all deltas in the block. This allows to encode linearly changing data (which is quite common for actual array) with only the base value, again spending zero space for delta values. And slightly decreases average bit depth for other cases.
Compressed data is stored in the array of bitstreams, accessed by bitstream reader. To give quick access to the start of each bitstream, index table is used (just an array of 37 16-bit indexes).
Each bitstream starts with the number of blocks in the stream (5 bits), then follows index of blocks, and finally - data blocks. Index of blocks gives a way to skip unneeded data blocks during search. Index contains: number of elements in the block (4 bits allow to encode from 9 to 24 delta values, plus the starting value), size of the base value for all deltas (1 bit for the sizes of 4 or 6), and size of the deltas (2 bits for sizes 0..3 - if base size is 4 or for sizes 2..5 - if base size is 6). These specific bit depths are probably close to optimal values, but may be changed to exchange some speed for some space or to adapt algorithm to different data array.
Data block contains starting value (9 bits), base value for deltas (4 or 6 bits), and delta values (0..3 or 2..5 bits for each).
Here is the function, extracting original values from the compressed data:
int get(size_t row, unsigned col)
{
BitstreamReader bsr(indexTable[row]);
unsigned blocks = bsr.getUI(5);
unsigned block = 0;
unsigned start = 0;
unsigned nextStart = 0;
unsigned offset = 0;
unsigned nextOffset = 0;
unsigned blockSize = 0;
unsigned baseSize = 0;
unsigned deltaSize = 0;
while (col >= nextStart) // 3 iterations on average
{
start = nextStart;
offset = nextOffset;
++block;
blockSize = bsr.getUI(4) + 9;
nextStart += blockSize;
baseSize = bsr.getUI(1)*2 + 4;
deltaSize = bsr.getUI(2) + baseSize - 4;
nextOffset += deltaSize * blockSize + baseSize + 9;
}
-- block;
bsr.skip((blocks - block) * 7 + offset);
int value = bsr.getI(9);
int base = bsr.getI(baseSize);
while(col-- > start) // 12 iterations on average
{
int delta = base + bsr.getUI(deltaSize);
value += delta;
}
return value;
}
Here is an implementation for bitstream reader:
class BitstreamReader
{
public:
BitstreamReader(size_t start): word_(start), bit_(0) {}
void skip(unsigned offset)
{
word_ += offset / 16 + ((bit_ + offset >= 16)? 1: 0);
bit_ = (bit_ + offset) % 16;
}
unsigned getUI(unsigned size)
{
unsigned old = bit_;
unsigned result = dataTable[word_] >> bit_;
result &= ((1 << size) - 1);
bit_ += size;
if (bit_ >= 16)
{
++word_;
bit_ -= 16;
if (bit_ > 0)
{
result += (dataTable[word_] & ((1 << bit_) - 1)) << (16 - old);
}
}
return result;
}
int getI(unsigned size)
{
int result = static_cast<int>(getUI(size));
return result | -(result & (1 << (size - 1)));
}
private:
size_t word_;
unsigned bit_;
};
I computed some estimate for the resulting data size. (I don't post code that allowed me to do it because of very low code quality). The result is 1250 bytes. Which is larger than best compressing programs can do. But significantly lower, than any simple methods.
Update
1250 bytes is not a limit. This algorithm may be improved to compress data harder and to work faster.
I noticed, that the number of blocks (5 bits) may be moved from bitstream to unused bits of the row index table. This saves about 30 bytes.
And to save 20 bytes more, you can store bitstreams in bytes instead of uint16, this saves space on padding bits.
So we have about 1200 bytes. Which is not exact. Size may be a little bit underestimated because I didn't take into account that not every bit depth may be encoded in the row index. Also this size may be overestimated because the only heuristic, assumed for encoder was calculating bit depth for the first 9 values and limiting the block size only if this bit depth needs to be increased by more than 2 bits. Of course, encoder may be smarter than this.
Decode speed may be also increased. If we move 9th bit from the original values to row indexes, each element of the index is exactly 8 bits. This allows to start bitstreams with the set of bytes, each of them may be decoded with faster methods, than a general bitstream's accessors. Remaining 8 bits of the original value may be moved to the place just after the row index for the same purpose. Or, alternatively, they may be included into each index entry, so that index consists of 16-bit values. After these modifications, bitstreams contain only data fields of variable length.
1049 bytes
I noticed that most runs are linear. That is why I decided to encode not the delta value, but a delta-of-delta. Think of it as a second derivative. This makes me store values -1, 0 and 1 most of the time, with some notable exceptions.
Secondly, I make the data 1-dimention. Converting it into 2 dimentsions is easy, but having it in 1 dimension permits the compression to span across several lines.
The compressed data is organized in varying-size chunks. Each chunk starts with a header:
9 bits - an absolute value, the value of input[x]
7 bits - a difference, the value of input[x+1]-input[x]
7 bits - a difference, the value of input[x+2]-input[x+1]
9 bits - the length of following data of second-derivative
2 bits each - an array of the second-derivative
The runs of second-derivatives in this example is surprisingly long, although only values -2, -1, 0 and 1 can be stored.
In the following piece of code I provide a complete, compilable code. It contains:
C (GCC) code. No C++ constructs.
The input array you provided
Visualization function to print the contents of the array
Compression function (in case your input changes a bit)
Getter function - to fetch an element out from the array
In the main function: I compress, decompress and perform a check
Have fun!
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
typedef int16_t Arr[37][73];
typedef int16_t ArrFlat[37*73];
typedef int16_t* ArrPtr;
Arr input = { {150,145,140,135,130,125,120,115,110,105,100,95,90,85,80,75,70,65,60,55,50,45,40,35,30,25,20,15,10,5,0,-4,-9,-14,-19,-24,-29,-34,-39,-44,-49,-54,-59,-64,-69,-74,-79,-84,-89,-94,-99,104,109,114,119,124,129,134,139,144,149,154,159,164,169,174,179,175,170,165,160,155,150}, \
{143,137,131,126,120,115,110,105,100,95,90,85,80,75,71,66,62,57,53,48,44,39,35,31,27,22,18,14,9,5,1,-3,-7,-11,-16,-20,-25,-29,-34,-38,-43,-47,-52,-57,-61,-66,-71,-76,-81,-86,-91,-96,101,107,112,117,123,128,134,140,146,151,157,163,169,175,178,172,166,160,154,148,143}, \
{130,124,118,112,107,101,96,92,87,82,78,74,70,65,61,57,54,50,46,42,38,34,31,27,23,19,16,12,8,4,1,-2,-6,-10,-14,-18,-22,-26,-30,-34,-38,-43,-47,-51,-56,-61,-65,-70,-75,-79,-84,-89,-94,100,105,111,116,122,128,135,141,148,155,162,170,177,174,166,159,151,144,137,130}, \
{111,104,99,94,89,85,81,77,73,70,66,63,60,56,53,50,46,43,40,36,33,30,26,23,20,16,13,10,6,3,0,-3,-6,-9,-13,-16,-20,-24,-28,-32,-36,-40,-44,-48,-52,-57,-61,-65,-70,-74,-79,-84,-88,-93,-98,103,109,115,121,128,135,143,152,162,172,176,165,154,144,134,125,118,111}, \
{85,81,77,74,71,68,65,63,60,58,56,53,51,49,46,43,41,38,35,32,29,26,23,19,16,13,10,7,4,1,-1,-3,-6,-9,-13,-16,-19,-23,-26,-30,-34,-38,-42,-46,-50,-54,-58,-62,-66,-70,-74,-78,-83,-87,-91,-95,100,105,110,117,124,133,144,159,178,160,141,125,112,103,96,90,85}, \
{62,60,58,57,55,54,52,51,50,48,47,46,44,42,41,39,36,34,31,28,25,22,19,16,13,10,7,4,2,0,-3,-5,-8,-10,-13,-16,-19,-22,-26,-29,-33,-37,-41,-45,-49,-53,-56,-60,-64,-67,-70,-74,-77,-80,-83,-86,-89,-91,-94,-97,101,105,111,130,109,84,77,74,71,68,66,64,62}, \
{46,46,45,44,44,43,42,42,41,41,40,39,38,37,36,35,33,31,28,26,23,20,16,13,10,7,4,1,-1,-3,-5,-7,-9,-12,-14,-16,-19,-22,-26,-29,-33,-36,-40,-44,-48,-51,-55,-58,-61,-64,-66,-68,-71,-72,-74,-74,-75,-74,-72,-68,-61,-48,-25,2,22,33,40,43,45,46,47,46,46}, \
{36,36,36,36,36,35,35,35,35,34,34,34,34,33,32,31,30,28,26,23,20,17,14,10,6,3,0,-2,-4,-7,-9,-10,-12,-14,-15,-17,-20,-23,-26,-29,-32,-36,-40,-43,-47,-50,-53,-56,-58,-60,-62,-63,-64,-64,-63,-62,-59,-55,-49,-41,-30,-17,-4,6,15,22,27,31,33,34,35,36,36}, \
{30,30,30,30,30,30,30,29,29,29,29,29,29,29,29,28,27,26,24,21,18,15,11,7,3,0,-3,-6,-9,-11,-12,-14,-15,-16,-17,-19,-21,-23,-26,-29,-32,-35,-39,-42,-45,-48,-51,-53,-55,-56,-57,-57,-56,-55,-53,-49,-44,-38,-31,-23,-14,-6,0,7,13,17,21,24,26,27,29,29,30}, \
{25,25,26,26,26,25,25,25,25,25,25,25,25,26,25,25,24,23,21,19,16,12,8,4,0,-3,-7,-10,-13,-15,-16,-17,-18,-19,-20,-21,-22,-23,-25,-28,-31,-34,-37,-40,-43,-46,-48,-49,-50,-51,-51,-50,-48,-45,-42,-37,-32,-26,-19,-13,-7,-1,3,7,11,14,17,19,21,23,24,25,25}, \
{21,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,21,20,18,16,13,9,5,1,-3,-7,-11,-14,-17,-18,-20,-21,-21,-22,-22,-22,-23,-23,-25,-27,-29,-32,-35,-37,-40,-42,-44,-45,-45,-45,-44,-42,-40,-36,-32,-27,-22,-17,-12,-7,-3,0,3,7,9,12,14,16,18,19,20,21,21}, \
{18,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,18,17,16,14,10,7,2,-1,-6,-10,-14,-17,-19,-21,-22,-23,-24,-24,-24,-24,-23,-23,-23,-24,-26,-28,-30,-33,-35,-37,-38,-39,-39,-38,-36,-34,-31,-28,-24,-19,-15,-10,-6,-3,0,1,4,6,8,10,12,14,15,16,17,18,18}, \
{16,16,17,17,17,17,17,17,17,17,17,16,16,16,16,16,16,15,13,11,8,4,0,-4,-9,-13,-16,-19,-21,-23,-24,-25,-25,-25,-25,-24,-23,-21,-20,-20,-21,-22,-24,-26,-28,-30,-31,-32,-31,-30,-29,-27,-24,-21,-17,-13,-9,-6,-3,-1,0,2,4,5,7,9,10,12,13,14,15,16,16}, \
{14,14,14,15,15,15,15,15,15,15,14,14,14,14,14,14,13,12,11,9,5,2,-2,-6,-11,-15,-18,-21,-23,-24,-25,-25,-25,-25,-24,-22,-21,-18,-16,-15,-15,-15,-17,-19,-21,-22,-24,-24,-24,-23,-22,-20,-18,-15,-12,-9,-5,-3,-1,0,1,2,4,5,6,8,9,10,11,12,13,14,14}, \
{12,13,13,13,13,13,13,13,13,13,13,13,12,12,12,12,11,10,9,6,3,0,-4,-8,-12,-16,-19,-21,-23,-24,-24,-24,-24,-23,-22,-20,-17,-15,-12,-10,-9,-9,-10,-12,-13,-15,-17,-17,-18,-17,-16,-15,-13,-11,-8,-5,-3,-1,0,1,1,2,3,4,6,7,8,9,10,11,12,12,12}, \
{11,11,11,11,11,12,12,12,12,12,11,11,11,11,11,10,10,9,7,5,2,-1,-5,-9,-13,-17,-20,-22,-23,-23,-23,-23,-22,-20,-18,-16,-14,-11,-9,-6,-5,-4,-5,-6,-8,-9,-11,-12,-12,-12,-12,-11,-9,-8,-6,-3,-1,0,0,1,1,2,3,4,5,6,7,8,9,10,11,11,11}, \
{10,10,10,10,10,10,10,10,10,10,10,10,10,10,9,9,9,7,6,3,0,-3,-6,-10,-14,-17,-20,-21,-22,-22,-22,-21,-19,-17,-15,-13,-10,-8,-6,-4,-2,-2,-2,-2,-4,-5,-7,-8,-8,-9,-8,-8,-7,-5,-4,-2,0,0,1,1,1,2,2,3,4,5,6,7,8,9,10,10,10}, \
{9,9,9,9,9,9,9,10,10,9,9,9,9,9,9,8,8,6,5,2,0,-4,-7,-11,-15,-17,-19,-21,-21,-21,-20,-18,-16,-14,-12,-10,-8,-6,-4,-2,-1,0,0,0,-1,-2,-4,-5,-5,-6,-6,-5,-5,-4,-3,-1,0,0,1,1,1,1,2,3,3,5,6,7,8,8,9,9,9}, \
{9,9,9,9,9,9,9,9,9,9,9,9,8,8,8,8,7,5,4,1,-1,-5,-8,-12,-15,-17,-19,-20,-20,-19,-18,-16,-14,-11,-9,-7,-5,-4,-2,-1,0,0,1,1,0,0,-2,-3,-3,-4,-4,-4,-3,-3,-2,-1,0,0,0,0,0,1,1,2,3,4,5,6,7,8,8,9,9}, \
{9,9,9,8,8,8,9,9,9,9,9,8,8,8,8,7,6,5,3,0,-2,-5,-9,-12,-15,-17,-18,-19,-19,-18,-16,-14,-12,-9,-7,-5,-4,-2,-1,0,0,1,1,1,1,0,0,-1,-2,-2,-3,-3,-2,-2,-1,-1,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,8,9}, \
{8,8,8,8,8,8,9,9,9,9,9,9,8,8,8,7,6,4,2,0,-3,-6,-9,-12,-15,-17,-18,-18,-17,-16,-14,-12,-10,-8,-6,-4,-2,-1,0,0,1,2,2,2,2,1,0,0,-1,-1,-1,-2,-2,-1,-1,0,0,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,8}, \
{8,8,8,8,9,9,9,9,9,9,9,9,9,8,8,7,5,3,1,-1,-4,-7,-10,-13,-15,-16,-17,-17,-16,-15,-13,-11,-9,-6,-5,-3,-2,0,0,0,1,2,2,2,2,1,1,0,0,0,-1,-1,-1,-1,-1,0,0,0,0,-1,-1,-1,-1,-1,0,0,1,3,4,5,7,7,8}, \
{8,8,9,9,9,9,10,10,10,10,10,10,10,9,8,7,5,3,0,-2,-5,-8,-11,-13,-15,-16,-16,-16,-15,-13,-12,-10,-8,-6,-4,-2,-1,0,0,1,2,2,3,3,2,2,1,0,0,0,0,0,0,0,0,0,0,-1,-1,-2,-2,-2,-2,-2,-1,0,0,1,3,4,6,7,8}, \
{7,8,9,9,9,10,10,11,11,11,11,11,10,10,9,7,5,3,0,-2,-6,-9,-11,-13,-15,-16,-16,-15,-14,-13,-11,-9,-7,-5,-3,-2,0,0,1,1,2,3,3,3,3,2,2,1,1,0,0,0,0,0,0,0,-1,-1,-2,-3,-3,-4,-4,-4,-3,-2,-1,0,1,3,5,6,7}, \
{6,8,9,9,10,11,11,12,12,12,12,12,11,11,9,7,5,2,0,-3,-7,-10,-12,-14,-15,-16,-15,-15,-13,-12,-10,-8,-7,-5,-3,-1,0,0,1,2,2,3,3,4,3,3,3,2,2,1,1,1,0,0,0,0,-1,-2,-3,-4,-4,-5,-5,-5,-5,-4,-2,-1,0,2,3,5,6}, \
{6,7,8,10,11,12,12,13,13,14,14,13,13,11,10,8,5,2,0,-4,-8,-11,-13,-15,-16,-16,-16,-15,-13,-12,-10,-8,-6,-5,-3,-1,0,0,1,2,3,3,4,4,4,4,4,3,3,3,2,2,1,1,0,0,-1,-2,-3,-5,-6,-7,-7,-7,-6,-5,-4,-3,-1,0,2,4,6}, \
{5,7,8,10,11,12,13,14,15,15,15,14,14,12,11,8,5,2,-1,-5,-9,-12,-14,-16,-17,-17,-16,-15,-14,-12,-11,-9,-7,-5,-3,-1,0,0,1,2,3,4,4,5,5,5,5,5,5,4,4,3,3,2,1,0,-1,-2,-4,-6,-7,-8,-8,-8,-8,-7,-6,-4,-2,0,1,3,5}, \
{4,6,8,10,12,13,14,15,16,16,16,16,15,13,11,9,5,2,-2,-6,-10,-13,-16,-17,-18,-18,-17,-16,-15,-13,-11,-9,-7,-5,-4,-2,0,0,1,3,3,4,5,6,6,7,7,7,7,7,6,5,4,3,2,0,-1,-3,-5,-7,-8,-9,-10,-10,-10,-9,-7,-5,-4,-1,0,2,4}, \
{4,6,8,10,12,14,15,16,17,18,18,17,16,15,12,9,5,1,-3,-8,-12,-15,-18,-19,-20,-20,-19,-18,-16,-15,-13,-11,-8,-6,-4,-2,-1,0,1,3,4,5,6,7,8,9,9,9,9,9,9,8,7,5,3,1,-1,-3,-6,-8,-10,-11,-12,-12,-11,-10,-9,-7,-5,-2,0,1,4}, \
{4,6,8,11,13,15,16,18,19,19,19,19,18,16,13,10,5,0,-5,-10,-15,-18,-21,-22,-23,-22,-22,-20,-18,-17,-14,-12,-10,-8,-5,-3,-1,0,1,3,5,6,8,9,10,11,12,12,13,12,12,11,9,7,5,2,0,-3,-6,-9,-11,-12,-13,-13,-12,-11,-10,-8,-6,-3,-1,1,4}, \
{3,6,9,11,14,16,17,19,20,21,21,21,19,17,14,10,4,-1,-8,-14,-19,-22,-25,-26,-26,-26,-25,-23,-21,-19,-17,-14,-12,-9,-7,-4,-2,0,1,3,5,7,9,11,13,14,15,16,16,16,16,15,13,10,7,4,0,-3,-7,-10,-12,-14,-15,-14,-14,-12,-11,-9,-6,-4,-1,1,3}, \
{4,6,9,12,14,17,19,21,22,23,23,23,21,19,15,9,2,-5,-13,-20,-25,-28,-30,-31,-31,-30,-29,-27,-25,-22,-20,-17,-14,-11,-9,-6,-3,0,1,4,6,9,11,13,15,17,19,20,21,21,21,20,18,15,11,6,2,-2,-7,-11,-13,-15,-16,-16,-15,-13,-11,-9,-7,-4,-1,1,4}, \
{4,7,10,13,15,18,20,22,24,25,25,25,23,20,15,7,-2,-12,-22,-29,-34,-37,-38,-38,-37,-36,-34,-31,-29,-26,-23,-20,-17,-13,-10,-7,-4,-1,2,5,8,11,13,16,18,21,23,24,26,26,26,26,24,21,17,12,5,0,-6,-10,-14,-16,-16,-16,-15,-14,-12,-10,-7,-4,-1,1,4}, \
{4,7,10,13,16,19,22,24,26,27,27,26,24,19,11,-1,-15,-28,-37,-43,-46,-47,-47,-45,-44,-41,-39,-36,-32,-29,-26,-22,-19,-15,-11,-8,-4,-1,2,5,9,12,15,19,22,24,27,29,31,33,33,33,32,30,26,21,14,6,0,-6,-11,-14,-15,-16,-15,-14,-12,-9,-7,-4,-1,1,4}, \
{6,9,12,15,18,21,23,25,27,28,27,24,17,4,-14,-34,-49,-56,-60,-60,-60,-58,-56,-53,-50,-47,-43,-40,-36,-32,-28,-25,-21,-17,-13,-9,-5,-1,2,6,10,14,17,21,24,28,31,34,37,39,41,42,43,43,41,38,33,25,17,8,0,-4,-8,-10,-10,-10,-8,-7,-4,-2,0,3,6}, \
{22,24,26,28,30,32,33,31,23,-18,-81,-96,-99,-98,-95,-93,-89,-86,-82,-78,-74,-70,-66,-62,-57,-53,-49,-44,-40,-36,-32,-27,-23,-19,-14,-10,-6,-1,2,6,10,15,19,23,27,31,35,38,42,45,49,52,55,57,60,61,63,63,62,61,57,53,47,40,33,28,23,21,19,19,19,20,22}, \
{168,173,178,176,171,166,161,156,151,146,141,136,131,126,121,116,111,106,101,-96,-91,-86,-81,-76,-71,-66,-61,-56,-51,-46,-41,-36,-31,-26,-21,-16,-11,-6,-1,3,8,13,18,23,28,33,38,43,48,53,58,63,68,73,78,83,88,93,98,103,108,113,118,123,128,133,138,143,148,153,158,163,168} };
void visual(Arr arr) {
int row;
int col;
for (row=0; row<37; ++row) {
for (col=0; col<73; ++col)
printf("%3d",arr[row][col]);
printf("\n");
}
}
void visualFlat(ArrFlat arr) {
int cell;
for (cell=0; cell<37*73; ++cell) {
printf("%3d",arr[cell]);
}
printf("\n");
}
typedef struct {
int16_t absolute:9;
int16_t adiff:7;
int16_t diff:7;
unsigned short diff2_length:9;
} __attribute__((packed)) Header;
typedef union {
struct {
int16_t diff2_a:2;
int16_t diff2_b:2;
int16_t diff2_c:2;
int16_t diff2_d:2;
} __attribute__((packed));
unsigned char all;
} Chunk;
int16_t chunkGet(Chunk k, int16_t offset) {
switch (offset) {
case 0 : return k.diff2_a;
case 1 : return k.diff2_b;
case 2 : return k.diff2_c;
case 3 : return k.diff2_d;
}
}
void chunkSet(Chunk *k, int16_t offset, int16_t value) {
switch (offset) {
case 0 : k->diff2_a=value; break;
case 1 : k->diff2_b=value; break;
case 2 : k->diff2_c=value; break;
case 3 : k->diff2_d=value; break;
default: printf("Invalid offset %hd\n", offset);
}
}
unsigned char data[1049];
void compress (ArrFlat src) {
Chunk diffData;
int16_t headerIdx=0;
int16_t diffIdx;
int16_t currentDiffValue;
int16_t length=-3;
int16_t shift=0;
Header h;
int16_t position=0;
while (position<37*73) {
if (length==-3) { //encode the absolute value
h.absolute=currentDiffValue=src[position];
++position;
++length;
continue;
}
if (length==-2) { //encode the first diff value
h.adiff=currentDiffValue=src[position]-src[position-1];
if (currentDiffValue<-64 || currentDiffValue>+63)
printf("\nDIFF TOO BIG\n");
++position;
++length;
continue;
}
if (length==-1) { //encode the second diff value
h.diff=currentDiffValue=src[position]-src[position-1];
if (currentDiffValue<-64 || currentDiffValue>+63)
printf("\nDIFF TOO BIG\n");
++position;
++length;
diffData.all=0;
diffIdx=headerIdx+sizeof(Header);
shift=0;
continue;
}
//compute the diff2
int16_t diff=src[position]-src[position-1];
int16_t diff2=diff-currentDiffValue;
if (diff2>1 || diff2<-2) { //big change - restart with header
if (length>511)
printf("\nLENGTH TOO LONG\n");
if (shift!=0) { //store partial byte
data[diffIdx]=diffData.all;
diffData.all=0;
++diffIdx;
}
h.diff2_length=length;
memcpy(data+headerIdx,&h,sizeof(Header));
headerIdx=diffIdx;
length=-3;
continue;
}
chunkSet(&diffData,shift,diff2);
shift+=1;
currentDiffValue=diff;
++position;
++length;
if (shift==4) {
data[diffIdx]=diffData.all;
diffData.all=0;
++diffIdx;
shift=0;
}
}
if (shift!=0) { //finalize
data[diffIdx]=diffData.all;
++diffIdx;
}
h.diff2_length=length;
memcpy(data+headerIdx,&h,sizeof(Header));
headerIdx=diffIdx;
printf("Ending byte=%hd\n",headerIdx);
}
int16_t get(int row, int col) {
int idx=row*73+col;
int dataIdx=0;
int pos=0;
int16_t absolute;
int16_t diff;
Header h;
while (1) {
memcpy(&h, data+dataIdx, sizeof(Header));
if (idx==pos) return h.absolute;
absolute=h.absolute+h.adiff;
if (idx==pos+1) return absolute;
diff=h.diff;
absolute+=diff;
if (idx==pos+2) return absolute;
dataIdx+=sizeof(Header);
pos+=3;
if (pos+h.diff2_length <= idx) {
pos+=h.diff2_length;
dataIdx+=(h.diff2_length+3)/4;
} else break;
}
int shift=4;
Chunk diffData;
while (pos<=idx) {
if (shift==4) {
diffData.all=data[dataIdx];
++dataIdx;
shift=0;
}
diff+=chunkGet(diffData,shift);
absolute+=diff;
++shift;
++pos;
}
return absolute;
}
int main() {
printf("Input:\n");
visual(input);
int row;
int col;
ArrPtr flatInput=(ArrPtr)input;
printf("sizeof(Header)=%lu\n",sizeof(Header));
printf("sizeof(Chunk)=%lu\n",sizeof(Chunk));
compress(flatInput);
ArrFlat re;
for (row=0; row<37; ++row)
for (col=0; col<73; ++col) {
int cell=row*73+col;
re[cell]=get(row,col);
if (re[cell]!=flatInput[cell])
printf("ERROR DETECTED IN CELL %d\n",cell);
}
visual(re);
return 0;
}
A Visual Studio version (compiled with VS 2010)
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
typedef int16_t Arr[37][73];
typedef int16_t ArrFlat[37*73];
typedef int16_t* ArrPtr;
Arr input = { [... your array as above ...] };
void visual(Arr arr) {
int row;
int col;
for (row=0; row<37; ++row) {
for (col=0; col<73; ++col)
printf("%3d",arr[row][col]);
printf("\n");
}
}
void visualFlat(ArrFlat arr) {
int cell;
for (cell=0; cell<37*73; ++cell) {
printf("%3d",arr[cell]);
}
printf("\n");
}
#pragma pack(1)
typedef struct {
int16_t absolute:9;
int16_t adiff:7;
int16_t diff:7;
unsigned short diff2_length:9;
} Header;
#pragma pack(1)
typedef union {
struct {
char diff2_a:2;
char diff2_b:2;
char diff2_c:2;
char diff2_d:2;
};
unsigned char all;
} Chunk;
int16_t chunkGet(Chunk k, int16_t offset) {
switch (offset) {
case 0 : return k.diff2_a;
case 1 : return k.diff2_b;
case 2 : return k.diff2_c;
case 3 : return k.diff2_d;
}
}
void chunkSet(Chunk *k, int16_t offset, int16_t value) {
switch (offset) {
case 0 : k->diff2_a=value; break;
case 1 : k->diff2_b=value; break;
case 2 : k->diff2_c=value; break;
case 3 : k->diff2_d=value; break;
default: printf("Invalid offset %hd\n", offset);
}
}
unsigned char data[1049];
void compress (ArrFlat src) {
Chunk diffData;
int16_t headerIdx=0;
int16_t diffIdx;
int16_t currentDiffValue;
int16_t length=-3;
int16_t shift=0;
int16_t diff;
int16_t diff2;
Header h;
int16_t position=0;
while (position<37*73) {
if (length==-3) { //encode the absolute value
h.absolute=currentDiffValue=src[position];
++position;
++length;
continue;
}
if (length==-2) { //encode the first diff value
h.adiff=currentDiffValue=src[position]-src[position-1];
if (currentDiffValue<-64 || currentDiffValue>+63)
printf("\nDIFF TOO BIG\n");
++position;
++length;
continue;
}
if (length==-1) { //encode the second diff value
h.diff=currentDiffValue=src[position]-src[position-1];
if (currentDiffValue<-64 || currentDiffValue>+63)
printf("\nDIFF TOO BIG\n");
++position;
++length;
diffData.all=0;
diffIdx=headerIdx+sizeof(Header);
shift=0;
continue;
}
//compute the diff2
diff=src[position]-src[position-1];
diff2=diff-currentDiffValue;
if (diff2>1 || diff2<-2) { //big change - restart with header
if (length>511)
printf("\nLENGTH TOO LONG\n");
if (shift!=0) { //store partial byte
data[diffIdx]=diffData.all;
diffData.all=0;
++diffIdx;
}
h.diff2_length=length;
memcpy(data+headerIdx,&h,sizeof(Header));
headerIdx=diffIdx;
length=-3;
continue;
}
chunkSet(&diffData,shift,diff2);
shift+=1;
currentDiffValue=diff;
++position;
++length;
if (shift==4) {
data[diffIdx]=diffData.all;
diffData.all=0;
++diffIdx;
shift=0;
}
}
if (shift!=0) { //finalize
data[diffIdx]=diffData.all;
++diffIdx;
}
h.diff2_length=length;
memcpy(data+headerIdx,&h,sizeof(Header));
headerIdx=diffIdx;
printf("Ending byte=%hd\n",headerIdx);
}
int16_t get(int row, int col) {
int idx=row*73+col;
int dataIdx=0;
int pos=0;
int16_t absolute;
int16_t diff;
int shift;
Header h;
Chunk diffData;
while (1) {
memcpy(&h, data+dataIdx, sizeof(Header));
if (idx==pos) return h.absolute;
absolute=h.absolute+h.adiff;
if (idx==pos+1) return absolute;
diff=h.diff;
absolute+=diff;
if (idx==pos+2) return absolute;
dataIdx+=sizeof(Header);
pos+=3;
if (pos+h.diff2_length <= idx) {
pos+=h.diff2_length;
dataIdx+=(h.diff2_length+3)/4;
} else break;
}
shift=4;
while (pos<=idx) {
if (shift==4) {
diffData.all=data[dataIdx];
++dataIdx;
shift=0;
}
diff+=chunkGet(diffData,shift);
absolute+=diff;
++shift;
++pos;
}
return absolute;
}
int main() {
int row;
int col;
ArrPtr flatInput=(ArrPtr)input;
ArrFlat re;
printf("Input:\n");
visual(input);
printf("sizeof(Header)=%lu\n",sizeof(Header));
printf("sizeof(Chunk)=%lu\n",sizeof(Chunk));
compress(flatInput);
for (row=0; row<37; ++row)
for (col=0; col<73; ++col) {
int cell=row*73+col;
re[cell]=get(row,col);
if (re[cell]!=flatInput[cell])
printf("ERROR DETECTED IN CELL %d\n",cell);
}
visual(re);
return 0;
}
726 bytes
This algorithm encodes difference between the actual value and the value, produced by linear extrapolation from previous values. In other words, it uses first order Taylor series or, as CygnusX1 calls it, delta-of-delta.
After this extrapolation encoding, most values are in the range [-1 .. 1]. This is a good reason to use Arithmetic coding or Range encoding. I've implemented arithmetic coder by Arturo San Emeterio Campos. Also an algorithm for Range coder by the same author is available.
Small values in the range [-2 .. 2] are compressed by arithmetic coder, while larger values are packed in 4-bit nibbles.
There are also several optimizations used to pack it a little bit tighter:
all values are compressed to one continuous stream
last column is not encoded at all, since it is equal to the first one
while encoding first column, history is updated only partially to improve results for second column
several cases, when value jumps from -100 to 100, are handled differently
This algorithm is slow, it uses up to 8000 32-bit integer divisions and lots of bit manipulations to extract single value. But it packs data into 726-byte array and code size is not very big.
Speed may be optimized (to ~2800 32-bit integer divisions) if frequency table is properly scaled. Also using range encoding instead of arithmetic coding may give some speed improvement. Space may be optimized if both arithmetic coder data and nibbles are packed in byte arrays instead of uint16 arrays (2 bytes) and if up to two starting zero bytes are aliased with the end of some other data structure (1..2 bytes). Using second order order Taylor series did not gain any space, but possibly other methods of extrapolation will give some improvement.
A full implementation can be found here: encoder, decoder and a test. Tested on GCC.
There is another possibility:
Have two arrays: one main and one overflow.
Every element of the main array contains the 7 bits of actual data + 1 "status" bit.
If the status bit is reset, the value fits into the remaining 7 bits.
If the status bit is set, part of the value is still in these 7 bits, but the remaining bits are contained in the overflow array.
The index in the overflow array is found by counting all the preceding elements in the main array who have their status bit set.
This has the following advantages:
Very fast lookup of values that fit into 7 bits.
Can handle values of unlimited range (either by using suitably large elements in the overflow array, or by repeating the algorithm and stacking another overflow array on top etc...).
On the other hand, if you know the values will always fit into 9 bits, use the 2-bit elements in the overflow array to save additional space (some bit-twiddling required, but can be done).
For some distributions of data, it may use less space than just using 9-bit elements (either in a single array or in 8-bit array + 1-bit array) - when most values fit into 7 bits.
Fairly simple to implement, so code size won't eat-away savings done for data.
Disadvantages:
Slow lookup of values that don't fit into 7 bits. Access to such a value requires linearly traversing all the elements left of it in the main array (and examining their status bits) to determine the index in the overflow array.
For some other distributions of data, it may use more space than 9-bit approach - when there are many values that don't fit into 7 bits.
Not as simple as 8-bit array + 1-bit array approach, so while still not very large, the code will be somewhat larger than that.
Don't forget to check the size of the compiled code if the sum of code+data sizes is important. Here is an example that uses normal 8-bit encoding for the data (50% gain) and optimizes for code size.
We'll store 8-bit values for each row:
unsigned char *row_data = compressed_data[row*73];
int value = row_data[column];
For the first rows, break them in two. The first value will be encoded directly. The next part will use a negative delta from the first value. The second part will be encoded as a positive delta from 100.
if (row <= 4) {
char break = break_point[row];
if (column >= break) return 100 + value;
if (column == 0) return value;
return row_data[0] - value;
}
The break_point would be the position of the 104, 101, 100, 103, 110 in the first five rows. I haven't checked if it can be computed rather than stored. Is it perhaps 51+row?
After the 5th row the values become smoother, we can just store them in 8-bit twos-complement. The exception is the last row.
if (row != 36) return (signed char) value;
The last row can be encoded like this, without any data (which saves 73 bytes):
value = 168+5*column;
if (value <= 178) return value;
value = 359 - x; /* 359 = 176 + 183 */
if (value >= 101) return value;
value = -x;
if (value > 0) x--;
return value;
This would require about 2640 bytes, but it would be very fast and compact to access.
The first row could be encoded similar to the last (with an increment at -5, a sign change at -104, and a 359-x "flip" at 184) saving 70 bytes of data at some cost in code size.
If the duplicated are contiguous and you have extra CPU, you could use a run-length encoding.
The dataset, sadly, looks too dense for a DFA... but you can totally get one working. It'll require preprocessing and be super fast. The assembly might exceed the 4K dataset, so it may not be an option.
Assuming your 16-bit values are infrequent, a hash might work for the extra large entries (see: google sparsehash)... there's a 1-bit+ overhead per entity.
You can also use 9-bit values and manage your memory byte boundaries manually, which is the same overhead as a separate bit array... maybe more.