Creating a 6 bit crc using boost - c++

I'm new to CRCs, boost and more of a java developer for that matter. I'm trying to use the the crc.hpp boost library to create a 6 bit crc calculated based on only two bits. First is this possible?
It seems that the Theoretical CRC Computer can be used to process a specific number of bits, however I'm unclear how to specify a 6 bit result. Help please.

Assuming your input is based on 2 actual bits and not two bytes, this should work:
const int initial_remainder = 0xBAADF00D;
unsigned char input = 0x3;
boost::crc_basic<6> checksum(initial_remainder);
checksum.process_bits(input, 2);
printf("%i", checksum.checksum());
You still need to figure out what the initial remainder should be, though.

This should just be a custom code that maximizes the Hamming distance between four byte values. It would be a table of four 8-bit values indexed by the two bits as a number in 0..3.
A set of values (there 280 such sets) that maximizes the minimum Hamming distance between any two of the four values is: 0x00, 0x4f, 0xb3, 0xfc. The minimum Hamming distance is 5. The high two bits of those values is the two-bit index in order.

Related

compress 4 byte floating point data to 1 byte

I need to compress floating point numbers (4 bytes) to 1 byte(0 to 0xFF) to send to another device. The floating point numbers range from -100000.0 to 100000.0.
The other device will decode from 1 byte back to floating point numbers. How do it do it with minimum data loss?
Thanks, JC
One solution is to use quantization. Divide 100000 to 127 intervals. Send the interval number to which float belongs to and a sign in lowest or highest bit
In your case the interval = 787,4
For example, you have input like 100. Send 1. Input 1000,147732. Send 2
On the device you can restore number by its interval.
The easiest solution is to restore the number as a middle of the interval. For example, every float that belongs to the first interval will be restored as 393.7
If you have some stats for digits distribution and it's not uniform, you can play around it by changing the intervals length and quantize frequent floats more precisely

Converting 12 bit color values to 8 bit color values C++

I'm attempting to convert 12-bit RGGB color values into 8-bit RGGB color values, but with my current method it gives strange results.
Logically, I thought that simply dividing the 12-bit RGGB into 8-bit RGGB would work and be pretty simple:
// raw_color_array contains R,G1,G2,B in a bayer pattern with each element
// ranging from 0 to 4096
for(int i = 0; i < array_size; i++)
{
raw_color_array[i] /= 16; // 4096 becomes 256 and so on
}
However, in practice this actually does not work. Given, for example, a small image with water and a piece of ice in it you can see what actually happens in the conversion (right most image).
Why does this happen? and how can I get the same (or close to) image on the left, but as 8-bit values instead? Thanks!
EDIT: going off of #MSalters answer, I get a better quality image but the colors are still drasticaly skewed. What resources can I look into for converting 12-bit data to 8-bit data without a steep loss in quality?
It appears that your raw 12 bits data isn't on a linear scale. That is quite common for images. For a non-linear scale, you can't use a linear transformation like dividing by 16.
A non-linear transform like sqrt(x*16) would also give you an 8 bits value. So would std::pow(x, 12.0/8.0)
A known problem with low-gradient images is that you get banding. If your images has an area where the original value varies from say 100 to 200, the 12-to-8 bit reduction will shrink that to less than 100 different values. You get rounding , and with naive (local) rounding you get bands. Linear or non-linear, there will then be some inputs x that all map to y, and some that map to y+1. This can be mitigated by doing the transformation in floating point, and then adding a random value between -1.0 and +1.0 before rounding. This effectively breaks up the band structure.
After you clarified that this 12bit data is only for one color, here is my simple answer:
Since you want to convert its value to its 8 bit equivalent, it obviously means you lost some of the data (4bits). This is the reason why you are not getting the same output.
After clarification:
If you want to retain the actual colour values!
Apply de-mosaicking in the 12 Bit image and then scale the resultant data to 8 - Bit. So that the colour loss due to de-mosaicking will be less compared to the previous approach.
You say that your 12-bits represent 2^12 bits of one colour. That is incorrect. There are reds, greens and blues in your image. Look at the histogram. I made this with ImageMagick at the command line:
convert cells.jpg histogram:png:h.png
If you want 8-bits per pixel, rather than trying to blindly/statically apportion 3 bits to Green, 2 bits to Red and 3 bits to Blue, you would probably be better off going with an 8-bit palette so you can have 250+ colours of all variations rather than restricting yourself to just 8 blue shades, 4 reds an 8 green. So, like this:
convert cells.jpg -colors 254 PNG8:result.png
Here is the result of that beside the original:
The process above is called "quantisation" and if you want to implement it in C/C++, there is a writeup here.

RLE Encoding bit sequence, not bytes

I need to implement a compression algorithm for binary data, that need to work on embedded constrained devices (256kB ROM, 48 KB RAM).
I'm thinking to the RLE compression algorithm. Unless implementing it from scratch, I've found a lot of C implementations, (for example: http://sourceforge.net/projects/bcl/?source=typ_redirect ), but they apply the RLE algorithm over the byte sequence (the word of the dictionary are 1 to 255, that is 8-bit encoding.
I'm finding for an implementation that, starting from a sequence of bytes, applies the RLE encoding over the bit-sequence corresponding to the input (0 and 1). Note that also another algorithm can work (I need a compression ratio <0.9, so I think any algorithm can do it), but the implementation need to work on a bit-basis, not bytes.
Can anyone help me? Thank you!
I think that you can encode bytes such as 0, 1, 2, 3, 255… etc. (Where lots of 0 and 1)
Let's encode this bit sequence:
000000011111110
1. Shift bits and increment the counter if bit compare to last bit
2. If NOT— shift 111 to output buffer and write control bit 1 before bit sequence
3. If bits can not be packed — write 0 bit and rest of data
Output:
111101110100
To decompress simply shift first control bit:
If 0 — write next bit to output buffer
If 1 — read 3 bits (can be other length) and convert them to the decimal number. Shift next bit which will mean what bit to repeat and start loop to represent the original sequence
But this compression method will work only on files, which have lots of 0 and 255 bytes (00000000 and 11111111 in binary), such as BMP files with black or white background.
Hope I helped you!

Deinterleaving PCM (*.wav) stereo audio data

I understand that PCM data is stored as [left][right][left][right].... Am trying to convert a stereo PCM to mono Vorbis (*.ogg) which I understand is achievable by halving the left and the right channels ((left+right)*0.5). I have actually achieved this by amending the encoder example in the libvorbis sdk like this,
#define READ 1024
signed char readbuffer[READ*4];
and the PCM data is read thus
fread(readbuffer, 1, READ*4, stdin)
I then halved the two channels,
buffer[0][i] = ((((readbuffer[i*4+1]<<8) | (0x00ff&(int)readbuffer[i*4]))/32768.f) + (((readbuffer[i*4+3]<<8) | (0x00ff&(int)readbuffer[i*4+2]))/32768.f)) * 0.5f;
It worked perfectly, but, I don't understand how they deinterleave the left and right channel from the PCM data (i.e. all the bit shifting and "ANDing" and "ORing").
A .wav file typically stores its PCM data in little endian format, with 16 bits per sample per channel. For the usual signed 16-bit PCM file, this means that the data is physically stored as
[LEFT LSB] [LEFT MSB] [RIGHT LSB] [RIGHT MSB] ...
so that every group of 4 bytes makes up a single stereo PCM sample. Hence, you can find sample i by looking at bytes 4*i through 4*i+3, inclusive.
To decode a single 16-bit value from two bytes, you do this:
(MSB << 8) | LSB
Because your read buffer values are stored as signed chars, you have to be a bit careful because both MSB and LSB will be sign-extended. This is undesirable for the LSB; therefore, the code uses
0xff & (int)LSB
to obtain the unsigned version of the low byte (technically, this works by upcasting to an int, and selecting the low 8 bits; an alternate formulation would be to just write (uint8_t)LSB).
Note that the MSBs are at indices 1 and 3, and the LSBs are at indices 0 and 2. So,
((readbuffer[i*4+1]<<8) | (0x00ff&(int)readbuffer[i*4]))
and
((readbuffer[i*4+3]<<8) | (0x00ff&(int)readbuffer[i*4+2]))
are just obtaining the values of the left and right channels as 16-bit signed values by using some bit manipulation to assemble the bytes into numbers.
Then, each of these values is divided by 32768.0. Note that a signed 16-bit value has a range of [-32768, 32767]. Thus, dividing by 32768 gives a range of approximately [-1, 1]. The two divided values are added to give a number in the range [-2, 2], and then the whole thing is multiplied by 0.5 to obtain the average (a floating-point value in the range [-1, 1]).

compact representation and delivery of point data

I have an array of point data, the values of points are represented as x co-ordinate and y co-ordinate.
These points could be in the range of 500 upto 2000 points or more.
The data represents a motion path which could range from the simple to very complex and can also have cusps in it.
Can I represent this data as one spline or a collection of splines or some other format with very tight compression.
I have tried representing them as a collection of beziers but at best I am getting a saving of 40 %.
For instance if I have an array of 500 points , that gives me 500 x and 500 y values so I have 1000 data pieces.
I around 100 quadratic beziers from this. each bezier is represented as controlx, controly, anchorx, anchory.
which gives me 100 x 4 = 400 pcs of data.
So input = 1000pcs , output = 400pcs.
I would like to further tighen this, any suggestions?
By its nature, spline is an approximation. You can reduce the number of splines you use to reach a higher compression ratio.
You can also achieve lossless compression by using some kind of encoding scheme. I am just making this up as I am typing, using the range example in previous answer (1000 for x and 400 for y),
Each point only needs 19 bits (10 for x, 9 for y). You can use 3 bytes to represent a coordinate.
Use 2 byte to represent displacement up to +/- 63.
Use 1 byte to represent short displacement up to +/- 7 for x, +/- 3 for y.
To decode the sequence properly, you would need some prefix to identify the type of encoding. Let's say we use 110 for full point, 10 for displacement and 0 for short displacement.
The bit layout will look like this,
Coordinates: 110xxxxxxxxxxxyyyyyyyyyy
Dislacement: 10xxxxxxxyyyyyyy
Short Displacement: 0xxxxyyy
Unless your sequence is totally random, you can easily achieve high compression ratio with this scheme.
Let's see how it works using a short example.
3 points: A(500, 400), B(550, 380), C(545, 381)
Let's say you were using 2 byte for each coordinate. It will take 16 bytes to encode this without compression.
To encode the sequence using the compression scheme,
A is first point so full coordinate will be used. 3 bytes.
B's displacement from A is (50, -20) and can be encoded as displacement. 2 bytes.
C's displacement from B is (-5, 1) and it fits the range of short displacement 1 byte.
So you save 10 bytes out of 16 bytes. Real compression ratio is totally depending on the data pattern. It works best on points forming a moving path. If the points are random, only 25% saving can be achieved.
If for example you use 32-bit integers for point coords and there is range limit, like x: 0..1000, y:0..400, you can pack (x, y) into a single 32-bit variable.
That way you achieve another 50% compression.
You could do a frequency analysis of the numbers you are trying to encode and use varying bit lengths to represent them, of course here I am vaguely describing Huffman coding
Firstly, only keep enough decimal points in your data that you actually need. Removing these would reduce your accuracy, but its a calculated loss. To do that, try converting your number to a string, locating the dot's position, and cutting of those many characters from the end. That could process faster than math, IMO. Lastly you can convert it back to a number.
150.234636746 -> "150.234636746" -> "150.23" -> 150.23
Secondly, try storing your data relative to the last number ("relative values"). Basically subtract the last number from this one. Then later to "decompress" it you can keep an accumulator variable and add them up.
A A A A R R
150, 200, 250 -> 150, 50, 50