compress 4 byte floating point data to 1 byte - compression

I need to compress floating point numbers (4 bytes) to 1 byte(0 to 0xFF) to send to another device. The floating point numbers range from -100000.0 to 100000.0.
The other device will decode from 1 byte back to floating point numbers. How do it do it with minimum data loss?
Thanks, JC

One solution is to use quantization. Divide 100000 to 127 intervals. Send the interval number to which float belongs to and a sign in lowest or highest bit
In your case the interval = 787,4
For example, you have input like 100. Send 1. Input 1000,147732. Send 2
On the device you can restore number by its interval.
The easiest solution is to restore the number as a middle of the interval. For example, every float that belongs to the first interval will be restored as 393.7
If you have some stats for digits distribution and it's not uniform, you can play around it by changing the intervals length and quantize frequent floats more precisely

Related

Compute FFT in frequency axis when signal is in rawData in Matlab

I have a signal of frequency 10 MHz sampled at 100 MS/sec. How to compute FFT in matlab in terms of frequency when my signal is in rawData (length of this rawData is 100000), also
what should be the optimum length of NFFT.(i.e., on what factor does NFFT depend)
why does my Amplitude (Y axis) change with NFFT
whats difference between NFFT, N and L. How to compute length of a signal
How to separate Noise and signal from a single signal (which is in rawData)
Here is my code,
t=(1:40);
f=10e6;
fs=100e6;
NFFT=1024;
y=abs(rawData(:1000,2));
X=abs(fft(y,NFFT));
f=[-fs/2:fs/NFFT:(fs/2-fs/NFFT)];
subplot(1,1,1);
semilogy(f(513:1024),X(513:1024));
axis([0 10e6 0 10]);
As you can find the corresponding frequencies in another post, I will just answer your other questions:
Including all your data is most of the time the best option. fft just truncates your input data to the requested length, which is probably not what you want. If you known the period of your input single, you can truncate it to include a whole number of periods. If you don't know it, a window (ex. Hanning) may be interesting.
If you change NFFT, you use more data in your fft calculation, which may change the amplitude for a given frequency slightly. You also calculate the amplitude at more frequencies between 0 and Fs/2 (half of the sampling frequency).
Question is not clear, please provide the definition of N and L.
It depends on your application. If the noise is at the same frequency as your signal, you are not able to separate it. Otherwise, you can a filter (ex. bandpass) to extract the frequencies of interest.

Length of FFT and IFFT

I have some signals which I add up to a larger signal, where each signal is located in a different frequency region.
Now, I perform the FFT operation on the big signal with FFTW and cut the concrete FFT bins (where the signals are located) out.
For example: The big signal is FFT transformed with 1024 points,
the sample rate of the signal is fs=200000.
I calculate the concrete bin positions for given start and stop frequencies in the following way:
tIndex.iStartPos = (int64_t) ((tFreqs.i64fstart) / (mSampleRate / uFFTLen));
and e.g. I get for the first signal to be cut out 16 bins.
Now I do the IFFT transformation again with FFTW and get the 16 complex values back (because I reserved the vector for 16 bins).
But when I compare the extracted signal with the original small signal in MATLAB, then I can see that the original signal (is a wav-File) has xxxxx data and my signal (which I saved as raw binary file) has only 16 complex values.
So how do I obtain the length of the IFFT operation to be correctly transformed? What is wrong here?
EDIT
The logic itself is split over 3 programs, each line is in a multithreaded environment. For that reason I post here some pseudo-code:
ReadWavFile(); //returns the signal data and the RIFF/FMT header information
CalculateFFT_using_CUFFTW(); //calculates FFT with user given parameters, like FFT length, polyphase factor, and applies polyphased window to reduce leakage effect
GetFFTData(); //copy/get FFT data from CUDA device
SendDataToSignalDetector(); //detects signals and returns center frequency and bandwith for each sigal
Freq2Index(); // calculates positions with the returned data from the signal detector
CutConcreteBins(position);
AddPaddingZeroToConcreteBins(); // adds zeros till next power of 2
ApplyPolyphaseAndWindow(); //appends the signal itself polyphase-factor times and applies polyphased window
PerformIFFT_using_FFTW();
NormalizeFFTData();
Save2BinaryFile();
-->Then analyse data in MATLAB (is at the moment in work).
If you have a real signal consisting of 1024 samples, the contribution from the 16 frequency bins of interest could be obtained by multiplying the frequency spectrum by a rectangular window then taking the IFFT. This essentially amounts to:
filling a buffer with zeros before and after the frequency bins of interest
copying the frequency bins of interest at the same locations in that buffer
if using a full-spectrum representation (if you are using fftw_plan_dft_1d(..., FFTW_BACKWARD,... for the inverse transform), computing the Hermitian symmetry for the upper half of the spectrum (or simply use a half-spectrum representation and perform the inverse transform through fftw_plan_dft_c2r_1d).
That said, you would get a better frequency decomposition by using specially designed filters instead of just using a rectangular window in the frequency domain.
The output length of the FT is equal to the input length. I don't know how you got to 16 bins; the FT of 1024 inputs is 1024 bins. Now for a real input (not complex) the 1024 bins will be mirrorwise identical around 512/513, so your FFT library may return only the lower 512 bins for a real input. Still, that's more than 16 bins.
You'll probably need to fill all 1024 bins when doing the IFFT, as it generally doesn't assume that its output will become a real signal. But that's just a matter of mirroring the lower 512 bins then.

Creating a 6 bit crc using boost

I'm new to CRCs, boost and more of a java developer for that matter. I'm trying to use the the crc.hpp boost library to create a 6 bit crc calculated based on only two bits. First is this possible?
It seems that the Theoretical CRC Computer can be used to process a specific number of bits, however I'm unclear how to specify a 6 bit result. Help please.
Assuming your input is based on 2 actual bits and not two bytes, this should work:
const int initial_remainder = 0xBAADF00D;
unsigned char input = 0x3;
boost::crc_basic<6> checksum(initial_remainder);
checksum.process_bits(input, 2);
printf("%i", checksum.checksum());
You still need to figure out what the initial remainder should be, though.
This should just be a custom code that maximizes the Hamming distance between four byte values. It would be a table of four 8-bit values indexed by the two bits as a number in 0..3.
A set of values (there 280 such sets) that maximizes the minimum Hamming distance between any two of the four values is: 0x00, 0x4f, 0xb3, 0xfc. The minimum Hamming distance is 5. The high two bits of those values is the two-bit index in order.

Rounding a value contained within a CString

So I have a CString which contains a number value e.g. "45.05" and I would like to round this number to one decimal place.
I use this funcion
_stscanf(strValue, _T("%f"), &m_Value);
to put the value into a float which i can round. However in the case of 45.05 the number i get is 45.04999... which rounds to 45.0 where one would expect 45.1
How can I get the correct value from my CString?
TIA
If you need a string result, your best bet is to find the decimal point and inspect the two digits after it and use them to make a rounded result. If you need a floating-point number as a result, well.. it's hopeless since 45.1 cannot be represented exactly.
EDIT: the nearest you can come to rounding with arithmetic is computing floor(x*10+0.5)/10, but know that doing this with 45.05 WILL NOT and CAN NOT result in 45.1.
You could extract the digits that make up the hundredths and below positions separately, convert them to a number and round it independently, and then add that to the rest of the number:
"45.05" = 45.0 and 0.5 tenths (0.5 can be represented exactly in binary)
round 0.5 tenths to 1
45.0 + 1 tenth = 45.1
don't confuse this with just handling the fractional position separately. "45.15" isn't divided into 45 and .15, it's divided into 45.1 and 0.5 tenths.
I haven't used c++ in a while but here are the steps I would take.
Count the characters after the Decimal
Remove the Decimal
cast the string to an Int
Perform Rounding operation
Divide by the (number of characters less one)*10
Store result in a float

compact representation and delivery of point data

I have an array of point data, the values of points are represented as x co-ordinate and y co-ordinate.
These points could be in the range of 500 upto 2000 points or more.
The data represents a motion path which could range from the simple to very complex and can also have cusps in it.
Can I represent this data as one spline or a collection of splines or some other format with very tight compression.
I have tried representing them as a collection of beziers but at best I am getting a saving of 40 %.
For instance if I have an array of 500 points , that gives me 500 x and 500 y values so I have 1000 data pieces.
I around 100 quadratic beziers from this. each bezier is represented as controlx, controly, anchorx, anchory.
which gives me 100 x 4 = 400 pcs of data.
So input = 1000pcs , output = 400pcs.
I would like to further tighen this, any suggestions?
By its nature, spline is an approximation. You can reduce the number of splines you use to reach a higher compression ratio.
You can also achieve lossless compression by using some kind of encoding scheme. I am just making this up as I am typing, using the range example in previous answer (1000 for x and 400 for y),
Each point only needs 19 bits (10 for x, 9 for y). You can use 3 bytes to represent a coordinate.
Use 2 byte to represent displacement up to +/- 63.
Use 1 byte to represent short displacement up to +/- 7 for x, +/- 3 for y.
To decode the sequence properly, you would need some prefix to identify the type of encoding. Let's say we use 110 for full point, 10 for displacement and 0 for short displacement.
The bit layout will look like this,
Coordinates: 110xxxxxxxxxxxyyyyyyyyyy
Dislacement: 10xxxxxxxyyyyyyy
Short Displacement: 0xxxxyyy
Unless your sequence is totally random, you can easily achieve high compression ratio with this scheme.
Let's see how it works using a short example.
3 points: A(500, 400), B(550, 380), C(545, 381)
Let's say you were using 2 byte for each coordinate. It will take 16 bytes to encode this without compression.
To encode the sequence using the compression scheme,
A is first point so full coordinate will be used. 3 bytes.
B's displacement from A is (50, -20) and can be encoded as displacement. 2 bytes.
C's displacement from B is (-5, 1) and it fits the range of short displacement 1 byte.
So you save 10 bytes out of 16 bytes. Real compression ratio is totally depending on the data pattern. It works best on points forming a moving path. If the points are random, only 25% saving can be achieved.
If for example you use 32-bit integers for point coords and there is range limit, like x: 0..1000, y:0..400, you can pack (x, y) into a single 32-bit variable.
That way you achieve another 50% compression.
You could do a frequency analysis of the numbers you are trying to encode and use varying bit lengths to represent them, of course here I am vaguely describing Huffman coding
Firstly, only keep enough decimal points in your data that you actually need. Removing these would reduce your accuracy, but its a calculated loss. To do that, try converting your number to a string, locating the dot's position, and cutting of those many characters from the end. That could process faster than math, IMO. Lastly you can convert it back to a number.
150.234636746 -> "150.234636746" -> "150.23" -> 150.23
Secondly, try storing your data relative to the last number ("relative values"). Basically subtract the last number from this one. Then later to "decompress" it you can keep an accumulator variable and add them up.
A A A A R R
150, 200, 250 -> 150, 50, 50