I'm stumbling upon a steganographied image with a divided IDAT structure of 12 blocks (the last LSB slightly smaller) (.PNG). I'll elaborate a bit on the structure of the issue before I get to the real point of my question since I need to clarify some of the things so please do not mark it as off-topic since it is not. I just have to explain the notion behind the script so that I may get to the issue itself. It definitely has embedded data into itself. The data seems to have been concealed by altering the enhanced LSB values eliminating the high-level bits for each pixel except for the last least significant bit. So all bytes are going to be 0 or 1 since 0 or 1 on a 256 values range won't give any visible color. Basically, a 0 stays at 0, and a 1 becomes maximum value, or 255. I've been analyzing this image in many different ways, but don't see anything odd beyond the utter lack of one value in any of the three color values (RGB) and the heightened presence of another value in 1/3 of the color values. Studying these and replacing bytes has given me nothing, however, and I am at a loss to whether this avenue is even worth pursuing.
Hence, I'm looking into developing a script in rather Python, PHP or C/C++ that would reverse the process and 'restore' the enhanced LSBs.
I've converted it to a 24-bit .BMP and tracking down the red curve from a chi-square steganalysis, it's certain that there is a steganographied data within the file.
First, there is a little bit more than 8 vertical zones. Which means that there is hidden data little bit more than 8kB. One pixel can be used to hide three bits (one in the LSB of each RGB color tone). So we can hide (98x225)x3 bits. To get the number of kilobytes, we divide by 8 and by 1024: ((98x225)x3)/(8x1024). Well, that should be around 8.1 kilobytes. But that ain't the case here.
The analisys of the APPO and APP1 markers of a .JPG extension of the file also give some awkward outputs:
Start Offset: 0x00000000
*** Marker: SOI (xFFD8) ***
OFFSET: 0x00000000
*** Marker: APP0 (xFFE0) ***
OFFSET: 0x00000002
length = 16
identifier = [JFIF]
version = [1.1]
density = 96 x 96 DPI (dots per inch)
thumbnail = 0 x 0
*** Marker: APP1 (xFFE1) ***
OFFSET: 0x00000014
length = 58
Identifier = [Exif]
Identifier TIFF = x[4D 4D 00 2A 00 00 00 08 ]
Endian = Motorola (big)
TAG Mark x002A = x[002A]
EXIF IFD0 # Absolute x[00000026]
Dir Length = x[0003]
[IFD0.x5110 ] =
[IFD0.x5111 ] = 0
[IFD0.x5112 ] = 0
Offset to Next IFD = [00000000]
*** Marker: DQT (xFFDB) ***
Define a Quantization Table.
OFFSET: 0x00000050
Table length = 67
----
Precision=8 bits
Destination ID=0 (Luminance)
DQT, Row #0: 2 1 1 2 3 5 6 7
DQT, Row #1: 1 1 2 2 3 7 7 7
DQT, Row #2: 2 2 2 3 5 7 8 7
DQT, Row #3: 2 2 3 3 6 10 10 7
DQT, Row #4: 2 3 4 7 8 13 12 9
DQT, Row #5: 3 4 7 8 10 12 14 11
DQT, Row #6: 6 8 9 10 12 15 14 12
DQT, Row #7: 9 11 11 12 13 12 12 12
Approx quality factor = 94.02 (scaling=11.97 variance=1.37)
I'm nearly convinced that there is no encryption algorithm applied therefore no key implementation follows the concealment. My notion is that of coding a script that would shift the LSB values and return the originals. I've ran the file under several structure analyses, statistical attacks, BPCS,
The histogram of the image shows a specific color with an unusual spike to it. I've manipulated that as best I can to try and view any hidden data, but to no avail. Those are the histograms of the RGB values as follows:
Then there are the multiple IDAT chunks. But, I've put together a similar image by defining random color values at each pixel location, and I too wound up with several of these. So far, I've also found very little inside them. Even more interesting, is the way that color values are repeated in the image. It seems, that the frequency of reused colors could hold some clue. But, I have yet to fully understand that relationship, if one exists. Additionally, there is only a single column and a single row of pixels that do not possess a full value of 255 on their alpha channel. I've even interpreted the X, Y, A, R, G, and B values of every pixel in the image as ASCII, but wound up with nothing too legible. Even the green curve of the average of LSBs cannot tell us anything. There is no evident break. Here are several other histograms which show the weird curve of the blue value from the RGB:
But the red curve, the output of the chi-square analysis, shows some difference. It can see something that we cannot see. Statistical detection is more sensitive than our eyes, and I guess that was my final point. However, there is also a sort of latency in the red curve. Even without hidden data, it starts at maximum and stays like that for some time. It's close to a false positive. It looks like the LSB in the image and is very close to random, and the algorithm needs a large population (remember the analysis is done on an incrementing population of pixels) before reaching a threshold where it can decide that actually, they are not random after all, and the red curve starts to go down. The same sort of latency happens with hidden data. You hide 1 or 2 kb, but the red curve does not go down right after this amount of data. It waits a little bit, here respectively at around 1.3 kb and 2.6 kb. Here is a representation of the data types from a hex editor:
byte = 166
signed byte = -90
word = 40,358
signed word = -25,178
double word = 3,444,481,446
signed double word = -850,485,850
quad = 3,226,549,723,063,033,254
signed quad = 3,226,549,723,063,033,254
float = -216652384.
double = 5.51490063721e-093
word motorola = 42,653
double word motorola = 2,795,327,181
quad motorola = 12,005,838,827,773,085,484
Here's another spectrum to confirm the behavior of the blue (RGB) value.
Please note that I needed to go through all of this in order to clarify the situation and the programming matter that I'm in pursuit of. This by itself makes my question NOT off-topic so I'd be glad if it doesn't get marked as such. Thank you.
In case of an image with LSB enhancement applied, I cannot think of a way to reverse it back to its original state because there is no clue about the original values of RGBs. They are set to either 255 or 0 depending on their Least Significant Bit. The other option I see round here is if this is some sort of protocol to include quantum steganography.
Matlab and some steganalysis techniques could be the key to your issue though.
Here's a Java chi-square class for some statistical analysis:
private long[] pov = new long[256];
and three methods as
public double[] getExpected() {
double[] result = new double[pov.length / 2];
for (int i = 0; i < result.length; i++) {
double avg = (pov[2 * i] + pov[2 * i + 1]) / 2;
result[i] = avg;
}
return result;
}
public void incPov(int i) {
pov[i]++;
}
public long[] getPov() {
long[] result = new long[pov.length / 2];
for (int i = 0; i < result.length; i++) {
result[i] = pov[2 * i + 1];
}
return result;
or try with some bitwise shift operations as:
int pRGB = image.getRGB(x, y);
int alpha = (pRGB >> 24) & 0xFF;
int blue = (pRGB >> 16) & 0xFF;
int green = (pRGB >> 8) & 0xFF;
int red = pRGB & 0xFF;
Related
I've inherited maintenance of a function that takes as parameter a value between 0 and 65535 (inclusive):
MyClass::mappingFunction(unsigned short headingIndex);
headingIndex can be converted to degrees using the following formula: degrees = headingIndex * 360 / 65536
The role of this function is to translate the headingIndex into 1 of 36 symbols representing various degrees of rotation, i.e. there is a symbol for 10 degrees, a symbol for 20 degrees etc, up to 360 degrees in units of 10 degrees.
A headingIndex of 0 would translate to displaying the 0 (360) degree symbol.
The function performs the following which I can't seem to get my head around:
const int MAX_INTEGER = 65536;
const int NUM_SYMBOLS = 36;
int symbolRange = NUM_SYMBOLS - 1;
int roundAmount = MAX_INTEGER / (symbolRange + 1) - 1;
int roundedIndex = headingIndex + roundAmount;
int symbol = (symbolRange * roundedIndex) / MAX_INTEGER;
I'm confused about the algorithm that is being used here, specifically with regard to the following:
The intention behind roundAmount? I understand it is essentially dividing the maximum input range into discrete chunks but to then add it on to the headingIndex seems a strange thing to do.
roundedIndex is then the original value now offset or rotated by some offset in a clockwise direction?
The algorithm produces results such as:
headingIndex of 0 --> symbol 0
headingIndex of 100 --> symbol 1
headingIndex of 65500 --> symbol 35
I'm thinking there must be a better way of doing this?
The shown code looks very convoluted (it is possibly a guard against integer overflow). A far simpler way to determine the symbol number would be code like the following:
symbol = (headingIndex * 36u) / 65536u;
However, if this does present problems with integer overflow, then the calculation could be done in double precision, converting the result back to int after rounding:
symbol = static_cast<int>( ((headindIndex * 36.0) / 65536.0) + 0.5 ); // Add 0.5 for rounding.
You have 65536 possible inputs (0..65535) and 36 outputs (0..35). That means each output bin should represent about 1820 inputs if they are divided equally.
The above formula doesn't do that. Only the first 54 values are in bin 0, then they are equally divided across the remaining 35 bins (MAX_INTEGER/symbolRange). About 1872 per bin.
To show this, solve for the lowest value of heading where symbol is 1. 1 * 65536 = 35 * (headingIndex + 1819) so headingIndex == 53.
If you want to keep the output the same but tidy it up. Walk away.
There are odd features of that method that may or may not be what is desired.
The range for headingIndex of 0 - 53 gives a symbol of 0. That's a bucket (AKA bin) of 54 values.
The range of 63717 - 65535 give 35. That bucket is 1819 values.
All the other buckets are either 1872 or 1873 values so seem 'big'.
We can't have equal sized buckets because number of values is 65536 and 65536/36 is 1820 and 16 remainder.
So we need to bury the 16 among the buckets. They have to be uneven in size.
Notice the constant MAX_INTEGER is a red herring. The max is 65535. 65536 is the range. The chosen name is misleading from the start.
Why:
int symbolRange = NUM_SYMBOLS - 1;
int roundAmount = MAX_INTEGER / (symbolRange + 1) - 1;
when the second line could be int roundAmount = MAX_INTEGER / MAX_SYMBOLS - 1;
It doesn't look quite thought through is all I'm saying. But looks can be deceptive.
What also bothers me is the 'obvious' method proposed in other answers works great!
int symbol=(NUM_SYMBOLS*headingIndex)/(MAX_INTEGER);
Gives us buckets of either 1820 or 1821 with an even distribution. I'd say that's the natural solution to the head question.
So why the current method? Is it some artefact of some measuring device?
I'll put money the maximum value is 65535 because that's the maximum value of an unsigned 16-bit integer.
It's right to wonder about overflow. But if you're working in 16-bits it's already broken. So I wonder about a device that is recording 16-bits. That's quite realistic.
This is similar to what I know as "The Instalments Problem".
We want to the customer to pay £655.36 over 36 months. Do they pay £18.20 a month totalling £655.20 and we forget the 16p? They won't pay £18.21 totalling £655.56 and overpay 20p. Bigger first payment of £18.36 and then 35 of £18.20?
People wrestle with this one. The business answers are 'get the money' - bigger first payment. Avoid complaints if they owe you money (big last payment) and forget the pennies (all same - we're bigger than a few pence!).
In arithmetic terms for a measurement (such as degrees) I'd say the sprinkled method offered is the most natural, even and distributes the anomaly evenly.
But it's not the only answer. Up to you. Hint: If you haven't been ask to fix this and just think it's ugly - walk away. Walk away now.
This Image2LCD software (https://www.buydisplay.com/default/image2lcd) converts images to c-arrays. I want to write this basic operation myself, but I dont understand why the software outputs an array of length 5000 for an input image of size 200x200. For 400x400 the array size is 20000. It seems like its always 1/8 of the number of pixels.
The output array for the square 200x200 image begins and ends like this:
const unsigned char gImage_test[5000] = { /* 0X00,0X01,0XC8,0X00,0XC8,0X00, */
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X60,0X00,0X00,0X00,0X00,
0X3C,0X60,0X00,0X0C,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X70,0X00,0X00,0X00,0X00,0X7E,0X70,0X00,0X0E,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X78,0X00,0X00,
0X00,0X00,0X7F,0X78,0X00,0X0F,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X7F,0XFC,0X3C,0X3E,0X3C,0X3F,0XF8,0X3C,0X7F,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X7F,
...
,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,};
(Yes there is a lot of white in the image.)
Why don't you need one value for each pixel?
Shooting from the hip here, but if you're using monochrome, you only need one bit per pixel (Byte = 8 bits). These bits can be packed into bytes for storage efficiency. Say the first 8 pixels of your image are these:
0 1 0 0 0 0 0 1
If we interpret these eight bits as one binary number, this is 1000001, which is 65 in decimal - so just storing 65 in an 8-bit integer, taking up only one byte, will store all 8 monochrome pixels. The downside is that it's not as intuitive as having each pixel as a separate value in the array.
I may be wrong, but 1/8th points straight to this kind of compression.
I am learning bare metal programming in c++ and it often involves setting a portion of a 32 bit hardware register address to some combination.
For example for an IO pin, I can set the 15th to 17th bit in a 32 bit address to 001 to mark the pin as an output pin.
I have seen code that does this and I half understand it based on an explanation of another SO question.
# here ra is a physical address
# the 15th to 17th bits are being
# cleared by AND-ing it with a value that is one everywhere
# except in the 15th to 17th bits
ra&=~(7<<12);
Another example is:
# this clears the 21st to 23rd bits of another address
ra&=~(7<<21);
How do I choose the 7 and how do I choose the number of bits to shift left?
I tried this out in python to see if I can figure it out
bin((7<<21)).lstrip('-0b').zfill(32)
'00000000111000000000000000000000'
# this has 8, 9 and 10 as the bits which is wrong
The 7 (base 10) is chosen as its binary representation is 111 (7 in base 2).
As for why it's bits 8, 9 and 10 set it's because you're reading from the wrong direction. Binary, just as normal base 10, counts right to left.
(I'd left this as a comment but reputation isn't high enough.)
If you want to isolate and change some bits in a register but not all you need to understand the bitwise operations like and and or and xor and not operate on a single bit column, bit 3 of each operand is used to determine bit 3 of the result, no other bits are involved. So I have some bits in binary represented by letters since they can each either be a 1 or zero
jklmnopq
The and operation truth table you can look up, anything anded with zero is a zero anything anded with one is itself
jklmnopq
& 01110001
============
0klm000q
anything orred with one is a one anything orred with zero is itself.
jklmnopq
| 01110001
============
j111nop1
so if you want to isolate and change two bits in this variable/register say bits 5 and 6 and change them to be a 0b10 (a 2 in decimal), the common method is to and them with zero then or them with the desired value
76543210
jklmnopq
& 10011111
============
j00mnopq
jklmnopq
| 01000000
============
j10mnopq
you could have orred bit 6 with a 1 and anded bit 5 with a zero, but that is specific to the value you wanted to change them to, generically we think I want to change those bits to a 2, so to use that value 2 you want to zero the bits then force the 2 onto those bits, and them to make them zero then orr the 2 onto the bits. generic.
In c
x = read_register(blah);
x = (x&(~(3<<5)))|(2<<5);
write_register(blah,x);
lets dig into this (3 << 5)
00000011
00000110 1
00001100 2
00011000 3
00110000 4
01100000 5
76543210
that puts two ones on top of the bits we are interested in but anding with that value isolates the bits and messes up the others so to zero those and not mess with the other bits in the register we need to invert those bits
using x = ~x inverts those bits a logical not operation.
01100000
10011111
Now we have the mask we want to and with our register as shown way above, zeroing the bits in question while leaving the others alone j00mnopq
Now we need to prep the bits to or (2<<5)
00000010
00000100 1
00001000 2
00010000 3
00100000 4
01000000 5
Giving the bit pattern we want to orr in giving j10mnopq which we write back to the register. Again the j, m, n, ... bits are bits they are either a one or a zero and we dont want to change them so we do this extra masking and shifting work. You may/will sometimes see examples that simply write_register(blah,2<<5); either because they know the state of the other bits, know they are not using those other bits and zero is okay/desired or dont know what they are doing.
x read_register(blah); //bits are jklmnopq
x = (x&(~(3<<5)))|(2<<5);
z = 3
z = z << 5
z = ~z
x = x & z
z = 2
z = z << 5
x = x | z
z = 3
z = 00000011
z = z << 5
z = 01100000
z = ~z
z = 10011111
x = x & z
x = j00mnopq
z = 2
z = 00000010
z = z << 5
z = 01000000
x = x | z
x = j10mnopq
if you have a 3 bit field then the binary is 0b111 which in decimal is the number 7 or hex 0x7. a 4 bit field 0b1111 which is decimal 15 or hex 0xF, as you get past 7 it is easier to use hex IMO. 6 bit field 0x3F, 7 bit field 0x7F and so on.
You can take this further in a way to try to be more generic. If there is a register that controls some function for gpio pins 0 through say 15. starting with bit 0. If you wanted to change the properties for gpio pin 5 then that would be bits 10 and 11, 5*2 = 10 there are two pins so 10 and the next one 11. But generically you could:
x = (x&(~(0x3<<(pin*2)))) | (value<<(pin*2));
since 2 is a power of 2
x = (x&(~(0x3<<(pin<<1)))) | (value<<(pin<<1));
an optimization the compiler might do for if pin cannot be reduced to a specific value at compile time.
but if it were 3 bits per field and the fields start aligned with bit zero
x = (x&(~(0x7<<(pin*3)))) | (value<<(pin*3));
which the compiler might do a multiply by 3 but maybe instead just
pinshift = (pinshift<<1)|pinshift;
to get the multiply by three. depends on the compiler and instruction set.
overall this is called a read modify write as you read something, modify some of it, then write back (if you were modifying all of it you wouldnt need to bother with a read and a modify you would write the whole new value). And folks will say masking and shifting to generically cover isolating bits in a variable either for modification purposes or if you wanted to read/see what those two bits above were you would
x = read_register(blah);
x = x >> 5;
x = x & 0x3;
or mask first then shift
x = x & (0x3<<5);
x = x >> 5;
six of one half a dozen of another, both are equal in general, some instruction sets one might be more efficient than another (or might be equal and then shift, or shift then and). One might make more sense visually to some folks rather than the other.
Although technically this is an endian thing as some processors bit 0 is the most significant bit. In C AFAIK bit 0 is the least significant bit. If/when a manual shows the bits laid out left to right you want your right and left shifts to match that, so as above I showed 76543210 to indicate the documented bits and associated that with jklmnopq and that was the left to right information that mattered to continue the conversation about modifying bits 5 and 6. some documents will use verilog or vhdl style notation 6:5 (meaning bits 6 to 5 inclusive, makes more sense with say 4:2 meaning bits 4,3,2) or [6 downto 5], more likely to just see a visual picture with boxes or lines to show you what bits are what field.
How do I choose the 7
You want to clear three adjacent bits. Three adjacent bits at the bottom of a word is 1+2+4=7.
and how do I choose the number of bits to shift left
You want to clear bits 21-23, not bits 1-3, so you shift left another 20.
Both your examples are wrong. To clear 15-17 you need to shift left 14, and to clear 21-23 you need to shift left 20.
this has 8, 9,and 10 ...
No it doesn't. You're counting from the wrong end.
I have millions of unstructured 3D vectors associated with arbitrary values - making for a set 4D of vectors. To make it simpler to understand: I have unixtime stamps associated with hundreds of thousands of 3D vectors. And I have many time stamps, making for a very large dataset; upwards of 30 millions vectors.
I have the need to search particular datasets of specific time stamps.
So lets say I have the following data:
For time stamp 1407633943:
(0, 24, 58, 1407633943)
(9, 2, 59, 1407633943)
...
For time stamp 1407729456:
(40, 1, 33, 1407729456)
(3, 5, 7, 1407729456)
...
etc etc
And I wish to make a very fast query along the lines of:
Query Example 1:
Give me vectors between:
X > 4 && X < 9 && Y > -29 && Y < 100 && Z > 0.58 && Z < 0.99
Give me list of those vectors, so I can find the timestamps.
Query Example 2:
Give me vectors between:
X > 4 && X < 9 && Y > -29 && Y < 100 && Z > 0.58 && Z < 0.99 && W (timestamp) = 1407729456
So far I've used SQLite for the task, but even after column indexing, the thing takes between 500ms - 7s per query. I'm looking for somewhere between 50ms-200ms per query solution.
What sort of structures or techniques can I use to speed the query up?
Thank you.
kd-trees can be helpful here. Range search in a kd-tree is a well-known problem. Time complexity of one query depends on the output size, of course(in the worst case all tree will be traversed if all vectors fit). But it can work pretty fast on average.
I would use octree. In each node I would store arrays of vectors in a hashtable using the timestamp as a key.
To further increase the performance you can use CUDA, OpenCL, OpenACC, OpenMP and implement the algorithms to be executed in parallel on the GPU or a multi-core CPU.
BKaun: please accept my attempt at giving you some insight into the problem at hand. I suppose you have thought of every one of my points, but maybe seeing them here will help.
Regardless of how ingest data is presented, consider that, using the C programming language, you can reduce the storage size of the data to minimize space and search time. You will be searching for, loading, and parsing single bits of a vector instead of, say, a SHORT INT which is 2 bytes for every entry - or a FLOAT which is much more. The object, as I understand it, is to search the given data for given values of X, Y, and Z and then find the timestamp associated with these 3 while optimizing the search. My solution does not go into the search, but merely the data that is used in a search.
To illustrate my hints simply, I'm considering that the data consists of 4 vectors:
X between -2 and 7,
Y between 0.17 and 3.08,
Z between 0 and 50,
timestamp (many of same size - 10 digits)
To optimize, consider how many various numbers each vector can have in it:
1. X can be only 10 numbers (include 0)
2. Y can be 3.08 minus 0.17 = 2.91 x 100 = 291 numbers
3. Z can be 51 numbers
4. timestamp can be many (but in this scenario,
you are not searching for a certain one)
Consider how each variable is stored as a binary:
1. Each entry in Vector X COULD be stored in 4 bits, using the first bit=1 for
the negative sign:
7="0111"
6="0110"
5="0101"
4="0100"
3="0011"
2="0010"
1="0001"
0="0000"
-1="1001"
-2="1010"
However, the original data that you are searching through may range
from -10 to 20!
Therefore, adding another 2 bits gives you a table like this:
-10="101010"
-9="101001" ...
...
-2="100010"
-1="100001" ...
...
8="001000"
9="001001" ...
...
19="001001"
20="010100"
And that's only 6 bits to store each X vector entry for integers from -10 to 20
For search purposes on a range of -10 to 20, there are 21 different X Vector entries
possible to search through.
Each entry in Vector Y COULD be stored in 9 bits (no extra sign bit is needed)
The 1's and 0's COULD be stored (accessed, really) in 2 parts
(tens place, and a 2 digit decimal).
Part 1 can be 0, 1, 2, or 3 (4 2-place bits from "00" to "11")
However, if the range of the entire Y dataset is 0 to 10,
part 1 can be 0, 1, ...9, 10 (which is 11 4-place bits
from "0000" to "1010"
Part 2 can be 00, 01,...98, 99 (100 7-place bits from "0000000" to "1100100"
Total storage bits for Vector Y entries is 11 + 7 = 18 bits in the
range 00.00 to 10.99
For search purposes on a range 00.00 to 10.99, there are 1089 different Y Vector
entries possible to search through (11x99) (?)
Each entry in Vector Z in the range of 0 to 50 COULD be stored in 6 bits
("000000" to "110010").
Again, the actual data range may be 7 bits long (for simplicity's sake)
0 to 64 ("0000000" to "1000000")
For search purposes on a range of 0 to 64, there are 65 different Z Vector entries
possible to search through.
Consider that you will be storing the data in this optimized format, in a single
succession of bits:
X=4 bits + 2 range bits = 6 bits
+ Y=4 bits part 1 and 7 bits part 2 = 11 bits
+ Z=7 bits
+ timestamp (10 numbers - each from 0 to 9 ("0000" to "1001") 4 bits each = 40 bits)
= TOTAL BITS: 6 + 11 + 7 + 40 = 64 stored bits for each 4D vector
THE SEARCH:
Input xx, yy, zz to search for in arrays X, Y and Z (which are stored in binary)
Change xx, yy, and zz to binary bit strings per optimized format above.
function(xx, yy, zz)
Search for X first, since it has 21 possible outcomes (range is -10 to 10)
- the lowest number of any array
First search for positive targets (there are 8 of them and better chance
of finding one)
These all start with "000"
7="000111"
6="000110"
5="000101"
4="000100"
3="000011"
2="000010"
1="000001"
0="000000"
So you can check if the first 3 bits = "000". If so, you have a number
between 0 and 7.
Found: search for Z
Else search for xx=-2 or -1: does X = -2="100010" or -1="100001" ?
(do second because there are only 2 of them)
Found: Search for Z
NotFound: next X
Search for Z after X is Found: (Z second, since it has 65 possible outcomes
- range is 0 to 64)
You are searching for 6 bits of a 7 bit binary number
("0000000" to "1000000") If bits 1,2,3,4,5,6 are all "0", analyze bit 0.
If it is "1" (it's 64), next Z
Else begin searching 6 bits ("000000" to "110010") with LSB first
Found: Search for Y
NotFound: Next X
Search for Y (Y last, since it has 1089 possible outcomes - range is 0.00 to 10.99)
Search for Part 1 (decimal place) bits (you are searching for
"0000", "0001" or "0011" only, so use yyPt1=YPt1)
Found: Search for Part 2 ("0000000" to "1100100") using yyPt2=YPt2
(direct comparison)
Found: Print out X, Y, Z, and timestamp
NotFound: Search criteria for X, Y, and Z not found in data.
Print X,Y,Z,"timestamp not found". Ask for new X, Y, Z. New search.
So I am running through "OpenCV 2 Computer Vision Application Programming Cookbook" by Robert Laganiere. Around page 42 it is talking about a image reduction algorithm. I understand the algorithm ( i think) but I do not understand exactly why one part was put in. I think I know why but if I am wrong I would like corrected. I am going to copy and paste a little bit of it in here:
"Color images are composed of 3-channel pixels. Each of these channels
corresponds to the intensity value of one of the three primary colors
(red, green, blue). Since each of these values is an 8-bit unsigned
char, the total number of colors is 256x256x256, which is more than 16
million colors. Consequently, to reduce the complexity of an analysis,
it is sometimes useful to reduce the number of colors in an image. One
simple way to achieve this goal is to simply subdivide the RGB space
into cubes of equal sizes. For example, if you reduce the number of
colors in each dimension by 8, then you would obtain a total of
32x32x32 colors. Each color in the original image is then assigned a
new color value in the color-reduced image that corresponds to the
value in the center of the cube to which it belongs. Therefore, the
basic color reduction algorithm is simple. If N is the reduction
factor, then for each pixel in the image and for each channel of this
pixel, divide the value by N (integer division, therefore the reminder
is lost). Then multiply the result by N, this will give you the
multiple of N just below the input pixel value. Just add N/2 and you
obtain the central position of the interval between two adjacent
multiples of N. if you repeat this process for each 8-bit channel
value, then you will obtain a total of 256/N x 256/N x 256/N possible
color values. How to do it... The signature of our color reduction
function will be as follows: void colorReduce(cv::Mat &image, int
div=64); The user provides an image and the per-channel reduction
factor. Here, the processing is done in-place, that is the pixel
values of the input image are modified by the function. See the
There's more... section of this recipe for a more general function
signature with input and output arguments. The processing is simply
done by creating a double loop that goes over all pixel values: "
void colorReduce(cv::Mat &image, int div=64) {
int nl= image.rows; // number of lines
// total number of elements per line
int nc= image.cols * image.channels();
for (int j=0; j<nl; j++) {
// get the address of row j
uchar* data= image.ptr<uchar>(j);
for (int i=0; i<nc; i++) {
// process each pixel ---------------------
data[i]=
data[i]/div*div + div/2;// <-HERE IS WHERE I NEED UNDERSTANDING!!!
// end of pixel processing ---------------
}}}
So I get how I am reducing the 0:255 pixel value by div amount. I then lose whatever remainder was left. Then by multiplying it by the div amount again we are scaling it back up to keep it in the range of 0:255. Why are we then adding (div/2) back into the answer? The only reason I can think is that this will cause some values to be rounded down and some rounded up. If you don't use it then all your values are rounded down. So in a way it is giving a "better" average?
Don't know, so what do you guys/girls think?
The easiest way to illustrate this is using an example.
For simplicity, let's say we are processing a single channel of an image. There are 256 distinct colors, ranging from 0 to 255. We are also going to use N=64 in our example.
Using these numbers, we will reduce the number of colors from 256 to 256/64 = 4. Let's draw a graph of our color space:
|......|......|......|......|
0 63 127 191 255
The dotted line represents our colorspace, going from 0 to 255. We have split this interval into 4 parts, and the splits are represented by the vertical lines.
In order to reduce all 256 colors to 4 colors, we are going to divide each color by 64 (losing the remainder), and then multiply it by 64 again. Let's see how this goes:
[0 , 63 ] / 64 * 64 = 0
[64 , 127] / 64 * 64 = 64
[128, 191] / 64 * 64 = 128
[192, 255] / 64 * 64 = 192
As you can see, all the colors from the first part became 0, all the colors from the second part became 64, third part 128, fourth part 192. So our color space looks like this:
|......|......|......|......|
0 63 127 191 255
|______/|_____/|_____/|_____/
| | | |
0 64 128 192
But this is not very useful. You can see that all our colors are slanted to the left of the intervals. It would be more helpful if they were in the middle of the intervals. And that's why we add 64/2 = 32 to the values. Adding half of the interval length shifts the colors to the center of the intervals. That's also what it says in the book: "Just add N/2 and you obtain the central position of the interval between two adjacent multiples of N."
So let's add 32 to our values and see how everything looks:
[0 , 63 ] / 64 * 64 + 32 = 32
[64 , 127] / 64 * 64 + 32 = 96
[128, 191] / 64 * 64 + 32 = 160
[192, 255] / 64 * 64 + 32 = 224
And the interval looks like this:
|......|......|......|......|
0 63 127 191 255
\______/\_____/\_____/\_____/
| | | |
32 96 160 224
This is a much better color reduction. The algorithm reduced our colorspace from 256 to 4 colors, and those colors are in the middle of the intervals that they reduce.
It is done to give an average of the quantization bounds, not floor of it.
For example for N = 32, all data from 0 to 31 will give 16 instead of 0.
Please check following picture or my excel file.