Arduino eInk Image2LCD - Size of c-array - c++
This Image2LCD software (https://www.buydisplay.com/default/image2lcd) converts images to c-arrays. I want to write this basic operation myself, but I dont understand why the software outputs an array of length 5000 for an input image of size 200x200. For 400x400 the array size is 20000. It seems like its always 1/8 of the number of pixels.
The output array for the square 200x200 image begins and ends like this:
const unsigned char gImage_test[5000] = { /* 0X00,0X01,0XC8,0X00,0XC8,0X00, */
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X60,0X00,0X00,0X00,0X00,
0X3C,0X60,0X00,0X0C,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X70,0X00,0X00,0X00,0X00,0X7E,0X70,0X00,0X0E,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X78,0X00,0X00,
0X00,0X00,0X7F,0X78,0X00,0X0F,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X7F,0XFC,0X3C,0X3E,0X3C,0X3F,0XF8,0X3C,0X7F,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X7F,
...
,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,
0X00,0X00,0X00,0X00,0X00,0X00,0X00,0X00,};
(Yes there is a lot of white in the image.)
Why don't you need one value for each pixel?
Shooting from the hip here, but if you're using monochrome, you only need one bit per pixel (Byte = 8 bits). These bits can be packed into bytes for storage efficiency. Say the first 8 pixels of your image are these:
0 1 0 0 0 0 0 1
If we interpret these eight bits as one binary number, this is 1000001, which is 65 in decimal - so just storing 65 in an 8-bit integer, taking up only one byte, will store all 8 monochrome pixels. The downside is that it's not as intuitive as having each pixel as a separate value in the array.
I may be wrong, but 1/8th points straight to this kind of compression.
Related
C++ - Reading number of bits per pixel from BMP file
I am trying to get number of bits per pixel in a bmp file. According to Wikipedia, it is supposed to be at 28th byte. So after reading a file: // Przejscie do bajtu pod ktorym zapisana jest liczba bitow na pixel plik.seekg(28, ios::beg); // Read number of bytes used per pixel int liczbaBitow; plik.read((char*)&liczbaBitow, 2); cout << "liczba bitow " << liczbaBitow << endl; But liczbaBitow (variable that is supposed to hold number of bits per pixel value) is -859045864. I don't know where it comes from... I'm pretty lost. Any ideas?
To clarify #TheBluefish's answer, this code has the bug // Read number of bytes used per pixel int liczbaBitow; plik.read((char*)&liczbaBitow, 2); When you use (char*)&libczbaBitow, you're taking the address of a 4 byte integer, and telling the code to put 2 bytes there. The other two bytes of that integer are unspecified and uninitialized. In this case, they're 0xCC because that's the stack initialization value used by the system. But if you're calling this from another function or repeatedly, you can expect the stack to contain other bogus values. If you initialize the variable, you'll get the value you expect. But there's another bug.. Byte order matters here too. This code is assuming that the machine native byte order exactly matches the byte order from the file specification. There are a number of different bitmap formats, but from your reference, the wikipedia article says: All of the integer values are stored in little-endian format (i.e. least-significant byte first). That's the same as yours, which is obviously also x86 little endian. Other fields aren't defined to be little endian, so as you proceed to decode the image, you'll have to watch for it. Ideally, you'd read into a byte array and put the bytes where they belong. See Convert Little Endian to Big Endian int libczbaBitow; unsigned char bpp[2]; plik.read(bpp, 2); libczbaBitow = bpp[0] | (bpp[1]<<8);
-859045864 can be represented in hexadecimal as 0xCCCC0018. Reading the second byte gives us 0x0018 = 24bpp. What is most likely happening here, is that liczbaBitow is being initialized to 0xCCCCCCCC; while your plik.read is only writing the lower 16 bits and leaving the upper 16 bits unchanged. Changing that line should fix this issue: int liczbaBitow = 0; Though, especially with something like this, it's best to use a datatype that exactly matches your data: int16_t liczbaBitow = 0; This can be found in <cstdint>.
Loading DDS textures?
I'm reading about loading DDS textures. I read this article and saw this posting. (I also read the wiki about S3TC) I understood most of the code, but there's two lines I didn't quite get. blockSize = (format == GL_COMPRESSED_RGBA_S3TC_DXT1_EXT) ? 8 : 16; and: size = ((width + 3) / 4) * ((height + 3) / 4) * blockSize; and: bufsize = mipMapCount > 1 ? linearSize * 2 : linearSize; What is blockSize? and why are we using 8 for DXT1 and 16 for the rest? What is happening exactly when we're calculating size? More specifically why are we adding 3, dividing by 4 then multiplying by blockSize? Why are we multiplying by 2 if mipMapCount > 1?
DXT1-5 formats are also called BCn formats (it ends with numbers but not exactly the same ones) and BC stands for block compression. Pixels are not stored separately, it only stores a block of data for the equivalent of 4x4 pixels. The 1st line checks if it's DXT1, because it has a size of 8 byte per block. DXT3 and DXT5 have use 16 bytes per block. (Note that newer formats exist and at least one of them is 8 bytes/block: BC4). The 2nd rounds up the dimensions of the texture to a multiple of the dimensions of a block. This is required since these formats can only store blocks, not pixels. For example, if you have a texture of 15x6 pixels, and since BCn blocks are 4x4 pixels, you will need to store 4 blocks per column, and 2 blocks per row, even if the last column/row of blocks will only be partially filled. One way of rounding up a positive integer (let's call it i) to a multiple of another positive integer (let's call it m), is: (i + m - 1) / m * m Here, we need get the number of blocks on each dimension and then multiply by the size of a block to get the total size of the texture. To do that we round up width and height to the next multiple of 4, divide by 4 to get the number of block and finally and multiply it by the size of the block: size = (((width + 3) / 4 * 4) * ((height + 3) / 4 * 4)) / 4 * blockSize; // ^ ^ ^ If you look closely, there's a *4 followed by a /4 that can be simplified. If you do that, you'll get exactly the same code you had. The conclusion to all this could be comment any code that's not perfectly obvious :P The 3rd line may be an approximation to calculate a buffer size big enough to store the whole mipmap chain easily. But I'm not sure what this linearSize is; it correspond to dwPitchOrLinearSize in the DDS header. In any case, you don't really need this value since you can calculate the size of each level easily with the code above.
Jpeg Compression fundamentals
I was reading Jpeg compression but I have some problem understanding the basics ! pls see this schema http://www.cs.cf.ac.uk/Dave/Multimedia/Topic5.fig_29.gif My problem is in the last steps,consider we have a 16*16 pixel gray image ,so we have 4 blocks of size 8*8. in the zigzag scan we have 4 arrays of size 1*64 which the first index of each array is the DC value and the remaining 63 values are AC components. let's assume the are like; BLOCK-1::150,-1, 6, 0,-3,.... BLOCK-2:-38, 4,-6,-1, 1,.... BLOCK-3:18,-2,3,4,1,.... BLOCK-4:45,3,5,-1,1,.... I know the DPCM encode the difference from previous 8*8 blocks but how ?! somthing like this : 150,150-(-38),-38-18,45-18>> 150,188,-156,27 then according to JPEG coefficient coding table we have 10010110-111110,10111100-111110,01100011-111110,11011-110 and for the AC component of, (for example), the first row (-1, 6, 0,-3,....)we use RLE so we have: (0,-1),(0,6),(1,-3),... then according to JPEG default AC code table we have : 00-0,100-110,111001-10 and if my calculations are correct what happens next ?! we put the first DC of the first block and after that the RLE of 63 remaining values and so on ? I mean for the first block we have 10010110-111110 ,00-0,100-110,111001-10, ... I'm a bit confused and I couldn't find the answer anywhere :(
First of all I greatly recommend you to refer to jpec a tiny JPEG encoder written in C (grayscale only, baseline DCT-based JPEG, 8x8 blocks only). You can find the main compression steps here. In particular, please refer to this line that corresponds to the entropy coding step of the current block. the DPCM encode the difference from previous 8*8 blocks but how? The entropy coding operates block after block. Assuming the current block is not empty, the DC coefficient encoding is done by first computing the difference between the current and previous DC values: int val, bits, nbits; /* DC coefficient encoding */ if (block->len > 0) { val = block->zz[0] - s->dc; s->dc = block->zz[0]; } Note: s represent the entropy coder state. Also, for the first block, s->dc is initialized to 0. So val represents the current DC difference: the size of this difference (= number of bits) is encoded by reading the corresponding DC code in the Huffman table, then, its amplitude (= value) is encoded. If the difference is negative, two's complement is used. bits = val; if (val < 0) { val = -val; bits = ~val; } JPEC_HUFF_NBITS(nbits, val); jpec_huff_write_bits(s, jpec_dc_code[nbits], jpec_dc_len[nbits]); /* (1) */ if (nbits) jpec_huff_write_bits(s, (unsigned int) bits, nbits); /* (2) */ For the full version please refer to this code section.
Why is this "reduction factor" algo doing "+ div/2"
So I am running through "OpenCV 2 Computer Vision Application Programming Cookbook" by Robert Laganiere. Around page 42 it is talking about a image reduction algorithm. I understand the algorithm ( i think) but I do not understand exactly why one part was put in. I think I know why but if I am wrong I would like corrected. I am going to copy and paste a little bit of it in here: "Color images are composed of 3-channel pixels. Each of these channels corresponds to the intensity value of one of the three primary colors (red, green, blue). Since each of these values is an 8-bit unsigned char, the total number of colors is 256x256x256, which is more than 16 million colors. Consequently, to reduce the complexity of an analysis, it is sometimes useful to reduce the number of colors in an image. One simple way to achieve this goal is to simply subdivide the RGB space into cubes of equal sizes. For example, if you reduce the number of colors in each dimension by 8, then you would obtain a total of 32x32x32 colors. Each color in the original image is then assigned a new color value in the color-reduced image that corresponds to the value in the center of the cube to which it belongs. Therefore, the basic color reduction algorithm is simple. If N is the reduction factor, then for each pixel in the image and for each channel of this pixel, divide the value by N (integer division, therefore the reminder is lost). Then multiply the result by N, this will give you the multiple of N just below the input pixel value. Just add N/2 and you obtain the central position of the interval between two adjacent multiples of N. if you repeat this process for each 8-bit channel value, then you will obtain a total of 256/N x 256/N x 256/N possible color values. How to do it... The signature of our color reduction function will be as follows: void colorReduce(cv::Mat &image, int div=64); The user provides an image and the per-channel reduction factor. Here, the processing is done in-place, that is the pixel values of the input image are modified by the function. See the There's more... section of this recipe for a more general function signature with input and output arguments. The processing is simply done by creating a double loop that goes over all pixel values: " void colorReduce(cv::Mat &image, int div=64) { int nl= image.rows; // number of lines // total number of elements per line int nc= image.cols * image.channels(); for (int j=0; j<nl; j++) { // get the address of row j uchar* data= image.ptr<uchar>(j); for (int i=0; i<nc; i++) { // process each pixel --------------------- data[i]= data[i]/div*div + div/2;// <-HERE IS WHERE I NEED UNDERSTANDING!!! // end of pixel processing --------------- }}} So I get how I am reducing the 0:255 pixel value by div amount. I then lose whatever remainder was left. Then by multiplying it by the div amount again we are scaling it back up to keep it in the range of 0:255. Why are we then adding (div/2) back into the answer? The only reason I can think is that this will cause some values to be rounded down and some rounded up. If you don't use it then all your values are rounded down. So in a way it is giving a "better" average? Don't know, so what do you guys/girls think?
The easiest way to illustrate this is using an example. For simplicity, let's say we are processing a single channel of an image. There are 256 distinct colors, ranging from 0 to 255. We are also going to use N=64 in our example. Using these numbers, we will reduce the number of colors from 256 to 256/64 = 4. Let's draw a graph of our color space: |......|......|......|......| 0 63 127 191 255 The dotted line represents our colorspace, going from 0 to 255. We have split this interval into 4 parts, and the splits are represented by the vertical lines. In order to reduce all 256 colors to 4 colors, we are going to divide each color by 64 (losing the remainder), and then multiply it by 64 again. Let's see how this goes: [0 , 63 ] / 64 * 64 = 0 [64 , 127] / 64 * 64 = 64 [128, 191] / 64 * 64 = 128 [192, 255] / 64 * 64 = 192 As you can see, all the colors from the first part became 0, all the colors from the second part became 64, third part 128, fourth part 192. So our color space looks like this: |......|......|......|......| 0 63 127 191 255 |______/|_____/|_____/|_____/ | | | | 0 64 128 192 But this is not very useful. You can see that all our colors are slanted to the left of the intervals. It would be more helpful if they were in the middle of the intervals. And that's why we add 64/2 = 32 to the values. Adding half of the interval length shifts the colors to the center of the intervals. That's also what it says in the book: "Just add N/2 and you obtain the central position of the interval between two adjacent multiples of N." So let's add 32 to our values and see how everything looks: [0 , 63 ] / 64 * 64 + 32 = 32 [64 , 127] / 64 * 64 + 32 = 96 [128, 191] / 64 * 64 + 32 = 160 [192, 255] / 64 * 64 + 32 = 224 And the interval looks like this: |......|......|......|......| 0 63 127 191 255 \______/\_____/\_____/\_____/ | | | | 32 96 160 224 This is a much better color reduction. The algorithm reduced our colorspace from 256 to 4 colors, and those colors are in the middle of the intervals that they reduce.
It is done to give an average of the quantization bounds, not floor of it. For example for N = 32, all data from 0 to 31 will give 16 instead of 0. Please check following picture or my excel file.
Coding an enhanced LSB reverser
I'm stumbling upon a steganographied image with a divided IDAT structure of 12 blocks (the last LSB slightly smaller) (.PNG). I'll elaborate a bit on the structure of the issue before I get to the real point of my question since I need to clarify some of the things so please do not mark it as off-topic since it is not. I just have to explain the notion behind the script so that I may get to the issue itself. It definitely has embedded data into itself. The data seems to have been concealed by altering the enhanced LSB values eliminating the high-level bits for each pixel except for the last least significant bit. So all bytes are going to be 0 or 1 since 0 or 1 on a 256 values range won't give any visible color. Basically, a 0 stays at 0, and a 1 becomes maximum value, or 255. I've been analyzing this image in many different ways, but don't see anything odd beyond the utter lack of one value in any of the three color values (RGB) and the heightened presence of another value in 1/3 of the color values. Studying these and replacing bytes has given me nothing, however, and I am at a loss to whether this avenue is even worth pursuing. Hence, I'm looking into developing a script in rather Python, PHP or C/C++ that would reverse the process and 'restore' the enhanced LSBs. I've converted it to a 24-bit .BMP and tracking down the red curve from a chi-square steganalysis, it's certain that there is a steganographied data within the file. First, there is a little bit more than 8 vertical zones. Which means that there is hidden data little bit more than 8kB. One pixel can be used to hide three bits (one in the LSB of each RGB color tone). So we can hide (98x225)x3 bits. To get the number of kilobytes, we divide by 8 and by 1024: ((98x225)x3)/(8x1024). Well, that should be around 8.1 kilobytes. But that ain't the case here. The analisys of the APPO and APP1 markers of a .JPG extension of the file also give some awkward outputs: Start Offset: 0x00000000 *** Marker: SOI (xFFD8) *** OFFSET: 0x00000000 *** Marker: APP0 (xFFE0) *** OFFSET: 0x00000002 length = 16 identifier = [JFIF] version = [1.1] density = 96 x 96 DPI (dots per inch) thumbnail = 0 x 0 *** Marker: APP1 (xFFE1) *** OFFSET: 0x00000014 length = 58 Identifier = [Exif] Identifier TIFF = x[4D 4D 00 2A 00 00 00 08 ] Endian = Motorola (big) TAG Mark x002A = x[002A] EXIF IFD0 # Absolute x[00000026] Dir Length = x[0003] [IFD0.x5110 ] = [IFD0.x5111 ] = 0 [IFD0.x5112 ] = 0 Offset to Next IFD = [00000000] *** Marker: DQT (xFFDB) *** Define a Quantization Table. OFFSET: 0x00000050 Table length = 67 ---- Precision=8 bits Destination ID=0 (Luminance) DQT, Row #0: 2 1 1 2 3 5 6 7 DQT, Row #1: 1 1 2 2 3 7 7 7 DQT, Row #2: 2 2 2 3 5 7 8 7 DQT, Row #3: 2 2 3 3 6 10 10 7 DQT, Row #4: 2 3 4 7 8 13 12 9 DQT, Row #5: 3 4 7 8 10 12 14 11 DQT, Row #6: 6 8 9 10 12 15 14 12 DQT, Row #7: 9 11 11 12 13 12 12 12 Approx quality factor = 94.02 (scaling=11.97 variance=1.37) I'm nearly convinced that there is no encryption algorithm applied therefore no key implementation follows the concealment. My notion is that of coding a script that would shift the LSB values and return the originals. I've ran the file under several structure analyses, statistical attacks, BPCS, The histogram of the image shows a specific color with an unusual spike to it. I've manipulated that as best I can to try and view any hidden data, but to no avail. Those are the histograms of the RGB values as follows: Then there are the multiple IDAT chunks. But, I've put together a similar image by defining random color values at each pixel location, and I too wound up with several of these. So far, I've also found very little inside them. Even more interesting, is the way that color values are repeated in the image. It seems, that the frequency of reused colors could hold some clue. But, I have yet to fully understand that relationship, if one exists. Additionally, there is only a single column and a single row of pixels that do not possess a full value of 255 on their alpha channel. I've even interpreted the X, Y, A, R, G, and B values of every pixel in the image as ASCII, but wound up with nothing too legible. Even the green curve of the average of LSBs cannot tell us anything. There is no evident break. Here are several other histograms which show the weird curve of the blue value from the RGB: But the red curve, the output of the chi-square analysis, shows some difference. It can see something that we cannot see. Statistical detection is more sensitive than our eyes, and I guess that was my final point. However, there is also a sort of latency in the red curve. Even without hidden data, it starts at maximum and stays like that for some time. It's close to a false positive. It looks like the LSB in the image and is very close to random, and the algorithm needs a large population (remember the analysis is done on an incrementing population of pixels) before reaching a threshold where it can decide that actually, they are not random after all, and the red curve starts to go down. The same sort of latency happens with hidden data. You hide 1 or 2 kb, but the red curve does not go down right after this amount of data. It waits a little bit, here respectively at around 1.3 kb and 2.6 kb. Here is a representation of the data types from a hex editor: byte = 166 signed byte = -90 word = 40,358 signed word = -25,178 double word = 3,444,481,446 signed double word = -850,485,850 quad = 3,226,549,723,063,033,254 signed quad = 3,226,549,723,063,033,254 float = -216652384. double = 5.51490063721e-093 word motorola = 42,653 double word motorola = 2,795,327,181 quad motorola = 12,005,838,827,773,085,484 Here's another spectrum to confirm the behavior of the blue (RGB) value. Please note that I needed to go through all of this in order to clarify the situation and the programming matter that I'm in pursuit of. This by itself makes my question NOT off-topic so I'd be glad if it doesn't get marked as such. Thank you.
In case of an image with LSB enhancement applied, I cannot think of a way to reverse it back to its original state because there is no clue about the original values of RGBs. They are set to either 255 or 0 depending on their Least Significant Bit. The other option I see round here is if this is some sort of protocol to include quantum steganography. Matlab and some steganalysis techniques could be the key to your issue though. Here's a Java chi-square class for some statistical analysis: private long[] pov = new long[256]; and three methods as public double[] getExpected() { double[] result = new double[pov.length / 2]; for (int i = 0; i < result.length; i++) { double avg = (pov[2 * i] + pov[2 * i + 1]) / 2; result[i] = avg; } return result; } public void incPov(int i) { pov[i]++; } public long[] getPov() { long[] result = new long[pov.length / 2]; for (int i = 0; i < result.length; i++) { result[i] = pov[2 * i + 1]; } return result; or try with some bitwise shift operations as: int pRGB = image.getRGB(x, y); int alpha = (pRGB >> 24) & 0xFF; int blue = (pRGB >> 16) & 0xFF; int green = (pRGB >> 8) & 0xFF; int red = pRGB & 0xFF;