Store OpenCV boolean matrix on disk - c++

I have a float matrix of 1024x1024 and I want to keep sign of this matrix inside a file. For this purpose, I want to keep the sign matrix as Matrix of boolean which I fail to do.
Assume, my matrix is:
2.312, 0.232, -2,132
5.754, -4,34, -3.23
-4.34, -1.23, 7.9453
My output should be
1,1,0
1,0,0
0,0,1
Since float is 4Byte and my matrix size is 10^20(1M) the size is 4MB and boolean is 1bit and matrix size is 1M, I expect the bool mat to be around 1Mb=128KB however, when I use threshold method in opencv my output file is 1MB which means the file is saved as uchar(8bit).
I tried to use imwrite but it didn't work.
EDIT: I realized that I didn't mention speed is also another important factor for my tests. I'm loading approximately 10 million of 1K*1K matrix from disk.
Thanks in advance

In OpenCV you can write
Mat input;
Mat A = (input >= 0);
Now the problem is that OpenCV has no bitmap data type. So the best you can get is Mat1u (unsigned char).
If you want to save space in your storage, you need to do it on your own. For example, you can use libpng to write out a PNG file of bit depth 1. Unfortunately, imwrite does not support setting that bit depth (it can write PNGs with bit depths 8 and 16).
If you want to write a compressed PNG with bitdepth 8, you can use imwrite:
std::vector<int> flags;
flags.push_back(CV_IMWRITE_PNG_COMPRESSION);
flags.push_back(9); // [0-9] 9 being max compression, default is 3
cv::imwrite("output.png", A, flags);
This will result in the best compression effort. Now you can use Imagemagick to compare the filesize against the same image stored with bit depth 1:
convert output.png -type Bilevel -define "png:bit-depth=1" -define "png:compression-level=9" output-1b.png
I tested with a random example image (see below).
8 bit, compressed PNG: 24,732 bytes
1 bit, compressed PNG: 20,529 bytes
8 bit, uncompressed PGM: 270,015 bytes
1 bit, uncompressed PBM: 34,211 bytes
As you can see, a compressed 8bit storage still beats uncompressed 1bit storage in this example.

Related

Use png8 instead of png24 as RGB height map

I have a 1 band DEM geotiff, and a formula to convert altitude -> RGB and RGB -> altitude (something like this: https://docs.mapbox.com/help/troubleshooting/access-elevation-data).
Using the formula (and GDAL/Python), I converted my geotiff to a 3 bands (R, G & B) geotiff, each band having values in the 0-255 range.
Using mapnik / mod_tile, I'm then serving my geotiff as PNG tiles to a web client. Everything is fine if I setup mod_tile to serve the tiles as 24 or 32 bits PNGs. But if I serve them as 8 bits PNGs (to reduce their size), then the decoded values are a bit off (I can't see the difference when looking at the image, but the RGB values are not exactly the same and it thus messes my decoded altitudes).
Am I right expecting to be able to do what I want (retrieving the exact RGB values) with 8 bits PNGs instead of 24/32, or there something I don't understand about 8 bits PNGs (if so, I'll have to dive into mod_tile code, I guess that when we ask 8 bits, it generates 24 or 32 and then compress)?
No, you are not right in expecting that you can compress any ensemble of 24-bit values losslessly to 8-bit values. If there are more than 256 different 24-bit values in the original, then some of those different values will necessarily map to the same 8-bit value.

Compression of an image

I have been calculating the uncompressed and compressed file sizes of an image. This for me has always resulted in the compressed image being smaller than the uncompressed image which I would expect. If an image contains a large number of different colours, then storing the palette takes up a significant amount of space, and more bits are also needed to store each code. However my question is, would it be possible the compression method could potentially result in a larger file than the uncompressed RGB image. What would the size (in pixels) of the smallest square RGB image, containing a total of k different colours, for which this compression method is still useful? So we want to find, for a given value of k, find the smallest integer number n for which an image of size n×n takes up less storage space after compression than the original RGB image.
Let's begin by making a small simplification -- the size of the encoded output depends on the number of pixels (the actual proportion of width vs. height doesn't really matter). Hence, let's generalize the problem to number of pixels N, from which we can always calculate n by taking a square root.
To further simplify the problem, we will also ignore the overhead of any image headers/metadata, such as width, height, size of the palette, etc. In practice, this would generally be some relatively small constant.
Problem Statement
Given that we have
N representing the number of pixels in an image
k representing the number of distinct colours in an image
24 bits per pixel RGB encoding
LRGB representing the length of a RGB image
LP representing the length of a palette image
our goal is to solve the following inequality
in terms of N.
Size of RGB Image
RGB image is just an array of N pixels, each pixel taking up a fixed number of bits given by the RGB encoding. Hence,
Size of Palette Image
Palette image consists of two parts: a palette, and the pixels.
A palette is an array of k colours, each colour taking up a fixed number of bits given by the RGB encoding. Therefore,
In this case, each pixel holds an index to a palette entry, rather than an actual RGB colour. The number of bits required to represent k values is
However, unless we can encode fractional bits (which I consider outside the scope of this question), we need to round this up. Therefore, the number of bits required to encode a palette index is
Since there are N such palette indices, the size of the pixel data is
and the total size of the palette image is
Solving the Inequality
And finally
In Python, we could express this in the following way:
import math
def limit_size(k):
return (k * 24.) / (24. - math.ceil(math.log(k, 2)))
def size_rgb(N):
return (N * 24.)
def size_pal(N, k):
return (N * math.ceil(math.log(k, 2))) + (k * 24.)
In general no, but your question is not precise.
If we compress normal files, they could be larger. E.g. if you compress a random generated sequence of bytes, there is not much to compress, and so you get the header of compression program, which tell which compression method is used, and some versioning. This will enlarge the file, and ev. some escaping. Good compression program will see that compression will not shrink the size, and so they should just not compress, and tell in the header that it is a flat file. Possibly this is done by region of program.
But your question is about images. Compression is done inside the file, and often not all file, but just the image bits. In this case program will see that there is no need to compress, and so they would keep the file uncompressed. But because the image headers are always present, this change only a flag, and so no increase of size.
But this could depends also on file format. You wrote about "palette", but this is not much used nowadays: compression is done finding similar pattern on file. But again: this depends on the image format. If you look in Wikipedia, for particular file format, you may see a table with headers parameters (e.g. bit depth or number of colours (palette), definitions of colours, and methods used to compress).
Then, for palette like image, the answer of Dan Mašek (https://stackoverflow.com/a/58683948/2758823) has some nice mathematical explanation, but one should not forget that compression is much heuristic and test of real examples: real images have patterns.

Converting opencv image to gdi bitmap doesn't work depends on image size

I have this code that converts an opencv image to a bitmap:
void processimage(MAT imageData)
{
Gdiplus::Bitmap bitmap(imageData.cols,imageData.rows,stride, PixelFormat24bppRGB,imageData.data);
// do some work with bitmap
}
It is working well when the size of image is 2748 X 3664. But I am tring to process an image wth size 1374 X 1832, it doesn't work.
The error is invalid parameter(2).
I checked and can confirm that:
in 2748 *3664:
cols=2748
rows=3664
stride= 8244
image is continues.
in 1374 X 1832
cols=1374
rows=1832
stride= 4122
image is continues.
So everything seems correct to me, but it generate error.
What is the problem and how can I fix it?
Edit
Based on answer which explained why I can not create bitmap. I finally implemented it in this way:
Mat newImage;
cvtColor(imageData, newImage, CV_BGR2BGRA);
Gdiplus::Bitmap bitmap(newImage.cols,newImage.rows,newImage.step1(), PixelFormat32bppRGB,newImage.data);
So effectively, I convert input image to a 4 byte per pixel and then use the convert it to bitmap.
All credits to Roger Rowland for his answer.
I think the problem is that a BMP format must have a stride that is a multiple of 4.
Your larger image has a stride of 8244, which is valid (8244/4 = 2061) but your smaller image has a stride of 4122, which is not (4122/4 = 1030.5).
As it says on MSDN for the stride parameter (with my emphasis):
Integer that specifies the byte offset between the beginning of one
scan line and the next. This is usually (but not necessarily) the
number of bytes in the pixel format (for example, 2 for 16 bits per
pixel) multiplied by the width of the bitmap. The value passed to this
parameter must be a multiple of four.
Assuming your stride is correct, I think you're only option is to copy it row by row. So, something like:
Great a Gdiplus::Bitmap of the required size and format
Use LockBits to get the bitmap pixel data.
Copy the OpenCV image one row at a time.
Call UnlockBits to release the bitmap data.
You can use my class CGdiPlus that implements all you need to convert from cv::Mat to Gdiplus::Bitmap and vice versa:
OpenCV / Tesseract: How to replace libpng, libtiff etc with GDI+ Bitmap (Load into cv::Mat via GDI+)

How can I store each pixel in an image as a 16 bit index into a colortable?

I have a 2D array of float values:
float values[1024][1024];
that I want to store as an image.
The values are in the range: [-range,+range].
I want to use a colortable that goes from red(-range) to white(0) to black(+range).
So far I have been storing each pixel as a 32 bit RGBA using the BMP file format. The total memory for storing my array is then 1024*1024*4 bytes = 4MB.
This seems very vasteful knowing that my colortable is "1 dimensional" whereas the 32 RGBA is "4 dimensional".
To see what I mean; lets assume that my colortable went from black(-range) to blue(+range).
In this case the only component that varies is clearly the B, all the others are fixed.
So I am only getting 8bits of precision whereas I am "paying" for 32 :-(.
I am therefore looking for a "palette" based file format.
Ideally I would like each pixel to be a 16 bit index (unsigned short int) into a "palette" consisting of 2^16 RGBA values.
The total memory used for storing my array in this case would be: 1024*1024*2 bytes + 2^16*4bytes = 2.25 MB.
So I would get twice as good precision for almost half the "price"!
Which image formats support this?
At the moment I am using Qt's QImage to write the array to file as an image. QImage has an internal 8 bit indexed ("palette") format. I would like a 16 bit one. Also I did not understand from Qt's documentation which file formats support the 8 bit indexed internal format.
Store it as a 16 bit greyscale PNG and do the colour table manually yourself.
You don't say why your image can be decomposed in 2^16 colours but using your knowledge of this special image you could make an algorithm so that indices that are near each other have similar colours and are therefore easier to compress.
"I want to use a colortable that goes from red(-range) to white(0) to black(+range)."
Okay, so you've got FF,00,00 (red) to FF,FF,FF (white) to 00,00,00 (black). In 24 bit RGB, that looks to me like 256 values from red to white and then another 256 from white to black. So you don't need a palette size of 2^16 (16384); you need 2^9 (512).
If you're willing to compromise and use a palette size of 2^8 then the GIF format could work. That's still relatively fine resolution: 128 shades of red on the negative size, plus 128 shades of grey on the positive. Each of a GIF's 256 palette entries can be an RGB value.
PNG is another candidate for palette-based color. You have more flexibility with PNG, including RGBA if you need an alpha channel.
You mentioned RGBA in your question but the use of the alpha channel was not explained.
So independent of file format, if you can use a 256 entry palette then you will have a very well compressed image. Back to your mapping requirement (i.e. mapping floats [-range -> 0.0 -> +range] to [red -> white -> black], here is a 256 entry palette that covers the range red-white-black you wanted:
float entry# color rgb
------ ------- ----- --------
-range 0 00 red FF,00,00
1 01 FF,02,02
2 02 FF,04,04
... ...
... ...
127 7F FF,FD,FD
0.0 128 80 white FF,FF,FF
129 81 FD,FD,FD
... ....
... ...
253 FD 04,04,04
254 FE 02,02,02
+range 255 FF black 00,00,00
If you double the size of the color table to be 9 bits (512 values) then you can make the increments between RGB entries more fine: increments of 1 instead of 2. Such a 9-bit palette would give you full single-channel resolution in RGB on both the negative and positive sides of the range. It's not clear that allocating 16 bits of palette would really be able to store any more visual information given the mapping you want to do. I hope I understand your question and maybe this is helpful.
PNG format supports paletted format up to 8-bits, but should also support grayscale images up to 16-bits. However, 16-bit modes are less used, and software support may be lacking. You should test your tools first.
But you could also test with plain 24-bit RGB truecolor PNG images. They are compressed and should produce better result than BMP in any case.

OpenCV convertTo slow

I have one image cv::Mat fooImage 1000*1000 pixels in CV_32F format.
Now I want to show the image, I use
fooImage.convertTo(displayImage,CV_8UC1)
However, it takes about 5ms just for this line. Is this normal?? How can I quickly convert a CV_32F Mat image to CV_8UC1?
Thanks!
That sounds slow but convertTo() probably isn't particularly optomised to use SSE2 or anything.
You are reading 4Mb from RAM, allocating 1Mb, doing 4Million floating point ops and writing 1Mb back to RAM so a millisecond isn't unreasonable.
You could write a simple loop to convert the image data into uchar yourself by simply multiplying each value by 255.0
Are you including the time to display the image? YOu are creating a "displayImage" that is still 8bit greyscale, this will have to be converted into a RGB or RGBA image when it is displayed