Encode image in JPG with OpenCV avoiding the ghost effect

Encode image in JPG with OpenCV avoiding the ghost effect - c++

I have an application (openCV - C++) that grab an image from webcam, encode it in JPG and trasmitt it from a Server to Client. Thwebcam is stereo so actually I have two image, LEFT and RIGHT. In the client, when I recieve the image I decode it and I generate an Anaglyph 3D Effect.
For do this I use the OpenCV...
Well I encode the image in this way:
params.push_back(CV_IMWRITE_JPEG_QUALITY);
params.push_back(60); //image quality
imshow(image); // here the anagliphic image is good!
cv::imencode(".jpg", image, buffer, params);
and decode in this way:
cv::Mat imageRecieved = cv::imdecode( cv::Mat(v), CV_LOAD_IMAGE_COLOR );
What I see is that this kind of encode generate in the Anaglyph image a "ghost effect" (artifact?) so there is a bad effect with the edges of the object. If look a door for example there a ghost effect with the edge of the door. I'm sure that this depends of the encode because if I show the Anaglyph image before encode instruction this work well. I cannot use the PNG because it generate to large image and this is a problem for the connection between the Server and the Client.
I look for the GIF but, if I understood good, is nt supported by the cv::encode function.
So there is another way to encode a cv:Mat obj in JPG withou this bad effect and without increase to much the size of the image?

If your server is only used as an image storage, you can send to the server the 2 original stereo images (compressed) and just generate the Anaglyph when you need it. I figure that if you fetch the image pair (JPEG) from the server and then generate the Anaglyph (client-side), it will have no ghosting. It might be that the compressed pair of images combined is smaller than the Anaglyph .png.

I assume the anaglyph encoding is using line interlacing to combine both sides into one image.
You are using JPEG to compress the image.
This algorithm optimized to compress "photo-like" real world images from cameras, and works very well on these.
The difference of "photo-like" and other images, regarding image compression, is about the frequencies occurring in the image.
Roughly speaking, in "photo-like" images, the high frequency part is relatively small, and mostly not important for the image content.
So the high frequencies can be safely compressed.
If two frames are interlaced line by line, this creates an image with very strong high frequency part.
The JPEG algorithm discards much of that information as unimportant, but because it is actually important, that causes relatively strong artefacts.
JPEG basically just "does not work" on this kind of images.
If you can change the encoding of the anaqlyph images to side by side, or alternating full images from left and right, JPEG compression should just work fine.
Is this an option for you?
If not, it will get much more complicated. One problem - if you need good compression - is that the algorithms that are great for compressing images with very high frequencies are really bad at compressing "photo-like" data, which is still the larger part of your image.
Therefore, please try really hard to change the encoding to be not line-interlacing, that should be about an order of magnitude easier than other options.

Related

Most efficient way to store video data

In order to accomplish some specific editing on some .avi files, I'd like to create an application (in C++) that is able to load, edit, and save those .avi files. But, what is the most efficient way? When first thinking about it, a simple 3D-Array containing a 2D-array of pixels for every frame seems the simplest solution; But then its size would be ENORMOUS. I mean, let's assume that a pixel only needs a color. One color would mean 3bytes (1char r, 1char b, 1char g). If I now have a 1920x1080 video format, this would mean 2MEGABYTES for only one frame! This data may or may not be smaller if using pointers for the colors, so that alreay used colors wont take more size - I don't really know, since I'm pretty new to C++ and the whole low-level stuff. (As a comparison: One of my AVI files recorded with Xvid codec is 40seconds long, 30fps, and only has 2MB.)
So how would you actually store the video data (Not even the audio, just the video) efficiently (while still being easily able to perform per-frame-changes on it)?

As you have realised, uncompressed video is enormous and it is not practical to store an entire video in this way.
Video compression is an extremely complex topic, but more-or-less, it works as follows: certain "key-frames" are compressed using fairly standard compression techniques similar or identical to still-photo compression such as JPEG. Frames following key-frames are compressed by comparing the frame with the previous one and looking for changes (such as moving blocks). Every now and again, a new key-frame is used.
You don't really have to worry much about that as you are not going to write your own video coder/decoder (codec). There are standard ones.
What will happen is that your program will decode the compressed video frame-by-frame and keep a certain number of frames in memory while you are working on them and then re-encode them when it is finished. In the uncompressed form, you will have access to the individual pixels and can work on them how you want.
You are probably not going to do that either by yourself - it is very hard. You probably need to use a framework, such as OpenCV. There are a huge number of standard filters and tools built in to these frameworks, and it may be that what you want to do is already implemented somewhere.
The OpenCV framework can return individual frames in a Mat object and you can then access the pixels. See this post Get Pixels from Mat
OpenCV
Tutorial page: Open CV Tutorial

How can I compress jpeg image with compression rate 4 bpp or less?

I am trying to compress my .jpeg image in Photoshop.
WHat is the best way to do this?
I am now calculating the bpp taking the image size in kb, calculating how many bits that is. Then I take the image size in pixel*pixel to get the amount of pixels in the image. After that I divide bits/pixels, to find how many bits per pixel the image has.
But How can I change this number? My guess is to change how many kb the image is, but how do i do this?
Thanks for any help!!

Yes, you can achieve higher compression ratio than 4 bits per pixel. Images with solid color can have rate as low as 0.13bpp.
In fact 4bpp is quite poor compression — it's same as uncompressed 16-color image or half of 256-color image, which even GIF can manage. JPEG can look decent at 1-2bpp.

in general, you cannot "compress" a jpeg image. all you can do is to reduce the image quality further in order to achieve a lower bpp value. jpeg streams are always compressed and they use a lossy compression method. it means that the original image will never ever be reconstructed from a jpeg image. the smaller the file the more information you have lost.
a specific "bpp value" is not, and should never be your target. especially with lossy compression. you should always look at your current image and decide whether it is still good enough or not.
if you still have the original image, try a lossless compression format, like zip-compressed or lzw-compressed tiff or compressed png. i'm sure PhotoShop can handle these formats as well. another softwares like IrfanView (https://www.irfanview.com/) or XnView MP (https://www.xnview.com/en/xnviewmp/) will convert your images too.
if you want manual (eg. full) control over your images, you should use command line utilities, like ImageMagick (https://imagemagick.org/) or NConvert (please find the XnView MP link above)
if you have only the jpeg images do not touch (edit & save) them. with every single save operation you lose another bunch of information. you should always work on file copies.
you should always keep your master image (the very picture you took with your phone or your camera).
of course, these rules of thumb will not answer your original question.

Can't find logic behind png file sizes

I'm saving a large number of small png files for use in a game on a phone, so space is at a premium.
I'm trying to figure out the logic behind the file sizes so I can save things most efficiently, but even after using pngcrush the sizes are totally inconsistent.
I saved a 1x1 image and it takes 3kb. I have another 23x21 image which takes only 2kb. I have two images which are almost the same size, but one takes 6kb and the other takes 13kb. I doubled the image height and copied one image into the empty space of the other and saved that. The combined image is only 11kb!
Why is a 1x1 image larger than a 23x21 image? Why can I combine a 13kb image and a 6kb image and get an 11kb image?
Here are the images I'm talking about (there's a 1x1 pixel in between the 1st and second images. It's difficult to see, so I'll just give the URL: http://g42.org/temp/png/1x1.png):
example http://g42.org/temp/png/hat.png
example http://g42.org/temp/png/1x1.png
example http://g42.org/temp/png/helmet1.png
example http://g42.org/temp/png/helmet2.png
example http://g42.org/temp/png/helmet1_2.png

It's not a compression thing, the problem with the 1x1 image is that it has metadata (added by Photoshop, it seems), a color profile (iCCP chunk). If you look inside the binary, its' the data between the strings "iCCP" and "IDAT", it could be removed and you get a 69 bytes file.
If you reopen and save the file most image viewers (xnview), or use pngcrush, you can strip that chunk. : See it here : http://i.stack.imgur.com/fmOdA.png
And regarding the helmet images: besides other informational chunks (imageReady ads some informational text, as you can see), the difference is due to different formats: the two-helmets is a paletted image (8bits per pixel), the single helmet is a RGB with alpha (32bits per pixel)

PNG compression is based on the same algorithm as zlib and is highly sensitive to the data that is being compressed so you won't see a consistent relationship between image size and file size. In the case of the combined image, it is still bigger than the smaller image and given the similarity of the two halves of the image, the compressor was probably able to reuse a lot of the Huffman tree. I don't know enough about the algorithm to say for certain how it ended up smaller than the other half.
As long as you are not seeing oddities like the 1x1 image, which you seem to have figured out in the comments, I don't think this will make a lot of sense without extensive study of image compression.

There is a great utility called pngcrush
http://pmt.sourceforge.net/pngcrush/
Compressing to PNG is a rather difficult task - there are lost of assumptions and strategies to try - do we create a palette, or are we better off without it?
PNGcrush essentially bruteforces 100+ different compression strategies, while at the same time trimming useless tags and sections.

PNG has several sub-formats: 24-bit with or without alpha, 8-bit (includes alpha), grayscale, etc. which use different amount of bytes per pixel and have different "compressibility".
Plus PNG supports several compression tricks (filters and gzip settings) which affect how well image data is compressed.
On top of that PNG can contain metadata, which sometimes can be pretty large, like some embedded color profiles.
ImageAlpha converts images to the most space-efficient PNG8+alpha variant.
ImageOptim removes junk metadata and finds best compression parameters.
With a combination of those two your images can be reduced by 30-50%.

Large JPEG/PNG Image Sequence Looping

I have been working on my project about remotely sensed image processing, and image sequence looping. Each resulting image (in JPEG or PNG format) has approximately 8000 * 4000 pixels. Our users usually want to loop an image sequence (more than 50 images) on the basis of region of interest at a time. Thus, I have to extract the required viewing area from the each image according to user's visualization client size. For example, if user's current client view is 640 * 480, I'll have to find a size of 640 * 480 data block from each original image based on the current x (columns) and y (rows) coordinates, and remap to the client view. When user pans to another viewing area by mouse dragging, our program must accordingly re-load regional data out of each original image as soon as possible.
I know neither JPEG library nor PNG library has some built-in data block read routines, such as
long ReadRectangle (long x0, long y0, long x1, long y1, char* RectData);
long ReadInaRectangle (long x0, long y0, short width, short height, char* RectData);
The built-in JPEG decompressor lacks this kind of functionality. I know that JPEG2000 format has provisions for decompressing a specific area of the image. I'm not entirely sure about JEPG.
Someone suggest that I use CreateFileMapping, MapViewOfFile, and CreateDIBSection to commit the number of bytes of a file mapping to map to the view. Unlike the simple flat binary image formats such *.raw, *.img, and *.bmp, JPEG's Blob will contain not only the image data but also the complicated JPG header. So it's not easy to map a block of data view out of the JPEG file.
Someone recommend that I use image tiling or image pyramid technology to generate sub-images, just like mnay popular, image visualization (Google Earth, and etc.), and GIS applications (WebGIS, and etc.) do.
How can I solve this problem?
Thanks for your help.
Golden Lee

If you're OK with region co-ordinates being multiples of 8, the JPEG library from ijg may be able to help you load partial JPEG images.
You'd want to:
Get all DCT coefficients for the entire image. Here's an example of how to do this. Yes, this will involve entropy decoding of the entire image, but this is the less expensive step of JPEG decoding (IDCT is the most expensive one, and we're avoiding it).
Throw away the blocks that you don't need (each block consists of 8x8) coefficients. You'll have to do this by hand, but since the layout is quite simple (the blocks are in scanline order) it shouldn't be that hard.
Apply block inverse DCT to each of the frames. You can probably get IJG to do that for you. If you can't, then you'll have to do your own IDCT and color transform back to [0, 255] because intensities are in [-127, 128] in the world of JPEG.
If all goes well, you'll get your decoded JPEG image. Because of chroma subsampling, the luma and chroma channels may be of different dimensions, and you will have to compensate for this yourself by scaling.
The first two steps are pretty much covered by the links. The fourth one is quite trivial (you can get the type of chroma subsampling using the IJG interface, and scaling -- essentially upsampling -- is easily achieved by using something like OpenCV or rolling your own code). The third one is something I haven't tried yet, but it sounds like it would be possible.

It's easy with the gd library. LibGD is an open source code library for the dynamic creation of images on the fly by programmers.

Does anyone know of a program/method to compress just certain parts of a PNG image w/o slicing it?

Please help! Thanks in advance.
Update: Sorry for the delayed response, but if it is helpful to provide more context here, since I'm not sure what alternative question I should be asking.
I have an image for a website home page that is 300px x 300px. That image has several distinct regions, including two that have graphical copy on top of the regions.
I have compressed the image down as much as I can without compromising the appearance of that text, and those critical regions of the image.
I tried slicing the less critical regions of the image and saving those at lower compressions in order to get the total kbs down, but as gregmac posted, the sections don't look right when rejoined.
I was wondering if there was a piece of software out there, or manual solution for identifying critical regions of an image to "compress less" and could compress other parts of the image more in order to get the file size down, while keeping those elements in the graphic that need to be high resolution sharper.

You cannot - you can only compress an entire PNG file.
You don't need to (I cannot think of a single case where compressing a specific portion of a PNG file would be useful)
Dividing the image in to multiple parts ("slicing") is the only way to compress different portions of a image file, although I'd even recommend again using different compression levels in one "sliced image", as differing compression artefacts joining up will probably look odd
Regarding your update,
identifying critical regions of an image to "compress less" and could compress other parts of the image more in order to get the file size down
This is inherently what image compression does - if there's a bit empty area it will be compressed to a few bytes (using RLE for example), but if there's a very detailed region it will have more bytes "spent" on it.
The problem sounds like the image is too big (in terms of file-size), have you tried other image formats, mainly GIF or JPEG (or the other PNG format, PNG-8 or PNG-24)?
I have compressed the image down as much as I can without compromising the appearance of that text
Perhaps the text could be overlaid using CSS, rather than embedded in the image? Might not be practical, but it would allow you to compress the background more (if the background image is a photo, JPEG might work best, since you no longer have to worry about the text)
Other than that, I'm out of ideas. Is the 300*300px PNG really too big?

It sounds like you are compressing parts of your image using something like JPEG and then pasting those compressed images onto a PNG combined with other images, and the entire PNG is sent to the browser where you split them up.
The problem with this is that the more you compress your JPEG parts the more decompression artifacts you will get. Then when you put these low quality images onto the PNG, which uses deflate compression, you will actually end up increasing the file size because it won't be able to compress well.
So if you are keen on keeping PNG as your file format the best solution would be to not compress the parts using JPEG which you paste onto your PNG - keep everything as sharp as possible.
PNG compresses each row separately unless you have used a "predictor" in the compression.
So it's best to keep your PNG as wide as possible with similar images next to each other horizontally rather than under each other vertically.
Perhaps upload an example of the images you're working with?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js