Do lossy get converted to losless - compression

In lossy algorithm it can take similar colour pixels and take it is average colours and make a few pixels have the same colors, isn't lossy algorithm a Compress files algorithm? I don't see how it compress a photo the photo still have the same number of pixels except if after lossy algorithm do it is work, a lossless algorithm compress the file

Related

Handle "Out-of-Gamut" Color in RGB to CIEL*a*b* to RGB Conversions

I've got functions (c++) that convert a game image (SDL2 SDL_Surface) from RGB through CIEXYZ to CIEL*a*b* so that adjustments to hue, brightness, saturation, will be more visually natural than in HSV space. That works, with the exception of those pixels that are adjusted out of the RGB gamut in the process.
While it is easy enough to force a value back into gamut by:
individually cropping subpixel values below 0 to 0 and above 255 to 255, or
compressing and moving the whole pixel or whole image into the 0-255 range by dividing by (max-min) and subtracting min/(max-min);
these options lead to gross artifacts when doing multiple operating on the same image. I am looking for the least destructive method of handling out-of-gamut subpixels in code. Digging through many pages of Google results least to hundreds of Photoshop links, a few design oriented links and references to CMSs like LittleCMS.
I need an algorithmic solution to put into c++ code.
Note: Just doing some basic experimentation, using linear compression on the entire image leads to massive loss of brightness over hundreds of iterations with calculations happening as floats. Further insight into the sigmoid compression comment below is most welcome.
The fundamental issue you face is multiple conversions between color spaces. If the conversion isn't lossless, then you will get cumulative artifacts.
The better solution is to maintain all of your imagery in one color space and do all of your manipulation within that color space. Treat conversion as a one-way street, converting a copy to RGB for display. Do not convert back and forth.

Comparing 2 images pixel by pixel (the first image is stored at a database)

I want to compare 2 images, where the first image is stored in a database and the second image is from a live video stream via a webcam. Is it possible to determine whether there are some differences between the images, or whether they are identical?
I want the image comparison to be pixel by pixel. If a pixel by pixel comparison is hard, or even impossible, could you suggest a better way of doing this?
A simple pixel by pixel comparison is unlikely to work well because of noise in the webcam image.
You need a similarity measure like Peak signal-to-noise ratio (PSNR) or Structural Similarity (SSIM)
Perform a hash function on your image and compare it with the precalculated image hash in the database.

DCT based Video Encoding Process

I am having some issues that I am hoping you will be able to clarify. I have self taught myself a video encoding process similar to Mpeg2. The process is as follows:
Split an RGBA image into 4 separate channel data memory blocks. so an array of all R values, a separate array of G values etc.
take the array and grab a block of 8x8 pixel data, to transform it using the Discrete Cosine Transform (DCT).
Quantize this 8x8 block using a pre-calculated quantization matrix.
Zigzag encode the output of the quantization step. So I should get a trail of consecutive numbers.
Run Length Encode (RLE) the output from the zigzag algorithm.
Huffman Code the data after the RLE stage. Using substitution of values from a pre-computed huffman table.
Go back to step 2 and repeat until all the channels data has been encoded
Go back to step 2 and repeat for each channel
First question is do I need to convert the RGBA values to YUV+A (YCbCr+A) values for the process to work or can it continue using RGBA? I ask as the RGBA->YUVA conversion is a heavy workload that I would like to avoid if possible.
Next question. I am wondering should the RLE store runs for just 0's or can that be extended to all the values in the array? See examples below:
440000000111 == [2,4][7,0][3,1] // RLE for all values
or
440000000111 == 44[7,0]111 // RLE for 0's only
The final question is what would a single symbol be in regard to the huffman stage? would a symbol to be replaced be a value like 2 or 4, or would a symbol be the Run-level pair [2,4] for example.
Thanks for taking the time to read and help me out here. I have read many papers and watched many youtube videos, which have aided my understanding of the individual algorithms but not how they all link to together to form the encoding process in code.
(this seems more like JPEG than MPEG-2 - video formats are more about compressing differences between frames, rather than just image compression)
If you work in RGB rather than YUV, you're probably not going to get the same compression ratio and/or quality, but you can do that if you want. Colour-space conversion is hardly a heavy workload compared to the rest of the algorithm.
Typically in this sort of application you RLE the zeros, because that's the element that you get a lot of repetitions of (and hopefully also a good number at the end of each block which can be replaced with a single marker value), whereas other coefficients are not so repetitive but if you expect repetitions of other values, I guess YMMV.
And yes, you can encode the RLE pairs as single symbols in the huffman encoding.
1) Yes you'll want to convert to YUV... to achieve higher compression ratios, you need to take advantage of the human eye's ability to "overlook" significant loss in color. Typically, you'll keep your Y plane the same resolution (presumably the A plane as well), but downsample the U and V planes by 2x2. E.g. if you're doing 640x480, the Y is 640x480 and the U and V planes are 320x240. Also, you might choose different quantization for the U/V planes. The cost for this conversion is small compared to DCT or DFT.
2) You don't have to RLE it, you could just Huffman Code it directly.

OpenCV process parts of an image

I'm trying to use a PNG with an alpha channel to 'mask' the current frame from a video stream.
My PNG has black pixels in the areas that I don't want processed and alpha in others - currently it's saved a 4 colours image with 4 channels, but it might as well be a binary image.
I'm doing background subtraction and contour finding on the image, so I imagine if I copy the black pixels from my 'mask' image into the current then there would be no contours found in the black areas. Is this a good approach? If so, how can I copy the black/non transparent pixels from one cv::Mat on top of the other?
What you're describing sounds to me like the usage of an image mask. It's odd that you'd do it in the alpha channel, when so many methods available in the OpenCV libraries support masking. Rather than use the alpha channel, why not create a separate binary image with non-zero values everywhere you'd like to find contours?
Depending on which algorithms you use, you are correct in your assumption that you would not find contours in the black pixeled areas. Unfortunately, I don't know of any efficient ways of copying pixels from one image to another, selectively, without getting into the nitty-gritty of the Mat structure, and iterating from byte to byte/pixel to pixel. Using the mask idea presented above with your pre-processing functions, and then sending the resulting binary image into findContours or the like, would allow you to both take advantage of the already well-written and optimized code of the OpenCV library, and keep more of your hair on your head, where it belongs ;).

How does H.264 or video encoders in general compute the residual image of two frames?

I have been trying to understand how video encoding works for modern encoders, in particular H264.
It is very often mentioned in documentation that residual frames are created from the differences between the current p-frame and the last i-frame (assuming the following frames are not used in the prediction). I understand that a YUV color space is used (maybe YV12), and that one image is "substracted" from the other and then the residual is formed.
What I don't understand is how exactly this substraction works. I don't think it is an absolute value of the difference because that would be ambiguous. What is the per pixel formula to obtain this difference?
Subtraction is just one small step in video encoding; the core principle behind most modern video encoding is motion estimation, followed by motion compensation. Basically, the process of motion estimation generates vectors that show offsets between macroblocks in successive frames. However, there's always a bit of error in these vectors.
So what happens is the encoder will output both the vector offsets, and the "residual" is what's left. The residual is not simply the difference between two frames; it's the difference between the two frames after motion estimation is taken into account. See the "Motion compensated difference" image in the wikipedia article on compensation for a clear illustration of this--note that the motion compensated difference is drastically smaller than the "dumb" residual.
Here's a decent PDF that goes over some of the basics.
A few other notes:
Yes, YUV is always used, and typically most encoders work in YV12 or some other chroma subsampled format
Subtraction will have to happen on the Y, U and V frames separately (think of them as three separate channels, all of which need to be encoded--then it becomes pretty clear how subtraction has to happen). Motion estimation may or may not happen on Y, U and V planes; sometimes encoders only do it on the Y (the luminance) values to save a bit of CPU at the expense of quality.