I'm developing an image manipulation software in which the user can adjust brightness, contrast and local contrast/"clarity" of an image.
Adjustments are made using OpenCV's convertTo for brightness and contrast, and CLAHE for local contrast.
I want to know the order in which I should apply these adjustments to the image. Is there a rule of thumb regarding this ? I get vastly different results changing the order, and I can't find anything in the documentation.
Actually this question's answer is inside the definitions of those terms.
Brightness: The brightness is directly related with the index values of each pixel. A pixel ( so also an image ) becomes brighter when getting closer to 255(white) and oppositely becomes darker when getting closer to 0(black). This increment or decrease in each pixel is done by adding or subtracting a constant to each pixel.
Contrast: In brightness we talked about adding a constant to each pixel, in here we are talking about multiplying each pixel with a constant. This makes gaps between pixels in whole image.
CLAHE: I see CLAHE as intelligent of Histogram Equalization. Some image's pixel population is distributed in a narrow interval inside 0-255. To widen this area to the whole interval(0-255), CLAHE is the tool.
If we back to your question:
If you set brightness and contrast first, CLAHE can cause a backward
operation.
If you use CLAHE first, it can make sense.
Note: According to my experiences, CLAHE is mostly used for pre-processing steps. Setting brightness and contrast in low amounts can be okey for both case, but in high amount changing I prefer to set after CLAHE.
Related
I'm trying to understand what the meaning of a temporal derivative is in an image. While I understand the brightness constancy equation, I don't understand why taking the difference between two images gives me the temporal derivative.
Taking the difference between two frames gives me the difference in pixel intensity per pixel between the two, but how is that the same as asking how much the image changed over a certain span of time?
The temporal derivative dI/dt of the image I(x,y,t) is the rate of change of the image over time at a particular position. As you noted, this is the difference in pixel intensity between the two frames. Considering a single pixel at (x,y), the finite difference approximation to the derivative is
f_d = ( I(x,y,t+delta) - I(x,y,t) ) / delta so that f_d -> dI/dt as delta -> 0.
In this case delta is simply set to one. So we are approximating the image derivative (with respect to time) by the difference between adjacent frames.
One aspect that may be confusing is how that relates to the movement of objects in the image. If you have some physics background, for instance, you might think about the difference between Eulerian and Lagrangian frames of reference: in the more intuitive Lagrangian viewpoint, you consider an object moving by tracking it over the pixels (space) in which it moves, e.g. watching a cat as it hops over a fence. The Eulerian view, which is closer to what we do in optical flow, is to track what happens at a single pixel, and never take our eyes off of it. As the cat passes over that area of (pixel) space, the pixel's values will change, and then go back to "normal" when it's gone.
These two views are in some sense equivalent, but may be useful in difference situations. In computer vision, tracking an object is hard, while computing these Eulerian-like temporal derivatives is easy. Ideally, we could track the cat: consider a point p(t)=(x_p(t),y_p(t)) on say its head, then compute dp/dt and figure out p(t) for all t, and use that for downstream processing. Unfortunately, this is hard, so instead we hope that brightness constancy is usually locally true, and use the optical flow to estimate dp/dt. Of course, dI/dt often does not correspond well to dp/dt (this is why the brightness constancy is an assumption). For instance, consider a light moving around a stationary sphere: dI/dt will be large, but dp/dt will be zero.
The difference between subsequent frames is the finite difference approximation to the temporal derivative.
Proper units would be obtained if the value were divided by the time between frames (i.e. multiplied by the frames per second value).
Let's say I have a series of infrared pictures and the task is to isolate human body from other objects in the picture. The problem is a noise from other relatively hot objects like lamps and their 'hot' shades.
Simple thresholding methods like binary and/or Otsu didn't give good results on difficult (noisy) pictures, so I've decided to do it manually.
Here are some samples
The results are not terrible, but I think they can be improved. Here I simple select pixels by hue value of HSV. More or less, hot pixels are located in this area: hue < 50, hue > 300. My main concern here is these pink pixels which sometimes are noise from lamps but sometimes are parts of human body, so I can't simply discard them without causing significant damage to the results: e.g. on the left picture this will 'destroy' half of the left hand and so on.
As the last resort I could use some strong filtering and erosion but I still believe there's a way somehow to told to OpenCV: hey, I don't need these pink areas unless they are part of a large hot cluster.
Any ideas, keywords, techniques, good articles? Thank in advance
FIR data is presumably monotonically proportional (if not linear) to temperature, and this should yield a grayscale image.
Your examples are colorized with a color map - the color only conveys a single channel of actual information. It would be best if you could work directly on the grayscale image (maybe remap the images to grayscale).
Then, see if you can linearize the images to an actual temperature scale such that the pixel value represents the temperature. Once you do this you can should be able to clamp your image to the temperature range that you expect a person to appear in. Check the datasheets of your camera/imager for the conversion formula.
I know images can be zoomed with the help of image pyramids. And I know opencv pyrUp() method can zoom images. But, after certain extent, the image gets non-clear. For an example, if we zoom a small image 15 times of its original size, it is definitely not clear.
Are there any method in OpenCV to zoom the images but keep the clearance as it is in the original one? Or else, any algorithm to do this?
One thing to remember: You can't pull extra resolution out of nowhere. When you scale up an image, you can have either a blurry, smooth image, or you can have a sharp, blocky image, or you can have something in between. Better algorithms, that appear to have better performance with specific types of subjects, make certain assumptions about the contents of the image, which, if true, can yield higher apparent performance, but will mess up if those assumptions prove false; there you are trading accuracy for sharpness.
There are several good algorithms out there for zooming specific types of subjects, including pixel art,
faces, or text.
More general algorithms for sharpening images include unsharp masking, edge enhancement, and others, however all of these are assume specific things about the contents of the image, for instance, that the image contains text, or that a noisy area would still be noisy (or not) at a higher resolution.
A low-resolution polka-dot pattern, or a sandy beach's gritty pattern, will not go over very well, and the computer may turn your seascape into something more reminiscent of a mosh pit. Every zoom algorithm or sharpening filter has a number of costs associated with it.
In order to correctly select a zoom or sharpening algorithm, more context, including sample images, are absolutely necessary.
OpenCV has the Super Resolution module. I haven't had a chance to try it yet so not too sure how well it works.
You should check out Super-Resolution From a Single Image:
Methods for super-resolution (SR) can be broadly classified into two families of methods: (i) The classical multi-image super-resolution (combining images obtained at subpixel misalignments), and (ii) Example-Based super-resolution (learning correspondence between low and high resolution image patches from a database). In this paper we propose a unified framework for combining these two families of methods.
You most likely want to experiment with different interpolation schemes for your images. OpenCV provides the resize function that can be used with various different interpolation schemes (docs). You will likely be trading off bluriness (e.g., in bicubic or bilinear interpolation schemes) with jagged aliasing effects (for example, in nearest-neighbour interpolation). I'd recommend experimenting with the different schemes that it provides and see which ones give you the best results.
The supported interpolation schemes are listed as:
INTER_NEAREST nearest-neighbor interpolation
INTER_LINEAR bilinear interpolation (used by default)
INTER_AREA resampling using pixel area relation. It may be the preferred method
for image decimation, as it gives moire-free results. But when the image is
zoomed, it is similar to the INTER_NEAREST method
INTER_CUBIC bicubic interpolation over 4x4 pixel neighborhood
INTER_LANCZOS4 Lanczos interpolation over 8x8 pixel neighborhood
Wikimedia commons provides this nice comparison image for nearest-neighbour, bilinear, and bicubic interpolation:
You can see that you are unlikely to get the same sharpness as the original image when zoomed, but you can trade off "smoothness" for aliasing effects (i.e., jagged edges).
Take a look at quick image scaling algorithms.
First, I will discuss a simple algorithm, dubbed "smooth Bresenham" that can best be described as nearest neighbour interpolation on a zoomed grid, using a Bresenham algorithm. The algorithm is quick, it produces a quality equivalent to that of linear interpolation and it can zoom up and down, but it is only suitable for a zoom factor that is within a fairly small range. To offset this, I next develop a directional interpolation algorithm that can only magnify (scale up) and only with a factor of 2×, but that does so in a way that keeps edges sharp. This directional interpolation method is quite a bit slower than the smooth Bresenham algorithm, and it is therefore practical to cache those 2× images, once computed. Caching images with relative sizes that are powers of 2, combined with simple interpolation, is actually a third image zooming technique: MIP-mapping.
A related question is Image scaling and rotating in C/C++. Also, you can use CImpg.
What your asking goes out of this universe physics: there are simply not enough bits in the original image to represent 15*15 times more details. Whatever algorithm cannot invent the "right information" that is not there. It can just find a suitable interpolation. But it will never increase the details.
Despite what happens in many police fiction, getting a picture of fingerprint on a car door handle stating from a panoramic view of a city is definitively a fake.
You Can easily zoom in or zoom out an image in opencv using the following two functions.
For Zoom In
pyrUp(tmp, dst, Size(tmp.cols * 2, tmp.rows * 2));
For Zoom Out
pyrDown(tmp, dst, Size(tmp.cols / 2, tmp.rows / 2));
You can get details about the method in the following link:
Image Zoom Out and Zoom In using OpenCV
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is there a way to detect if an image is blurry?
How to calculate blurness and sharpness of a given image usig opencv? Is there any functions there in opencv to do it? If there is no functions in opencv how can I implement it? nay ideas would be great..
The input will be an image and the output should be the blurness and sharpness of the image.
I recommend you to make a frequential analysis of the image. Energy in high band will tell you that the image is quite sharpened, while energy in low band usually means that image is blurry. For computing spectrum, you can use FFTW library.
Regards,
I don't know about opencv.
If I were trying to get an approximate measurement of where an imagine is on the sharp-to-blurry spectrum, I'd start from the observation that the sharpness of parts of an image is evident from the contrast between adjacent pixels - something like max(c1 * abs(r1 - r2), c2 * abs(g1 - g2), c3 * abs(b1 - b2)) where c1-3 weigh perceptual importance of each of the red, green and blue channels, and the two pixels are (r1,g1,b1) and (r2,g2,b2)).
Many tweaks possible, such as raising each colour's contribution to a power to emphasise changes at the dark (power <1)or bright (power >1) end of the brightness scale. Note that the max() approach considers sharpness for each colour channel separately: a change from say (255,255,255) to (0,255,255) is very dramatic despite only one channel changing.
You may find it better to convert from RBG to another colour representation, such as Hue/Saturation/Value (there'll be lots of sites online explaining the HSV space, and formulas for conversions).
Photographically, we're usually interested in knowing that the in-focus part of the image is sharp (foreground/background blur/bokeh due to shallow depth of field is a normal and frequently desirable quality) - the clearest indication of that is high contrast in some part of the image, suggesting you want the maximum value of adjacent-pixel contrasts. That said, some focused pixtures can still have very low local contrasts (e.g. a picture of a solid coloured surface). Further, damaged pixel elements on the sensor, dirt on the lens/sensor, and high-ISO / long-exposure noise may all manifest as spots of extremely high contrast. So the validity of your result's always going to be questionable, but it might be ball-park right a useful percentage of the time.
I am processing some images using ImageMagick library. As part of the processing I want to minimize the number of colors if this doesn't affect image quality (too much).
For this I have tried to use MagickQuantizeImage function. Can someone explain me whow should I choose the parameters ?
treedepth:
Normally, this integer value is zero or one. A zero or one tells Quantize to choose a optimal tree depth of Log4(number_colors).% A tree of this depth generally allows the best representation of the reference image with the least amount of memory and the fastest computational speed. In some cases, such as an image with low color dispersion (a few number of colors), a value other than Log4(number_colors) is required. To expand the color tree completely, use a value of 8.
dither:
A value other than zero distributes the difference between an original image and the corresponding color reduced algorithm to neighboring pixels along a Hilbert curve.
measure_error:
A value other than zero measures the difference between the original and quantized images. This difference is the total quantization error. The error is computed by summing over all pixels in an image the distance squared in RGB space between each reference pixel value and its quantized value.
ps: I have made some tests but sometimes the quality of images in severely affected, and I don't want find a result by trial and error.
This is a really good description of the algorithm
http://www.imagemagick.org/www/quantize.html
They are referencing the command-line version, but the concepts are the same.
The parameter measure_error is meant to give you an indication of how good an answer you got. Set to non-zero, then look at the Image object's mean_error_per_pixel field after you quantize to see how good a quantization you got.
If it's not good enough, increase the number of colors.