OpenCV: Detecting seizure-inducing lights in a video? - c++

I have been working on an algorithm which can detect seizure-inducing strobe lights in a video.
Currently, my code returns virtually every frame as capable of causing a seizure (3Hz flashes).
My code calculates the relative luminance of each pixel and sees how many times the luminance goes up then down, etc. or down then up, etc. by more than 10% within any given second.
Is there any way to do this without comparing each individual pixel within a second of each other and that only returns the correct frames.
An example of what I am trying to emulate:

The common approach to solving this type of problems is to convert the frames to grayscale and then construct a cube containing frames from a 1 to 3 seconds time interval. From this cube, you can extract the time-varying characteristics of either individual pixels (noisy), or blocks (recommended). The resulting 1D curves can first be observed manually to see if they actually show the 3Hz variation that you are looking for (sometimes, these variations are either lost or distorted because of the camera's auto exposure settings). If you can see it, they you should be able to use FFT to isolate and detect it automatically.

Convert the image to grayscale. Break the image up into blocks, maybe 16x16 or 64x64 or larger (experiment to see what works). Take the average luminance of each block over a minimum of 2/3 seconds. Create a wave of luminance over time. Do an fft on this wave and look for a minimum energy threshold around 3Hz.


Infrared images segmentation using OpenCV

Let's say I have a series of infrared pictures and the task is to isolate human body from other objects in the picture. The problem is a noise from other relatively hot objects like lamps and their 'hot' shades.
Simple thresholding methods like binary and/or Otsu didn't give good results on difficult (noisy) pictures, so I've decided to do it manually.
Here are some samples
The results are not terrible, but I think they can be improved. Here I simple select pixels by hue value of HSV. More or less, hot pixels are located in this area: hue < 50, hue > 300. My main concern here is these pink pixels which sometimes are noise from lamps but sometimes are parts of human body, so I can't simply discard them without causing significant damage to the results: e.g. on the left picture this will 'destroy' half of the left hand and so on.
As the last resort I could use some strong filtering and erosion but I still believe there's a way somehow to told to OpenCV: hey, I don't need these pink areas unless they are part of a large hot cluster.
Any ideas, keywords, techniques, good articles? Thank in advance
FIR data is presumably monotonically proportional (if not linear) to temperature, and this should yield a grayscale image.
Your examples are colorized with a color map - the color only conveys a single channel of actual information. It would be best if you could work directly on the grayscale image (maybe remap the images to grayscale).
Then, see if you can linearize the images to an actual temperature scale such that the pixel value represents the temperature. Once you do this you can should be able to clamp your image to the temperature range that you expect a person to appear in. Check the datasheets of your camera/imager for the conversion formula.

How to estimate exposure time for camera to take a good image from a scene

I am trying to write code to calculate the correct exposure time for a camera to capture an image in correct brightness.
what I have is a camera that supply me data in RAW (Bayer raw data) and I can control its exposure time, and I want to control its exposure so when it captured an image, the image is in correct brightness (not too dark (under exposed) or too bright (over exposed).
I think I need an algorithm similar to this:
1-capture a sample image
2-calculate image brightness.
3-calculate correct exposure.
4-capture a new image,
5-check that the image brightness is correct if not go to step 3.
6- capture final image.
My question is:
How can I calculate image brightness?
If I calculate image brightness, how can I calculate exposure? One way of doing this is to do a search (for example start from very fast exposure time increase it till you get a correct exposure, but It is a very time consuming, is there any better way of doing this?)
To do this, I may need to calibrate my camera (as the relationship between brightness and exposure time is different between different sensors), how can I do this?
I am using OpenCV and I can use algorithms which is available in OpenCV (c++)
There are multiple ways to measure the "correct" brightness of the image. A common method is to calculate the intensity histogram and make sure that the values cover the entire range of values, and there is not too much cut-off. I'm not sure if there's a single "one fit all" way for any possible scene.
A faster way than linearly increasing the exposure is to do a binary search, by measuring at low and high exposure, then measuring in the middle, and then continuing to split the sub-range in the middle, until you find the optimum.

Detect if images are different in real-time

I am working on a microscope that streams live images via a built-in video camera to a PC, where further image processing can be performed on the streamed image. Any processing done on the streamed image must be done in "real-time" (minimal frames dropped).
We take the average of a series of static images to counter random noise from the camera to improve the output of some of our image processing routines.
My question is: how do I know if the image is no longer static - either the sample under inspection has moved or rotated/camera zoom-in or out - so I can reset the image series used for averaging?
I looked through some of the threads, and some ideas that seemed interesting:
Note: using Windows, C++ and Intel IPP. With IPP the image is a byte array (Ipp8u).
1. Hash the images, and compare the hashes (normal hash or perceptual hash?)
2. Use normalized cross correlation (IPP has many variations - which to use?)
Which do you guys think is suitable for my situation (speed)?
If you camera doesn't shake, you can, as inVader said, subtract images. Then a sum of absolute values of all pixels of the difference image is sometimes enough to tell if images are the same or different. However, if your noise, lighting level, etc... varies, this will not give you a good enough S/N ratio.
And in noizy conditions normal hashes are even more useless.
The best would be to identify that some features of your object has changed, like it's boundary (if it's regular) or it's mass center (if it's irregular). If you have a boundary position, you'll need to analyze just one line of pixels, perpendicular to that boundary, to tell that boundary has moved.
Mass center position may be a subject to frequent false-negative responses, but adding a total mass and/or moment of inertia may help.
If the camera shakes, you may have to align images before comparing (depending on comparison method and required accuracy, a single pixel misalignment might be huge), and that's where cross-correlation helps.
And further, you doesn't have to analyze each image. You can skip one, and if the next differs, discard both of them. Here you have twice as much time to analyze an image.
And if you are averaging images, you might just define an optimal amount of images you need and compare just the first and the last image in the sequence.
So, simplest thing to try would be to take subsequent images, subtract them from each other and have a look at the difference. Then define some rules including local and global thresholds for the difference in which two images are considered equal. Simple subtraction of bitmap/array data, looking for maxima and calculating the average differnce across the whole thing should be ne problem to do in real time.
If there are varying light conditions or something moving in a predictable way(like a door opening and closing), then something more powerful, albeit slower, like gaussian mixture models for background modeling, might be worth looking into, click here. It is quite compute intensive, but can be parallelized pretty easily.
Motion detection algorithms is what is used.
First of all I would take a series of images at a slow fps rate and downsample those images to make them smaller, not too much but enough to speed up the process.
Now you have several options:
You could make a sum of absolute differences of the two images by subtracting them and use a threshold to value if the image has changed.
If you want to speed it up even further I would suggest doing a progressive SAD using a small kernel and moving from the top of the image to the bottom. You can value the complessive amount of differences during the process and eventually stop when you are satisfied.

finding image silhouette using openCV

as i want to track motion of an object, i require silhouette of sequence of images.
does anybody know , how to do this?
Silhouette mask is a binary image that has non-zero pixels where the motion occurs
You can use the technique of background subtraction. Here are two ways of doing it.
Subtract the previous frame from the current frame. Only pixels in both frames that haven't changed will result in zero. See cvSub, cvAbsDiff.
Maintain a running average of the video frames. See the function cvRunningAvg in the Motion Analysis and Object Tracking section of the OpenCV docs. For each new frame, subtract the running average from the current frame. When you're done, update the running average with the current frame.
After using one of the methods above, you could segment the resulting difference image using cvThreshold or cvAdaptiveThreshold. This will result in a binary image, ideally with zero where the image was static, and 1 or 255 where motion was present.
Though you didn't mention this in your question, you can then proceed to calculate the contour of the binary image. There's cvFindContours for that.
Have a look at this: Tracking colored objects in OpenCV

Video upsampling with C/C++

I want to upsample an array of captured (from webcam) OpenCV images or corresponding float arrays (Pixel values don't need to be discrete integer). Unfortunately the upsampling ratio is not always integer, so I cannot figure myself how to do it with simple linear interpolation.
Is there an easier way or a library to do this?
Well, I dont know a library to to do framerate scaling.
But I can tell you that the most appropriate way to do it yourself is by just dropping or doubling frames.
Blending pictures by simple linear pixel interpolation will not improve quality, playback will still look jerky and even also blurry now.
To proper interpolate frame rates much more complicated algorithms are needed.
Modern TV's have build in hardware for that and video editing software like e.g. After-Effects has functions that do it.
These algorithms are able to create in beetween pictures by motion analysis. But that is beyond the range of a small problem solution.
So either go on searching for an existing library you can use or do it by just dropping/doubling frames.
The ImageMagick MagickWand library will resize images using proper filtering algorithms - see the MagickResizeImage() function (and use the Sinc filter).
I am not 100% familiar with video capture, so I'm not sure what you mean by "pixel values don't need to be discrete integer". Does this mean the color information per pixel may not be integers?
I am assuming that by "the upsampling ratio is not always integer", you mean that you will upsample from one resolution to another, but you might not be doubling or tripling. For example, instead of 640x480 -> 1280x960, you may be doing, 640x480 -> 800x600.
A simple algorithm might be:
For each pixel in the larger grid
Scale the x/y values to lie between 0,1 (divide x by width, y by height)
Scale the x/y values by the width/height of the smaller grid -> xSmaller, ySmaller
Determine the four pixels that contain your point, via floating point floor/ceiling functions
Get the x/y values of where the point lies within that rectangle,between 0,1 (subtract the floor/ceiling values xSmaller, ySmaller) -> xInterp, yInterp
Start with black, and add your four colors, scaled by the xInterp/yInterp factors for each
You can make this faster for multiple frames by creating a lookup table to map pixels -> xInterp/yInterp values
I am sure there are much better algorithms out there than linear interpolation (bilinear, and many more). This seems like the sort of thing you'd want optimized at the processor level.
Use libswscale from the ffmpeg project. It is the most optimized and supports a number of different resampling algorithms.