How to encode grayscale video in libvpx (webm)? - c++

I have a stream of raw images that cames from a network grayscale camera that we are developing. In this case, our images are arrays of 8bits pixels (640x480). Since this camera outputs more than 200 frames per second, I need to store these images as a WebM video, as quickly as possible, in order to not lose any frame.
What is the best way of doing that, using libvpx?

The fastest and easiest thing to do would be to provide the gray scale plane directly into libvpx compression function vpx_codec_encode with VPX_IMG_FMT_I420. You'll have to input two 2x2 subsampled color planes with it though - 320x240 in your case - make all the octets of those planes have the value 128.

Related

OpenCV : imwrite changes the channels pixels values when saving

I'm reading an image and doing some processing on the blue channel without changing the Red nor the green channels.
When i finished processing the blue channel, i merged back the three channels into one RGB image. and when i use imshow to view the channels, every thing is alright and i can see that the changes i've made only affect the Blue channel and they do not affect the red nor the green ones.
Up to this point every thing is alright !
But when i save the image using imwrite, the resulting image is slightly different, in that the changes made on the blue channel seem to get propagated to the red and green channels, it's like imwrite is doing some kind of mean between the 3 channels :
image = imread('image.jpg', IMREAD_COLOR);
split(image, channels);
// Create some changes on channels[0]
merge(channels, 3, image);
// Up to this point every thing is alright
imwrite("modified.jpg", image); // Image changes when written;
Is there any solution to avoid this behavior ?
JPG is a lossy format: https://en.wikipedia.org/wiki/JPEG
JPEG (/ˈdʒeɪpɛɡ/ JAY-peg)1 is a commonly used method of lossy
compression for digital images, particularly for those images produced
by digital photography. The degree of compression can be adjusted,
allowing a selectable tradeoff between storage size and image quality.
JPEG typically achieves 10:1 compression with little perceptible loss
in image quality.
Solution: Use a lossles Format like PNG to save your image.

Portion of image sensor used for 1080p video

I'm trying to get the width and height of the sensor used when recording 1080p video for an image processing application using a raspi cam. I have noted the field of view changes from 1080p video to a 1080p still image, even though the resolution is the same. I believe, this is done due to a bit rate issue of h264 video.
All of these observations, make me confused as to how I can calculate the correct width and height in mm, when using 1080p video. In the raspberry pi camera spec, it says:
sensor resolution - 2592 x 1944 pixels
sensor dimensions - 3.76 x 2.74 mm
Will a straightforward linear interpolation be accurate? ex: (3.76 * 1920 / 2592). But, then it seems the image can be scaled as well, which happens in either the video or the still image format.
Note: I have calibrated the camera and have all intrinsic values in pixel units. My effort here is to convert all of these into mm.
Just calibrate the camera for the mode you want to use.
The width and height of your sensor is given in the specs.
It also gives you a pixel size of 1.4µm x 1.4µm. If not given you could calculate it by dividing the sensor width by the image width. Same for height.
And it says that there is cropping in 1080p mode. This means that only a region of your sensor is used. Simply multiply your image width and height with the pixel size and you'll get the size of the sensor area that is used for 1080p.
To get the position of that area take a picture of the same scene in 1080p and in full resolution and compare them.
Not sure about the scaling. You did not provide sufficient information here.
You can either calibrate your camera in 1080p mode or you calibrate it in full resoultion and correct the pixel positions by some translation offset. Pixel size and physical position did not change through cropping...

Is there a way to have both grayscale and rgb pixels on the same image opencv C++?

I need to be able to work with images where some regions are grayscale while others are kept on the RGB format. I don't want to convert an image into a grayscale since it will lose the channels and will become simply one channeled, is there a way to keep the RGB channels of some pixels on the picture and turn the others into a grayscale?
NO.
I see two solutions to this:
Have both a gray (Mat1b) and a rgb (Mat3b) image, and work on the image you need.
Have a single rgb (Mat3b) image, and set r,g,b channels to the same gray value where you need. In this way you can mimic to have a mixed gray/rgb image.

How to get grayscale value of pixels from grayscale image in xCode

I was wondering how to determine the equivalent of RGB values for a grayscale image. The original image is grayscale and everything I have found online is converting an RGB image pixel values to the grayscale pixel values. I already can read in the image. Ideally, this would be for xCode.
I was wondering if there was a class which would do this for me. If so, and you could point me to it, that would be great. I will read on it.
Any help is greatly appreciated.
NOTE: I am a beginner in C++ and do not have time to learn everything formally; I have to learn all of my programming on the fly.
You need more information to transform from a simple Greyscale to RGB, when you do reverse operation, the color information is "lost", as the three channels are set to same value(depending on the algorithm each channel will have a different/same weight in the final color computation).
Digital cameras, usually store more information per pixel, 12 bits per channel in 35mm and 14 bits per channel in medium format (those bits number are the average, some products offer less or even more quality).
Thanks to those additional bits per channel, the camera can compute the "real" color, or what it thinks is the real color based on some parameters.
TL;DR: You can't without more data from your source, in this case the image.
You can convert a gray value to RGB by setting each component of the RGB value to the gray value:
ColorRGB myColorRGB = ColorRGBMake(myGrayValue, myGrayValue, myGrayValue);

Background extraction

Can anyone suggest me a fast way of getting the foreground image?
Currently I am using BackgroundSubtractorMOG2 class to do this. it is very slow. and my task doesn't need that much complex algorithm.
I can get a image of the background in the binging. camera position will not change. so I believe that there is a easy way to do this.
I need to capture a blob of the object moving in front of the camera. and there will be only one object always.
I suggest to do as following, simple solution:
Compute difference matrix:
cv::absdiff(frame, background, absDiff);
This makes each pixel (i,j) in absDiff set to |frame(i,j) - background(i.j)|. Each channel (e.g. R,G,B) is procesed independently.
Convert result to single-channeled monocolor image:
cv::cvtColor(absDiff, absDiffGray, cv::COLOR_BGR2GRAY);
Apply binary filter:
cv::threshold(absDiffGray, absDiffGrayThres, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
Here we used Ots'u Method to determine appriopriate threshold level. If there was any
noise from step 2, binary filter would remove it.
Apply blob detection in absDiffGrayThres image. This can be one of built-in opencv method's or manually written code which look for pixels positions which vale are 255 (remember about fast opencv pixel retrieval operations)
Such process is enough fast to manage with 640x480 RGB images with frame rate at least 30 fps on quite old Core 2 Duo 2.1 GHz, 4 GB RAM without GPU support.
Hardware remark: be sure that your camera lense aperture is not set to auto-adjust. Imagine following situation: you computed a background image on the beginning. Then, some object appears and covers bigger part of camera view. Less light comes to the lense and, beacause of auto light adjustment, camera increases aperture, background color changes, difference gives a blob in place where actually there is not any object.