Portion of image sensor used for 1080p video - computer-vision

I'm trying to get the width and height of the sensor used when recording 1080p video for an image processing application using a raspi cam. I have noted the field of view changes from 1080p video to a 1080p still image, even though the resolution is the same. I believe, this is done due to a bit rate issue of h264 video.
All of these observations, make me confused as to how I can calculate the correct width and height in mm, when using 1080p video. In the raspberry pi camera spec, it says:
sensor resolution - 2592 x 1944 pixels
sensor dimensions - 3.76 x 2.74 mm
Will a straightforward linear interpolation be accurate? ex: (3.76 * 1920 / 2592). But, then it seems the image can be scaled as well, which happens in either the video or the still image format.
Note: I have calibrated the camera and have all intrinsic values in pixel units. My effort here is to convert all of these into mm.

Just calibrate the camera for the mode you want to use.
The width and height of your sensor is given in the specs.
It also gives you a pixel size of 1.4µm x 1.4µm. If not given you could calculate it by dividing the sensor width by the image width. Same for height.
And it says that there is cropping in 1080p mode. This means that only a region of your sensor is used. Simply multiply your image width and height with the pixel size and you'll get the size of the sensor area that is used for 1080p.
To get the position of that area take a picture of the same scene in 1080p and in full resolution and compare them.
Not sure about the scaling. You did not provide sufficient information here.
You can either calibrate your camera in 1080p mode or you calibrate it in full resoultion and correct the pixel positions by some translation offset. Pixel size and physical position did not change through cropping...

Related

How will the Camera Intrinsics change if an image is cropped/resized?

I have a recorded camera ROS bag file from a Realsense camera. The camera intrinsics for the recorded setting in already know. The initial resolution of the image is 848*480. Because of some visual obstruction in the FOV of the camera I would like to crop out the top of the image so it doesn't gets detected with Visual SLAM Algorithm I am using.
Since SLAM is heavily dependent on the Camera Intrinsics, I would like to know how will the camera parameters f_x, f_y, c_x and c_y change for :
Cropped Image
Resized Image (Image Scaling only)
There is no skew involved in the original camera parameters.
Will the new pricipal point c_x also change as Cropped_image_width?
I am bit confused as to how to calculate the new camera parameters ? Am I correct in assuming the following for the Case 1 - Cropped Case :
Cropping:
cx,cy reduced by the amount of pixels cropped off the left/top. Cropping off of right/bottom edges has no effect.
Scaling:
fx,fy are multiplied by the scaling factor
cx,cy are multiplied by the scaling factor
Remember, the principal point need not be in the center of the image.
Assuming top left image origin. Off-by-one/half (pixel) errors to be checked carefully against the specific scaling algorithm used, or just ignored.
For the case no. 1, cx is the same as the original image, but cy changes as newHeight/2.
For the case no. 2, f changes as fxscale.

How to encode grayscale video in libvpx (webm)?

I have a stream of raw images that cames from a network grayscale camera that we are developing. In this case, our images are arrays of 8bits pixels (640x480). Since this camera outputs more than 200 frames per second, I need to store these images as a WebM video, as quickly as possible, in order to not lose any frame.
What is the best way of doing that, using libvpx?
The fastest and easiest thing to do would be to provide the gray scale plane directly into libvpx compression function vpx_codec_encode with VPX_IMG_FMT_I420. You'll have to input two 2x2 subsampled color planes with it though - 320x240 in your case - make all the octets of those planes have the value 128.

How to map thermal image (Flir A325sc) with Asus XTion Pro Live IR image under OpenCV+ROS

I want to map the thermal image of the Flir with the depth image of the XTion.
As the depth image is calculatet from Xtions IR camera I want to map the Flir with Xtions IR image.
Therefor I placed both cameras on one plane close to each other (about 7 cm in x, 1 cm in y and 3 cm in z).
Then I used ROS Indigo and openCV 2.4.9 to:
Set the Flir Focus to a fix on (no autofocus)
Get both images synchronized.
Resize the Xtion IR image from 640x480 to 320x240 pixels as the Flir image
Calculate the intrinsic camera parameters for both cameras. (Flir + Xtion IR)
Calculate the extrinsic parameters
Remap both images to get the rectified images
I now have the two rectified images but still an offset in X (horizontal direction).
If I understand that correctly, I have the offset due to the different focal lengths and field of views (Flir with objective: 45° H x 33.8° V and 9.66mm focal length, XTion: 58° H x 45° V) and could solve the problem with a perspective transform but I don't have both focal lengths in mm.
The datasheets:
http://support.flir.com/DsDownload/Assets/48001-0101_en_40.pdf
https://www.imc-store.com.au/v/vspfiles/assets/images/1196960_en_51.pdf
http://www.asus.com/us/Multimedia/Xtion_PRO_LIVE/specifications/
I had the idea to get the focal lengths with cv::calibrationMatrixValues but I dont know the apertureWith and Heigth.
Cross-Post
How could I solve this problem?

OpenCV: Denoising image / video frame

I want to denoise a video using OpenCV and C++. I found on the OpenCV doc site this:
fastNlMeansDenoising(contourImage,contourImage2);
Every time a new frame is loaded, my program should denoise the current frame (contourImage) and write it to contourImage2.
But if I run the code, it returns 0 and exits. What am I doing wrong or is there an alternative way to denoise an image? (It should be fast, because I am processing a video)
while you are using c++ you are not providing the full argument try this that way.
cv::fastNlMeansDenoisingColored(contourImage, contourImage2, 10, 10,7, 21);
// This is Original Function to be used.
cv::fastNlMeansDenoising(src[, dst[, h[, templateWindowSize[, searchWindowSize]]]]) → dst
Parameters:
src – Input 8-bit 1-channel, 2-channel or 3-channel image.
dst – Output image with the same size and type as src .
templateWindowSize – Size in pixels of the template patch that is used to compute weights. Should be odd. Recommended value 7 pixels.
searchWindowSize – Size in pixels of the window that is used to compute weighted average for given pixel. Should be odd. Affect performance linearly: greater.
searchWindowsSize - greater denoising time. Recommended value 21 pixels.
h – Parameter regulating filter strength. Big h value perfectly removes noise but also removes image details, smaller h value preserves details but also preserves some noise

Background extraction

Can anyone suggest me a fast way of getting the foreground image?
Currently I am using BackgroundSubtractorMOG2 class to do this. it is very slow. and my task doesn't need that much complex algorithm.
I can get a image of the background in the binging. camera position will not change. so I believe that there is a easy way to do this.
I need to capture a blob of the object moving in front of the camera. and there will be only one object always.
I suggest to do as following, simple solution:
Compute difference matrix:
cv::absdiff(frame, background, absDiff);
This makes each pixel (i,j) in absDiff set to |frame(i,j) - background(i.j)|. Each channel (e.g. R,G,B) is procesed independently.
Convert result to single-channeled monocolor image:
cv::cvtColor(absDiff, absDiffGray, cv::COLOR_BGR2GRAY);
Apply binary filter:
cv::threshold(absDiffGray, absDiffGrayThres, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
Here we used Ots'u Method to determine appriopriate threshold level. If there was any
noise from step 2, binary filter would remove it.
Apply blob detection in absDiffGrayThres image. This can be one of built-in opencv method's or manually written code which look for pixels positions which vale are 255 (remember about fast opencv pixel retrieval operations)
Such process is enough fast to manage with 640x480 RGB images with frame rate at least 30 fps on quite old Core 2 Duo 2.1 GHz, 4 GB RAM without GPU support.
Hardware remark: be sure that your camera lense aperture is not set to auto-adjust. Imagine following situation: you computed a background image on the beginning. Then, some object appears and covers bigger part of camera view. Less light comes to the lense and, beacause of auto light adjustment, camera increases aperture, background color changes, difference gives a blob in place where actually there is not any object.