I want to get a part of an image loaded in another image. There are several, easy ways to do that but for example cv::Mat OutImage = Image(cv::Rect(7,47,1912,980)) but- the resulted image is to large For example:
I got an image with 1920 x 1024 pixel. I want to cut a cv:Rect(7,47,1912,980) from it. I would suggest, that the resulting image has the size (1912 - 7 = 1905) x (980 - 47 = 933) pixel but it has 1912 x 980. It seems, that Opencv is just cutting on the right lower side and keeping the left upper area.
The dimension of the image is important, because in the next step I'd like to perform a substraction which is only valid if the Mat object has the same dimension. I also don't want to use a loop designed by myself, because performance is very important.
Any ideas?
Regards,
Jan
It is actually cv:Rect(x,y,width,height), so you should set the last two parameters as your willing output width and height. Mind the range you set or it would cause errors.
I had also dealed with this issue I will just give my example here it is working for me well. You may also try this one.
Rect const box(100, 295, 400, 185); //this mean the first corner is
//(x,y)=(100,295)
// and the second corner is
//(x + b, y+c )= (100 +400,295+185)
Mat ROI = frame(box);
Related
I am writing a disparity matching algorithm using block matching, but I am not sure how to find the corresponding pixel values in the secondary image.
Given a square window of some size, what techniques exist to find the corresponding pixels? Do I need to use feature matching algorithms or is there a simpler method, such as summing the pixel values and determining whether they are within some threshold, or perhaps converting the pixel values to binary strings where the values are either greater than or less than the center pixel?
I'm going to assume you're talking about Stereo Disparity, in which case you will likely want to use a simple Sum of Absolute Differences (read that wiki article before you continue here). You should also read this tutorial by Chris McCormick before you read more here.
side note: SAD is not the only method, but it's really common and should solve your problem.
You already have the right idea. Make windows, move windows, sum pixels, find minimums. So I'll give you what I think might help:
To start:
If you have color images, first you will want to convert them to black and white. In python you might use a simple function like this per pixel, where x is a pixel that contains RGB.
def rgb_to_bw(x):
return int(x[0]*0.299 + x[1]*0.587 + x[2]*0.114)
You will want this to be black and white to make the SAD easier to computer. If you're wondering why you don't loose significant information from this, you might be interested in learning what a Bayer Filter is. The Bayer Filter, which is typically RGGB, also explains the multiplication ratios of the Red, Green, and Blue portions of the pixel.
Calculating the SAD:
You already mentioned that you have a window of some size, which is exactly what you want to do. Let's say this window is n x n in size. You would also have some window in your left image WL and some window in your right image WR. The idea is to find the pair that has the smallest SAD.
So, for each left window pixel pl at some location in the window (x,y) you would the absolute value of difference of the right window pixel pr also located at (x,y). you would also want some running value, which is the sum of these absolute differences. In sudo code:
SAD = 0
from x = 0 to n:
from y = 0 to n:
SAD = SAD + absolute_value|pl - pr|
After you calculate the SAD for this pair of windows, WL and WR you will want to "slide" WR to a new location and calculate another SAD. You want to find the pair of WL and WR with the smallest SAD - which you can think of as being the most similar windows. In other words, the WL and WR with the smallest SAD are "matched". When you have the minimum SAD for the current WL you will "slide" WL and repeat.
Disparity is calculated by the distance between the matched WL and WR. For visualization, you can scale this distance to be between 0-255 and output that to another image. I posted 3 images below to show you this.
Typical Results:
Left Image:
Right Image:
Calculated Disparity (from the left image):
you can get test images here: http://vision.middlebury.edu/stereo/data/scenes2003/
I have an image here with a table.. In the column on the right the background is filled with noise
How to detect the areas with noise? I only want to apply some kind of filter on the parts with noise because I need to do OCR on it and any kind of filter will reduce the overall recognition
And what kind of filter is the best to remove the background noise in the image?
As said I need to do OCR on the image
I tried some filters/operations in OpenCV and it seems to work pretty well.
Step 1: Dilate the image -
kernel = np.ones((5, 5), np.uint8)
cv2.dilate(img, kernel, iterations = 1)
As you see, the noise is gone but the characters are very light, so I eroded the image.
Step 2: Erode the image -
kernel = np.ones((5, 5), np.uint8)
cv2.erode(img, kernel, iterations = 1)
As you can see, the noise is gone however some characters on the other columns are broken. I would recommend running these operations on the noisy column only. You might want to use HoughLines to find the last column. Then you can extract that column only, run dilation + erosion and replace this with the corresponding column in the original image.
Additionally, dilation + erosion is actually an operation called closing. This you could call directly using -
cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
As #Ermlg suggested, medianBlur with a kernel of 3 also works wonderfully.
cv2.medianBlur(img, 3)
Alternative Step
As you can see all these filters work but it is better if you implement these filters only in the part where the noise is. To do that, use the following:
edges = cv2.Canny(img, 50, 150, apertureSize = 3) // img is gray here
lines = cv2.HoughLinesP(edges, 1, np.pi / 180, 100, 1000, 50) // last two arguments are minimum line length and max gap between two lines respectively.
for line in lines:
for x1, y1, x2, y2 in line:
print x1, y1
// This gives the start coordinates for all the lines. You should take the x value which is between (0.75 * w, w) where w is the width of the entire image. This will give you essentially **(x1, y1) = (1896, 766)**
Then, you can extract this part only like :
extract = img[y1:h, x1:w] // w, h are width and height of the image
Then, implement the filter (median or closing) in this image. After removing the noise, you need to put this filtered image in place of the blurred part in the original image.
image[y1:h, x1:w] = median
This is straightforward in C++ :
extract.copyTo(img, new Rect(x1, y1, w - x1, h - y1))
Final Result with alternate method
Hope it helps!
My solution is based on thresholding to get the resulted image in 4 steps.
Read image by OpenCV 3.2.0.
Apply GaussianBlur() to smooth image especially the region in gray color.
Mask the image to change text to white and the rest to black.
Invert the masked image to black text in white.
The code is in Python 2.7. It can be changed to C++ easily.
import numpy as np
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
# read Danish doc image
img = cv2.imread('./imagesStackoverflow/danish_invoice.png')
# apply GaussianBlur to smooth image
blur = cv2.GaussianBlur(img,(5,3), 1)
# threshhold gray region to white (255,255, 255) and sets the rest to black(0,0,0)
mask=cv2.inRange(blur,(0,0,0),(150,150,150))
# invert the image to have text black-in-white
res = 255 - mask
plt.figure(1)
plt.subplot(121), plt.imshow(img[:,:,::-1]), plt.title('original')
plt.subplot(122), plt.imshow(blur, cmap='gray'), plt.title('blurred')
plt.figure(2)
plt.subplot(121), plt.imshow(mask, cmap='gray'), plt.title('masked')
plt.subplot(122), plt.imshow(res, cmap='gray'), plt.title('result')
plt.show()
The following is the plotted images by the code for reference.
Here is the result image at 2197 x 3218 pixels.
As I know the median filter is the best solution to reduce noise. I would recommend to use median filter with 3x3 window. See function cv::medianBlur().
But be careful when use any noise filtration simultaneously with OCR. Its can lead to decreasing of recognition accuracy.
Also I would recommend to try using pair of functions (cv::erode() and cv::dilate()). But I'm not shure that it will best solution then cv::medianBlur() with window 3x3.
I would go with median blur (probably 5*5 kernel).
if you are planning to apply OCR the image. I would advise you to the following:
Filter the image using Median Filter.
Find contours in the filtered image, you will get only text contours (Call them F).
Find contours in the original image (Call them O).
isolate all contours in O that have intersection with any contour in F.
Faster solution:
Find contours in the original image.
Filter them based on size.
Blur (3x3 box)
Threshold at 127
Result:
If you are very worried of removing pixels that could hurt your OCR detection. Without adding artefacts ea be as pure to the original as possible. Then you should create a blob filter. And delete any blobs that are smaller then n pixels or so.
Not going to write code, but i know this works great as i use this myself, though i dont use openCV (i wrote my own multithreaded blobfilter out of speed reasons). And sorry but i cannot share my code here. Just describing how to do it.
If processing time is not an issue, a very effective method in this case would be to compute all black connected components, and remove those smaller than a few pixels. It would remove all the noisy dots (apart those touching a valid component), but preserve all characters and the document structure (lines and so on).
The function to use would be connectedComponentWithStats (before you probably need to produce the negative image, the threshold function with THRESH_BINARY_INV would work in this case), drawing white rectangles where small connected components where found.
In fact, this method could be used to find characters, defined as connected components of a given minimum and maximum size, and with aspect ratio in a given range.
I had already faced the same issue and got the best solution.
Convert source image to grayscale image and apply fastNlMeanDenoising function and then apply threshold.
Like this -
fastNlMeansDenoising(gray,dst,3.0,21,7);
threshold(dst,finaldst,150,255,THRESH_BINARY);
ALSO use can adjust threshold accorsing to your background noise image.
eg- threshold(dst,finaldst,200,255,THRESH_BINARY);
NOTE - If your column lines got removed...You can take a mask of column lines from source image and can apply to the denoised resulted image using BITWISE operations like AND,OR,XOR.
Try thresholding the image like this. Make sure your src is in grayscale. This method will only retain the pixels which are between 150 and 255 intensity.
threshold(src, output, 150, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
You might want to invert the image as you are trying to negate the gray pixels. After the operation, invert it again to get your desired result.
I am totally new to OpenCV and I have started to dive into it. But I'd need a little bit of help.
So I want to combine these 2 images:
I would like the 2 images to match along their edges (ignoring the very right part of the image for now)
Can anyone please point me into the right direction? I have tried using the findTransformECC function. Here's my implementation:
cv::Mat im1 = [imageArray[1] CVMat3];
cv::Mat im2 = [imageArray[0] CVMat3];
// Convert images to gray scale;
cv::Mat im1_gray, im2_gray;
cvtColor(im1, im1_gray, CV_BGR2GRAY);
cvtColor(im2, im2_gray, CV_BGR2GRAY);
// Define the motion model
const int warp_mode = cv::MOTION_AFFINE;
// Set a 2x3 or 3x3 warp matrix depending on the motion model.
cv::Mat warp_matrix;
// Initialize the matrix to identity
if ( warp_mode == cv::MOTION_HOMOGRAPHY )
warp_matrix = cv::Mat::eye(3, 3, CV_32F);
else
warp_matrix = cv::Mat::eye(2, 3, CV_32F);
// Specify the number of iterations.
int number_of_iterations = 50;
// Specify the threshold of the increment
// in the correlation coefficient between two iterations
double termination_eps = 1e-10;
// Define termination criteria
cv::TermCriteria criteria (cv::TermCriteria::COUNT+cv::TermCriteria::EPS, number_of_iterations, termination_eps);
// Run the ECC algorithm. The results are stored in warp_matrix.
findTransformECC(
im1_gray,
im2_gray,
warp_matrix,
warp_mode,
criteria
);
// Storage for warped image.
cv::Mat im2_aligned;
if (warp_mode != cv::MOTION_HOMOGRAPHY)
// Use warpAffine for Translation, Euclidean and Affine
warpAffine(im2, im2_aligned, warp_matrix, im1.size(), cv::INTER_LINEAR + cv::WARP_INVERSE_MAP);
else
// Use warpPerspective for Homography
warpPerspective (im2, im2_aligned, warp_matrix, im1.size(),cv::INTER_LINEAR + cv::WARP_INVERSE_MAP);
UIImage* result = [UIImage imageWithCVMat:im2_aligned];
return result;
I have tried playing around with the termination_eps and number_of_iterations and increased/decreased those values, but they didn't really make a big difference.
So here's the result:
What can I do to improve my result?
EDIT: I have marked the problematic edges with red circles. The goal is to warp the bottom image and make it match with the lines from the image above:
I did a little bit of research and I'm afraid the findTransformECC function won't give me the result I'd like to have :-(
Something important to add:
I actually have an array of those image "stripes", 8 in this case, they all look similar to the images shown here and they all need to be processed to match the line. I have tried experimenting with the stitch function of OpenCV, but the results were horrible.
EDIT:
Here are the 3 source images:
The result should be something like this:
I transformed every image along the lines that should match. Lines that are too far away from each other can be ignored (the shadow and the piece of road on the right portion of the image)
By your images, it seems that they overlap. Since you said the stitch function didn't get you the desired results, implement your own stitching. I'm trying to do something close to that too. Here is a tutorial on how to implement it in c++: https://ramsrigoutham.com/2012/11/22/panorama-image-stitching-in-opencv/
You can use Hough algorithm with high threshold on two images and then compare the vertical lines on both of them - most of them should be shifted a bit, but keep the angle.
This is what I've got from running this algorithm on one of the pictures:
Filtering out horizontal lines should be easy(as they are represented as Vec4i), and then you can align the remaining lines together.
Here is the example of using it in OpenCV's documentation.
UPDATE: another thought. Aligning the lines together can be done with the concept similar to how cross-correlation function works. Doesn't matter if picture 1 has 10 lines, and picture 2 has 100 lines, position of shift with most lines aligned(which is, mostly, the maximum for CCF) should be pretty close to the answer, though this might require some tweaking - for example giving weight to every line based on its length, angle, etc. Computer vision never has a direct way, huh :)
UPDATE 2: I actually wonder if taking bottom pixels line of top image as an array 1 and top pixels line of bottom image as array 2 and running general CCF over them, then using its maximum as shift could work too... But I think it would be a known method if it worked good.
I want to merge 2 images. How can i remove the same area between 2 images?
Can you tell me an algorithm to solve this problem. Thanks.
Two image are screenshoot image. They have the same width and image 1 always above image 2.
When two images have the same width and there is no X-offset at the left side this shouldn't be too difficult.
You should create two vectors of integer and store the CRC of each pixel row in the corresponding vector element. After doing this for both pictures you find the CRC of the first line of the lower image in the first vector. This is the offset in the upper picture. Then you check that all following CRCs from both pictures are identical. If not, you have to look up the next occurrence of the initial CRC in the upper image again.
After checking that the CRCs between both pictures are identical when you apply the offset you can use the bitblit function of your graphics format and build the composite picture.
I haven't come across something similar before but I think the following might work:
Convert both to grey-scale.
Enhance the contrast, the grey box might become white for example and the text would become more black. (This is just to increase the confidence in the next step)
Apply some threshold, converting the pictures to black and white.
afterwards, you could find the similar areas (and thus the offset of overlap) with a good degree of confidence. To find the similar parts, you could harper's method (which is good but I don't know how reliable it would be without the said filtering), or you could apply some DSP operation(s) like convolution.
Hope that helps.
If your images are same width and image 1 is always on top. I don't see how that hard could it be..
Just store the bytes of the last line of image 1.
from the first line to the last of the image 2, make this test :
If the current line of image 2 is not equal to the last line of image 1 -> continue
else -> break the loop
you have to define a new byte container for your new image :
Just store all the lines of image 1 + all the lines of image 2 that start at (the found line + 1).
What would make you sweat here is finding the libraries to manipulate all these data structures. But after a few linkage and documentation digging, you should be able to easily implement that.
I want to do:
masked = image - mask
But I want to "displace" mask. That is, move it vertically and horizontally (as long as the intersection between it and image is not empty, this would be valid).
I have some hand-coded assembly (which uses MMX instructions) which does this, embedded in a C++ program, but it's unstable when doing vertical displacement, so I thought of using OpenCV instead. Would it be possible to do this calling only one OpenCV function?
Performance is critical; using OpenCV, time should be at least in the same order of magnitude as the assembly code.
EDIT: Here's an example
image (medium frame, see the contrast in the guy's skull):
mask (first frame, no contrast):
image - mask, without displacement. Notice how the contrast path is enhanced, but since the patient moved a little, we can see some skull contours which are visual noise for diagnostic purposes.
image - mask, mask displaced about 5 pixels down. To try and compensate for the noise introduced by the patient's movement, we "displace" the mask slightly so as to remove the contours and see the contrast path better (brightness and contrast were adjusted, that's why it looks a bit darker).
EDIT 2: About the algorithm, I managed to fix its issues. It doesn't crash anymore, but the downside is that it now processes all image pixels (it should only process those which need to be subtracted). Anyway, how to fix the old code is not my question; my question is, how do I do this processing using OpenCV? I'll post some profiling results later.
I know this is in Python, so not what you are after, but translating it to C++ should be very straight forward. It crops both images to matching sizes (required for nearly all operations), determined by the displacement between the images, and their relative sizes. This method should be quick, as cv.GetSubRect doesn't copy anything, so its just down to the cv.AbsDiff function (if you have an actual difference mask, you could use cv.Sub which should make it even quicker). Also this code will handle displacement in any direction and mask and image can be any size (mask can be larger than image). There must be an overlap for a specified displacement. The difference between images can be viewed alone, or the difference 'in-place'.
A nice diagram to illustrate whats going on. The first two squares are example image and mask. The next three squares show a horizontal displacement of the 'mask' of -30, 0, and 30 pixels, and the last one has a displacement of 20, 20.
import cv
image = cv.LoadImageM("image.png")
mask = cv.LoadImageM("mask.png")
image = cv.LoadImageM("image2.png")
mask = cv.LoadImageM("small_mask.png")
image_width, image_height = cv.GetSize(image)
mask_width, mask_height = cv.GetSize(mask)
#displacements here:
horiz_disp = 20
vert_disp = 20
image_horiz = mask_horiz = image_vert = mask_vert = 0
if vert_disp < 0:
mask_vert = abs(vert_disp)
sub_height = min(mask_height + vert_disp, image_height)
else:
sub_height = min(mask_height, image_height - vert_disp)
image_vert = vert_disp
if horiz_disp < 0:
mask_horiz = abs(horiz_disp)
sub_width = min(mask_width + horiz_disp, image_width)
else:
sub_width = min(mask_width, image_width - horiz_disp)
image_horiz = horiz_disp
#cv.GetSubRect returns a rectangular part of an image, without copying any data. - fast.
mask_sub = cv.GetSubRect(mask, (mask_horiz, mask_vert, sub_width, sub_height))
image_sub = cv.GetSubRect(image, (image_horiz, image_vert, sub_width, sub_height))
#Subtracts the mask overlap region from the image overlap region, puts it in image_sub
cv.AbsDiff(image_sub, mask_sub, image_sub)
# Shows diff only:
cv.ShowImage('image_sub', image_sub)
# Shows image with diff section
cv.ShowImage('image', image)
cv.WaitKey(0)