I am using python2.7 opencv library to calculate histograms of some images, all of the exact same size (cv2.calchist)
i have a need to do 2 things:
1. calculate the average of multiple images - multiple images who represent a similar object, and therefor i want to have a "representive" histogram of that object (if you have a better idea i am open to suggustions) for future comparisons.
2. store the histogram data in my mongo db for future comparisons (cv2 correlation)
the only code i see rellevant for the question is my histogram_comparison code:
def histogram_comparison(real, fake):
images = [real, fake]
index = []
for image in images:
image = image.decode('base64')
image = np.fromstring(image, dtype=np.uint8)
image = cv2.imdecode(image, 1)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
hist = cv2.calcHist([image], [0, 1, 2], None, [32, 32, 32],
[0, 256, 0, 256, 0, 256])
hist = cv2.normalize(hist).flatten()
index.append(hist)
result_dist = cv2.compareHist(index[0], index[1], cv2.cv.CV_COMP_CORREL)
return round(result_dist, 5)
taken from: http://www.pyimagesearch.com/2014/07/14/3-ways-compare-histograms-using-opencv-python/
i do realize that when using numpy's (or was it scipy?) histograms, there is an easy way to get the bins and average them, but then im not really sure how then to compare between histograms so i would rather stay with opencv
thanks in advance
Since OpenCV (since 2.2) natively uses numpy arrays and since len(images) is constant, you can get avg between all your histograms and stores in mongo by simply:
h, b = np.histogram(images, bins=[0, 256])
db.histograms.insert({hist:(h/len(images)), bins:b })
I do not know if it's exactly what you want, but I hope it helps! See ya!
Related
To whom it may concern,
I am pretty new to tensorflow. I am trying to solve the famous MNIST problem for CNN. But i have encountered difficulty when i have to resuffle the x_training data (which is a [40000, 28, 28, 1] shape data.
my code is as below:
x_train_final = tf.reshape(x_train_final, [-1, image_width, image_width, 1])
x_train_final = tf.cast(x_train_final, dtype=tf.float32)
perm = np.arange(num_training_example).astype(np.int32)
np.random.shuffle(perm)
x_train_final = x_train_final[perm]
Below errors happened:
ValueError: Shape must be rank 1 but is rank 2 for 'strided_slice_1371' (op: 'StridedSlice') with input shapes: [40000,28,28,1], [1,40000], [1,40000], [1].
Anyone can advise how can i work around this? Thanks.
I would suggest you to make use of scikit's shuffle function.
from sklearn.utils import shuffle
x_train_final = shuffle(x_train_final)
Also, you can pass in multiple arrays and shuffle function will reorganize(shuffle) the data in those multiple arrays maintaining same shuffling order in all those arrays. So with that, you can even pass in your label dataset as well.
Ex:
X_train, y_train = shuffle(X_train, y_train)
I am trying to train a Custom Object Detector by using the HOG+SVM method on OpenCV.
I have managed to extract HOG features from my positive and negative samples using the below line of code:
import cv2
hog = cv2.HOGDescriptor()
def poshoggify():
for i in range(1,20):
image = cv2.imread("/Users/munirmalik/cvprojek/cod/pos/" + str(i)+ ".jpg")
(winW, winH) = (500, 500)
for resized in pyramid(image, scale=1.5):
# loop over the sliding window for each layer of the pyramid
for (x, y, window) in sliding_window(resized, stepSize=32, windowSize=(winW, winH)):
# if the window does not meet our desired window size, ignore it
if window.shape[0] != winH or window.shape[1] != winW:
continue
img_pos = hog.compute(image)
np.savetxt('posdata.txt',img_pos)
return img_pos
And the equivalent function for the negative samples.
How do I format the data in such a way that the SVM knows which is positive and which is negative?
Furthermore, how do I translate this training to the "test" of detecting the desired objects through my webcam?
How do I format the data in such a way that the SVM knows which is positive and which is negative?
You would now create another list called labels which would store the class value associated with a corresponding image. For example, if you have a training set of features that looks like this:
features = [pos_features1, pos_features2, neg_features1, neg_features2, neg_features3, neg_features4]
you would have a corresponding labels class like
labels = [1, 1, 0, 0, 0, 0]
You would then feed this to a classifier like so:
clf=LinearSVC(C=1.0, class_weight='balanced')
clf.fit(features,labels)
Furthermore, how do I translate this training to the "test" of detecting the desired objects through my webcam?
Before training, you should have split your labelled dataset (groundtruth) into training and testing datasets. You can do this using skilearns KFold module.
I saw the example of feature extraction in the keras doc and used the following code to extract feature from input image
input_shape = (224, 224, 3)
model = VGG16(weights = 'imagenet', input_shape = (input_shape[0],
input_shape[1], input_shape[2]), pooling = 'max', include_top = False)
img = image.load_img(img_path, target_size=(input_shape[0],
input_shape[1]))
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)
feature = model.predict(img)
Then when I output the shape of the feature variable, I found it is (1, 512). Why it is this dimension?
The print model.summary() shows the shape of last conv layer's output after maxpooling is (7, 7, 512), this is the dimension that I expect feature should be.
Thank Yong Yuan for helping me figuring this out. Since he has some problem with answering question on SO so I just put his answer here in case other people have same question.
Basically it's because there's a global max pooling layer specified in this model (as we can see in the line model = VGG16(....., pooling = 'max', ....) which selects the largest cell from the 7*7 cells. It's also said in keras documents:
pooling: Optional pooling mode for feature extraction when include_top is False.
And in the output given by model.summary(), we can see after the max pooling of the fifth convolution block there's actually a global_max_pooling2d_1 layer and so the final dimension becomes 512.
i have a picture from a laser line and i would like to extract that line out of the image.
As the laser line is red, i take the red channel of the image and then searching for the highest intensity in every row:
The problem now is, that there are also some points which doesnt belong to the laser line (if you zoom into the second picture, you can see these points).
Does anyone have an idea for the next steps (to remove the single points and also to extract the lines)?
That was another approach to detect the line:
First i blurred out that "black-white" line with a kernel, then i thinned(skeleton) that blurred line to a thin line, then i applied an OpenCV function to detect the line.. the result is in the below image:
NEW:
Now i have another harder situation.
I have to extract a green laser light.
The problem here is that the colour range of the laser line is wider and changing.
On some parts of the laser line the pixel just have high green component, while on other parts the pixel have high blue component as well.
Getting the highest value in every row will always output a value, instead of ignoring when the value isn't high enough. Consider using a threshold too, so that you can discard ones that aren't high enough.
However, that's not a very efficient way to do this at all. A much better and easier solution would be to use the OpenCV function inRange(); define a lower and upper bound for the red color in all three channels, and this will return a binary image with white pixels where the image intensity is within that BGR range.
This is in python but it does the job, should be easy to see how to use the function:
import cv2
import numpy as np
img = cv2.imread('image.png')
lowerb = np.array([0, 0, 120])
upperb = np.array([100, 100, 255])
red_line = cv2.inRange(img, lowerb, upperb)
cv2.imshow('red', red_line)
cv2.waitKey(0)
This produces the output:
This could be further processed by finding contours or other methods to turn the points into a nice curve.
I'm really sorry for the short answer without any code, but I suggest you take contours and process them.
I dont know exact what you need, so here are two approaches for you:
just collect as much as possible contours on single line (use centers and try find straight line with smallest mean)
as first way, but trying heuristically combine separated lines.... it's much harder, but this may give you almost full laser line from image.
--
Some example for yours picture:
import cv2
import numpy as np
import math
img = cv2.imread('image.png')
hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV)
# filtering red area of hue
redHueArea = 15
redRange = ((hsv[:, :, 0] + 360 + redHueArea) % 360)
hsv[np.where((2 * redHueArea) > redRange)] = [0, 0, 0]
# filtering by saturation
hsv[np.where(hsv[:, :, 1] < 95)] = [0, 0, 0]
# convert to rgb
rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2RGB)
# select only red grayscaled channel with low threshold
gray = cv2.cvtColor(rgb, cv2.COLOR_RGB2GRAY)
gray = cv2.threshold(gray, 15, 255, cv2.THRESH_BINARY)[1]
# contours processing
(_, contours, _) = cv2.findContours(gray.copy(), cv2.RETR_LIST, 1)
for c in contours:
area = cv2.contourArea(c)
if area < 8: continue
epsilon = 0.1 * cv2.arcLength(c, True) # tricky smoothing to a single line
approx = cv2.approxPolyDP(c, epsilon, True)
cv2.drawContours(img, [approx], -1, [255, 255, 255], -1)
cv2.imshow('result', img)
cv2.waitKey(0)
In your case it's work perfectly, but, as i already said, you will need to do much more work with contours.
When processing an image with text in OpenCV, my opening operation does not result in proper output data. The issue is quite similar to the one described in this article:
http://www.cpe.eng.cmu.ac.th/wp-content/uploads/CPE752_06part2.pdf
What I can see, people suggest to use reconstruction operations. Is there any build-in mechanism in OpenCV or some known library/code that implements this?
Here's my Python3 implementation in analogy to MatLab's imreconstruct algorithm:
import cv2
import numpy as np
def imreconstruct(marker: np.ndarray, mask: np.ndarray, radius: int = 1):
"""Iteratively expand the markers white keeping them limited by the mask during each iteration.
:param marker: Grayscale image where initial seed is white on black background.
:param mask: Grayscale mask where the valid area is white on black background.
:param radius Can be increased to improve expansion speed while causing decreased isolation from nearby areas.
:returns A copy of the last expansion.
Written By Semnodime.
"""
kernel = np.ones(shape=(radius * 2 + 1,) * 2, dtype=np.uint8)
while True:
expanded = cv2.dilate(src=marker, kernel=kernel)
cv2.bitwise_and(src1=expanded, src2=mask, dst=expanded)
# Termination criterion: Expansion didn't change the image at all
if (marker == expanded).all():
return expanded
marker = expanded
This answer arrives late, but here is the basic algorithm for under-reconstruction:
Inputs are two images: ImReference and ImMarker, with marker <= reference
Intermediate image: ImRec
Output image: ImResult
Copy ImMarker into ImRec
copy ImRec into ImResult
ImDilated = Dilation(ImResult)
ImRec = Minimum(ImDilated, ImReference)
If ImRec != ImResult then return to step 5.
It's not the most optimal algorithm, but it uses only basic operations.