Track Eye Pupil Position with Webcam, OpenCV, and Python - python-2.7

I am trying to build a robot that I can control with basic eye movements. I am pointing a webcam at my face, and depending on the position of my pupil, the robot would move a certain way. If the pupil is in the top, bottom, left corner, right corner of the eye the robot would move forwards, backwards, left, right respectively.
My original plan was to use an eye haar cascade to find my left eye. I would then use houghcircle on the eye region to find the center of the pupil. I would determine where the pupil was in the eye by finding the distance from the center of the houghcircle to the borders of the general eye region.
So for the first part of my code, I'm hoping to be able to track the center of the eye pupil, as seen in this video. https://youtu.be/aGmGyFLQAFM?t=38
But when I run my code, it cannot consistently find the center of the pupil. The houghcircle is often drawn in the wrong area. How can I make my program consistently find the center of the pupil, even when the eye moves?
Is it possible/better/easier for me to tell my program where the pupil is at the beginning?
I've looked at some other eye tracking methods, but I cannot form a general algorithm. If anyone could help form one, that would be much appreciated!
https://arxiv.org/ftp/arxiv/papers/1202/1202.6517.pdf
import numpy as np
import cv2
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier('haarcascade_righteye_2splits.xml')
#number signifies camera
cap = cv2.VideoCapture(0)
while 1:
ret, img = cap.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#faces = face_cascade.detectMultiScale(gray, 1.3, 5)
eyes = eye_cascade.detectMultiScale(gray)
for (ex,ey,ew,eh) in eyes:
cv2.rectangle(img,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)
roi_gray2 = gray[ey:ey+eh, ex:ex+ew]
roi_color2 = img[ey:ey+eh, ex:ex+ew]
circles = cv2.HoughCircles(roi_gray2,cv2.HOUGH_GRADIENT,1,20,param1=50,param2=30,minRadius=0,maxRadius=0)
try:
for i in circles[0,:]:
# draw the outer circle
cv2.circle(roi_color2,(i[0],i[1]),i[2],(255,255,255),2)
print("drawing circle")
# draw the center of the circle
cv2.circle(roi_color2,(i[0],i[1]),2,(255,255,255),3)
except Exception as e:
print e
cv2.imshow('img',img)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
cap.release()
cv2.destroyAllWindows()

I can see two alternatives, from some work that I did before:
Train a Haar detector to detect the eyeball, using training images with the center of the pupil at the center and the width of the eyeball as width. I found this better than using Hough circles or just the original eye detector of OpenCV (the one used in your code).
Use Dlib's face landmark points to estimate the eye region. Then use the contrast caused by the white and dark regions of the eyeball, together with contours, to estimate the center of the pupil. This produced much better results.

Just replace line where you created HoughCircles by this:
circles = cv2.HoughCircles(roi_gray2,cv2.HOUGH_GRADIENT,1,200,param1=200,param2=1,minRadius=0,maxRadius=0)
I just changed a couple of parameters and it gives me more accuracy.
Detailed information about parameters here.

Related

Computing real world distance using pixel distance

I have two images as here a vehicle entered the region and other image as here vehicle exited the region
I captured this images using a single CCTV camera mounted on a road.
Now I want to compute the real world distance travelled by it to find the speed of the vehicle. I use object detection to get the bounding boxes for car number plate in both the images there by I can compute pixel distance. I can map pixel distance to real world only when image plane and road plane are parallel to each other(I am using this technique but it isn't giving accurate results). Since my camera is inclined at an angle to the road, I couldn't use that technique.
I have tried few research papers but couldn't find any useful information relevant to my problem. Someone please share insights on doing it, it will be helpful.
In this problem we have single camera view so there is not way you can find the real world distance between the objects using camera view geometry. Though we can convert the image pixels to real world unit by considering certain reference objects with known length values in real world units.
In the sample image captured you can identify the road lane markers as shown below in the image and knowing their lengths in real world units you can find the pixels to real world distance.
Below is a quick and basic implementation of road-lane marker detection approach. This will also give you contours in the objects like car, bike in the image but you can remove such contours by applying mask over those objects once you know their object bounding boxes.
img = cv2.imread("road_lane.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.blur(gray, (3, 3))
# Find Canny edges
edged = cv2.Canny(blur, 30, 200)
# Finding Contours
contours, hierarchy = cv2.findContours(edged, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
boundRect = []
for i, c in enumerate(contours):
#ignore large and small contours
if len(c) < 300 and len(c) > 100 :
box = cv2.boundingRect(c)
#check for vertical rectangles
if box[2] < box[3]:
boundRect.append(box)
for i in range(len(boundRect)):
cv2.rectangle(img, (int(boundRect[i][0]), int(boundRect[i][1])), (int(boundRect[i][0] + boundRect[i][2]), int(boundRect[i][1] + boundRect[i][3])), (255, 0, 0), 5)

Any better way to locate whitespaces on binary images?

This is a picture of a corridor after masking.
I'm trying to follow the "free path" in an indoor environment using opencv, the way I'm trying to locate the free area or whitespace in the image is by traversing the whole array and checking the pixel values, but this seems too slow. I also tried using findContours and edge detection methods from opencv but the largest contour area is pointing at the far left corner of the white area. Any other way I can do this ?
import cv2
im = cv2.imread('YourImagePAth\\image.png')
gray=cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
contours, hierarchy =
cv2.findContours(gray,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)[-2:]
idx =0
for cnt in contours:
idx += 1
x,y,w,h = cv2.boundingRect(cnt)
roi=im[y:y+h,x:x+w]
cv2.rectangle(im,(x,y),(x+w,y+h),(0,0,255),2)
cv2.drawContours(im,contours,-1,(0,255,0),3)
cv2.imshow('img',im)
cv2.waitKey(0)
now if you want the white pixels area use : area = cv.contourArea(cnt)
and if you want the red rectangle area use : w*h (bounding box dimensions).
This is what you should get if you draw the bounding box using the code above :

Rectangle detection / tracking using OpenCV

What I need
I'm currently working on an augmented reality kinda game. The controller that the game uses (I'm talking about the physical input device here) is a mono colored, rectangluar pice of paper. I have to detect the position, rotation and size of that rectangle in the capture stream of the camera. The detection should be invariant on scale and invariant on rotation along the X and Y axes.
The scale invariance is needed in case that the user moves the paper away or towards the camera. I don't need to know the distance of the rectangle so scale invariance translates to size invariance.
The rotation invariance is needed in case the user tilts the rectangle along its local X and / or Y axis. Such a rotation changes the shape of the paper from rectangle to trapezoid. In this case, the object oriented bounding box can be used to measure the size of the paper.
What I've done
At the beginning there is a calibration step. A window shows the camera feed and the user has to click on the rectangle. On click, the color of the pixel the mouse is pointing at is taken as reference color. The frames are converted into HSV color space to improve color distinguishing. I have 6 sliders that adjust the upper and lower thresholds for each channel. These thresholds are used to binarize the image (using opencv's inRange function).
After that I'm eroding and dilating the binary image to remove noise and unite nerby chunks (using opencv's erode and dilate functions).
The next step is finding contours (using opencv's findContours function) in the binary image. These contours are used to detect the smallest oriented rectangles (using opencv's minAreaRect function). As final result I'm using the rectangle with the largest area.
A short conclusion of the procedure:
Grab a frame
Convert that frame to HSV
Binarize it (using the color that the user selected and the thresholds from the sliders)
Apply morph ops (erode and dilate)
Find contours
Get the smallest oriented bouding box of each contour
Take the largest of those bounding boxes as result
As you may noticed, I don't make an advantage of the knowledge about the actual shape of the paper, simply because I don't know how to use this information properly.
I've also thought about using the tracking algorithms of opencv. But there were three reasons that prevented me from using them:
Scale invariance: as far as I read about some of the algorithms, some don't support different scales of the object.
Movement prediction: some algorithms use movement prediction for better performance, but the object I'm tracking moves completely random and therefore unpredictable.
Simplicity: I'm just looking for a mono colored rectangle in an image, nothing fancy like car or person tracking.
Here is a - relatively - good catch (binary image after erode and dilate)
and here is a bad one
The Question
How can I improve the detection in general and especially to be more resistant against lighting changes?
Update
Here are some raw images for testing.
Can't you just use thicker material?
Yes I can and I already do (unfortunately I can't access these pieces at the moment). However, the problem still remains. Even if I use material like cartboard. It isn't bent as easy as paper, but one can still bend it.
How do you get the size, rotation and position of the rectangle?
The minAreaRect function of opencv returns a RotatedRect object. This object contains all the data I need.
Note
Because the rectangle is mono colored, there is no possibility to distinguish between top and bottom or left and right. This means that the rotation is always in range [0, 180] which is perfectly fine for my purposes. The ratio of the two sides of the rect is always w:h > 2:1. If the rectangle would be a square, the range of roation would change to [0, 90], but this can be considered irrelevant here.
As suggested in the comments I will try histogram equalization to reduce brightness issues and take a look at ORB, SURF and SIFT.
I will update on progress.
The H channel in the HSV space is the Hue, and it is not sensitive to the light changing. Red range in about [150,180].
Based on the mentioned information, I do the following works.
Change into the HSV space, split the H channel, threshold and normalize it.
Apply morph ops (open)
Find contours, filter by some properties( width, height, area, ratio and so on).
PS. I cannot fetch the image you upload on the dropbox because of the NETWORK. So, I just use crop the right side of your second image as the input.
imgname = "src.png"
img = cv2.imread(imgname)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
## Split the H channel in HSV, and get the red range
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
h,s,v = cv2.split(hsv)
h[h<150]=0
h[h>180]=0
## normalize, do the open-morp-op
normed = cv2.normalize(h, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8UC1)
kernel = cv2.getStructuringElement(shape=cv2.MORPH_ELLIPSE, ksize=(3,3))
opened = cv2.morphologyEx(normed, cv2.MORPH_OPEN, kernel)
res = np.hstack((h, normed, opened))
cv2.imwrite("tmp1.png", res)
Now, we get the result as this (h, normed, opened):
Then find contours and filter them.
contours = cv2.findContours(opened, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
print(len(contours))[-2]
bboxes = []
rboxes = []
cnts = []
dst = img.copy()
for cnt in contours:
## Get the stright bounding rect
bbox = cv2.boundingRect(cnt)
x,y,w,h = bbox
if w<30 or h < 30 or w*h < 2000 or w > 500:
continue
## Draw rect
cv2.rectangle(dst, (x,y), (x+w,y+h), (255,0,0), 1, 16)
## Get the rotated rect
rbox = cv2.minAreaRect(cnt)
(cx,cy), (w,h), rot_angle = rbox
print("rot_angle:", rot_angle)
## backup
bboxes.append(bbox)
rboxes.append(rbox)
cnts.append(cnt)
The result is like this:
rot_angle: -2.4540319442749023
rot_angle: -1.8476102352142334
Because the blue rectangle tag in the source image, the card is splited into two sides. But a clean image will have no problem.
I know it's been a while since I asked the question. I recently continued on the topic and solved my problem (although not through rectangle detection).
Changes
Using wood to strengthen my controllers (the "rectangles") like below.
Placed 2 ArUco markers on each controller.
How it works
Convert the frame to grayscale,
downsample it (to increase performance during detection),
equalize the histogram using cv::equalizeHist,
find markers using cv::aruco::detectMarkers,
correlate markers (if multiple controllers),
analyze markers (position and rotation),
compute result and apply some error correction.
It turned out that the marker detection is very robust to lighting changes and different viewing angles which allows me to skip any calibration steps.
I placed 2 markers on each controller to increase the detection robustness even more. Both markers has to be detected only one time (to measure how they correlate). After that, it's sufficient to find only one marker per controller as the other can be extrapolated from the previously computed correlation.
Here is a detection result in a bright environment:
in a darker environment:
and when hiding one of the markers (the blue point indicates the extrapolated marker postition):
Failures
The initial shape detection that I implemented didn't perform well. It was very fragile to lighting changes. Furthermore, it required an initial calibration step.
After the shape detection approach I tried SIFT and ORB in combination with brute force and knn matcher to extract and locate features in the frames. It turned out that mono colored objects don't provide much keypoints (what a surprise). The performance of SIFT was terrible anyway (ca. 10 fps # 540p).
I drew some lines and other shapes on the controller which resulted in more keypoints beeing available. However, this didn't yield in huge improvements.

Fast and robust way to detect a reflective ball

I have a highly reflective ball in an image that looks like this:
What is a robust method to detect the ball in real-time? (5-10 FPS)
I tried several segmentation algorithms, but they fail to separate the ball from the background and instead cut the ball into pieces, as there are many different areas on the ball itself.
Due to the reflective nature, a simple circular hough transform does not work well. The same goes for any simple treshhold or morphological operation.
Do you have any advice for handling reflective surfaces in general?
The HoughCircles suggestion is great, as long as you have a rough idea of how the ball will move in the frame and therefore roughly what minimum, maximum radius you account for:
import numpy as np
import cv2
import cv2.cv as cv
img = cv2.imread('wcEXm.jpg',0)
#Method 1: Hough Circles
img = cv2.medianBlur(img,5)
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
# HoughCircles(image, method, dp, minDist[, circles[, param1[, param2[, minRadius[, maxRadius]]]]])
circles = cv2.HoughCircles(img,cv.CV_HOUGH_GRADIENT,dp=1,minDist=50,param1=127,param2=30,minRadius=50,maxRadius=150)
circles = np.uint16(np.around(circles))
for i in circles[0,:]:
# draw the outer circle
cv2.circle(cimg,(i[0],i[1]),i[2],(0,255,0),2)
# draw the center of the circle
cv2.circle(cimg,(i[0],i[1]),2,(0,0,255),3)
cv2.imshow('detected circles',cimg)
Another option is to use findContours(). With the right options and bit of filtering (e.g. dilate(), erode()) you can segment the ball from the background and the ratio between the width and height (closer to a square) will help.
However, there's one neat little thing that might simplify this a lot if you're not interested in the size of the ball, just knowing where it is.
You're ball is reflective and to even begin to detect you will need a source of light, therefore, even though colours/environments will look different, the ball will have a highlight. Assuming the source of light isn't in the frame, your reflective ball will probably be the next brightest thing in the scene:
import cv2
img = cv2.imread('wcEXm.jpg',0)
minVal, maxVal, minLoc, maxLoc = cv2.minMaxLoc(img)
cv2.circle(img, maxLoc, 20, (0,192,0),10)
In terms of performance on RaspberryPi, I recommend the following:
Use Adrian's tutorial on using PiCam with OpenCV in Python
If you plan to use minMaxLoc() or other functions that work with grayscale images you can use the 'yuv' colour space and simply use the Y (luminance) channel to save a bit of time not needing to convert from RGB to lumma/grayscale
Use a smaller resolution (e.g. 320x240 or 160x120). You can scale the result back up if you need to map the x,y position of the ball to something else.
Update:
Yet another thing that might help is Canny edge detection because the scene is simple and the ball will stand out:
edges = cv2.Canny(img,100,200)

How to detect and draw a circle around the iris region of an eye?

I have been trying to detect the iris region of an eye and thereafter draw a circle around the detected area. I have managed to obtain a clear black and white eye image containing just the pupil, upper eyelid line and eyebrow using threshold function.
Once this is achieved HoughCircles is applied to detect if there are circles appearing in the image. However, it never detects any circular regions. After reading up on HoughCircles, it states that
the Hough gradient method works as follows:
First the image needs to be passed through an edge detection phase (in this case, cvCanny()).
I then added a canny detector after the threshold function. This still produced zero circles detected. If I remove the threshold function, the eye image becomes busy with unnecessary lines; hence I included it in.
cv::equalizeHist(gray, img);
medianBlur(img, img, 1);
IplImage img1 = img;
cvAddS(&img1, cvScalar(70,70,70), &img1);
//converting IplImage to cv::Mat
Mat imgg = cvarrToMat(&img1);
medianBlur(imgg, imgg, 1);
cv::threshold(imgg, imgg, 120, 255, CV_THRESH_BINARY);
cv::Canny(img, img, 0, 20);
medianBlur(imgg, imgg, 1);
vector<Vec3f> circles;
/// Apply the Hough Transform to find the circles
HoughCircles(imgg, circles, CV_HOUGH_GRADIENT, 1, imgg.rows/8, 100, 30, 1, 5);
How can I overcome this problem?
Would hough circle method work?
Is there a better solution to detecting the iris region?
Are the parameters chosen correct?
Also note that the image is directly obtained from the webcam.
Try using Daugman's Integro differential operator. It calculates the centre of the iris and pupil and draws an accurate circle on the iris and pupil boundaries. The MATLAB code is available here iris boundary detection using Daugman's method. Since I'm not familiar with OpenCV you could convert it.
The binary eye image contained three different aspects eyelashes ,the eye and the eyebrow.The main aim is to get to the region of interest which is the eye/iris, excluding eyebrows and eyelashes.I followed these steps:-
Step 1: Discard the upper half of the eye image ,therefore we are left with eyelashes,eye region and small shadow regions .
Step 2:Find the contours
Step 3:Find largest contour so that we have just the eye region
Step 4:Use bounding box to create a rectangle around the eye area
http://docs.opencv.org/doc/tutorials/imgproc/shapedescriptors/bounding_rects_circles/bounding_rects_circles.html
Now we have the region of interest.From this point I am now exacting these images and using neural network to train the system to emulate properties of a mouse. Im currently learning about the neural network link1 and how to use it in opencv.
Using the previous methods which included detecting the iris point,creating an eye vector,tracking it and calculating the graze on the screen is time consuming .Also there is light reflected on the iris making it difficult to detect.