OCR reading O instead of 0 - C++, Opencv, tesseract - c++

I'm using Tesseract framework with Opencv and C++ to read letters from an image on windows platform. Result contains O instead of 0 in many scenarios. Is there any way to eliminate this and receive an accurate answer?

One possible solution will be to use tessedit_char_whitelist config to specify only the characters that you are searching for.
If you have a well-known text pattern in your image maybe you can crop multiple images and use this config respectively to search for '0' or 'O' when you know they will appear.
Here's one code I made to see this config usage:
import cv2
import numpy as np
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
img = cv2.imread('a.jpg')
grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(_, blackWhiteImage) = cv2.threshold(grayImage, 127, 255, cv2.THRESH_BINARY)
blackWhiteImage = cv2.copyMakeBorder(src=blackWhiteImage, top=100, bottom=100, left=50, right=50, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
data = pytesseract.image_to_data(blackWhiteImage, config="-c tessedit_char_whitelist=ABCDEFGHIJKLMNO0123456789 --psm 6")
originalImage = cv2.cvtColor(blackWhiteImage, cv2.COLOR_GRAY2BGR)
for z, a in enumerate(data.splitlines()):
if z != 0:
a = a.split()
if len(a) == 12:
x, y = int(a[6]), int(a[7])
w, h = int(a[8]), int(a[9])
cv2.rectangle(originalImage, (x, y), (x + w, y + h), (0, 255, 0), 1)
cv2.putText(originalImage, a[11], (x, y - 2), cv2.FONT_HERSHEY_DUPLEX, 0.5, (0, 0, 255), 1)
cv2.imshow('final image', originalImage)
cv2.waitKey(0)
You will never achieve a perfect result using OCR. You will always need to do software tricks to try to achieve them.
To improve your results check this from tesseract documentation: Improving the quality of the output

Related

Detect more than one face OpenCV

i'm learning python so i'm trying to work with OpenCV.
Program detecting only one face, if there will be 2 faces it will be show only one
here's the code:
def getData(id):
psg = psgconnect.cursor()
psg.execute("SELECT name FROM people WHERE id=%s", (Id,))
cursor = psg.execute("SELECT name FROM people WHERE id=%s", (Id,))
Data = None
psgconnect.commit()
row = psg.fetchone()
#psgconnect.close()
return row
while True:
ret, img = cam.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
face = faceDetect.detectMultiScale(gray, 1.3, 5)
for(x, y, w, h) in face:
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
Id, conf = recognizer.predict(gray[y:y + h, x:x + w])
data = getData(Id)
# if data != None:
if(conf<50):
#cv2.putText(img, data[0], (x, y + h), cv2.FONT_HERSHEY_PLAIN, 4, (255, 255, 255), 4)
cv2.putText(img, 'nashel', (x, y + h), cv2.FONT_HERSHEY_PLAIN, 4, (255, 255, 255), 4)
elif(conf>51):
cv2.putText(img, 'Unknown', (x, y + h), cv2.FONT_HERSHEY_PLAIN, 4, (255, 255, 255), 4)
cv2.imshow("Face", img)
k = cv2.waitKey(10)
print("suda doshli")
if k == 27:
#psg.close()
psgconnect.close()
print("zdes")
break
cam.release()
input()
cv2.destroyAllWindows()
What wroung could be?
Since you're not getting any answer, I'll try and help you even though I don't use OpenCV.
First, are you sure that OpenCV is only detecting one face in a two-face image or could it be a problem with your loop ?
You could check how many faces are detected by using a simple print:
ret, img = cam.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
face = faceDetect.detectMultiScale(gray, 1.3, 5)
print len(face)
If it prints 1, then there actually is a problem with how OpenCV is working with your image (and in this case I really can't help you because I don't know anything about OpenCV). However, when I read your code, I get the feeling that some things are off (even after your edit), so I propose you tried the following:
def getData(id):
psg = psgconnect.cursor()
psg.execute("SELECT name FROM people WHERE id=%s", (Id,))
cursor = psg.execute("SELECT name FROM people WHERE id=%s", (Id,))
Data = None
psgconnect.commit()
row = psg.fetchone()
#psgconnect.close()
return row
while True:
ret, img = cam.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
face = faceDetect.detectMultiScale(gray, 1.3, 5)
for(x, y, w, h) in face:
cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)
Id, conf = recognizer.predict(gray[y:y + h, x:x + w])
data = getData(Id)
if(conf<50):
cv2.putText(img, 'nashel', (x, y + h), cv2.FONT_HERSHEY_PLAIN, 4, (255, 255, 255), 4)
elif(conf>51):
cv2.putText(img, 'Unknown', (x, y + h), cv2.FONT_HERSHEY_PLAIN, 4, (255, 255, 255), 4)
cv2.imshow("Face", img)
k = cv2.waitKey(10)
print("suda doshli")
if k == 27:
#psg.close()
psgconnect.close()
print("zdes")
break
cam.release()
input()
cv2.destroyAllWindows()
Let me know how it goes!
As a side note: since you're learning python and considering the code you pasted before your first edit, I would just like to emphasize how important indentation is in python. A change of indentation can completely change the meaning of your code, and it is very important that you understand how and why certain blocks are indented.

finding rectangles in a pcb using python

Is there any way to find rectangles in a pcb board using python? My goal is to find the pcb components. I tried to smooth the picture and then apply cunny edge and contour detection but the only correct contour that i managed to find is the contour around the board. Is there any way to find the components of the board and draw a rectangle around them? Any help will be highly appreciated! Thank you!
UPDATE
The code i used is about trying to find contours based on color.
import numpy as np
import cv2
from matplotlib import pyplot as plt
im = cv2.imread('img14.jpg')
#gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
#ret, thresh = cv2.threshold(gray, 80, 255, 0)
#blur = cv2.bilateralFilter(img,9,75,75)
kernel = np.ones((5,5),np.float32)/25
dst = cv2.filter2D(im,-1,kernel)
# find all the 'black' shapes in the image
lower = np.array([0, 0, 0])
upper = np.array([100, 100, 100])
shapeMask = cv2.inRange(dst, lower, upper)
(cnts, _) = cv2.findContours(shapeMask.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
print "I found %d black shapes" % (len(cnts))
for c in cnts:
cv2.drawContours(im, [c], -1, (0, 255, 0), 2)
cv2.imshow('shapemask', shapeMask)
cv2.imshow('contours', im)
cv2.waitKey(0)
it print that 322 contours have been found and that's the problem. I need only the 8 biggest. Is there any way to take only those with the biggest area? Or maybe i have to process the image first for better results?

Python 2.7- how to save a picture through tuple or set values?

So I've been using opencv2 and PIL to get pixel values. They're saved as such
(0, 0, 0), (1, 1, 1)
I've tried like 7 different ways to use this data to create an image.
My biggest problem is I can't seem to get putdata to work with my tuple.
I would show code, but my laptop is flat and my code is broken anyway.
Tldr: how to save an image with PIL using pixel values stored in a tuple?
Something like this should work
from PIL import Image
W = 200
H = 200
img = Image.new("RGB", (W, H))
pixel_list = [(i%256,i%256,i%256) for i in range(W*H)]
i_pixel = 0
for x in range(W):
for y in range(H):
img.putpixel((x, y), pixel_list[i_pixel])
i_pixel += 1
img.save('result.png')
With the following result
Note: I read here the following:
In 1.1.6, the above is better written as:
pix = im.load()
for i in range(n):
...
pix[x, y] = value
But I couldn't get that to work.

How to merge 2 gray-scale images in Python with OpenCV

I want to merge 2 one-channel, gray-scale images with OpenCv merge method. It is the code below:
...
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
zeros = numpy.zeros(img_gray.shape)
merged = cv2.merge([img_gray, zeros])
...
The problem is that gray-scale image doesn't have depth attribute that should be 1 and merge function require the same size of images and the same depth. I get error:
error: /build/buildd/opencv-2.4.8+dfsg1/modules/core/src/convert.cpp:296: error: (-215) mv[i].size == mv[0].size && mv[i].depth() == depth in function merge
How can i merge this arrays?
Solved, i had to change dtype of img_gray from uint8 to float64
img_gray = numpy.float64(img_gray)
OpenCV Version 2.4.11
import numpy as np
# Load the image
img1 = cv2.imread(paths[0], cv2.IMREAD_UNCHANGED)
# could also use cv2.split() but per the docs (link below) it's time consuming
# split the channels using Numpy indexing, notice it's a zero based index unlike MATLAB
b = img1[:, :, 0]
g = img1[:, :, 1]
r = img1[:, :, 2]
# to avoid overflows and truncation in turn, clip the image in [0.0, 1.0] inclusive range
b = b.astype(np.float)
b /= 255
manipulate the channels ... in my case, adding Gaussian noise to blue channel ( b => b1 )
b1 = b1.astype(np.float)
g = g.astype(np.float)
r = r.astype(np.float)
# gotcha : notice the parameter is an array of channels
noisy_blue = cv2.merge((b1, g, r))
# store the outcome to disk
cv2.imwrite('output/NoisyBlue.png', noisy_blue)
N.B.:
Alternatively, you may also use np.double instead np.float in astype for type casting
Open CV Documentation Link

Color image segmentation with Python

I have many pictures as below:
My objective is to identify those "beads", try to mark it with a circle, and count the detected numbers.
I tried to use image segmentation algorithms via Python and the source codes are as below:
from matplotlib import pyplot as plt
from skimage import data
from skimage.feature import blob_dog, blob_log, blob_doh
from math import sqrt
from skimage.color import rgb2gray
from scipy import misc # try
image = misc.imread('test.jpg')
image_gray = rgb2gray(image)
blobs_log = blob_log(image_gray, max_sigma=10, num_sigma=5, threshold=.1)
# Compute radii in the 3rd column.
blobs_log[:, 2] = blobs_log[:, 2] * sqrt(2)
blobs_dog = blob_dog(image_gray, max_sigma=2, threshold=.051)
blobs_dog[:, 2] = blobs_dog[:, 2] * sqrt(2)
blobs_doh = blob_doh(image_gray, max_sigma=2, threshold=.01)
blobs_list = [blobs_log, blobs_dog, blobs_doh]
colors = ['yellow', 'lime', 'red']
titles = ['Laplacian of Gaussian', 'Difference of Gaussian',
'Determinant of Hessian']
sequence = zip(blobs_list, colors, titles)
for blobs, color, title in sequence:
fig, ax = plt.subplots(1, 1)
ax.set_title(title)
ax.imshow(image, interpolation='nearest')
for blob in blobs:
y, x, r = blob
c = plt.Circle((x, y), r, color=color, linewidth=2, fill=False)
ax.add_patch(c)
plt.show()
The best results obtained so far are still unsatisfactory:
How can I improve it ?
You could use Gimp or Photoshop and test some filters and colors changes to differentiate the circles from the background. Brightness and Contrast adjustments may work. Then you can apply an Edge detector to detect the circles.
by converting this image to grayscale you have effectively thrown away the most powerful cue you have to segment the beads - their distinctive green color. try running the same code but replace
image_gray = rgb2gray(image)
with
image_gray = image[:,:,1]