how to remove all non-character objects from license plate image? [duplicate] - c++

This question already has answers here:
Removing extra pixels/lines from license plate
(2 answers)
Closed 6 years ago.
I'm developing an ANPR for Persian plates, I've found the way to find plate, and with some methods I have reached the image below, now I need to remove all non-character objects from the image to process them later. there are some similar questions on SO but they have different image noise and also different aim. I have also tried Erode and Dilate but since characters are small and has low resolution, it destroys characters.
I don't want to use counters features because of performance. I need to remove these noises with some effects/filters. So this is not a duplicate question.
Here is some input images and outputs I need.
input:
output:
input:
output:

At least in the western world licence plates have a fixed layout. Having this preknowledge it is sufficient to localize the plate and get its orientation.
Then simply crop the regions you are interested in.
We also have standardized characters optimized for machine readability. I don't know if this is the case for your characters as well. You should be able to apply any decent OCR to read the plates contents.
Another option would be to search for blobs. Then delete everything that is too small or too big too eccentric or whatever to be a character.
Not sure if this dot above the U shaped character is important or can be omitted.

Related

OpenCV: Letters and words detection from edge detection image

I am currently dealing with text recognition. Here is a part of binarized image with edge detection (using Canny):
EDIT: I am posting a link to an image. I don't have 10 rep points so I cannot post an image.
EDIT 2: And here's the same piece after thresholding. Honestly, I don't know which approach would be better.
[2
The questions remain the same:
How should I detect certain letters? I need to determine location of every letter and then every word.
Is it a problem that some letters are "opened"? I mean that they are not closed areas.
If I use cv::matchtemplate, does it mean that I need to have 24 templates for every letter + 10 for every digit? And then loop over my image to determine the best correlation?
If both the letters and squares they are in, are 1-pixel wide, what filters / operations should I do to close the opened letters? I tried various combinations of dilate and erode - with no effect.
The question is kind of "how do I do OCR with Open CV?" and the answer is that it's an involved process and quite difficult.
But some pointers. Firstly, its hard to detect letters which are outlined. Most of the tools are designed for filled letters. But that image looks as if there will only be one non-letter distractor if you fill all loops using a certain size threshold. You can get rid of the non-letter lines because they are a huge connected object.
Once you've filled the letters, they can be skeletonised.
You can't use morphological operations like open and close very sensibly on images where the details are one pixel wide. You can put the image through the operation, but essentially there is no distinction between detail and noise if all features are one pixel. However once you fill the letters, that problem goes away.
This isn't in any way telling you how to do it, just giving some pointers.
As mentioned in the previous answer by malcolm OCR will work better on filled letters so you can do the following
1 use your second approach but take the inverse result and not the one you are showing.
2 run connected component labeling
3 for each component you can run the OCR algorithm
In order to discard outliers I will try to use the spatial relation between detected letters. They sold have other letter horizontally or vertically next to them.
Good luck

OCR in opencv - how to pass objects

I'd like to write OCR in OpenCV. I need to recognize single letters. I'd like to use K-Nearest Neighbors. I'd like to recognize letters with different size and font and handwritten.
So, I'll prepare images to train. The first question is. Should I use letters in the (1) same size of images or (2) fit image?
1)
2)
How about found letters? Should I pass it as 1 (with the same size as train images) or 2 (just fit rectangle to letter)???
The "benchmark" MNIST dataset normalizes and centers the characters as in scenario (1) you described. If you're just interested in classification, it might make any difference how you do it.
If I understand you correctly, your second question has to do with what's called "preprocessing" in ML jargon. If you apply a transformation to convert each raw image into one of type either (1) or (2), it's called a preprocessing step -- which ever one you choose. Whatever preprocessing you do to the training set, the exact same preprocessing has to be done to the data before applying the model.
To make it simple, if you have a giant data set that you want to split into "training" and "testing" examples, first transform this into a "preprocessed data" set, and split this one. That way you're sure the exact same transformation parameters are used for both training and testing.

Improve Tesseract detection quality

I am trying to extract alphanumeric characters (a-z0-9) which do not form sensefull words from an image which is taken with a consumer camera (including mobile phones). The characters have equal size and font type and are not formated. The actual processing is done under Windows.
The following image shows the raw input:
After perspective processing I apply the following with OpenCV:
Convert from RGB to gray
Apply cv::medianBlur to remove noise
Convert the image to binary using adaptive thresholding cv::adaptiveThreshold
I know the number of rows and columns of the grid. Thus I simply extract each grid cell using this information.
After all these steps I get images which look similar to these:
Then I run tesseract (latest SVN version with latest training data) on each extracted cell image individually (I tried different -psm and -l values):
tesseract.exe -l eng -psm 11 sample.png outtext
The results produced by tesseract are not very good:
Most characters are not recognized.
The grid lines are sometimes interpreted as "l" or "i" characters.
I already experimented with morphologic operations (open, close, erode, dilate) and replaced adaptive thresholding with OTSU thresholding (THRESH_OTSU) but the results got worse.
What else could I try to improve the recognition quality? Or is there even a better method to extract the characters besides using tesseract (for instance template matching?)?
Edit (21-12-2014):
I tested simple template matching (using normalized cross correlation and LMS but with even worse results). But I have made a huge step forward by extracting each character using findCountours and then running tesseract with only one character and the -psm 10 option which interprets each input image as a single character. Additonaly I remove non-alphanumeric characters in a post processing step. The first results are encouraging with detection rates of 90% and better. The main problem are misdetections of "9" and "g" and "q" characters.
Regards,
As I say here, you can tell tesseract to pay attention on "almost same" characters.
Also, there is some option in tesseract that don't help you in your example.
For instance, a "Pocahonta5S" will become, most of the time, a "PocahontaSS" because the number is in a letter word. You can see in this way so.
Concerning pre-processing, you better have to use a sharpen filter.
Don't forget that tesseract will always apply an Otsu's filter before reading anything.
If you want good result, sharpening + Adaptive Threshold with some other filters are good ideas.
I recommend to use OpenCV in Combination with tesseract.
The problem in your input images for tesseract are the non-character regions in your image.
An approach myself
To get rid of these I would use the openCV findContour function to receive all contours in your binary image. Afterwards define some criteria to illiminate the non-character regions. For example only take the regions, which are inside the image and doesn't touch the border, or to only take the regions with a specific region-area or a specific ratio of heigth to width. Find some kind of features, that let you distinguish between character an non-character contours.
Afterwards eliminate these non-character regions and handle the images forward to tesseract.
Just as idea for general testing this approach:
Eliminate the non-character regions manual (gimp or paint,...) and give the image to tesseract. If the result fits your expactations you can try to eliminate the the non-character regions with proposed method of above.
I suggest a similar approach I'm using in my case.
(I only have the problem of speed, which you should not have if its only some characters to compare)
First: Get the form to have default size and transform it:
https://www.youtube.com/watch?v=W9oRTI6mLnU
Second: Use matchTemplate
Improve template matching with many templates for one Image/ find characters on image
I also played around with OCR but I didn't like it because of 2 reasons:
Some kind of blackbox and hard to debug why its not recognized
In my case it was never 100% accurate no matter what i did even for screenshots with "perfect" characters.

how to identify a car after performing canny edge in a still image opencv c++ [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am new in OpenCV and I want to know how can I identify the cars in a canny edged image
because I want to count the cars in the image based on their edges.
Here is the canny edged image
And here is the original image
The general problem of identifying dynamic objects on a given scene for whichever purposes, such as counting, may be tackled by the use of background subtraction.
The idea is to use one of the implementations of this technique that OpenCV provides, BackgroundSubtractorMOG for instance, to construct a background model for your scene, by providing every frame of a video stream for it to process. It will identify what features of the scene are most probably static, to construct a syntetic image of the most probable background, the parking lot without cars in your case. You would then subtract a given frame from this syntetic background and count the blobs which have a minimum size, i.e. are big enough to be vehicles.
The results are impresive and I particularly love this technique. On youtube you can check some examples, I suggest this one, which is very close to your particular case. This one here is also very interesting, because it displays the syntetic background image side by side with the current frame, so you can see how well it works. Pay close attention around 00:50 on this last video, you can see the car slowly appearing on the background image, because it stays on the same spot for too long.
Aren't humans good at spotting things? You even recognize the cars in the canny edge image, even though there is not a single wheel visible.
Anyway, the main reason why you're using canny edge detection is because you have a datastream of 10-100 Megapixels per second. You need to quickly find the interesting bits in there. And as your image shows, it works fantastically for that.
Now, to count actual cars in parking spaces, I would suggest a fixed setup procedure that identifies the potential parking spots. You don't want to count passing cars anyway. This step can be semi-automated by checking for parallel sets of lines in the canny image.
Once you've got those parking spots identified, it may be a good idea to define a mask. Use this mask to zero out the non-parking spot pixels. (Doing this before canny edge detection speeds up that process too, but obviously adds a false edge around the mask so you'd have to reapply the mask.)
Now it's really just checking if there's anything sufficiently big in a parking spot. You probably don't care if a motorbike is counted as a car anyway. To do so, use the canny edges to separate the car pixels from the surrounding parking lot pixels, and count if they differ (in color/brightness/texture/...)

How to detect Text Area from image?

i want to detect text area from image as a preprocessing step for tesseract OCR engine, the engine works well when the input is text only but when the input image contains Nontext content it falls, so i want to detect only text content in image,any idea of how to do that will be helpful,thanks.
Take a look at this bounding box technique demonstrated with OpenCV code:
Input:
Eroded:
Result:
Well, I'm not well-experienced in image processing, but I hope I could help you with my theoretical approach.
In most cases, text is forming parallel, horisontal rows, where the space between rows will contail lots of background pixels. This could be utilized to solve this problem.
So... if you compose every pixel columns in the image, you'll get a 1 pixel wide image as output. When the input image contains text, the output will be very likely to a periodic pattern, where dark areas are followed by brighter areas repeatedly. These "groups" of darker pixels will indicate the position of the text content, while the brighter "groups" will indicate the gaps between the individual rows.
You'll probably find that the brighter areas will be much smaller that the others. Text is much more generic than any other picture element, so it should be easy to separate.
You have to implement a procedure to detect these periodic recurrences. Once the script can determine that the input picture has these characteristics, there's a high chance that it contains text. (However, this approach can't distinguish between actual text and simple horisontal stripes...)
For the next step, you must find a way to determine the borderies of the paragraphs, using the above mentioned method. I'm thinking about a pretty dummy algorithm, witch would divide the input image into smaller, narrow stripes (50-100 px), and it'd check these areas separately. Then, it would compare these results to build a map of the possible areas filled with text. This method wouldn't be so accurate, but it probably doesn't bother the OCR system.
And finally, you need to use the text-map to run the OCR on the desired locations only.
On the other side, this method would fail if the input text is rotated more than ~3-5 degrees. There's another backdraw, beacuse if you have only a few rows, then your pattern-search will be very unreliable. More rows, more accuracy...
Regards, G.
I am new to stackoverflow.com, but I wrote an answer to a question similar to this one which may be useful to any readers who share this question. Whether or not the question is actually a duplicate, since this one was first, I'll leave up to others. If I should copy and paste that answer here, let me know. I also found this question first on google rather than the one i answered so this may benefit more people with a link. Especially since it provides different ways of going about getting text areas. For me, when I looked up this question, it did not fit my problem case.
Detect text area in an image using python and opencv
In the Current time, the best way to detect the text is by using EAST (An Efficient and Accurate Scene Text Detector)
The EAST pipeline is capable of predicting words and lines of text at arbitrary orientations on 720p images, and furthermore, can run at 13 FPS, according to the authors.
EAST quick start tutorial can be found here
EAST paper can be found here