Finding all the regions in a webpage's image - python-2.7

I am working on a project where I need to find different regions present in an image(for any web page) like - navigation bar, menu bar, body, advertisement section etc. First I want to segment my entire image into distinct regions/sections using Image processing.
What I have done:
1st approach: I ran edge detection algorithm(Canny), this way I could see different regions in the form of rectangular boxes. However, I couldn't find a way to recognize all these regions.
2nd approach: I dealt with Hough transform to get all the horizontal and vertical lines which can help me in deciding different rectangular sections in the image. However, I am not able to come up with some concrete approach to use this houghlines to find all the rectangular regions imbibed in the image.
Any kind of your help is highly appreciated!

Related

Image bounding boxes from sprite sheet

I've a sprite sheet containing a set of icons as shown here:
I'd like to get the bounding box (at pixel precision) of all icons inside it, some cases like list, grid have to be considered as only one icons. Any ideas are more than welcome.
I think the main issue in your problem is that some icons contain disjoint parts.
If all the icons were in only one part, you could just find the "connected components" (groups of white pixels) in your image and isolate them.
I don't know your level in image processing but to connect the parts of one icons, I would probably use dilation, which is a morphological method to expand (under constraints) the areas of maximum intensity in an image.
If you need any clarification, please let me know !
In general, it is not possible: only the humans have enough context to determine which of the disjoint parts belong together. You can approximate it using various ways, but it's a lost cause - and IMHO completely unnecessary. Imagine writing a test for this functionality - it's impossible, it requires a human in the loop, since the results for any particular icon sheet don't generalize. Knowing that the algorithm works for some sheet tells you nothing about whether it will work for some other sheet that you know nothing about a-priori.
It'd be simpler to manually colorize each sprite to have a color different than that of its neighbors. Then a greedy algorithm could find the bounding boxes easily without having to approximate anything.

Error in train_object_detector.cpp dlib

I was trying to run train_object_detector.cpp in dlib library to train it for pedestrian detection. I'm using INRIA dataset and when i tried to use it, there was an exception:
exception thrown!
Error! An impossible set of object boxes was given for training. All
the boxes
need to have a similar aspect ratio and also not be smaller than about
1600
pixels in area. The following images contain invalid boxes:
crop001002.png
crop001027.png
crop001038.png
crop001160.png
crop001612.png
crop001709.png
Try the -h option for more information.
when i removed these photos, it did run and loaded all photos but then another exception was thrown
exception thrown!
An impossible set of object labels was detected. This is happening
because none
of the object locations checked by the supplied image scanner is a
close enough
match to one of the truth boxes. To resolve this you need to either
lower the
match_eps or adjust the settings of the image scanner so that it hits
this truth box. Or you could adjust the offending truth rectangle so
it can be matched by the current image scanner. Also, if you are using
the scan_image_pyramid object then you could try using a finer image
pyramid or adding more detection templates. E.g. if one of your
existing detection templates has a matching width/height ratio and
smaller area than the offending rectangle then a finer image pyramid
would probably help.
please help me to deal with that.
Did you label your images using ImgLab?
When you label your images with this tool, keep in mind that your bounding boxes must have a similar aspect ration and that these bounding boxes must be smaller than the sliding window.
Usually, the example that you are running should dynamically calculate the size of the sliding window according to the provided boxes.
I'd suggest that you modify the source code a bit to do further tracking for the error source, if non of these helps.

How can you graph error as a shaded region?

When using TGraphErrors, the error bars appear as crosses, in the absence of significant X errors and many, many data points (such as MCA with 16k bins or so) I'd like to be able to remove the single points and single error bars and graph the error as a shaded region bounding the curve from above and below.
But I'm still a rank beginner at using ROOT, and I cannot figure out how to leverage TGraphErrors to do what I want. Will I need to instead use a TMultiGraph instead (and calculate the above and below bounding curves) and if so how can I control the shading region?
Something like the below would be along the lines of what I'm looking for. Source
Take a look at the TGraphPainter documentation which gives a few examples. One way is to draw the TGRaphErrors using option 4:
A smoothed filled area is drawn through the end points of the vertical error bars.
You will probably find that to get the final plot to look as you want, you have to draw the same graph multiple times - once to get the shaded region, then again on top to get the central curve.
This blog post gives a working example. It's written in PyROOT, but can be easily adapted to C++.

How to detect Text Area from image?

i want to detect text area from image as a preprocessing step for tesseract OCR engine, the engine works well when the input is text only but when the input image contains Nontext content it falls, so i want to detect only text content in image,any idea of how to do that will be helpful,thanks.
Take a look at this bounding box technique demonstrated with OpenCV code:
Input:
Eroded:
Result:
Well, I'm not well-experienced in image processing, but I hope I could help you with my theoretical approach.
In most cases, text is forming parallel, horisontal rows, where the space between rows will contail lots of background pixels. This could be utilized to solve this problem.
So... if you compose every pixel columns in the image, you'll get a 1 pixel wide image as output. When the input image contains text, the output will be very likely to a periodic pattern, where dark areas are followed by brighter areas repeatedly. These "groups" of darker pixels will indicate the position of the text content, while the brighter "groups" will indicate the gaps between the individual rows.
You'll probably find that the brighter areas will be much smaller that the others. Text is much more generic than any other picture element, so it should be easy to separate.
You have to implement a procedure to detect these periodic recurrences. Once the script can determine that the input picture has these characteristics, there's a high chance that it contains text. (However, this approach can't distinguish between actual text and simple horisontal stripes...)
For the next step, you must find a way to determine the borderies of the paragraphs, using the above mentioned method. I'm thinking about a pretty dummy algorithm, witch would divide the input image into smaller, narrow stripes (50-100 px), and it'd check these areas separately. Then, it would compare these results to build a map of the possible areas filled with text. This method wouldn't be so accurate, but it probably doesn't bother the OCR system.
And finally, you need to use the text-map to run the OCR on the desired locations only.
On the other side, this method would fail if the input text is rotated more than ~3-5 degrees. There's another backdraw, beacuse if you have only a few rows, then your pattern-search will be very unreliable. More rows, more accuracy...
Regards, G.
I am new to stackoverflow.com, but I wrote an answer to a question similar to this one which may be useful to any readers who share this question. Whether or not the question is actually a duplicate, since this one was first, I'll leave up to others. If I should copy and paste that answer here, let me know. I also found this question first on google rather than the one i answered so this may benefit more people with a link. Especially since it provides different ways of going about getting text areas. For me, when I looked up this question, it did not fit my problem case.
Detect text area in an image using python and opencv
In the Current time, the best way to detect the text is by using EAST (An Efficient and Accurate Scene Text Detector)
The EAST pipeline is capable of predicting words and lines of text at arbitrary orientations on 720p images, and furthermore, can run at 13 FPS, according to the authors.
EAST quick start tutorial can be found here
EAST paper can be found here

Remove a certain pattern from an image in OPENCV

I am trying to write a software for document management. First I Input the blank invoice. then feeds the other invoices with data. Using SIFT detectors i get what type of a invoice it is.
Then I want to remove the interect of the two images. Basically this will keep only the information and remove the common data on the invoice. I want to know is there a proper way to remove areas from the image
there is a concept in imagery called the region of interest. It creates a pointer to a sub-region in the original image, this could help you to read directly at x,y coordinates in the image.
Another possibility would be to make a substraction of the original image. But depending on the quality of the filled form picture, this might lead to other problems.
I was implying the ROI in a sense that you could create a ROI for every place where the form has input data and process only those specific regions
I found a function you might help you, cvAbsDiff, which can subtract an image from another
Here is a link that might help you understanding how to use it
http://blog.damiles.com/?p=67