streaking image with opencv - c++

So I've been working on some code for a couple weeks and it is far from complete, However the one thing keeping me from moving forward is a strange problem which I cannot figure out. I've been stuck for a few days now
The code below is for a program which accepts to command line arguments, an infile and an outfile. The infile will be a small square binary tif image, somewhere between 200x200 and 400x400. At this point the program should tile the image, stretching each part to various lengths. The outfile should have a height of 768 pixels and a width in the ballpark of 50k to 60k pixels. I apologize, but I can't supply them for example, they are confidential.
While it does work, sort of, it only replicates images to around 34k pixels and stops. The last row continues to display a black streak to the end.
I think the problem is coming from my create1track() function. I have tried optimizing it with very few changes. If I use a while loop as opposed to a for loop I get three black streaks as opposed to one. Does anyone have any suggestions on why it might do this?
It's a pretty simple function. I don't see why it shouldn't work
I'm posting my entire code, hoping for some advice. A copy is stored is stored here:
https://www.dropbox.com/s/sp153rz252uikue/main_backup.cpp
I'd accept any other criticism/input, just be nice, I just started teaching myself c++ about 2 months ago and since I'm pretty new to programming in general, I'm sure there is plenty of things I'm doing wrong.

Related

How do I prevent square images that spans 2 columns from pushing the overall height of the row?

Example here: https://springetts.co.uk/test/
I'm having some difficulty figuring out the best way to turn the third image tile into a rectangle without having to change the original image shape.
Here's a mockup of the result I would like:
screenshot of the third image as a rectangle
I've tried a variety of things, but can't seem to figure out an elegant way for it to work nicely and so I would really appreciate some help. Thank you.
I've tried adding height:50% to the in an attempt to crop the nested inside but that did not work.
Just add this CSS rule somewhere.
.page-id-5077 #post-2661 img {
aspect-ratio: 2/1;
object-fit: cover;
}
That will get you pretty close, although you may need to change the aspect ratio slightly to get the height to match the square images in columns 1 and 2, because your rectangle needs to account for the gap between columns 3 and 4. Through experimentation, I found aspect-ratio: 41/20; to be fairly accurate.
PS. While in this case you provided a link which enabled me to understand the problem you’re trying to solve, on Stack Overflow it is considered good practice not to rely on links, because they tend to change over time. Better to write questions which are self-contained and therefore can still be useful in ten years’ time. That means including enough code in your question so that we can understand the problem without reference to external links, or better still, including a working Stack Snippet. More details here.

Computer vision algorithm to use for making lines thinner

I have lecture notes written by a professor using a stylus.
A sample:
The width of the line used here is making reading difficult for me. I would like to make the lines thinner. The only solution I could think of is dilating the image. This gives a passable result:
The picture above is with uniform kernel of shape (2, 2) applied once; I've tried a bunch of kernel types, widths & numbers of iterations to arrive at this version that looks best to me.
However, I can't help but wonder if there's maybe another applicable algorithm that I'm missing; one that could lead to even better results? I wasn't able to google any computer vision approaches to font thinning, so I would appreciate any information on the subject.
Have been monitored such info during several days. Try to use Thinning described here, the link is also in the references to OpenCV-Python-Tutorial on morphological transforms. Taking Image Gradient can help, but it will make the image Grayscale, and with inverting colors you can get black-on-white text. Try to leave original color on black pixels location when original and final images are stacked.

Opencv C++ Recognize number

this should be easy. I'm working on a Sedoku solver and I and trying to figure out how to tell which number I am looking at.
I am able to isolate the number as seen above. I just can't get any image recognition to work. I've tried Knearest and something called tesseract but to no avail. Any help?
for easy tasks like this, I would not recommend using something like tesseract. Just think about some simple trick way. For example, threshold it and count the black pixels and see what are the count for each number. of course this method will fail for 6 and 9 so you may cut the number into two half and count each one and compare.. and so on.

Optimize calculation of white pixels in a binary image

I have a program which does the following steps (using OpenCV):
Connect to a camera
Start a loop
Fetch frame
Extract red channel
Threshold the extracted channel
Put it into a deque to build a buffer (right now, a three image buffer)
Calculate the variation among frames in the buffer (some morphology included)
Take that variation as a binary image
Count the amount of variation (white pixels)
If there's variation, calculate its center.
My problem is that the loop that starts with the second step, is ideally repeated 90 times a second, and the CPU it is running on is quite weak (Raspberry PI), and so I decided to benchmark the application once it bottle necked.
I broke things up into four groups. Steps 3, 4-6, 7-8 and 9. Here are some results in microseconds (benchmarks based on the system time, not CPU time, they are not 100% precise)
Read camera:5101; Update buffer:15032; Calculate the variation:8149; Count non-zero:51665
Read camera:5446; Update buffer:16335; Calculate the variation:8365; Count non-zero:50005
Read camera:5394; Update buffer:15423; Calculate the variation:7163; Count non-zero:43006
Read camera:7527; Update buffer:20051; Calculate the variation:7919; Count non-zero:54895
Read camera:5492; Update buffer:16657; Calculate the variation:7757; Count non-zero:1034739
So it takes 5 to 7.5ms read a frame, 15 to 20ms to apply some processing and update a buffer, 7 to 8.5ms to calculate the buffer variation, and then 45ms to a second to count the amount of variation.
It spikes quite often in the last step, so that 1 second is not uncommon.
Why is it taking so much in the last step? It is a single line of code:
variatedPixels = countNonZero(variation);
With a best case scenario of 72ms, (27ms for the first steps + 45mn for the last) I'm nowhere close to being able to process 90 frames a second, and these are timings on an overclocked RPi2. That's definitely way too optimistic for the PI.
The worst I can take are 30 FPS for the application to work, but in that case it can't drop a single frame. That means having code executing in less than 33ms.
Is there any way to reproduce that line in less than 6ms? It doesn't really seem to do that much comparing to the remaining code, something just doesn't feels right. And why does it sometimes peaks? Can it be due to a thread change?
The ideas I have so far are:
To make the program multi-threaded. (It doesn't really need to answer
in real-time, just can't drop frames. There's a 400ms window to
display the results)
Reduce the bit depth from 8bits to 3 bits after
thresholding (it can lead to wrong results and no performance
benefit).
Since I'm new to C++ I would like to avoid complex solutions such as multi-threading.
EDIT:
Here is my code: https://gist.github.com/anonymous/90570c37f175fd2461b4
That's already cleaned out to be straight to the problem.
I'm probably messing up with the pointers, but it works. Please tell me in case something there is obviously wrong since I'm new, and hope the code not to be that awful. :)
EDIT 2:
I fixed a little bug with the measurement while cleaning up the code. Step 10 was always being executed, also it was being included by mistake under the step 9 times.
It also seems that having 5-6 "imshows" being updated at every second takes a lot of CPU on the PI. (I neglected that since in the desktop it wasn't even taking 1% CPU to display the frames to debug).
Right now I think I'm at 25-35ms. Need a a little more optimization to ensure it always works. So far the detection rate of my algorithm seems to be close to ~80%.

How to detect Text Area from image?

i want to detect text area from image as a preprocessing step for tesseract OCR engine, the engine works well when the input is text only but when the input image contains Nontext content it falls, so i want to detect only text content in image,any idea of how to do that will be helpful,thanks.
Take a look at this bounding box technique demonstrated with OpenCV code:
Input:
Eroded:
Result:
Well, I'm not well-experienced in image processing, but I hope I could help you with my theoretical approach.
In most cases, text is forming parallel, horisontal rows, where the space between rows will contail lots of background pixels. This could be utilized to solve this problem.
So... if you compose every pixel columns in the image, you'll get a 1 pixel wide image as output. When the input image contains text, the output will be very likely to a periodic pattern, where dark areas are followed by brighter areas repeatedly. These "groups" of darker pixels will indicate the position of the text content, while the brighter "groups" will indicate the gaps between the individual rows.
You'll probably find that the brighter areas will be much smaller that the others. Text is much more generic than any other picture element, so it should be easy to separate.
You have to implement a procedure to detect these periodic recurrences. Once the script can determine that the input picture has these characteristics, there's a high chance that it contains text. (However, this approach can't distinguish between actual text and simple horisontal stripes...)
For the next step, you must find a way to determine the borderies of the paragraphs, using the above mentioned method. I'm thinking about a pretty dummy algorithm, witch would divide the input image into smaller, narrow stripes (50-100 px), and it'd check these areas separately. Then, it would compare these results to build a map of the possible areas filled with text. This method wouldn't be so accurate, but it probably doesn't bother the OCR system.
And finally, you need to use the text-map to run the OCR on the desired locations only.
On the other side, this method would fail if the input text is rotated more than ~3-5 degrees. There's another backdraw, beacuse if you have only a few rows, then your pattern-search will be very unreliable. More rows, more accuracy...
Regards, G.
I am new to stackoverflow.com, but I wrote an answer to a question similar to this one which may be useful to any readers who share this question. Whether or not the question is actually a duplicate, since this one was first, I'll leave up to others. If I should copy and paste that answer here, let me know. I also found this question first on google rather than the one i answered so this may benefit more people with a link. Especially since it provides different ways of going about getting text areas. For me, when I looked up this question, it did not fit my problem case.
Detect text area in an image using python and opencv
In the Current time, the best way to detect the text is by using EAST (An Efficient and Accurate Scene Text Detector)
The EAST pipeline is capable of predicting words and lines of text at arbitrary orientations on 720p images, and furthermore, can run at 13 FPS, according to the authors.
EAST quick start tutorial can be found here
EAST paper can be found here