I'm new to ARM and linux in general but I have eclipse set up for programming my beaglebone black (running angstrom). I want to process an image (local file) and then use the processed image information to turn on/off some LEDs.
What's the most best/most efficient way to process images with my setup? Should I make some script to process the image in Matlab or linux equivalent? If so how would I get the information from those programs into my c++ program? Or should I simply process the image in c++ (probably more difficult)?
This highly depends on what do you mean by process. If you want to do something complicated, I would recommend OpenCV since it offers a vast range of functionalities you can use to process your images.
That being said, if by process you mean extract text from images, you could take a look at Tesseract which is an open source OCR. If you will be going for an OCR, you could use OpenCV to do some pre-processing to make the text extraction process easier and more succesful.
If I am understanding you correctly, then you could take a look at this tutorial which should do what you are after (you start with an image and end up with a pixellated version of it).
Related
I'm trying to write an application that records and saves the screen in C++ on the windows platform. I'm not sure where to start with this. I assume I need some sort of API, (FFMPEG, maybe OpenGL?). Could someone point me in the right direction?
You could start by looking at Windows remote desktop protocol, maybe some programming libraries are provided for that.
I know of a product that intercepts calls into the Windows GDI dll and uses that to store the screen drawing activities.
A far more simpler approach would be to do screenshots as often as possible and somehow minimize redundant data (parts of the screen that didn't change between frames).
If the desired output of your app is a video file (like mpeg) you are probably better off just grabbing frames and feeding them into a video encoder. I don't know how fast the encoders are these days. Ffmpeg would be a good place to start.
If the encoder turns out not fast enough, you can try storing the frames and encoding the video file afterwards. Consecutive frames should have many matching pixels, so you could use that to reduce the amount of data stored.
What kind of debugging is available for image processing/computer vision/computer graphics applications in C++? What do you use to track errors/partial results of your method?
What I have found so far is just one tool for online and one for offline debugging:
bmd: attaches to a running process and enables you to view a block of memory as an image
imdebug: enables printf-style of debugging
Both are quite outdated and not really what I would expect.
What would seem useful for offline debugging would be some style of image logging, lets say a set of commands which enable you to write images together with text (probably in the form of HTML, maybe hierarchical), easy to switch off at both compile and run time, and the least obtrusive it can get.
The output could look like this (output from our simple tool):
http://tsh.plankton.tk/htmldebug/d8egf100-RF-SVM-RBF_AC-LINEAR_DB.html
Are you aware of some code that goes in this direction?
I would be grateful for any hints.
Coming from a ray tracing perspective, maybe some of those visual methods are also useful to you (it is one of my plans to write a short paper about such techniques):
Surface Normal Visualization. Helps to find surface discontinuities. (no image handy, the look is very much reminiscent of normal maps)
color <- rgb (normal.x+0.5, normal.y+0.5, normal.z+0.5)
Distance Visualization. Helps to find surface discontinuities and errors in finding a nearest point. (image taken from an abandoned ray tracer of mine)
color <- (intersection.z-min)/range, ...
Bounding Volume Traversal Visualization. Helps visualizing a bounding volume hierarchy or other hierarchical structures, and helps to see the traversal hotspots, like a code profiler (e.g. Kd-trees). (tbp of http://ompf.org/forum coined the term Kd-vision).
color <- number_of_traversal_steps/f
Bounding Box Visualization (image from picogen or so, some years ago). Helps to verify the partitioning.
color <- const
Stereo. Maybe useful in your case as for the real stereographic appearance. I must admit I never used this for debugging, but when I think about it, it could prove really useful when implementing new types of 3d-primitives and -trees (image from gladius, which was an attempt to unify realtime and non-realtime ray tracing)
You just render two images with slightly shifted position, focusing on some point
Hit-or-not visualization. May help to find epsilon errors. (image taken from metatrace)
if (hit) color = const_a;
else color = const_b
Some hybrid of several techniques.
Linear interpolation: lerp(debug_a, debug_b)
Interlacing: if(y%2==0) debug_a else debug_b
Any combination of ideas, for example the color-tone from Bounding Box Visualization, but with actual scene-intersection and lighting applied
You may find some more glitches and debugging imagery on http://phresnel.org , http://phresnel.deviantart.com , http://picogen.deviantart.com , and maybe http://greenhybrid.deviantart.com (an old account).
Generally, I prefer to dump bytearray of currently processed image as raw data triplets and run Imagemagick to create png from it with number e.g img01.png. In this way i can trace the algorithms very easy. Imagemagick is run from the function in the program using system call. This make possible do debug without using any external libs for image formats.
Another option, if you are using Qt is to work with QImage and use img.save("img01.png") from time to time like a printf is used for debugging.
it's a bit primitive compared to what you are looking for, but i have done what you suggested in your OP using standard logging and by writing image files. typically, the logging and signal export processes and staging exist in unit tests.
signals are given identifiers (often input filename), which may be augmented (often process name or stage).
for development of processors, it's quite handy.
adding html for messages would be simple. in that context, you could produce viewable html output easily - you would not need to generate any html, just use html template files and then insert the messages.
i would just do it myself (as i've done multiple times already for multiple signal types) if you get no good referrals.
In Qt Creator you can watch image modification while stepping through the code in the normal C++ debugger, see e.g. http://labs.qt.nokia.com/2010/04/22/peek-and-poke-vol-3/
What should I use to perform screen capture on Windows for subsequent image processing?
I seek to do follow-up image processing in OpenCV.
Well the most straightforward thing to do is to use an off the shelf video capture tool to create an AVI file and then have image processing software operate on that, after the fact.
To get up and running:
CamStudio is free and open source and has a simple gui.
VirtualDub is also FOSS and is more powerful, but less intuitive to use. Its primarily a video editing and processing tool, but it actually has sophisiticated capture capabilities.
Both work on Windows and both can output uncompressed AVI files that OpenCV can read.
If you are completely new to OpenCV, then I recommend a O'Reilly's "Learning OpenCV". Its for the older OpenCV 1.1 but it will at least get you started.
If you crack open that book, and you find that its way above your head, then I would consider trying to do your image processing in a higher level language. MATLAB with the Image Processing Toolbox is well suited for rapid prototyping of image processing and its a much more forgiving development environment. Its an interpretative language, so you can see-as-you-code.
Based on the question as stated, this is as much info as I can provide. Perhaps consider providing more details about your specific application requirements?
I am creating a photo editor app in webos using its hybrid app. I am new to c++.
I don't want to display image on the screen using C++, because on the front end I am using javascript as ui.because javascript UI is better thn PDK... But on the backend I have to use c++ just to process it and save image to the file. I can't save it using javascript because webOS doesn't have support for canvas.toDataURL() method.
So I have to pick an image file from a relative path in the local directory, get its rgb values, process on the rgb values and then saving image back to the directory. Saving as new and replacing the previous.
Ok, now I want assistance from u developers. Also if this is all possibe using the SDL library ?? Also can I crop image in c+|+ as well given x,y coordinates of all of its edges to be cropped from?
I don't know the SDL library well (I suppose it can load and save images) but for loading and saving images you can use the OpenIL/DevIL library, it is quite simple and supports many formats. You could also take a look at OpenCV, but that could be a bit heavy-weight for your purpose. To your second question, You can do everything with the image when it's loaded, just program it. With the right libraries and programmer, C++ can do nearly everything. Sorry for this stupid sentence, but you asked if you can do that in C++ and the answer is nearly always Yes.
Does anyone know of a c++ library for taking an image and performing image recognition on it such that it can find letters based on a given font and/or font height? Even one that doesn't let you select a font would be nice (eg: readLetters(Image image).
I've been looking into this a lot lately. Your best is simply Tesseract. If you need layout analysis on top of the OCR than go with Ocropus (which in turn uses Tesseract to do the OCR). Layout analysis refers to being able to detect position of text on the image and do things like line segmentation, block segmentation, etc.
I've found some really good tips through experimentation with Tesseract that are worth sharing. Basically I had to do a lot of preprocessing for the image.
Upsize/Downsize your input image to 300 dpi.
Remove color from the image. Grey scale is good. I actually used a dither threshold and made my input black and white.
Cut out unnecessary junk from your image.
For all three above I used netbpm (a set of image manipulation tools for unix) to get to point where I was getting pretty much 100 percent accuracy for what I needed.
If you have a highly customized font and go with tesseract alone you have to "Train" the system -- basically you have to feed a bunch of training data. This is well documented on the tesseract-ocr site. You essentially create a new "language" for your font and pass it in with the -l parameter.
The other training mechanism I found was with Ocropus using nueral net (bpnet) training. It requires a lot of input data to build a good statistical model.
In terms of invoking Tesseract/Ocropus are both C++. It won't be as simple as ReadLines(Image) but there is an API you can check out. You can also invoke via command line.
While I cannot recommend one in particular, the term you are looking for is OCR (Optical Character Recognition).
There is tesseract-ocr which is a professional library to do this.
From there web site
The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available
I think what you want is Conjecture. Used to be the libgocr project. I haven't used it for a few years but it used to be very reliable if you set up a key.
The Tesseract OCR library gives pretty accurate results, its a C and C++ library.
My initial results were around 80% accurate, but applying pre-processing on the images before supplying in for OCR the results were around 95% accurate.
What is pre-preprocessing:
1) Binarize the bitmap (B&W worked better for me). How it could be done
2) Resampling your image to 300 dpi
3) Save your image in a lossless format, such as LZW TIFF or CCITT Group 4 TIFF.