I am planning to do a project on Image Processing, my knowledge in this subject in general is low. My Preferred Language is C++.
Can the members out here give me:
A Brief idea of What Image Processing is?
What books should i consult [ please keep in mind i am a beginner and am ONLY interested in making a college Project ]
What libraries can i use? [ I know about Boost/OpenCV etc. I would like to know what simplest and can get my project done quickly - its a minor project ]
Apart from the above 3 points, anything which i should know if be told to me will be of good help. Thanks in Advance.
I would suggest reading a good book. Image processing is not a programming field - it's an engineering field and it involves mathematics and signal processing knowledge and intuition. The Gonzalez and Woods Image Processing is quite good and doesn't require a vast knowledge of signal processing before you start reading it. The bottom line is you don't learn image processing like you learn a new programming language; you learn it like a completely new subject that just happens to involve coding. To break this up into answers to your questions,
Image processing is a discipline of digital signal processing which is itself at an intersection of computer science and applied mathematics. It involves pixel-based image operations for purposes of image enhancement (color and contrast correction, denoising, deblurring), visual effects (spatial distortion, morphing, color-substitution), artificial vision (feature extraction, texture segmentation, pattern identification, spatial perception). There are also many narrowly applied areas of image processing such as RADAR image processing, medical image processing, etc.
The book I mentioned above is really a great read. If it's a bit pricey for you, I always find it useful going to Amazon and searching for an inexpensive older edition used book on the subject with a five star rating. Never failed me yet. Beware of getting books that are too old though.
There are plenty of libraries for the task, Boost/CImg are some of them, and it really depends on the platform you're coding for. However, I would think that an image processing project would not involve any libraries, instead you would be writing image processing filters and other operators yourself -- that's the essence of it. You would very likely use algorithm libraries though for faster computation. A project in image processing is not a software project; rather, it's an engineering project and using a library would kill the purpose completely. That is in my humble opinion, of course.
Answer to 3.: CImg might be a good choice to start quickly.
Modifying the image data in such a way to get the desired effect (for example, change a colour image into black and white image).
Very broad question, and the answer depends on what you want to do.
Take a look into GraphickMagick or ImageMagick.
Image processing is a lot about math, and is particular matrix manipulations and in more advanced processing, Fourrier transformation.
image processing is at its basic definition, image manipulation, whatever the manipulations are (either color manipulation, feature extractions, enhancements, ... ). Image processing is different than computer graphics (2d and 3d)
I would assume visit your local college library, they should have existing reference for image processing, algorithms and all that jazz. You have to decide (with your college professor/adviser) what part of image processing you want to explore.
Have a look at the ImageMagick libraries (among others), it offer a good package to start learning about image processing; source code is available).
Max.
Altough old, I trink Digital Image Processing by K. Pratt is a good choice to start (to get a gist of common techniques), but imho you shouldn't learn with C++; a high level language with good image processing toolbox (like MATLAB) is far better to try algorithms (whic sometimes need heavy use of complex numerical methods).
Related
I want to do image processing using OpenCV and C++. When I am capturing an image in a dark environment it seems to be hard to do people detection. Changing brightness and contrast may help the situation. But my project is related with computer vision. So i want my program to identify weather there is a need of adding or reducing brightness and contrast, But how to identify that? I have no idea, Please help
Good solution: Use illumination so your scene is not dark.
If this is not possible you can increase exposure time and/or gain. Both methods degrade your SNR. Especially with moving people motion blur will become a problem if your exposure time is too high.
Do not just increase image brightness or contrast by software. It makes no difference for your computer, only for you.
Read something about auto exposure algorithms. A well exposed image is neither under nor over exposed. It's histogram should be as broad as possible.
I believe, You can try "histogram equalization" .
Here is an example image that i have used for experiment.
Example
Source code in C++ language
Please let me know if you need any more information regarding this topic.
I think you should consider using an infrared camera. See this article here for example: "Selection of a Visible-Light vs. Thermal Infrared Sensor in
Dynamic Environments Based on Confidence Measures", authors: Cuerda and coworkoers.
I am working on a project that requires me to detect and track a human in a live video from a webcam connected to a Beagleboard xm.
I have completed this task using Opencv in pixel domain. The results on the board are very accurate but extremely slow. Many people have suggested me to leave pixel domain and do the same task in an h.264/MPEG-4 compressed video as it would extremely reduce the computational overhead.
I have read many research papers but failed to discover any software platform or a library that I can use to analyze and process h.264 compressed videos.
I will be thankful if someone can suggest me some library for h.264 compressed video analysis and guide me further.
Thanks and Regards.
I'm not sure how practical this really is (I've never tried to do it), but my guess would be that what they're referring to would be looking for a block of macro-blocks that all have (nearly) identical motion vectors.
For example, let's assume you have a camera that's not panning, and the picture shows a car driving across the screen. Looking at the motion vectors, you should have a (roughly) car-shaped bunch of macro-blocks that all have similar motion vectors (denoting the motion of the car). Then, rather than look at the entire picture for your object of interest, you can look at that block in isolation and try to identify it. Likewise, if the camera was panning with the car, you'd have a car-shaped block with small motion vectors, and most of the background would have similar motion vectors in the opposite direction of the car's movement.
Note, however, that this is likely to be imprecise at best. Just for example, let's assume our mythical car as driving in front of a brick building, with its headlights illuminating some of the bricks. In this case, a brick in one picture might (easily) not point back at the same brick in the previous picture, but instead point at the brick in the previous picture that happened to be illuminated about the same. The bricks are enough alike that the closest match will depend more on illumination than the brick itself.
You may be able, eventually, to parse and determine that h.264 has an object, but this will not be "object tracking" like your looking for. openCV is excellent software and what it does best. Have you considered scaling the video down to a smaller resolution for easier analysis by openCV?
I think you are highly over estimating the computing power of this $45 computer. Object recognition and tracking is VERY hard computationally speaking. I would start by seeing how many frames per second your board can track and optimize from there. Start looking at where your bottlenecks are, you may be better off processing raw video instead of having to decode h.264 video first. Again, RAW video takes a LOT of RAM, and processing through that takes a LOT of CPU.
Minimize overhead from decoding video, minimize RAM overhead by scaling down the video before analysis, but in the end, your asking a LOT from a 1ghz, 32bit ARM processor.
FFMPEG is a very old library that is not being supported now a days. It has very limited capabilities in terms of processing and object tracking in h.264 compressed video. Most of the commands usually are outdated.
The best thing would be to study h.264 thoroughly and then try to implement your own API in some language like Java or c#.
I raised this question due to curiousity while using Google Goggle and Google's "Search by Image".
If you try giving Google an image to search, it can show you some results. Identical images work best (of course), but taken photo of various objects could be difficult.
I guess Google Goggle has workaround a bit by using text recognition and image matching recognition. If text recognition found the text, for instance, "SONY", then things might get simpler. If a brand's image is detected, then things should be simpler as well. The same goes with other famous brand and famous landmark, such as an Eiffel Tower. Having text and brand's image could help recognize things easily.
But if we are to search for something more obscure (need a better wording here), for instance, take this ramen image.
If you put this image into Google, you will get images of various other images that have similar colors and sometimes similar shape. Heck, there are other ramen images in the result, but I think it would be better if these ramen images are up in the top, since we input a ramen image, and our context here is ramen.
So here is my question, will it be possible to create such a software that can understand the context of the image? How can we express the context in the software?
Man, you just pointet out the very reason why so much people work on computer vision.
Is is quite easy to mathematically describe objects. Color, shape, density, . . .
All those can be calculated easily.
But computer vision becomes very complex when talking about "real life objects".
Angle, luminosity, and simply non consistency make it really almost impossible to detect an object accurately.
When working on computer vision, you should always ask yourself : what makes the object I want to recognize unique ?
What descriptor can I use that no other object possess ?
Ask yourself the question for theses ramen. Let's say I simply want to detect ramens.
What if the color of the soup changes? What if the meat is bigger ?
If you want to know more, you should read about pattern recognition and pattern matching.
And if you can find the solution to this kind of problems in a generic way, you can register for the nobel price I think :)
Some things are quite well known nowadays, like face recognition or OCR; but they are often quite specialized and apply to only one domain.
Think about it, even Google's image search algorithm sucks when you feed it with ramen.
It is pretty efficient with sudoku though, as he knows exactly what he is searching for.
All the difference is made in training, where you give a list of assumptions to help the algorithm.
So basically you got it. either you create a really nice computer vision system good at detecting one thing based on a lot of assumptions, or an "ok" but quite generic one :).
The choice mostly depends on your application
I am a newbie and i would like to give artistic effects to the images such as photocopy , blur ,glow edge, mosaic bubble ,pencil sketch ,marker , glass effect,paint brush,glow diffused effect programatically?
I want to implement it and any input is appreciated.
The following link depicts what i want to do exactly...
http://picasaweb.google.com/ashish.j.soni/BloggerPictures?authkey=Gv1sRgCMPHzP2P_fjRlgE#5413931800038472178
I believe that these effects are implemented in the GIMP.
The source code is available online http://www.gimp.org/source/ This may help you see how such algorithms are implemented.
You should check OpenCV, it will make your life easier. It is aimed at real-time computer vision, Not really what you are after, but it has a lot of useful functionality (eg. opening/saving different image formats) and standard image operations.
Matlab/octave is also a good tool to get feeling with image processing algorithms.
We are developing some stress and strain analysis software at university. Now it's time to move from rectangles and boxes and spheres to some real models. But I still have little idea where to start.
In our software we are going to build mesh and then make calculations, but how do I import solid bodies from CAD/CAE software?
1) How CAD/CAE models are organised? How solid bodies are represented? What are the possibilities of DWG, DXF, IGES, STEP formats? There is e.g. a complete DXF reference, but it's too difficult for me to understand without knowing basic concepts.
2) Are there C++ libraries to import solid bodies from CAD/CAE file formats? Won't it be too difficult to build a complete model to be able to import comprehensive file?
To import solid bodies you first need to export them from the CAD system. Most CAD system datafiles are propriety (unless they've all moved over to XML in the few years I've been out of the industry!). DWG is Autodesk's file format and they don't (well didn't) encourage people to read it directly. They did offer a file reading/writing library if memory serves, but I don't know what the state of that is now. DXF, IGES and STEP are all data transfer formats.
DXF is owned by Autodesk but is published so other companies can use it to read and write models. The DXF reference is complicated, but is just a reference - you need to know the concepts before you can understand what it represents.
Solid models can be represented in a number of ways, either by Constructional Solid Geometry (CSG) where the shape is made up from the addition or subtraction of solid primitives from each other, or by Boundary Representation (B-Rep) where the edges are stored, or by triangulated faces (as used by 3D Studio MAX, WPF and many others) and so on. The particular format will depend on what the modeller is designed to do.
There are libraries and tools for reading the various file formats. I don't know which ones are still active as it's 5+ years since I was heavily involved in 3D graphics. You'd be better off searching for the current crop yourself. I'd recommend starting with Wikipedia - it will have some articles on 3D graphics and there should be plenty of links to further reading and tools/libraries.
Once you have a reader you'll need to convert the data to your internal format - not a trivial task. You might be better off adopting an existing format. One of my jobs was the reading of models from various sources into my company's data structure. My task was greatly helped by the fact that the modellers we supported came with API's that let us read the model meshes directly and from there it was a relatively straightforward (but never easy) task to convert their mesh into ours. There were always edge cases and nuances of the format that caused headaches. These were multiplied several times over if we had to read the file format ourselves - such as for DXF or VRML.
The most common way solid models are represented in current 3D CAD software (CATIA, Pro/Engineer/Solidworks/NX) is through Boundary Representation (B-REP).
However, most of libraries to import such CAD data are proprietary. Some libraries come directly from geometric modelers (such as ACIS with Interop, Parasolid, or Granite), other are from small software companies specialized on the CAD data translation market.
On the open source side, maybe have look at the OpenCascade kernel. This kernel have been open sourced (mostly), and it have some STEP import and meshing functionalities.
Your best bet is to work with an existing open source CAD system, such as BRL-CAD, that includes support for numerous importers and exporters.
Your intuition that learning a given format would be difficult to understand and implement support for is quite true, particularly when dealing with solid geometry formats intended for analysis purposes. Preserving solidity with topological guarantees is important for producing valid analyses, but rarely addressed by simple mesh formats.
In particular for the two prevalent international standards (IGES and STEP), they are excessively complex to support as they can contain the same solid geometry encoded in numerous ways. Consider a simple sphere example. That sphere could be encoded as a simple point and radius (with no explicit surface information, an implicit form common with CSG usage), it could be a polygonal mesh (lossy BREP facets mesh format), it could be a spline surface (BREP NURBS), it could be volumetric (think CT scan data), and more. Focusing on any one of those involves various tradeoffs (simplicity, solidity, analytic guarantees, flexibility, etc).
As mentioned regarding BRL-CAD, it's a large open source solid modeling system that has a lot of functionality in many areas you could leverage, about a dozen libraries of functionality and more than 400 succinct tools (two dozen or so being geometry converters). Even if it doesn't do exactly what you need, you have the source code and can contribute improvements back and collaborate with an existing community to help implement what you need.
Upon re-reading your question, let me completely change my answer. If you all need is meshes, then just use a simple mesh-based format.
OBJ is simple, good, and very standard. Conversion from many CAD formats to OBJ requires a tessellator/mesher, which you don't want to be writing anyway, just get a seat of a CAD package to do the translation. Moi or Rhino are low-cost, and support many formats.
I regularly work with a piece of commercial software for electromagnetic simulations that uses the ACIS modeling kernel and components from Simmetrix. While I can't personally attest to the ease of using those libraries, they do seem to work as advertised and could save you a lot of work. They may not be available on suitable terms for academic use, but they do seem to be designed to do exactly what you want.
As for i know, all of CAD/CAE softwarea support IGES, STEP etc file formats for geometry and ideas, anysis etc for mesh data. Most of time, we find that iges does not contain topological information. But the development of STEP(Standard for the Exchange of Product) started in 1984 as a successor of IGES.The initial plan was that "STEP shall be based on one single, complete, implementation-independent Product Information Model, which shall be the Master Record of the integrated topical and application information models". We have some libraries to read and write these file format. But as i wrote code to read and write geometries as well as mesh, reading or writing these file formats is not difficult, but alot boring.