I should face an application of estimating the size(lengths, width and height) of cars in surveillance videos. Where should I start to learn? And what is the baseline accuracy(%1? %10?) that the state-of-art could achieve?
This is not a light subject to pick up, but if you are inclinded here is an excellent book to get you started: Multiple View Geometry in Computer Vision
A very recent review of automated traffic analysis is this one. A slightly older one is here.
In reality - this is a hard problem. Additional measurement of the vehicles in question will at the least require some form of calibration.
Related
I want to do image processing using OpenCV and C++. When I am capturing an image in a dark environment it seems to be hard to do people detection. Changing brightness and contrast may help the situation. But my project is related with computer vision. So i want my program to identify weather there is a need of adding or reducing brightness and contrast, But how to identify that? I have no idea, Please help
Good solution: Use illumination so your scene is not dark.
If this is not possible you can increase exposure time and/or gain. Both methods degrade your SNR. Especially with moving people motion blur will become a problem if your exposure time is too high.
Do not just increase image brightness or contrast by software. It makes no difference for your computer, only for you.
Read something about auto exposure algorithms. A well exposed image is neither under nor over exposed. It's histogram should be as broad as possible.
I believe, You can try "histogram equalization" .
Here is an example image that i have used for experiment.
Example
Source code in C++ language
Please let me know if you need any more information regarding this topic.
I think you should consider using an infrared camera. See this article here for example: "Selection of a Visible-Light vs. Thermal Infrared Sensor in
Dynamic Environments Based on Confidence Measures", authors: Cuerda and coworkoers.
Are there any open source code which will take a video taken indoors (from a smart phone for example of a home or office buildings, hallways) and superimpose that on a 2D picture showing the path traveled? This can be a handr drawn picture or a photo of a floor layout.
First I thought of doing this using the accelerometer and compass sensors but thought that perhaps one can get better accuracy with the visual odometer approach. I only need 0.5 to 1 meter accuracy. The phone will also collect important information indoors (no gps) for superimposing that data on the path traveled (this is the real application of this project and we know how to do this part). The post processing of the video can be done later on a stand alone computer so speed and cpu power is not a issue.
Challenges -
The user will simply hand carry the smart phone so the video taker is moving (walking) and not fixed
limit the video rate to keep the file size small (5 frames/sec? is that ok?). Typically need perhaps a full hour of video
Will using inputs from the phone sensors help the visual approach?
any help or guidance is appreciated Thanks
I have worked in the area for quite some time. There are three points which I'd care to make.
Vision only is hard
Vision based navigation using just a cellphone camera is very difficult. Most of the literature with great results show ~1% distance traveled as state-of-the-art but is usually using stereo cameras. Stereo helps a great deal, particularly in indoor environments for coping with scale drift. I've worked on a system which achieves 0.5% distance traveled for stereo but only roughly 5% distance traveled for monocular. While I can't share code, much of our system was inspired by this Sibley and Mei paper.
Stereo code in our case ran at full 60fps on a desktop. Provided you can push data fast enough, it'll be fine. With your error envelope, you can only navigate for 100m or so. Is that enough?
Multi-sensor is way to go. Though other sensors are worse than vision by themselves.
I've heard some good work with accelerometers mounted on the foot to do ZUPT (zero velocity updates) when the foot is briefly motionless on the ground while taking a step in order to zero out drift. This approach has the clear drawback of needing to mount the device on your foot, making a vision approach largely useless.
Compass is interesting but will be distracted by the ton of metal within an office building. Translating few feet around a large metal cabinet might cause 50+ degrees of directional jump.
Ultimately, a combination of sensors is likely to be the best if you can make that work.
Can you solve a simpler problem?
How much control do you have over your environment? Can you slap down fiducial markers? Can you do wifi triangulation? Does it need to be an initial exploration? If you can go through the environment before hand and produce visual bubbles (akin to Google Street View) to match against, you'll be much more accurate.
I'm not aware of any software that does this directly (though it might exist) but stuff similar to what you want to do has been done. A few pointers:
Google for "Vision based robot localization" the problem you state is very similar to the problem robots with a camera have when they enter a new environment. In this field the approach is usually to have the robot map its environment and then use the model for later reference, but the techniques are similar to what you'll need.
Optical flow will roughly tell you in what direction the camera is moving, but it won't tell you the speed because you have no objective reference. This is because you don't know if the things you see moving in the video feed are 1cm away and very small or 1 mile away and very big.
If you know the camera matrix of the camera recording the images you could try partial 3D scene reconstruction techniques to take a stab at the speed. Note that you can do the 3D scene stuff without the camera matrix (this is the "uncalibrated" part you see in the title of a lot of the google results), the camera matrix will let you add real world object sizes (and hence distances) to your reconstruction.
The amount of images/second you need depends on the speed of the camera. More is better, but my guess is that 5/second should be sufficient at walking speeds.
Using extra sensors will help. Probably the robot localization articles talk about this as well.
As a part of my masters project I proposed to build a virtual trial room application intended for retail clothing stores. Currently its meant to be used directly in store though it may be extended for online stores as well.
This application will show customers how a selected apparel would look on them by showing it on their 3D replica on screen.
It involves 3 steps
Sizing up the customer
Building customer replica 3D humanoid model
Apply simulated cloth on the model
My question is about the feasibility of the project and choice of framework.
Can this be achieved in real time using a normal Desktop computer? If yes what would be appropriate framework ( hardware, software, programming language etc ) for this purpose?
On the work I have done till now, I was planning to achieve above steps in following ways
for step 1 : option a) Two cameras for front and side views or
option b) 1 Kinect or 2 Kinect for complete 3D data
for step 2: either use makehuman (http://www.makehuman.org/) code to build a customised 3D model using above data or build everything from scratch, unsure about the framework.
for step 3: Just need few cloth samples, so thought of building simulated clothes in blender.
Currently I have just the vague idea about different pieces but I am not sure of how to develop complete application.
Theoretically this can be achieved in real time. Many usefull algorithms for video tracking, stereo vision and 3d recostruction are available in OpenCV library. But it's very difficult to build robust solution. For example, you'll probably need to track human body which moves frame to frame and perform pose estimation (OpenCV contains POSIT algorithm), however it's not trivial to eliminate noise in resulting objects coordinates. For inspiration see a nice work on video tracking.
You might want to choose another way, simplify some things, avoid complicated stuff do things less dynamicaly and estimate only clothes size and approximate human location. I this case most likely you will create something usefull and interesting.
I've lost link to one online fiting room where hands and body detection implemented. Using Kinnect solves many problems. But If for some reason you won't use it then AR(augmented reality) helps you (yet another fitting room)
Are there any methods in the computer vision literature that allows for detecting transparent glass in images? Like if I have an image of a car, can I detect windows? etc...
All methods I've found so far are active methods (i.e. require calibration, control over the environment or lasers). I need a passive method (i.e. all you have is an image, or multi-view images of the object and thats it).
Here is some very recent work aimed at detecting transparent objects in a general setting.
http://books.nips.cc/papers/files/nips22/NIPS2009_0397.pdf
http://videolectures.net/nips09_fritz_alfm/
I think what you looking for is detection of translucent regions. There is very limited work here since it is a very hard problem. Basically it is a major chicken and egg problem. Translucent regions cause almost all fundamental image processing tools to fail (e.g. motion estimation, feature matching, tracking, etc...). Yet you must use such tools to detect translucent regions. Anyway, up to my knowledge this is the most recent piece of work in this area and I doubt there is any other.
http://www.mee.tcd.ie/~sigmedia/pmwiki/uploads/Misc.Icip2011/CVPR_new.pdf
It is published in CVPR which is a top conference in Computer Vision.
Just a wild guess: if the camera is moving and you perform a 3D reconstruction of the scene, you could detect large discontinuities of the reconstructions at the reflected regions.
I think you should provide a clearer description of what your are trying to achieve.
The paper "Deriving intrinsic images from image sequences" shows some results with transparencies.
If you are close enough, you may be able to use the glass refraction (a la Snell's law) to detect the glass from multiple views.
I also think that reflections (specular regions) are a good indication for curved glasses.
Detecting it is one thing, but separating is another. You can do separation because its like putting 2 sounds with 1 of the sounds 180 degree out of phase. If you manage to learn the phasing sound by itself, you have the other sound automatically, so you could then learn that one too. Im stuck at the point where I can only superimposesubtract them if I learnt them by themselves. So the real gain here is somehow learning this addup, as 2 separate things, even though you never saw them apart.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm looking to write a bit of software that will end up drawing a human frame (which can be configured with various parameters), and the plan is to have some sort of clothing placed on the dummy.
I've looked at Blender, and OpenGL libraries as well as other rendering and physics engines, I'm not looking for you to tell me how to do this, but mainly I'm wondering what libraries are out there to do this sort of thing?
So there'll be a pattern for the clothing in 2d, then the system, (at least in theory) will be able to translate that in to a 3d representation of a shirt for example? And then place that on the human frame. I know there's a lot of work I need to do for this, however in terms of rendering the clothing on to the frame, and accounting for collisions and how it drops around the frame etc, I've been googling, and have found a few bits, but was wondering if there were C++ libraries out there that would do that.
I'm developing using Visual C++ 2010, and the target environment is a Windows box.
Either that, or i'm going to need to take some physics lessons.
Unfortunately, developing a system like the one your talking about would be insanely difficult. On the plus side, there are alot of easy to use technologies that will help you attain your goal hopefully.
Generally, the way that this type of thing works is as follows: You make some 3d asset in a modeling program such as Blender, 3ds max, Maya, Softimage, etc, and then use this in your program/game. You can think of these programs as just spitting out a bunch of 3d coordinates, which your program, with the help of OpenGL or DirectX can load into memory and render.
Modeling and loading assets is of course the alternative to developing algorithms to generate points. This is what it seems like your trying to accomplish.
The bad news is that clothing is really really complicated. A big part of this is due to the fact that most of it requires simulating cloth dynamics. Another part of the problem is that even if you had a 2d pattern, how would you the manner in which the clothing would adhere to your human model? Is it skin tight? Loose? How will you parameterize that? The placement of the actual clothing on the body is a chore in and of itself as anyone with experience in 3d modeling might tell you.
Nevertheless, some of the industry's brightest professionals are looking for both better ways to simulate cloth, and better ways to automate asset creation.
In summation, the easy answer is that what your trying to do, as interesting and noble as it may be, is going to be extremely difficult and may not have the result your looking for.
As for where you can go for more answers:
If your still intersted in finding a way to automate clothing attatchment to models, I would start by looking around academic websites. Look for any computer science departments which have computer graphics research programs. You will find alot of interesting things there.
For more academic type resources look at Game Programming Gems, GPU Programming Gems, and Graphics Programming Gems book series. They feature many good articles that tackle difficult graphics problems such as these.
Another thing you might do is check out blender a little more. There is an interesting project called MakeHuman
http://makehuman.blogspot.com/
That automates the process of developing human models in blender.
There are a couple of tutorials for putting clothing on the models, take a look at this one:
http://www.davidjarvis.ca/blender/tutorial-05.shtml
For more tutorials on clothing and cloth simulation in blender, you can always check out
www.blendercookie.com
cg.tutsplus.com
I hope some of this has been useful.
From what I remember, cloth is simulated as a mesh of springs which suggests physics libraries for the simulation along with an understanding of the physics of springs/cloth. I've not heard of a physics library tailored to cloth simulation though, but no doubt someone on this site will know of one.
It's answer about cloth simulation itself. (maybe it is not you're intersing in)
If you want to model cloth simulation by some vendors middleware - you can try to use
Havok(it's commercial). It seems to me, that is supports any collision objects, represented by a triangle mesh.
PhysX (it's free), but when you will try to use it there is a lot of constraints on it).
If you want to model cloth physics by you hands I can advise to you this steps:
Refresh base knowledge about physics (Interia, Energy, Newton's law.)
Good start point fo cloth simulation and also physics simulation is that book
http://www.amazon.com/Game-Physics-Pearls-Gino-Bergen/dp/1568814747
Read articles from Siggraph about clothes.
Think about which collision objects do you need
Think about what forces do you need.
Split this challenge to
Broad Phase / Integration / Collision Detection / Collision Response / Constrain Solver
I have developed cloth physics simulation in C++, OpenCL.
It takes me about 4 months to develop, and about 2 months to Debug stage5.
But it was very hot-time in my life, the job has consumed huge amount of time.
except the part that you want to change the dummy while application is running what you want is more or less the example of game engines like Esenthel Engine. the whole idea is to load a mesh for the body and then put a "cloth" (cloth is already defined in most game engines as physical type) on it. but when it come to runtime changes in human frame it becomes a little more tricky since you have to know how you are going to affect the parrameters which is not easy of organic shapes.
Free Game engine to use these days is Unity 3d ... as well it all depends in the detail and as well Maya and 3ds Max are the best of the modeling programs.