OpenCV - Detection of moving object C++ - c++

I am working on Traffic Surveillance System an OpenCv project, I need to detect moving cars and people. I am using background subtraction method to detect moving objects and thus drawing counters.
I have a problem :
When two car are moving on road closely them my system detects it as one car, I have used all efforts like canny-edge detection, transformation etc. Can anyone tell me any particular methodology to solve this type of problems.

Plenty of solutions are possible.
A geometric approach would detect that the one moving blob is too big to be a single passenger car. Still, this may indicate a car with a caravan. That leads us to another question: if you have two blobs moving close together, how do you know it's two cars and not one car towing a caravan? You may need to add some elementary shape detection.
Another trivial approach is to observe that cars do not suddenly multiply. If you have 5 video frames, and in 4 of them you spot two cars, then it's very very likely that the 5th frame also has two cars.

CV system tracks object as moving blobs (“clouds” of moving pixels) identifies them and distinct one from another in case of occlusions. When two (or more) blobs are intersected, system merges them in one combined object and marks it by IDs of all those source-objects that currently included in the combination. When one of objects separates from the combination CV system recognize which one is out and re-arrange ID appropriately.

Related

Object Detection: Training Requried or No Training Required?

This question is related to Object detection, and basically, detecting any "known" object. For an example, imagine I have the below objects.
Table
Bottle.
Camera
Car
I will take 4 photos from all of these individual object. One from left, another from right, and other 2 from up and down. I originally thought it is possible to recognize these objects with these 4 photos per each, because you have the photos in all 4 angles, no matter how you see the object you can detect it.
But I got confused with someones idea about training the engine with thousands of positive and negative images from each object. I really don't think this is required.
So simply speaking, my question is, in order to identify an object, do I need these thousands of positive and negative objects? Or else simply 4 photos from 4 angles is enough?
I am expecting to use OpenCV for this.
Update
Actually the main thing is something like this.. Imagine that I have 2 laptops. One is Dell and the other one is HP. Both are laptops but you know, they have clearly visible differences including the Logo. Can we do this using Feature Description? If not, how "hard" the "training" process? How many pics needed?
Update 2
I need to detect "specific" objects. Not all the cars, all the bottles etc. For an example, the "Maruti Car Model 123" and "Ferrari Car Model 234" are both cars but different. Imagine I have the pictures of Maruti and Ferrari of above mentioned models, then I need to detect them. I don't have to worry about other cars or vehicles, or even other models of Maruti and Ferrari. But the above mentioned "Maruti Car Model 123" should be identified as "Maruti Car Model 123" and above mentioned "Ferrari Car Model 234"should be identified as "Ferrari Car Model 234". How many pictures do I need for this?
Answers:
If you want to detect a specific object and you don't need to account for view point changes, you can use 2D features:
http://docs.opencv.org/doc/tutorials/features2d/feature_homography/feature_homography.html
To distinguish between 2 logos, you'll probably need to build a detector for each logo which will be trained on a set of images. For example, you can train a Haar cascade classifier.
To distinguish between different models of cars, you'll probably need to train a classifier using training images of each car. However, I encountered an application which does that using a nearest neighbor approach - it just extracts features from the given test image and compares it to a know set of images of difference car models.
Also, I can recommend some approaches and packages if you'll explain more on the application.
To answer the question you asked in the title, if you want to be able to determine what the object in the picture is you need a supervised algorithm (a.k.a. trained). Otherwise you would be able to determine, in some cases, the edges or the presence of an object, but not what kind of an object it is. In order to tell what the object is you need a labelled training set.
Regarding the contents of the question, the number of possible angles in a picture of an object is infinite. If you just have four pictures in your training set, the test example could be taken in an angle that falls halfway between training example A and training example B, making it hard to recognize for your algorithm. The larger the training set the higher the probability of recognizing the object. Be careful: you never reach the absolute certainty that your algorithm will recognize the object. It just becomes more likely.

Pattern matching/recognition library for vectors (like OpenCV for image input)

Does anyone know a good pattern matching/recognition library in C++ (oss preferred) that is able to detect whether a list of vectors is an arrow or some other class?
I already know OpenCV but this is meant to be used for raster graphics (or did I missed something?)... but I already have a vector geometry and it sounds strange to convert them back into a raster graphic where you have to detect the edges again.
So what I need is a library that uses a list of vectors as input instead of a raster graphic and can recognize if the vectors are an arrow (independent from the direction) and extract the parts of the arrow (head/tip/tail etc.).
Anyone who knows such a lib or has a hint where to look for this kind of problem (algorithms etc.)?
I try to change the way a UI is used. I already tried protractor algorithm and divided the recognition step into different parts, e.g. for arrow example:
draw, stop drawing and take result
treat first line as body (route line, arrow shaft)
wait for accept (=> result is recognised as simple line replace hand drawn graphic with route graphic) or next draw process
draw arrow head and take result coordinates
wait for accept/finish button (=> result is recognised as arrow and is no simple route)
a) replace hand drawn vectors by correct arrow graphic
b) or go on with any fletchings? bla, bla, bla
But I want to do this in a single step for all vector lines (regardless of the order and direction). Are there any recommendations?
And what if the first is a a polyline with an angle and there is also a recognition of a caret but the follow up symbology needs to decide between them?
I want to draw commands instead of searching it them in a burdened menu. But it is important to detect also the parts of a graphic (e.g. center line, left line, ...) and keep aspect ratio (dimension) as far as it is possible, which means that key coordinates should be kept, too (e.g. arrow tip). This is important for replacing the hand-drawn vectors with the corrected standard graphic.
Is this possible with a lib as a single task or should I stay at the current concept of recognising each polyline separately and look at the input order (e.g. first line must be the direction)?
You can look here to get an idea: http://depts.washington.edu/aimgroup/proj/dollar/
There is the $1 Recognizer algorithm and some derived ones and you can try them online.
The problem is, that my "commands" consists of multiple lines and every line might have a different special meaning in the context to get the complete graphic. The algorithms and libraries I already know (like the $1 Recognizer above) are more related to single gestures instead of a complex order of multiple gesture inputs which gets the precise meaning if interpreted as a whole sketch.
I think continuing with the interpretion of each line separately and not puting it into the whole context (recognise the whole sketch) could lead to a dead end. But maybe a mixed approach might get it.
Real life comparism: It is like when somebody draws a horse. You wouldn't say it is a horse if he just started to draw the first line - you'll need some more input, e.g. 4 legs etc.
(Well, I know not everyone is good in drawing and some horses could look like cows... but anyway, this should give you an idea what I mean.)
Any hints?
Update: I've found a video here that is close to the problem. The missing link is how parts of the structure are accessible after the recognition but this can be done in a separate step, too (after knowing what the drawing shows).
In my humble opinion I'don't think that there's a library in the wild that fulfils such specific needs. In the end you'll end up writing custom code.
Either way, the first thing you'll have to do is to extract classification features from every gesture you detect. You'll have then to put your acquired feature vectors in a feature space. Once you do this, there are literally a million things you can do in order to classify the feature vectors to one of the available classes (e.g., arrow, triangle etc.). For example, the guys from the University of Washington in the link you've supplied are doing their feature extraction in steps 1,2 and 3 and they classify the acquired feature vector in step 4.
The idea of breaking the gesture into sub-gestures sounds tempting, though I have a suspicion it will introduce problems in a matter of ways (e.g., how to detect the end of a sub-gesture and the beginning of the next) and it will also introduce a significant overhead
since you will end up in additional steps and a short of a decision tree structure.
One other thing that I forgot to mention above is that you will also need to create a training data-set of a reasonable size in order to train your classifiers.
I won't get into the trouble of suggesting libraries, classifiers, linear algebra packages etc. since this is out of the scope in the first place (i.e., kindly I would suggest to search the web for specific components that will help you build your application).

Group of soldiers moving on grid map together

I am making RTS game and whole terrain is like grid ( cells with x and y coordinates). I have couple soldiers in group (military unit) and I want to send them from point A to point B ( between points A and B is obstacles ). I can solve for one soldier using A* algorithm and that is not problem. How to achieve that my group of soldiers always going together ? (I notice couple corner cases when they split and go with different ways to the same destination point, I can choose leader of group but I don't need that soldiers going on same cells but by leader, for example couple at right side, couple at left side if it is possible). Was anyone solving similar problem in past ? Any idea for algorithm modification ?
You want a flocking algorithm, where you have the leader of the pack follow the A* directions and the other follow the leader in formation.
In case you have very large formations you are going to get into issues like "how to fit all those soldiers through this small hole" and that's where you will need to get smart.
An example could be to enforce a single line formation for tight spots, others would involve breaking down the groups into smaller squads.
If you don't have too many soldiers, a straightforward modification would be to consider the problem as multidimensional problem with each soldier representing 2 dimensions. You can add constaints to this multidimensional space to ensure that your soldiers keep close to each other. However, this might become computationally expensive.
Artifical Potential Fields are usually less expensive and easy to implement. And they can be extended to cooperative strategies. If combined with graph search techiques, you cannot get stuck in local minima. Google gives plenty of starting points: http://www.google.com/search?ie=UTF-8&oe=utf-8&q=motion+planning+potential+fields

Detecting Chess moves from successive image differences using OpenCV tools

Hey, I am coding up a simple chess playing robot's vision system, I am trying to improve on some previous research to allow camera and a standard chess set be used and both be allowed to move during the game. So far I can locate the board in an image acquired via web-cam, and I want to detect moves by taking difference of successive images to determine what has changed then use previous information about the board occupancy to detect moves.
My problem is that I can't seem to reliably detect changes at the moment, my current pipeline goes like this:
Subtract two images -> Histogram equalize the difference image -> erode and dilate diff image to remove minor changes -> make a binary copy and do distance transform -> Get the largest blob(corresponding to the highest value after DT and flood fill that blob) -> repeat again until DT returns a value small enough to ignore change.
I am coding all this in OpenCV and C++. but my flood fill seem to always either not fill the blobs, hence most cases I just get one change detected. I have tried also using cv::inpaint but that didn't help either. So my question is; am I just using the wrong approach or somehow turing can make the change detection more reliable. In case of the former, could people suggest alternative routes, preferable codable in C++/Python and/or OpenCV in a reasonable time?
thanks
The problem of getting a fix on the board and detecting movement of pieces can be solved independently, assuming one does not move the board while also moving pieces around..
Some thoughts on how I would approach it:
Detecting the orientation of the board
You have to be able to handle the board being rotated in place, as well as moved around as long as some angle is maintained that lets you see the pieces. It would help if there were something on the board that you could easily identify (e.g. a marker on each corner) so that if you lose orientation (e.g. someone moves the board away from the camera completely) you could easily find it again.
In order to keep track of the board you need to model the position of the camera relative to the board in 3D space. This is the same problem as determining the location of the camera being moved around a fixed board. A problem of Egomotion. Once you solve that you can move on to the next stage, which is detecting movement and tracking objects.
Detecting movement of pieces
This probably the simpler part of the problem. There are lots of algorithms out there for object detection in video. I would only add that you can use "key" frames. What I mean by that is to identify those frames in which you see only the board before and after a single move. e.g. you don't see the hand moving it obscuring the pieces, etc. Once you have the before/after frame you can figure out what moved and where it is positioned relative to the board.
You can probably get away with not being able to recognize the shape of each piece if you assume continuity (i.e. that you've tracked all movements since the initial arrangement of the board, which is well known).

Simulating a car moving along a track

For Operating Systems class I'm going to write a scheduling simulator entitled "Jurrasic Park".
The ultimate goal is for me to have a series of cars following a set path and passengers waiting in line at a set location for those cars to return to so they can be picked up and be taken on the tour. This will be a simple 2d, top-down view of the track and the cars moving along it.
While I can code this easily without having to visually display anything I'm not quite sure what the best way would be to implement a car moving along a fixed track.
To start out, I'm going to simply use OpenGL to draw my cars as rectangles but I'm still a little confused about how to approach updating the car's position and ensuring it is moving along the set path for the simulated theme park.
Should I store vertices of the track in a list and have each call to update() move the cars a step closer to the next vertex?
If you want curved track, you can use splines, which are mathematically defined curves specified by two vector endpoints. You plop down the endpoints, and then solve for a nice curve between them. A search should reveal source code or math that you can derive into source code. The nice thing about this is that you can solve for the heading of your vehicle exactly, as well as get the next location on your path by doing a percentage calculation. The difficult thing is that you have to do a curve length calculation if you don't want the same number of steps between each set of endpoints.
An alternate approach is to use a hidden bitmap with the path drawn on it as a single pixel wide curve. You can find the next location in the path by matching the pixels surrounding your current location to a direction-of-travel vector, and then updating the vector with a delta function at each step. We used this approach for a path traveling prototype where a "vehicle" was being "driven" along various paths using a joystick, and it works okay until you have some intersections that confuse your vector calculations. But if it's a unidirectional closed loop, this would work just fine, and it's dead simple to implement. You can smooth out the heading angle of your vehicle by averaging the last few deltas. Also, each pixel becomes one "step", so your velocity control is easy.
In the former case, you can have specially tagged endpoints for start/stop locations or points of interest. In the latter, just use a different color pixel on the path for special nodes. In either case, what you display will probably not be the underlying path data, but some prettied up representation of your "park".
Just pick whatever is easiest, and write a tick() function that steps to the next path location and updates your vehicle heading whenever the car is in motion. If you're really clever, you can do some radius based collision handling so that cars will automatically stop when a car in front of them on the track has halted.
I would keep it simple:
Run a timer (every 100msec), and on each timer draw each ones of the cars in the new location. The location is read from a file, which contains the 2D coordinates of the car (each car?).
If you design the road to be very long (lets say, 30 seconds) writing 30*10 points would be... hard. So how about storing at the file the location at every full second? Then between those 2 intervals you will have 9 blind spots, just move the car in constant speed (x += dx/9, y+= dy/9).
I would like to hear a better approach :)
Well you could use some path as you describe, ether a fixed point path or spline. Then move as a fixed 'velocity' on this path. This may look stiff, if the car moves at the same spend on the straight as cornering.
So you could then have speeds for each path section, but you would need many speed set points, or blend the speeds, otherwise you'll get jerky speed changes.
Or you could go for full car simulation, and use an A* to build the optimal path. That's over kill but very cool.
If there is only going forward and backward, and you know that you want to go forward, you could just look at the cells around you, find the ones that are the color of the road and move so you stay in the center of the road.
If you assume that you won't have abrupt curves then you can assume that the road is directly in front of you and just scan to the left and right to see if the road curves a bit, to stay in the center, to cut down on processing.
There are other approaches that could work, but this one is simple, IMO, and allows you to have gentle curves in your road.
Another approach is just to have it be tile-based, so you just look at the tile before you, and have different tiles for changes in road direction an so you know how to turn the car to stay on the tile.
This wouldn't be as smooth but is also easy to do.