I am currently developing an application based on OpenCV/C++ to track small animals:
Here is a example of the kind of video it should process. The program simply output the x,y position of the animal and the time for each area. This is graphical representation of the result onto the original.
My question is a bit awkward --- and maybe off topic --- in that I am not asking how to improve my program, but how to assess it. I am aware of the existence Bonn Benchmark on Tracking dataset, but it is not appropriate to my case.
The program is meant to process very long video, therefore, I cannot realistically ask independent humans to assess the position of the animals and compare human vs. program.
I have also considered using robots or putting transponders on bigger animals in order to have the precise positions, but I do not really have the resources.
I came out with the idea of using a program to generate videos of blobs moving in a 2d environment. My question is simple:
Are you aware of any programmable high-level framework that I could use to graphically simulate the motion of an object moving stochastically in a parametrisable background ?
My dream would be to have a command line tool that would work like this:
$ program [BACKGROUND_OPTIONS] [OBJECT_OPTIONS] -V VIDEO_OUTPUT -P POSITIONS_OUTPUT
The background texture could be manipulated as well as the shape, colour and motion pattern of the object moving.
I know that I could probably "easily" make it myself (and I will if I cannot find anything), but I would prefer if the program assessed and the reference were as much independent as they can (for instance not both made by the same person).
One thing I've seen several motion detection/tracking projects do is create test videos with some 3D rendering software such as Blender. It doesn't have the simple interface of your dream test creator, but it's a good testing tool for lots of reasons:
You can set up whatever scenario you want (varying perspective, number of objects, test length, motion paths, etc.)
You completely control lighting parameters, shapes, sizes, etc.
You can design simple tests to verify basic functionality (solid color background with solid colored moving spheres makes a good starting point), then branch into more complex scenarios (other static objects, objects occluding other objects, background images, and so on).
Related
I'm looking for some advanced tutorials or maybe open-source applications written in C++ or .NET that would implement a complex vector-based application, something like MS Visio or Autocad. What I need to know is how the guru of such applications are managing the rendering of complex objects (> 1000 rectangle) on mouse move, when user can move a complex object over other complex objects. I know about XOR painting and stuff, but if you check the above applications is clearly they are not using this technique. The whole object moves smoothly on top on another, not only its XOR reflection. Plus the moving objects show some additional info while moved, like current coordinates, or something else, so is not a static representation of it saved in a bitmap.
Any advises are welcome.
Thx
A lot of graphical applications use some kind of spatial partitioning to prune down the number of objects they need to look at. For example, if you move a rectangle, the application looks in a quadtree and finds the 2 or 3 other objects whose bounding boxes overlap the moving rectangle. Then it only needs to do a full collision detection and graphics processing with 2 or 3 objects instead of 1000.
I'm writing an OpenGL 2D library in Python. Everything is going great, and the codebase is steadily growing.
Now I want to write unit tests so I don't accidently bring in new bugs while fixing others/making new features. But I have no idea how those would work with graphics libraries.
Some things I thought of:
make reference screenshots and compare them with autogenerated screenshots in the tests
replace opengl calls with logging statements and compare logs
But both seem a bad idea. What is the common way to test graphics libraries?
The approach I have used in the past for component level testing is:
Use a uniform colored background, with a few different colors.
Use uniform colored rectangles as graphical objects in tests (with a few different colors).
Place rectangles in known places where you can calculate their projected position in the image by yourself.
Calculate expected intensity of each channel of each pixel (background, foreground or mixture).
If you have a test scenario that results in non-round positions, use a non-accurate compare (e.g. correlation)
Use calculations to create expected result images.
Compare output images to expected result images.
If you have a blur effect, compare sum of intensity instead of discrete intensities.
As graham stated, internal units may be unit-testable free from graphics calls.
Break it down even further.
The calls that make the graphics will rely on algorithms - test the algorithms.
What kind of debugging is available for image processing/computer vision/computer graphics applications in C++? What do you use to track errors/partial results of your method?
What I have found so far is just one tool for online and one for offline debugging:
bmd: attaches to a running process and enables you to view a block of memory as an image
imdebug: enables printf-style of debugging
Both are quite outdated and not really what I would expect.
What would seem useful for offline debugging would be some style of image logging, lets say a set of commands which enable you to write images together with text (probably in the form of HTML, maybe hierarchical), easy to switch off at both compile and run time, and the least obtrusive it can get.
The output could look like this (output from our simple tool):
http://tsh.plankton.tk/htmldebug/d8egf100-RF-SVM-RBF_AC-LINEAR_DB.html
Are you aware of some code that goes in this direction?
I would be grateful for any hints.
Coming from a ray tracing perspective, maybe some of those visual methods are also useful to you (it is one of my plans to write a short paper about such techniques):
Surface Normal Visualization. Helps to find surface discontinuities. (no image handy, the look is very much reminiscent of normal maps)
color <- rgb (normal.x+0.5, normal.y+0.5, normal.z+0.5)
Distance Visualization. Helps to find surface discontinuities and errors in finding a nearest point. (image taken from an abandoned ray tracer of mine)
color <- (intersection.z-min)/range, ...
Bounding Volume Traversal Visualization. Helps visualizing a bounding volume hierarchy or other hierarchical structures, and helps to see the traversal hotspots, like a code profiler (e.g. Kd-trees). (tbp of http://ompf.org/forum coined the term Kd-vision).
color <- number_of_traversal_steps/f
Bounding Box Visualization (image from picogen or so, some years ago). Helps to verify the partitioning.
color <- const
Stereo. Maybe useful in your case as for the real stereographic appearance. I must admit I never used this for debugging, but when I think about it, it could prove really useful when implementing new types of 3d-primitives and -trees (image from gladius, which was an attempt to unify realtime and non-realtime ray tracing)
You just render two images with slightly shifted position, focusing on some point
Hit-or-not visualization. May help to find epsilon errors. (image taken from metatrace)
if (hit) color = const_a;
else color = const_b
Some hybrid of several techniques.
Linear interpolation: lerp(debug_a, debug_b)
Interlacing: if(y%2==0) debug_a else debug_b
Any combination of ideas, for example the color-tone from Bounding Box Visualization, but with actual scene-intersection and lighting applied
You may find some more glitches and debugging imagery on http://phresnel.org , http://phresnel.deviantart.com , http://picogen.deviantart.com , and maybe http://greenhybrid.deviantart.com (an old account).
Generally, I prefer to dump bytearray of currently processed image as raw data triplets and run Imagemagick to create png from it with number e.g img01.png. In this way i can trace the algorithms very easy. Imagemagick is run from the function in the program using system call. This make possible do debug without using any external libs for image formats.
Another option, if you are using Qt is to work with QImage and use img.save("img01.png") from time to time like a printf is used for debugging.
it's a bit primitive compared to what you are looking for, but i have done what you suggested in your OP using standard logging and by writing image files. typically, the logging and signal export processes and staging exist in unit tests.
signals are given identifiers (often input filename), which may be augmented (often process name or stage).
for development of processors, it's quite handy.
adding html for messages would be simple. in that context, you could produce viewable html output easily - you would not need to generate any html, just use html template files and then insert the messages.
i would just do it myself (as i've done multiple times already for multiple signal types) if you get no good referrals.
In Qt Creator you can watch image modification while stepping through the code in the normal C++ debugger, see e.g. http://labs.qt.nokia.com/2010/04/22/peek-and-poke-vol-3/
As a part of my masters project I proposed to build a virtual trial room application intended for retail clothing stores. Currently its meant to be used directly in store though it may be extended for online stores as well.
This application will show customers how a selected apparel would look on them by showing it on their 3D replica on screen.
It involves 3 steps
Sizing up the customer
Building customer replica 3D humanoid model
Apply simulated cloth on the model
My question is about the feasibility of the project and choice of framework.
Can this be achieved in real time using a normal Desktop computer? If yes what would be appropriate framework ( hardware, software, programming language etc ) for this purpose?
On the work I have done till now, I was planning to achieve above steps in following ways
for step 1 : option a) Two cameras for front and side views or
option b) 1 Kinect or 2 Kinect for complete 3D data
for step 2: either use makehuman (http://www.makehuman.org/) code to build a customised 3D model using above data or build everything from scratch, unsure about the framework.
for step 3: Just need few cloth samples, so thought of building simulated clothes in blender.
Currently I have just the vague idea about different pieces but I am not sure of how to develop complete application.
Theoretically this can be achieved in real time. Many usefull algorithms for video tracking, stereo vision and 3d recostruction are available in OpenCV library. But it's very difficult to build robust solution. For example, you'll probably need to track human body which moves frame to frame and perform pose estimation (OpenCV contains POSIT algorithm), however it's not trivial to eliminate noise in resulting objects coordinates. For inspiration see a nice work on video tracking.
You might want to choose another way, simplify some things, avoid complicated stuff do things less dynamicaly and estimate only clothes size and approximate human location. I this case most likely you will create something usefull and interesting.
I've lost link to one online fiting room where hands and body detection implemented. Using Kinnect solves many problems. But If for some reason you won't use it then AR(augmented reality) helps you (yet another fitting room)
I've started reading into the material on Wikipedia, but I still feel like I don't really understand how a scene graph works and how it can provide benefits for a game.
What is a scene graph in the game engine development context?
Why would I want to implement one for my 2D game engine?
Does the usage of a scene graph stand as an alternative to a classic entity system with a linear entity manager?
What is a scene graph in the game
engine development context?
Well, it's some code that actively sorts your game objects in the game space in a way that makes it easy to quickly find which objects are around a point in the game space.
That way, it's easy to :
quickly find which objects are in the camera view (and send only them to the graphics cards, making rendering very fast)
quickly find objects near to the player (and apply collision checks to only those ones)
And other things. It's about allowing quick search in space. It's called "space partitioning". It's about divide and conquer.
Why would I want to implement one for
my 2D game engine?
That depends on the type of game, more precisely on the structure of your game space.
For example, a game like Zelda could not need such techniques if it's fast enough to test collision between all objects in the screen. However it can easily be really really slow, so most of the time you at least setup a scene graph (or space partition of any kind) to at least know what is around all the moving objects and test collisions only on those objects.
So, that depends. Most of the time it's required for performance reasons. But the implementation of your space partitioning is totally relative to the way your game space is structured.
Does the usage of a scene graph stand
as an alternative to a classic entity
system with a linear entity manager?
No.
Whatever way you manage your game entities' object life, the space-partition/scene-graph is there only to allow you to quickly search objects in space, no more no less. Most of the time it will be an object that will have some slots of objects, corresponding to different parts of the game space and in those slots it will be objects that are in those parts.
It can be flat (like a 2D screen divider in 2 or 4), or it can be a tree (like binary tree or quadtree, or any other kind of tree) or any other sorting structure that limits the number of operations you have to execute to get some space-related informations.
Note one thing :
In some cases, you even need different separate space partition systems for different purposes. Often a "scene graph" is about rendering so it's optimized in a way that is dependent on the player's point of view and it's purpose is to allow quick gathering of a list of objects to render to send to the graphics card. It's not really suited to perform searches of objects around another object and that makes it hard to use for precise collision detection, like when you use a physic engine. So to help, you might have a different space partition system just for physics purpose.
To give an example, I want to make a "bullet hell" game, where there is a lot of balls that the player's spaceship has to dodge in a very precise way. To achieve enough rendering and collision detection performance I need to know :
when bullets appear in the screen space
when bullets leave the screen space
when the player enters in collision with bullets
when the player enters in collision with monsters
So I recursively cut the screen that is 2D in 4 parts, that gives me a quadtree. The quadtree is updated each game tick, because everything moves constantly, so I have to keep track of each object's (spaceship, bullet, monster) position in the quadtree to know which one is in which part of the screen.
Achieving 1. is easy, just enter the bullet in the system.
To achieve 2. I kept a list of leaves in the quadtree (squared sections of the screen) that are on the border of the screen. Those leaves contain the ids/pointers of the bullets that are near the border so I just have to check that they are moving out to know if I can stop rendering them and managing collision too. (It might be bit more complex but you get the idea.)
To achieve 3 and 4. I need to retrieve the objects that are near the player's spaceship. So first I get the leaf where the player's spaceship is and I get all of the objects in it. That way I will only test the collision with the player spaceship on objects that are around it, not all objects. (It IS a bit more complex but you get the idea.)
That way I can make sure that my game will run smoothly even with thousands of bullets constantly moving.
In other types of space structure, other types of space partitioning are required. Typically, kart/auto games will have a "tunnel" scene-graph because visually the player will see only things along the road, so you just have to check where he is on the road to retrieve all visible objects around in the "tunnel".
What is a scene graph? A Scene graph contains all of the geometry of a particular scene. They are useful for representing translations, rotations and scales (along with other affine transformations) of objects relative to each other.
For instance, consider a tank (the type with tracks and a gun). Your scene may have multiple tanks, but each one be oriented and positioned differently, with each having its turret rotated to different azimuth and with a different gun elevation. Rather than figuring out exactly how the gun should be positioned for each tank, you can accumulate affine transformations as you traverse your scene graph to properly position it. It makes computation of such things much easier.
2D Scene Graphs: Use of a scene graph for 2D may be useful if your content is sufficiently complex and if your objects have a number of sub components not rigidly fixed to the larger body. Otherwise, as others have mentioned, it's probably overkill. The complexity of affine transformations in 2D is quite a bit less than in the 3D case.
Linear Entity Manager: I'm not clear on exactly what you mean by a linear entity manager, but if you are refering to just keeping track of where things are positioned in your scene, then scene graphs can make things easier if there is a high degree of spatial dependence between the various objects or sub-objects in your scene.
A scene graph is a way of organizing all objects in the environment. Usually care is taken to organize the data for efficient rendering. The graph, or tree if you like, can show ownership of sub objects. For example, at the highest level there may be a city object, under it would be many building objects, under those may be walls, furniture...
For the most part though, these are only used for 3D scenes. I would suggest not going with something that complicated for a 2D scene.
There appear to be quite a few different philosophies on the web as to what the responsebilties are of a scenegraph. People tend to put in a lot of different things like geometry, camera's, light sources, game triggers etc.
In general I would describe a scenegraph as a description of a scene and is composed of a single or multiple datastructures containing the entities present in the scene. These datastructures can be of any kind (array, tree, Composite pattern, etc) and can describe any property of the entities or any relationship between the entities in the scene.
These entities can be anything ranging from solid drawable objects to collision-meshes, camera's and lightsources.
The only real restriction I saw so far is that people recommend keeping game specific components (like game triggers) out to prevent depedency problems later on. Such things would have to be abstracted away to, say, "LogicEntity", "InvisibleEntity" or just "Entity".
Here are some common uses of and datastructures in a scenegraph.
Parent/Child relationships
The way you could use a scenegraph in a game or engine is to describe parent/child relationships between anything that has a position, be it a solid object, a camera or anything else. Such a relationship would mean that the position, scale and orientation of any child would be relative to that of its parent. This would allow you to make the camera follow the player or to have a lightsource follow a flashlight object. It would also allow you to make things like the solar system in which you can describe the position of planets relative to the sun and the position of moons relative to their planet if that is what you're making.
Also things specific to some system in your game/engine can be stored in the scenegraph. For example, as part of a physics engine you may have defined simple collision-meshes for solid objects which may have too complex geometry to test collisions on. You could put these collision-meshes (I'm sure they have another name but I forgot it:P) in your scenegraph and have them follow the objects they model.
Space-partitioning
Another possible datastructure in a scenegraph is some form of space-partitioning as stated in other answers. This would allow you to perform fast queries on the scene like clipping any object that isn't in the viewing frustum or to efficiently filter out objects that need collision checking. You can also allow client code (in case you're writing an engine) to perform custom queries for whatever purpose. That way client code doesn't have to maintain its own space-partitioning structures.
I hope I gave you, and other readers, some ideas of how you can use a scenegraph and what you could put in it. I'm sure there are alot of other ways to use a scenegraph but these are the things I came up with.
In practice, scene objects in videogames are rarely organized into a graph that is "walked" as a tree when the scene is rendered. A graphics system typically expects one big array of stuff to render, and this big array is walked linearly.
Games that require geometric parenting relationships, such as those with people holding guns or tanks with turrets, define and enforce those relationships on an as-needed basis outside of the graphics system. These relationships tend to be only one-deep, and so there is almost never a need for an arbitrarily deep tree structure.