I was wondering of there is a way for OpenNI to recognize a movement such as crouching or running in place? And could you have hand, finger and body recognition all running at the same time?
I played with pose estimation a while back, but that's not the same as gesture recognition which also involves time.
I recommend having a look at Dynamic Time Warping(DTW) as a useful technique.
Also have a look at the KineticSpace project which makes use of this technique.
It's written in Processing(Java) and uses a wrapper, but it still uses OpenNI under the hood.
This should help you work out crouching and perhaps running.
Regarding your second question: you get body recognition and hand tracking for free with OpenNI/Nite, but you'd have to do the finger detection yourself.
Here are few random results found on Google:
Kinect Core Vision
Finger Tip on CodePlex
ROS Finger tracking
There are more examples out there, it depends on what language you use and how comfortable you are coding.
HTH
Related
I'm really interested in the Google Street View mobile application, which integrates a method to create a fully functional spherical panorama using only your smartphone camera. (Here's the procedure for anyone interested: https://www.youtube.com/watch?v=NPs3eIiWRaw)
What strikes me the most is that it always manages to create the full sphere, even when stitching a feature-less near monochrome blue sky or ceiling ; which gets me to thinking that they're not using feature based matching.
Is it possible to get a decent quality full spherical mosaic without using feature based matching and only using sensor data? Are smartphone sensors precise enough? What library would be usable to do this? OpenCV? Something else?
Thanks!
The features are needed for registration. In the app the clever UI makes sure they already know where each photo is relative to the sphere so in the extreme case all the have to do is reproject/warp and blend. No additional geometry processing needed.
I would assume that they do try to do some small corrections to improve the registration, but even if these fail, you can fallback onto the sensor based ones acquired at capture time.
This is a case where a clever UI makes the vision problem significantly easier.
I want to implement finger tracking using Kinect SDK in C++. I have worked on a lot of hand gestures but I'm stuck at finger tracking. Can you tell me some good libraries or open source projects to get a head start? I am working on Windows 7 64 bit system. Any help will be appreciated.
I don't know of any libraries in c++ that support it out of the box but if you are just looking for a head start or a starting point you might want to look at This or this. Its in c# but it should give you a good idea and it does involve finger detection including directional information.
You can use OpenCV along with your Kinect SDK for finger tracking. Here is an inspiring video:
http://www.youtube.com/watch?v=xML2S6bvMwI
You can also see this link:
Finger detection in human hand
I am working with opencv these days and I am capable of doing 99% of stuff explained in opencv official tutorials. And I managed to do motion tracking manually with background substraction, where some users claimed as impossible.
However, right now I am working with object detection, where I need to track the hand and want to find whether the hand is moved to left or right. Can this be done by following steps? (used in motion detection)
Get camera 2 instances of camera video (real time)
blur it to reduce noise
theresold it to find hand (or leave it if blur is enough)
find the absolute deference between 2 images
Get PSR
find pixel position of motion
However, it seems like it is not 100% same as motion detection, because I read some stuff about Kalman Filter, Block-matching, etc which I did not use in motion detection. However, I found this tutorial
http://homepages.cae.wisc.edu/~ece734/project/s06/lintangwuReport.pdf
But, I really need your advice. Is there any tutorial which teach me how to do this? I am interested in learning core theory with opencv explanation (c++).
Since I am not good at maths( I am working on it - I didnt go to the university , they found me and invited me to join the final year for free because of my programming skills, so I missed math) , full of math stuff will not work.
Please help. Thank you.
Are there any methods in the computer vision literature that allows for detecting transparent glass in images? Like if I have an image of a car, can I detect windows? etc...
All methods I've found so far are active methods (i.e. require calibration, control over the environment or lasers). I need a passive method (i.e. all you have is an image, or multi-view images of the object and thats it).
Here is some very recent work aimed at detecting transparent objects in a general setting.
http://books.nips.cc/papers/files/nips22/NIPS2009_0397.pdf
http://videolectures.net/nips09_fritz_alfm/
I think what you looking for is detection of translucent regions. There is very limited work here since it is a very hard problem. Basically it is a major chicken and egg problem. Translucent regions cause almost all fundamental image processing tools to fail (e.g. motion estimation, feature matching, tracking, etc...). Yet you must use such tools to detect translucent regions. Anyway, up to my knowledge this is the most recent piece of work in this area and I doubt there is any other.
http://www.mee.tcd.ie/~sigmedia/pmwiki/uploads/Misc.Icip2011/CVPR_new.pdf
It is published in CVPR which is a top conference in Computer Vision.
Just a wild guess: if the camera is moving and you perform a 3D reconstruction of the scene, you could detect large discontinuities of the reconstructions at the reflected regions.
I think you should provide a clearer description of what your are trying to achieve.
The paper "Deriving intrinsic images from image sequences" shows some results with transparencies.
If you are close enough, you may be able to use the glass refraction (a la Snell's law) to detect the glass from multiple views.
I also think that reflections (specular regions) are a good indication for curved glasses.
Detecting it is one thing, but separating is another. You can do separation because its like putting 2 sounds with 1 of the sounds 180 degree out of phase. If you manage to learn the phasing sound by itself, you have the other sound automatically, so you could then learn that one too. Im stuck at the point where I can only superimposesubtract them if I learnt them by themselves. So the real gain here is somehow learning this addup, as 2 separate things, even though you never saw them apart.
I come from a C/C++ background and more recently flash. Anyway I wrote a 2D engine in AS3 and would like to get it running on the iPhone. To start with I converted it to C++. As a test I wrote a very simple C++ engine and added the files to a standard view-based application in XCode. I then added a UIImageView that covered the whole iPhone screen.
The way my test engine is set up at the moment is that each frame it renders the result to an image which is then used to update the UIImageView every frame. Assuming I can pass input from the iPhone to the C++ engine this seems like a fairly platform-independent solution. Since I have been coding for iPhone/Mac for less than 1 day I was wondering whether this is the standard approach to getting an existing C++ engine running on the iPhone and if not, what is?
There's no problem in you rendering into an image and refreshing that image, but you get no acceleration from the GPU using this technique. So you'd be burning a lot of CPU cycles, which in turns eats battery.
If the objects you are rendering can be described in normal graphics primitives, be sure to use the drawing APIs which are optimised for the platform and can delegate work to the GPU.
An alternative approach is to make use of OpenGLES, but this has a learning curve
Yes, that is a fairly normal way to handle it. Generally you would either use small Objective C stubs for your events and things like pushing out the frame, or you would setup an OpenGL context and then pass it to your C++ code.