Hand gesture recognition based CAPTCHA - web-services

I am planning to make a Hand Gesture recognition based CAPTCHA. My idea is to implement captcha as something which asks user to make gestures as indicated on the screen(which will be images of the required hand gestures , like V, F, 2, etc with hand) and using pattern matching the system authenticates the pattern and directs to the next page if authenticated.
Now I am unable to decide how to make the UI and how can I incorporate this into a webpage(like a plugin or something). As of now I have been coding the same using visual studio and c++ using OPENCV library.

Check out this library called "JSFeat", it implements basic computer vision algorithms in a web-framework that I could see you using to at least prototype your project:
http://inspirit.github.io/jsfeat/#imgproc
The demos opened up my webcam in the browser and ran some standard algorithms (edge detect, optical flow, etc). If you're developing your own algorithms, there's some matrix math to help out, though I bet you could also combine this with other javascript linear algebra libraries.
Best of luck.

Related

How can I control and receive video stream from SITL drone using px4,gazebo and C++ without using ROS?

I want to program a drone to fly with a C++ project using image processing analysis in real-time (using OpenCV). I want to do it with PX4, Gazebo simulator. The final goal is to run the project on a real
drone using Jetson-Nano and Pixhawk drone.
I have 2 main problems.
I can't manage to get the video stream of the px4 drone models without using ROS. I have followed this official guide to install the relevant software (Gazebo, Px4, GCS).
For python we have Drone-kit library to control the drone, but I want to use C++ for my project. what are the alternatives tools instead of the Drone-kit to control drones with C++ and how I can receive the video stream from the gazebo px4 drone?
After that I tried to search for information online for hours and go through the documentations, I could not find a suitable guide or solution.
thanks.
Posting this as an answer after details in the comments made things more clear.
For an application like this ROS should most definitely be used. It comes with a wide range of pre-built packages and tools to enable easy localization and navigation. When looking at UAVs the MoveIt! package is a good place to look. It handles 3D navigation and has a few UAV implementations already. The Hector Quadcopter package is another good option for something like SLAM.

Implementing OCR on IOS

Was wondering if anyone had some idea on how one would implement OCR image linking on an IOS device.
What I exactly want the app to do is scan an image using the iPhones camera, then recognise that image. When the image is recognised the app should open a link that is relative to the image.
A better example of what I am talking about is made by a company called Augment . They make a product called "Trackers" which is exactly what I would like to implement.
There are no in-build/custom SDK's that do your exact requirement.
However, you can achieve it by customizing the OpenCV library or any of Augmented reality SDK's.
Here are the links may helpful to you
OpenCV library tutorial iOS
Wikitude Augmented reality
Now there is Real-Time Recognition SDK (http://rtrsdk.com).
It is free by the way. Disclaimer: I work for ABBYY

Kinect SDK c++ grab and press gesture

I've been working with Kinect for a few months and I've used OpenNI and Kinect for Windows SDK since then. A few months ago, when I started using it, I was usinf C# with WPF to create menus that a user could use gestures to interact with them using WPF. Using WPF it's pretty easy to detect grab gestures, swipes and press gesture (This is done in a trasparent way to the programmer), but since I've migrated to C++, I have no clue how to detect them. Which functions of the Kinect SDK are used to do this or where can I find a tutorial regarding this matter?
Many thanks!
I realize it's an old question, but I had the same problem and haven't found an answer. What I did found was this post on MSDN forums stating that those gestures are integral part of WPF/XAML and cannot be used without it. Person from url i linked below said: "I would say that press, pan and zoom as you found them are no "general gestures" out of the box, only interactions with some WPF/XAML controls inside a KinectRegion." Good thing is that those functions are not that hard to implement.
https://social.msdn.microsoft.com/Forums/en-US/59cf4671-98cc-4fe5-a3e0-6ecc612cde3c/swipe-gesture-in-v2-sdk?forum=kinectv2sdk

Getting the amplitude(or rms voltage) of audio signal captured in C++ by wavin lib.?

I am working on a very basic robotics project, and wish to implement voice recognition in it.
i know its a complex thing but i wish to do it for only 3 or 4 commands(or words).
i know that using wavin i can record audio. but i wish to do real-time amplitude analysis on the audio signal, how can that be done, the wave will be inputed as 8-bit, mono.
i have thought of divinding the signal into a set of some specific time, further diving it into smaller subsets, getting the average rms value over the subset and then summing them up and then see how much different they are from the actual stored signal.If the error is below accepted value for all(or most) of the sets, then print the word.
How can this be implemented?
if you can provide me any other suggestion also, it would be great.
Thanks, in advance.
There is no simple way to recognize words, because they are basically a sequence of phonemes which can vary in time and frequency.
Classical isolated word recognition systems use signal MFCC (cepstral coefficients) as input data, and try to recognize patterns using HMM (hidden markov models) or DTW (dynamic time warping) algorithms.
You will also need a silence detection module if you don't want a record button.
For instance Edimburgh University toolkit provides some of these tools (with good documentation).
If you don't want to build it "from scratch" or have a source of inspiration, here is an (old but free) implementation of such a system (which uses its own toolkit) with a full explanation and practical examples on how it works.
This system is a LVCSR (Large-Vocabulary Continuous Speech Recognition) and you only need a subset of it. If someone know an open source reduced vocabulary system (like a simple IVR) it would be welcome.
If you want to make a basic system from your own, I recommend you to use MFCC and DTW:
For each target word to modelize:
record some instances of the word
compute some (eg each 10ms) delta-MFCC through the word to have a model
When you want to recognize a signal:
compute some delta-MFCC of this signal
use DTW to compare these delta-MFCC to each modelized word's delta-MFCC
output the word that fits the best (use a threshold to drop garbage)
If you just want to recognize a few commands, there are many commercial and free products you can use. See Need text to speech and speech recognition tools for Linux or What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition? or Speech Recognition on iPhone. The answers to these questions link to many available products and tools. Speech recognition and understanding of a list of commands is a very common problem solved commercially. Many of the voice automated phone systems you call uses this type of technology. The same technology is available for developers.
From watching these questions for few months, I've seen most developer choices break down like this:
Windows folks - use the System.Speech features of .Net or Microsoft.Speech and install the free recognizers Microsoft provides. Windows 7 includes a full speech engine. Others are downloadable for free. There is a C++ API to the same engines known as SAPI. See at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. or http://msdn.microsoft.com/en-us/library/ms723627(v=vs.85).aspx
Linux folks - Sphinx seems to have a good following. See http://cmusphinx.sourceforge.net/ and http://cmusphinx.sourceforge.net/wiki/
Commercial products - Nuance, Loquendo, AT&T, others
Online service - Nuance, Yapme, others
Of course this may also be helpful - http://en.wikipedia.org/wiki/List_of_speech_recognition_software

Creating a user interface that accepts sketch input

I would like to create a simple application asks the user to draw an image they have been shown. Once the user has completed the image the program would score the drawing. Are there any existing libraries for creating interfaces that accept sketches/drawing from users? I need a sketch object (maybe as a vector graph) which can be processed.
The program should run on tablets and touch screen laptops, preferably on windows however multi-platform would be ideal. I am open to using what ever programming language is best for this project.
Currently I am looking at the SATIN library (http://dub.washington.edu:2007/projects/satin/) but it is rather old, the last change was in 2001.
Maybe you can try Pencil. It's a add-ons for firefox and you can easily draw some sketch schema (or mockup).
Download Pencil for Firefox