APIs/Libraries/SDKs for handwriting tracking? - computer-vision

I'm trying to make a mobile AR application. I want to track the user's handwriting in real time using smartphones while the user writes on paper. That is to track every stroke made by the user.
I know some SDKs and products like ManoMotion and Leap Motion provide relatively precise hand tracking and analysis, but since writing on paper doesn't involve many motions and gestures, I don't think they are suitable.
I have searched online and haven't found any resources for my particular use case. So I would like to ask if there are other resources I should take a look at, or I should rely on some lower level APIs like OpenCV.

mhc, kind of late to answer this. But regarding ManoMotion you need to apply for access to the SDK products through the official website at www.manomotion.com the approval process is manual and they require some information regarding the intention of use in a project so I highly suggest to provide with as much detail as possible.

Related

C++ Usage Statistics

I have made an application in C++ and would like to know how to go about implementing a usage statistics system so that I may gather some data regarding how users use the program.
Eg. IP Address, Number of hours spent in application, and OS used.
In theory I know I can code this myself if I must, but I was wondering if there is a framework available to make this easier to do. Unfortunately I was unable to find anything on google.
Though there is no any kind of such framework, you could reduce the work you have to do (in order to retrieve all these information) by using some approaches and techniques, which I tried describe below. Please, anybody feel free to correct me.
Let's summarise, what groups of information do we need to complete the task:
User Environment Information. I suggest you to look at Microsoft's WMI infrastructure, in particular to WMI classes: Desktop, File System, Networking, etc. Using this classes in your application can help you retrieve almost all kind of system information. But if you don't satisfy with this, see #2.
Application and System Performance. Under these terms I mean overall system performance, processor's count, processes running in OS, etc. To retrieve these data you can use the NtQuerySystemInformation function. With its help, you will get an access to detailed SystemProcessInformation, SystemProcessorPerformanceInformation (retrieves info about each processor) information, and much more.
User Related Information. It's hard to find a framework to do such things, so I suggest you simply start writing code, having in mind your requirements:
counting how many times each button was pressed, each text field was changes, etc.
measuring delay time between consecutive actions in some kind of predefined sequences (for example, if you have a settings gui form and you expect from the user to fill very fast all required text fields, so using a time delay measuments can give you an information if the user acted as we expected from him or delayed after TextBox2 for a 5 minutes).
anything that could be interested to you.
So, how you could implement the last item (User Related Information) requirements? As for me, I'd do something like folowing (some may seem very hard to implement or too pointless):
- creating a kind of base Counter class and derive from it some controls (buttons, edits, etc).
- using a windows hooks for mouse or keybord while getting a child handle (to recognize a control, for example).
- using Callback class, which can do all "dirty" work (counting, measuring, performing additional actions).
You could store all this information either in a textfile or an SQLite database or there wherever you prefer.
I would recommend taking a look at DeskMetrics. This StackOverflow post summarizes the issue.
Building your own framework could take you months of development (apart from maintenance). With something like Trackerbird Software Analytics you can integrate a DLL with your app and start tracking in 30 minutes and you get all the cool real-time visualizations.
Disclaimer: I am affiliated with company.

Audio Subtitle Transcription - C++

I'm on a project that among other video related tasks should eventually be capable of extracting the audio of a video and apply some kind of speech recognition to it and get a transcribed text of what's said on the video. Ideally it should output some kind of subtitle format so that the text is linked to a certain point on the video.
I was thinking of using the Microsoft Speech API (aka SAPI). But from what I could see it is rather difficult to use. The very few examples that I found for speech recognition (most are for Text-To-Speech which mush easier) didn't perform very well (they don't recognize a thing). For example this one: http://msdn.microsoft.com/en-us/library/ms717071%28v=vs.85%29.aspx
Some examples use something called grammar files that are supposed to define the words that the recognizer is waiting for but since I haven't trained the Windows Speech Recognition thoroughly I think that might be adulterating the results.
So my question is... what's the best tool for something like this? Could you provide both paid and free options? Well the best "free" (as it comes with Windows) option I believe it's SAPI, all the rest should be paid but if they are really good it might be worth it. Also if you have any good tutorials for using SAPI (or other API) on a context similar to this it would be great.
On the whole this is a big ask!
The issue with any speech recognition system is that it functions best after training. It needs context (what words to expect) and some kind of audio benchmark (what does each voice sound like). This might be possible in some cases, such as a TV series if you wanted to churn through hours of speech -separated for each character- to train it. There's a lot of work there though. For something like a film there's probably no hope of training a recogniser unless you can get hold of the actors.
Most film and TV production companies just hire media companies to transcribe the subtitles based on either direct transcription using a human operator, or converting the script. The fact that they still need humans in the loop for these huge operations suggests that automated systems just aren't up to it yet.
In video you have a plethora of things that make you life difficult, pretty much spanning huge swathes of current speech technology research:
-> Multiple speakers -> "Speaker Identification" (can you tell characters apart? Also, subtitles normally have different coloured text for different speakers)
-> Multiple simultaneous speakers -> The "cocktail party problem" - can you separate the two voice components and transcribe both?
-> Background noise -> Can you pick the speech out from any soundtrack/foley/exploding helicopters.
The speech algorithm will need to be extremely robust as different characters can have different gender/accents/emotion. From what I understand of the current state of recognition you might be able to get a single speaker after some training, but asking a single program to nail all of them might be tough!
--
There is no "subtitle" format that I'm aware of. I would suggest saving an image of the text using a font like Tiresias Screenfont that's specifically designed for legibility in these circumstances, and use a lookup table to cross-reference images against video timecode (remembering NTSC/PAL/Cinema use different timing formats).
--
There's a bunch of proprietary speech recognition systems out there. If you want the best you'll probably want to license a solution off one of the big boys like Nuance. If you want to keep things free the universities of RWTH and CMU have put some solutions together. I have no idea how good they are or how well they might be suited to the problem.
--
The only solution I can think of similar to what you're aiming at is the subtitling you can get on news channels here in the UK "Live Closed Captioning". Since it's live, I assume they use some kind of speech recognition system trained to the reader (although it might not be trained, I'm not sure). It's got better over the past few years, but on the whole it's still pretty poor. The biggest thing it seems to struggle with is speed. Dialogue is normally really fast, so live subtitling has the extra issue of getting everything done in time. Live closed captions quite frequently get left behind and have to miss a lot of content out to catch up.
Whether you have to deal with this depends on whether you'll be subtitling "live" video or if you can pre-process it. To deal with all the additional complications above I assume you'll need to pre-process it.
--
As much as I hate citing the big W there's a goldmine of useful links here!
Good luck :)
This falls into the category of dictation, which is a very large vocabulary task. Products like Dragon Naturally Speaking are amazingly good and that has a SAPI interface for developers. But it's not so simple of a problem.
Normally a dictation product is meant to be single speaker and the best products adapt automatically to that speaker, thereby improving the underlying acoustic model. They also have sophisticated language modeling which serves to constrain the problem at any given moment by limiting what is known as the perplexity of the vocabulary. That's a fancy way of saying the system is figuring out what you're talking about and therefore what types of words and phrases are likely or not likely to come next.
It would be interesting though to apply a really good dictation system to your recordings and see how well it does. My suggestion for a paid system would be to get Dragon Naturally Speaking from Nuance and get the developer API. I believe that provides a SAPI interface, which has the benefit of allowing you to swap in the Microsoft speech or any other ASR engine that supports SAPI. IBM would be another vendor to look at but I don't think you will do much better than Dragon.
But it won't work well! After all the work of integrating the ASR engine, what you will probably find is that you get a pretty high error rate (maybe half). That would be due to a few major challenges in this task:
1) multiple speakers, which will degrade the acoustic model and adaptation.
2) background music and sound effects.
3) mixed speech - people talking over each other.
4) lack of a good language model for the task.
For 1) if you had a way of separating each actor on a separate track that would be ideal. But there's no reliable way of separating speakers automatically in a way that would be good enough for a speech recognizer. If each speaker were at a distinctly different pitch, you could try pitch detection (some free software out there for that) and separate based on that, but this is a sophisticated and error prone task.) The best thing would be hand editing the speakers apart, but you might as well just manually transcribe the speech at that point! If you could get the actors on separate tracks, you would need to run the ASR using different user profiles.
For music (2) you'd either have to hope for the best or try to filter it out. Speech is more bandlimited than music so you could try a bandpass filter that attenuates everything except the voice band. You would want to experiment with the cutoffs but I would guess 100Hz to 2-3KHz would keep the speech intelligible.
For (3), there's no solution. The ASR engine should return confidence scores so at best I would say if you can tag low scores, you could then go back and manually transcribe those bits of speech.
(4) is a sophisticated task for a speech scientist. Your best bet would be to search for an existing language model made for the topic of the movie. Talk to Nuance or IBM, actually. Maybe they could point you in the right direction.
Hope this helps.

Geocoding things from the database?

I am kind of new to Geocoding. What I want to do is pull a bunch of names of places from the DB, and display them as markers on the page. And then allow people to choose different options which would force another db query, which would place a number of new markers on the page.
Is that possible? It seems like relatively simple functionality, but since I am not good at JSON, it is giving me a hard time.
Thanks,
Alex
There are lots of ways to geocode and really you need to give more information!
For example, in an offline environment, MapPoint is a pretty good solution (costs about $200/300 license). It can be made to work on a webserver but isn't usually worth the effort.
For a web server, then I would look at a web service. These are usually limited for free use, or pay for heavier (or commercial) use. Your question is too wide to give specifics, but look at the web services provided by Bing Maps, Google Maps, Yahoo (yes they're still around), and OpenStreetMaps-based. bing Maps and Google Maps look like they'll be around for a long time - but might cost, depending on your application. OpenStreetMaps promises to have the widest coverage (including non NAm/EUR countries), but probably doesn't have the coverage of the others, yet.
After figuring this out, I made a tool that takes an address and converts it to lat/lng. It also has a code tutorial: http://comehike.com/utils/address_to_geolocation.php
I hope it helps people. Would be fun to get feedback too :)

What Features Should Tomorrow's Wiki Include? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
What features should "Tomorrow's" wikis include? How might they incorporate Web 2.0 features like AJAX? What other features are they currently missing? What do you want to see from the next release of your favorite Wiki?
Edit: How might a Wiki be integrated into other products? What "neat uses" could wikis have?
Preview-as-you-type works very nicely indeed here on Stack Overflow. Many wikis don't do that.
Make it really easy to link between pages, eg. that, as you type, the wiki finds likely pages you may be referring to. That way you can make links without having to know the exact title of a target page, and bouncing on the shift key to WriteInCamelCase, or throwing in square brackets. Make it very easy to link to other websites outside the wiki, too (and by "easy" I do not mean like wikisisters, which, if I remember correctly, is like foowiki:ALinkLikeThis).
Similarly, if you can generate links within text automatically, you could, for example, have a mail system that wikifies your email. You create a wiki page, say, for Joel Spolsky, and references to Joel emails in your inbox become links to that page, which you can find by clicking "what links here". (This probably needs something along the lines of Bayesian filtering to prune the stray references to other Joels... your Bayesian Classifier learns that if the context is smart and getting things done, it's Spolsky. If it's flying Viking kittens, it's morely likely Joel Veich).
A variety of RSS feeds for tracking changes would nice, too. (Diffs, full text, changes on pages I've edited, ...)
Wikipedia has grown a fairly colossal categorisation system ("Fictional Cats", anyone?); laying a taxonomy over a wiki's flat namespace could provide another way for users to find their way around. Wikipedia's doing this a little, but in fairly limited ways so far: there are links to the relevant category lists, but you can't, for example, look for a composer called "Smith".
Similarly, wikis give you this big graph of interconnected nodes, of how closely your community sees the relevant concepts as being. Is that interesting? Is that useful? Does anyone who isn't google want to think about this stuff?
PS. If you believe Paul Graham's definition of Web 2.0 as "Democracy, Don't Maltreat Users, and Javascript works now", wikis are two thirds Web 2.0 already.
I am personally already tired of wikis. Wiki as a software is outdated, now it is about wiki as a feature (like my favorite new website, stack overflow).
The main advantage of community wiki — more editing — came into existence when we introduced "Suggested Edits".
With "Suggested Edits", anyone, even an anonymous user, can edit anything — so long as another experienced user reviews and approves their edit.
I'm in the process of choosing a wiki tool, and have looked at numerous packages over the past week. I'm sure there are dozens I haven't even heard of yet, probably good ones. But in general, here's my "beginner's mind" take on the problem.
Wiki markup should be abandoned. A wiki that is limited to wiki markup will only be useful to 'nix hacks and others who get excited about doing things the hard way and insisting that everybody else is stupid. I mean, Morse code is fine with me personally; I don't get what was wrong with a nice, clean dash-dot-dash. Or smoke signals, they were nice, except for the carbon footprint. But times change, and we have to change with them.
Real users (business users, customers, clients) want rich text editing. Period. And when a wiki tries to support both rich text and wiki markup, the results are not pretty. The model is confusing and (apparently) difficult to implement. The fckeditor extension at wikiwiki is a nightmare, for example. It's just not worth it.
Wikis need better access control. The idea that all content should be open to everyone is fine for an open, public, non-profit wiki like this one. But in the business world, that's not how it works. Restricting access is not evil, it's reality. Wiki tools need to do a much better job of providing access control: access to pages and groups of pages based on role or group membership, where groups can be formed by anyone on an ad hoc basis and users can belong to multiple groups and pages can be accessible to multiple groups, at the whim of the page's creator.
Those are the two things that I want, above all else, and I haven't found it in open source, at least not out of the box. Which, of course, is why open source is open source.
There's been some interesting work using wikis for testing and software development. EG, movement towards literate programming -- allowing pages to exist as both code and documentation that is compiled down into one or the other (or, I suppose, both simultaneously).
They have a regular session about this at the annual WikiSym conference.
I think one direction of Wikis is going from open ended collections of documents to an "everyone can edit but with more structure" applications like SO.
Another direction that I've seen is more direct integration with other project support tools, so project planning, issue management, and all that stuff.
Personally, I think the next big direction is going to be some sort of multimedia based Wiki, not just a Wiki where multimedia can be embedded in the text.
I really like MediaWiki. It's widely used and free/Free. The markup syntax is straightforward and allows you to do enough basic styling that you don't need to use custom HTML or to use a WYSIWYG. I assume by "sexy web 2.0" you mean Flash/AJAX, but I like MediaWiki because it works cleanly with basic HTML/Javascript (you don't have to wait for custom widgets to load, etc...).
What makes wikis reach their potential of usefulness is the community that develops around them more than the software itself. You need to find a niche where people are both passionate about (but not criminally insane about) the central topic and have enough technical prowess to log on to a website and edit some text.
"Wiki" is ultimately just a pattern:
Open editing by all/most visitors
Integrated revision tracking and rollback to reduce the cost of mistakes
Simple syntax for cross-linking between articles, and auto-creation of stub articles when referenced
That's not a perfect description, but it's a combination that isn't particularly magic. Successful wikis combine those things with a critical mass of people creating and maintaining content.
The next step, IMO, is less about web 2.0 shininess and more about the integration of better structural information. Adding any metadata beyond "this points to that" is an exercise in brute force hand-markup. Maybe microformats? Maybe the development of more structured knowledgebase software that uses wiki-ish editing UI but a smarter backend? I'm not sure, but I think better handling of the structured data is really the next wave.
Extensibility.
Check out DekiWiki, they are doing an excellent job with this.
DekiWiki extensions
The wiki-of-the-future will be completely editable online, concurrently by everyone. Check out EtherPad for a demo of the techonology.
For me, in terms of Enterprise style uses for a wiki, I have a couple of thoughts;
An effective way to keep and synchronise a central, web based wiki with multiple, offline, desktop style wiki's for people on the go
To move towards wiki as a function as opposed to wiki as a system, so we can integrate the wiki collaborative system into other things

What are some examples of how your company uses a wiki for development?

Do you use a wiki in your company? Who uses it and what for. Do you share information between projects / teams / departments or not?
We use ours to store
Coding Style docs
Setup and Deployment procedures for web servers and sites
Network diagrams (what are all the servers in Dev, Staging, QA and Production called etc.)
Project docs (pdfs, visios, excel, docs, etc.) are stored in SVN. For the non-techies we have links to those docs in the wiki that point to an up-to-date share on my box. (tip: some wikis provide source control integration but ours doesn't)
Installation and Setup procedures for development tools
Howto's on things like using our bug tracking system, our unit testing philosophy
When doing research on a topic I often capture the important information in a wiki page for others to learn from
I've seen them used to keep seating charts in medium to large size organizations for the new people
At my previous company all of the emergency contacts and procedures for handling a critical outage where available on the front page of the wiki
The best part about a wiki is that it's searchable. Some wiki's support searching inside uploaded or linked docs as well.
If you setup a wiki and encourage or even require people to use it the amount of information that will accumulate can be amazing. It's definately worth the effort especially if you have someone in IT with some spare time on their hands to set it up.
Do you use a wiki in your company?
= We use it for the purpose of a Knowlede Based. Basically it is a wiki but many more functionalities intagrated.
Who uses it and what for
= Employees. Knowledge Sharing, Preparation of collaborative-documents, etc.
Do you share information between projects / teams / departments or not?
= Depends on the requirements. It is possible to set permissions between users.
We use a wiki, for documenting our systems. It's updated gradually as things update and evolve. It should go without saying that there's benefit in that, however whether you use a wiki or other methods is worth thinking about.
A wiki is great for collarborative editing. The information shouldn't go stale in theory, because as people use the systems they have the opportunity to keep it up to date.
However we have found in our organisation that people struggle a little with wiki markup. Especially tables. I think a solution that has wysiwyg editing would be better if you have non-highly-technical people editing it. Sharepoint springs to mind, but it's expensive.
I use a wiki as my virtual "story wall" for agile development. All of my stories are written and organized in the wiki. While my customers are reasonably local (we can have face-to-face meetings), they aren't co-located. To enable better customer interaction I've resorted to a wiki instead of a wall-based story tracking mechanism. It also works a little better for me due to the fact that I often have multiple, concurrent projects and limited wall space in my cube. In a larger team with more focused projects and more wall area, I'm not sure I'd make the same choice.
My company uses a wiki for project-planing but also for storing documentation and ideas.
I have found that a wiki is a great way to link the programmers in the company with the business-people.
When someone who are not on the programming-team comes up with an idea or finds a bug, it's a loot simpler to let that person document it in the wiki.
I think it's an important aspect for a small company like mine to easily synchronize the business-team with the development-team. A wiki helps with that, since it gives the feeling of being a part of the development process, instead of having to ask the programmer directly about every little detail.
we have MediaWiki to store technical information that is not ready to be published in other formats - specification drafts, diagrams (via GraphViz extension), results of short investigations, etc.
I also think this question is a wiki too :)