Cross Validation in libsvm - c++

I'm using libsvm library in my project and have recently discovered that it provides out-of-the-box cross validation.
I'm checking the documentation and it says clearly that I have to call svm-train with -n switch to use CV feature
.
When I call it with -v switch I cannot get a model file which is needed by svm-predict.
Implementing Support Vector Machine from scratch is beyond the scope of my project, so I'd rather fix this one if it is broken or ask the community for support.
Can anybody help with that?
Here's the link to the library, implemented in C and C++, and here is the paper that describes how to use it.

Cause libsvm use cv only for parameter selection.
From libsvm FAQ:
Q: After doing cross validation, why there is no model file outputted ?
Cross validation is used for selecting good parameters. After finding them, you want to re-train the whole data without the -v option.
If you are going to use cv for estimating quality of classifier on your data you should implement external cross validation by splitting data, train on some part and test on other.

It's been a while since I used libsvm so I don't think I have the answer you're looking, but if you run the cross-validation and are satisfied with the results, running lib-svm with the same parameters without the -v will yield the same model.

Related

How to use the `cc_common.create_link_variables` API?

I'm trying to write some complex Starlark rules that link and build multiple dynamic libraries on Linux using the (relatively) new cc_common APIs.
There seems to be 2 different ways you can create compile/link actions using this API:
Using the compile()/link() methods, which are relatively "high-level", and
Using the create_compile_variables()/create_link_variables() along with get_memory_inefficient_command_line() and then calling actions.run() directly with the generated command line.
In particualr, I'm trying to get #2 to work. My question is, how can I create the param_file to pass into create_link_variables? There doesn't seem to be any Starlark API for this.
https://docs.bazel.build/versions/1.1.0/skylark/lib/cc_common.html#create_link_variables
agoessling I have shared a couple of source files for you here
It should give you a pretty good idea of how this lower level cc_common API can be used end to end.
There are still known holes in this API, i.e. not everything possible with the built-in cc rules also possible through cc_common, but I would say 90% is available.
I am not associated with the Bazel team and the code is the result of my own digging and sniffing. No warranties, but it works for me. Let me know if you get stuck on something - I will try to help.
If you get an idea of how to do some of it better (prettier, more compatible with the built-in rules, more platform-independent, etc.) I am all ears. Good luck!

Is tf.py_func allowed at online prediction time?

Is tf.py_func allowed at online prediction time?
If yes any examples of how to use it?
Does the answer change if I need to install additional pip packages?
My use-case: I work with text, I need to do word stemming (using porter stemmer), I know how to do it using python, tensorflow doesn't have Ops for that. I would like to use the same text processing at training and prediction time - thus I would like to encode it all into a tensorflow graph.
https://www.tensorflow.org/api_docs/python/tf/py_func comes with known limitations and I would like to know if it will work during training and online prediction before I invest more time into it.
Thanks
Unfortunately, no. Py_func can not be restored from a saved model. However, since your use case involves pre-processing, just invoke the py_func explicitly in all three (train, eval, serving) input functions. This won't work if the py_func is in the middle of your graph, but for stemming, it should work just fine.

Creating custom voice commands (GNU/Linux)

I'm looking for advices, for a personal project.
I'm attempting to create a software for creating customized voice commands. The goal is to allow user/me to record some audio data (2/3 secs) for defining commands/macros. Then, when the user will speak (record the same audio data), the command/macro will be executed.
The software must be able to detect a command in less than 1 second of processing time in a low-cost computer (RaspberryPi, for example).
I already searched in two ways :
- Speech Recognition (CMU-Sphinx, Julius, simon) : There is good open-source solutions, but they often need large database files, and speech recognition is not really what I'm attempting to do. Speech Recognition could consume too much power for a small feature.
- Audio Fingerprinting (Chromaprint -> http://acoustid.org/chromaprint) : It seems to be almost what I'm looking for. The principle is to create fingerprint from raw audio data, then compare fingerprints to determine if they can be identical. However, this kind of software/library seems to be designed for song identification (like famous softwares on smartphones) : I'm trying to configure a good "comparator", but I think I'm going in a bad way.
Do you know some dedicated software or parcel of code doing something similar ?
Any suggestion would be appreciated.
I had a more or less similar project in which I intended to send voice commands to a robot. A speech recognition software is too complicated for such a task. I used FFT implementation in C++ to extract Fourier components of the sampled voice, and then I created a histogram of major frequencies (frequencies at which the target voice command has the highest amplitudes). I tried two approaches:
Comparing the similarities between histogram of the given voice command with those saved in the memory to identify the most probable command.
Using Support Vector Machine (SVM) to train a classifier to distinguish voice commands. I used LibSVM and the results are considerably better than the first approach. However, one problem with SVM method is that you need a rather large data set for training. Another problem is that, when an unknown voice is given, the classifier will output a command anyway (which is obviously a wrong command detection). This can be avoided by the first approach where I had a threshold for similarity measure.
I hope this helps you to implement your own voice activated software.
Song fingerprint is not a good idea for that task because command timings can vary and fingerprint expects exact time match. However its very easy to implement matching with DTW algorithm for time series and features extracted with CMUSphinx library Sphinxbase. See Wikipedia entry about DTW for details.
http://en.wikipedia.org/wiki/Dynamic_time_warping
http://cmusphinx.sourceforge.net/wiki/download

Creating QR Codes with ColdFusion

Is there any way with pure ColdFusion/cfscript to produce a QR code, without relying on external APIs or JavaScript?
No. ColdFusion cannot generate bar codes by itself. You need a separate tool or library. It is easy enough to install a java library, like ZXing. Then generate the images from CF. Alternately, you could do a <cfhttp> call to an external server that generates the bar code image for you, or basically do the same thing with javascript. You would not need to install anything for the latter two (2) options. But they still rely on an external resource.
Bottom line you need something more than just ColdFusion. What is the reason you cannot use either an external API or javascript? Because without either of those, you are probably out of luck.
Edit based on comments:
If the only restriction is the images must generated locally, then you can use ZXing as described in the link above -OR- any of the other components/libraries mentioned in the other responses, like Joe's suggestion which uses iText (though also based on ZXing).
Some other external APIs
http://cfbarbecue.riaforge.org/
http://zanstra.com/my/Barcode.html?barcode=3PTSP8827A231
If you really wanted to, you could look up (perhaps you need to buy?) the encoding standard for QR codes, which I believe is an ISO standard. Then you could write a program which would output a table with the appropriate number of rows and columns, each with either a black or a white background. I wouldn't recommend this form of "rolling your own" though; it's a lot of work to do essentially what's been done before.
Tim Cunningham wrote a library that is hosted on Github that utilizes iText that does just this very thing. https://github.com/boltz/QRToad

c++: program settings - boost.PropertyTree or boost.program_options?

I was looking for a solution to store program settings or options or configuration in C++. These could be settings that are exposed in a GUI and need to be saved between runs of my code.
In my search I came across boost.PropertyTree which seemed to be a good choice. I know boost is well respected code so I'm comfortable using it and so I started developing using this. Then I come across boost.program_options which seems to allow you to do the same thing but also looks more specialized for the specific use-case of program settings.
Now I'm wondering which is the most appropriate for the job? (or is there a 3rd option that is better than both)
EDIT:
fyi this is for a plugin so it will not use command line options (as in, it's not even possible).
UPDATE
I ended up sticking with boost.PropertyTree. I needed to be able to save changed options back to the INI, and I didn't see a way of doing that with boost.program_options.
Use boost::program_options. It's exactly what it's for. In one library you get command line options, environment variables options and an INI-like configuration file parser. And they're all integrated together in the Right way, so when then the user specifies the same option in more than one of these sources the library knows the Right priority order to consider.
boost::property_tree on the other hand is a more generalized library. The library parses the text stream into a uniform data model. But You need to do the real parsing -- that of making sense of the blob of data for your needs. The library doesn't know when to expect a parameter when it sees a particular option string, or to disallow specific values or types of values for a particular option.
After some digging around I think boost.PropertyTree is still the best solution because it gives me the capability to save the options after changing them from within the program which is a requirement.
There is a non-Boost possibility too. Config4Cpp is a robust, simple-to-use and comprehensively documented configuration-file parser library that I wrote. It is available at www.config4star.org.
I suggest you read Chapter 3 (Preferences for a GUI Application) of the Practical Usage Guide manual to read an overview of how Config4Cpp can do what you want. Then open the Getting Started Guide manual, and skim-read Chapters 2 and 3, and Section 7.4 (you might prefer to read the PDF version of that manual). Doing that should give you sufficient details to help you decide if Config4Cpp suits your needs better or worse than Boost.
By the way, the indicated chapters and sections of documentation are short, so they shouldn't take long to read.