I am doing my dissertation on sentiment analysis on transcriptions of oral testimonies and have a few questions/clarifications regarding the programming behind Google Cloud's Natural Language API v1beta2.
I assume that it is a combination of lexicon-based methods and machine learning based methods of sentiment analysis but would appreciate confirmation of this.
What language model does Google NLP use? (I am guessing something involving deep learning and Tensorflow but am not sure)
What source material was the language model trained on? And was
anything like SentiWordNet or WordNet used?
If the API can detect both implicit and explicit sentiment?
Is the tool only capable of working in English or can it translate/trace sentiment in, for example, German or Polish?
I am open to any and all answers. Also, if anyone knows of any official Google document which lists this information that would also be appreciated. Thank you.
We're currently using a large CNN implemented in TF. The model in the Cloud NL API does not use a LM. The API can detect both implicit and explicit sentiment, but it does not differentiate between them. The Sentiment API supports English, Spanish, French, German, Portuguese, Korean, Japanese, and Chinese
Source: I'm the Product Manager for NLU Research # Google/Alphabet.
Related
Is there a way to disable auto correction for Google Cloud Speech to Text API? It is important for me to get accurate transcript of user's speech, with any errors they make rather than a corrected version.
It is difficult to distinguish between mistakes made by speaker (grammar/pronunciation errors) in the audio content and mistakes made by Speech API. However, you can check different versions of text output predicted by model behind the scene with the help of maxAlternatives property of the API.
You have not provided the example of such use-case, but if you are already expecting unusual pronunciation or Acronyms you can provide hint to the request using phraseHint property.
Please provide further details if it doesn't answer your question.
How flexible or supportive is the Amazon Machine Learning platform for sentiment analysis and text analytics?
You can build a good machine learning model for sentiment analysis using Amazon ML.
Here is a link to a github project that is doing just that: https://github.com/awslabs/machine-learning-samples/tree/master/social-media
Since the Amazon ML supports supervised learning as well as text as input attribute, you need to get a sample of data that was tagged and build the model with it.
The tagging can be based on Mechanical Turk, like in the example above, or using interns ("the summer is coming") to do that tagging for you. The benefit of having your specific tagging is that you can put your logic into the model. For example, the difference between "The beer was cold" or "The steak was cold", where one is positive and one was negative, is something that a generic system will find hard to learn.
You can also try to play with some sample data, from the project above or from this Kaggle competition for sentiment analysis on movie reviews: https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews. I used Amazon ML on that data set and got fairly good results rather easily and quickly.
Note that you can also use the Amazon ML to run real-time predictions based on the model that you are building, and you can use it to respond immediately to negative (or positive) input. See more here: http://docs.aws.amazon.com/machine-learning/latest/dg/interpreting_predictions.html
It is great for starting out. Highly recommend you explore this as an option. However, realize the limitations:
you'll want to build a pipeline because models are immutable--you have to build a new model to incorporate new training data (or new hyperparameters, for that matter)
you are drastically limited in the tweakability of your system
it only does supervised learning
the target variable can't be other text, only a number, boolean or categorical value
you can't export the model and import it into another system if you want--the model is a black box
Benefits:
you don't have to run any infrastructure
it integrates with AWS data sources well
the UX is nice
the algorithms are chosen for you, so you can quickly test and see if it is a fit for your problem space.
Is/are there existing C++ NLP API(s) out there? The closest thing I have found is CLucene, a port of Lucene. However, it seems a bit obsolete and the documentation is far from complete.
Ideally, this/these API(s) would permit tokenization, stemming and PoS tagging.
Freeling is written in C++ too, although most people just use their binaries to run the tools: http://devel.cpl.upc.edu/freeling/downloads?order=time&desc=1
Try something like DyNet, it's a generic neural net framework but most of its processes are focusing on NLP because the maintainers are creators of the NLP community.
Or perhaps Marian-NMT, it was designed for sequence-to-sequence model machine translation but potentially many NLP tasks can be structured as a sequence-to-sequence task.
Outdated
Maybe you can try Ellogon http://www.ellogon.org/ , they have GUI support and also C/C++ API for NLP too.
if you remove the restriction on c++ , you get the perfect NLTK (python)
the remaining effort is then interfacing between python and c++.
Apache Lucy would get you part of the way there. It is under active development.
Maybe you can use Weka-C++. It's the very popular Weka library for machine learning and data mining (including NLP) ported from Java to C++.
Weka supports tokenization and stemming, you'll probably need to train a classifier for PoS tagging.
I only used Weka with Java though, so I'm afraid can't give you more details on this version.
There is TurboParser by André Martins at CMU, also has a Python wrapper. There is is an online demo for it.
This project provides free (even for commercial use) state-of-the-art information extraction tools. The current release includes tools for performing named entity extraction and binary relation detection as well as tools for training custom extractors and relation detectors.
MITIE is built on top of dlib, a high-performance machine-learning library, MITIE makes use of several state-of-the-art techniques including the use of distributional word embeddings and Structural Support Vector Machines[3]. MITIE offers several pre-trained models providing varying levels of support for both English and Spanish, trained using a variety of linguistic resources (e.g., CoNLL 2003, ACE, Wikipedia, Freebase, and Gigaword). The core MITIE software is written in C++, but bindings for several other software languages including Python, R, Java, C, and MATLAB allow a user to quickly integrate MITIE into his/her own applications.
https://github.com/mit-nlp/MITIE
Are there to day any concept mining open source tools available? I have only be coming across like Leximancer, which although seem to fit the role is not open source and quite expensive for a undergraduate student. I have been unsuccessful so far since the word 'concept' on both google and google scholar seems to be un-matching what I want.
It seems to me you need a text mining tool for clustering. RapidMiner has an open-source, Java based Community Edition which has several extensions (Text Mining, R, etc.). In addition you can develop and integrate your own algorithms too.
Moreover Rexer Analytics offers a comprehensive data mining survey annually, you can call for reports for free.
I want to know about various techniques to do speech recognition and text to speech conversion.
Also please let me know about any resources like links, tutorials ,ebooks etc. on it.
Which is the most efficient technique to achieve it ?
I'm going to answer the part about speech recognition (since I don't know much about text-to-speech):
http://ecx.images-amazon.com/images/I/4190SZC61CL._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_AA240_SH20_OU01_.jpg
This book, "Statistical Methods for Speech Recognition" is a classic that explains the mathematical foundations of statistical speech recognition, written by the founder of that area, Frederick Jelinek.
The most important concept you have to know is Hidden Markov Models. People have been using them in speech recognition for decades. A recent approach uses Conditional Random Fields, see the paper (PDF) and the associated software toolkit SCARF.
It is fairly hard to write your own speech recognizer. It's an active research area with several scientific conferences, e.g. ASRU, Interspeech, ICASSP.
Both are very wide areas.
About recognition: In this this schema you will find how to build a basic automatic speech recognition system. It isn't by any means close to the start of the art, but it is something achievable and it works. If you want to do something more advanced, read about cepstral coefficients and Hidden Markov Models. Have a look into HTK, it is a widely used toolkit for Hidden Markov Models.
About text to speech: I'd have a look at Festival.
There are multiple sphinx's. The main active ones are pocketsphinx and sphinx4.
Sphinx4 is written in Java. It is better for desktop and web applications.
Pocketsphinx is written in C. It is better for embedded devices. There are iphone/android apps that use it.
Sounds like you want pocketsphinx. Try out this tutorial:
http://www.speech.cs.cmu.edu/sphinx/tutorial.html
A better place to ask pocketsphinx/sphinx4 questions is on CMU's sourceforge forum.
Also you should provide more info like what you intend to make.
As for books, the bible of speech recognition is "Spoken Language Processing"
Since you mentioned MS -
You should just look at the Microsoft Speech site. It contains many resources for dealing with speech, including TTS and speech recognition.
If you're looking for some actual code, check out Sphinx, an open source speech recognition project from CMU. It's not written in C++, but if you're interested in algorithms, it's implemented a bunch of stuff you can learn from. (I'd like to echo #dehmann's point, too: read up on hidden markov models.)
If you are curious about what to do with your fancy speech recognition you should read:
Voice Interaction Design by Randy Allen Harris
It provides some great advice about when to use Voice and how to use it in an application.