how to make speech recognition chat bot using wit.ai? - python-2.7

I'm trying to build a chat bot using wit.ai, which will recognize the speech and convert into text in chat bot.
Is it possible with the GUI of wit.ai to make such kind of chat bot?
I actually converted the voice into text, but facing difficulty to integrate the voice input with chat bot. How to do this?

since the new release of messenger, you can convert speech into text, so if you are developing for messenger or another app with a good voice-to-text, you can rely on the app instead of trying doing it by yourself. In the end you're gonna have just text inputs, but people would be able to convert it's speech into text.

Related

Make a website converting text to audio [Google Cloud Text to Speech API]

I'm a beginner in coding. I would like to make a simple website using Google Cloud Text to Speech API.
a web site with a text box
you write a text in the text box and click a button "convert to audio"
you can download mp3 file which is made by google cloud text to speech api
I have read Google Cloud Text to Speech API's official site, but couldn't find a solution.
I have searched like "develop a website converting text to audio".
I found this site.
Creating an HTML Application to Convert Text Files to Audio Files
However, it didn't meet my request.
Could you give me any information to develop a website converting text to audio?
Thank you in advance.
Sincerely, Kazu
I have made a python program on Google Colaboratory. I would like to do the same thing on a website.
from google.colab import drive
drive.mount('/content/drive')
!cp ./drive/'My Drive'/credential.json ./credential.json
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="credential.json"
f= open("text.ssml","w+")
f.write('<speak><prosody rate="slow">hello world</prosody></speak>')
f.close()
!pip install google-cloud-texttospeech
#!/usr/bin/env python
from google.cloud import texttospeech
client = texttospeech.TextToSpeechClient()
with open('text.ssml', 'r') as f:
ssml = f.read()
input_text = texttospeech.types.SynthesisInput(ssml=ssml)
voice = texttospeech.types.VoiceSelectionParams(language_code='en-US', name="en-US-Wavenet-A")
audio_config = texttospeech.types.AudioConfig(audio_encoding=texttospeech.enums.AudioEncoding.MP3)
response = client.synthesize_speech(input_text, voice, audio_config)
with open('output.mp3', 'wb') as out:
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
from google.colab import files
files.download('output.mp3')
In order to achieve what you want, as you say you are new to coding the first thing is to research the GCP text-to-speech API. A good first step is to follow the quick start tutorial available Using client libraries text-to-speech.
As for your requirements of an input box to convert the text to audio. You need to follow the general guidelines for deploying an application on GCP.
Serve Machine Learning Model on App Engine Flexible Environment
so basically your steps would be to train a model and serve via an App engine deployment, or deploying an application which send requests with a json payload to the text-to-speech API. But you need to do quite a bit of reading. Hope this helps.
If you want the flexibility of handling multiple TTS (text to speech) providers (we have at least 4), and enhanced discovery of voices, you might want to look at www.api.audio
Here's an example https://docs.api.audio/recipes/create-engaging-newscast

Can Lex start the conversation?

I want to create a Lex bot that would send a welcome message every time the chat gets opened. Does anyone know if this is possible?
It should depend on the channel you are going to use, but I know that Lex itself cannot initiate a conversation. Also, channels like Facebook Messenger highly discourage bots that initiate a chat because it could become flagged as a spam bot.
However, you could definitely build a workaround to do it, but that will have to be channel specific and outside of Lex. Perhaps as simple as detecting a user opens a chat, and send a "hello" to Lex from that user yourself so that Lex replies with the welcome message. But something like that depends completely on the channel you use.
Word of Warning: Initiating a conversation may violate a user agreement or developer guidelines of Amazon Lex, or the chat channel your bot uses, so I don't suggest doing so.

Voice Recognition SDK/API and Windows 8 Store App (C++)

I'm currently trying to find a SDK or API that I can integrate into a Windows 8 Store App (C++). I have found several but they either require desktop APIs not accessable in Store Apps or is only for C# (Such as Bing Voice Recognition, which would be perfect if it was available for c++). I know there is minimal support for what I'm asking, but I've searched extensively so any help/suggestions on what to try or to use for very basic voice to text would be tremendously helpful.
Thank you!
If you are using a Windows 8 Store App (C++), the best option I found for using voice recognition is to use the AT&T Speech API. The C# have an SDK, but for C++ you can POST to their server using OAuth2 and get a JSON response back with the speech transcribed.

Using Google Api: Speech To Text on PC Version

Google Chrome provide speech to text(STT) and So many smart phone apps provide STT. It has good recognition.
I want program in Visual Studio(MFC), But there's no methods to do STT. If I use Google Speech To Text Api, It's so easy to settle this problem.
If there's no public google api about STT, Tell me another way to this except with start.
To my knowledge, Google has not documented their speech API and do not intend it to be used by general purpose clients. I believe their intent is for the speech API to support their Android and Chrome products. That said, there is more information at Does Anyone Uses Google Speech API in Production? and Is there an API for Google's speech recognition technology?.
Since you're programming for Windows, why don't you use the built in Windows speech engine. You can use the System.Speech features of .Net or Microsoft.Speech and install the free recognizers Microsoft provides. Windows 7 includes a full speech engine. Others are downloadable for free. There is a C++ API to the same engines known as SAPI. See at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. or http://msdn.microsoft.com/en-us/library/ms723627(v=vs.85).aspx. More background on Microsoft engines for Windows
What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition?
One last link that I hope is helpful, here is a simple example of speech recogntion in .NET - SAPI and Windows 7 Problem
You may find this: https://gist.github.com/alotaiba/1730160 useful. Basically you need to send FLAC-encoded audio file to google servers in POST request. Be aware that it accept only audio files of 15 seconds of less (for simple voice command app it would be enough).
I'm looking into thing like this and MS Speech API isn't for me, even if is good, because it doesn't support most languages Google's API does (example being Polish, same for MS text-to-speech).

Is it possible to build offline app with Appcelerator and Rhomobile?

I have recently found those two look-alike solutions/IDE for cross-mobile development: Appcelerator and Rhomobile (I know there are more) and I have questions regarding those two platform:
1) I believe the only way to build the view is using HTML, which I like alot the ideas. But, does that mean the application itself isn't available if the mobile is offline?
2) Do you guys know if it's possible to publish the application to the App Store and Google Store?
3) Are there any simulator for different mobile and do they support all those slide/tab events?
4) And finally, are there a way to transfert the App on your mobile phone without having to publish it anywhere.
Please note that I have no knowledge at all about mobile app dev and those two solutions (Appcelerator, Rhomobile) would be perfect for me as I am familiar with Javascript and HTML.
Thank you!
Ok I have only used appcelerator but:
1) a webview is like a browser without the address bar, it simply parses HTML, where it gets it from is up to you. If you write the HTML and pass in a file well then yes it can be offline, if it is used to parse a response from a webpage well then no as it needs to send a http request to the webpage.
As many people seem to mistake (for a reason unknown to me as all the documentation states other wise), appcelerator is not the same as phonegap, appcelerator uses its own javascript based API to allow developers to make native apps, it is NOT a webview wrapper. It is offline by default and allows you to send http requests if you need something online.
2) yes you can publish to the app store and the google store from appcelerator, the documentation walks you through the process.
3) Appcelerator requires you to download either the IOS sdk or Android SDK which come with simulators, appcelerator / the emulators support the standard events found on these devices.
4) With Android to can build a .apk file and distribute however you wish, with IOS the only way is to publish to the app store. the only other way is to make a mobile website instead of an application