Bypassing speech recognition in Amazon AVS

Bypassing speech recognition in Amazon AVS - amazon-web-services

As I understand AVS you send an audio clip to the API which is parsed for speech recognition and then interprets that text and gives you some result based on what you asked.
What I want to do is make kind of a cli version of Alexa where you type in what you would normally say out loud to Amazon echo.
So what I'm wondering if there is some way to bypass the speech recognition step using some amazon api so I can just send the text.
I thought about implementing the ai myself but it would be nice to use all the available skills for Alexa.

No chance.
For your own skills you can do that by calling them directly. Finally it's a simple HTTPS call with a JSON Payload. But it's not possible for other skills except the owner publish is as HTTP Endpoint.
But you have to handle also the user sessions etc.
For a "CLI - Echo" have a look at the different Bot Frameworks. Most of the Companies with an Alexa App have also a documented REST Backend which you can use directly. See Twitter, Facebook etc.

Related

Google Speech to text available offline?

I would like to leverage Google's Speech to text service for a desktop app, but I would like it to be offline. Is this possible?
They have on-prem solutions but can it be offline so no data is sent?
https://cloud.google.com/speech-to-text#all-features

Google's Speech to Text API only works through the cloud, it is not possible to work offline, this is because Speech API and Text to Speech API make request using REST or RPC calls.
The Speech-to-Text On-Prem allows you to deploy the Speech to Text API through a container or any GKE cluster, but that doesn't mean you can do it in your local desktop.

Google Speech-to-text API is available offline only for English in several devices. If You want this API to work for you offline for other languages, you should install that Specific language on you device - otherwise it won't work.
Basically Google Speech Recognition requires internet access to make REST and RPC calls. If you have a working with internet access, it will work on every Language you want. But in offline mode it only works on device-specific language, most probably English.

aws lex human handoff / intervene

So, I am building a solution for web and mobile platforms which provides users with some screens from where the data from the Database is retrieved and inserted by some forms. One of the core features of the application is a chat-bot with supervised learning.
As per my understanding of the Lex API, it can share the current
websocket connection with other aws-services like API Gateway and Lambda.
Also, I have come to know that the human handoff is not provided out
of the box in Lex like DialogFlow API and Azure Bots.
Therefore, I am planning to share the same websocket opened by the Lex API for interacting with the user with the API Gateway ( as it supports websockets ) and thereby creating a human handover.
Please suggest whether there is a better approach to this problem's solution or I am on the right path ?
P.S. My Application stack is Nodejs and Angular based and following is my app's architecture.

How do you use Amazon's Mechanical Turk API?

I'm using Amazon's Mechanical Turk as a requester and need to automate some things (specifically bonus payments).
I feel really stupid for asking this, but... how does one actually use the MTurk API? I was reading the API reference, and it specifies some details about a lot of requests, including this one for bonus payment, but there's nothing about how to actually perform such a request. I assume it's an HTML request, but there's no mention of which endpoints to use or how to obtain keys for authorization.

You could call it via the AWS CLI or via your preferred programming language, such as Python.
For an introduction, you could do a web search and read articles like Tutorial: A beginner’s guide to crowdsourcing ML training data with Python and MTurk.

You'll need to have an AWS account and make a call from code or the CLI. For working code examples using the API in a variety of languages, check out mturk-code-samples on GitHub. The MTurk blog also has end-to-end examples.

Request data from my Google Glass to Mirror API?

Since you can´t do all the nice looking stuff with the GDK at this point (html, images and so on) on cards. I was wondering if there is a possibility to ask a Mirror API from within my application (created with GDK) to send me some data?
Example:
I see the flow like this:
The GDK app is started with "ok glass, search app"
You talk what you want to search for.
The app takes the word and asks the Mirror API for a result.
The Mirror API sends the result to the glass timeline.
Regards
Joakim

Absolutely. If it makes sense for your app to either communicate with an external service that sends data back via the Mirror API, or calls the Mirror API itself, then you can certainly do so. (Although you begin getting dangerously close to just doing it all in Mirror at that point.)
The biggest challenge you'll face is having your app go through the OAuth dance to get an auth token to use.

It sounds like the core of your issues is that you want a richer way to display content in the static part of the timeline to the right of the clock. You have a couple of options.
GDK
If you'd like to stay pure GDK, you can create your own view, and flatten it into a bitmap. The steps to complete this are the same as for other Android devices.
Mirror API
You could also use the Mirror API to insert HTML static cards using timeline's insert method, but to do this you will need to communicate some authentication information to your GDK Glassware. For example, if you want to insert into the Mirror API directly from Glass, you would need a way to provide an access and refresh token to your GDK Glassware.
There is no graceful way to do this with the released APIs, but I've seen some people accomplish this using the OAuth 2.0 flow for devices or scannable QR codes.
If you go down this route, be prepared to update your implementation. Google has announced improved support for sending authentication information to GDK Glassware. Once it's available you will want to switch over to it.

Mirror API allow you to communicate back to it from Glass through the contact that you can create using Mirror API.
How it works -
you create a contact,
then when you need to do wit GDK on step 2 of your flow is share a
note with your contact
the words get transcribed and delivered to you Mirror notification
listener.
the mirror notification listener on server get the text of what kind
of app you would like to search, perform the search and deliver the
result by simply publishing on your timeline.
That's the best that i can see right now.
Here is a link how to declare voice menu commands (now only two are available, but you can propose more)
https://developers.google.com/glass/develop/mirror/contacts#declaring_voice_menu_commands
P.S. To go through oauth2 challenge - download sample mirror app
https://developers.google.com/glass/samples/mirror edit you
oauth2.properties file with creds you will get on your google development web console (you will need to create the app with google and request to enable Mirror API)
then run mvn clean install
then run
mvn jetty:jetty

Invoking a web service API by using Text Message

Now I am creating an iOS application. I also implemented some web services. My requirement is : "The user should be able to call a web service API by Sending a Text Message(SMS)". After a lot of research I found out that there a provider called Clickatell(http://www.clickatell.com/). But I don't know how can I configure it? Please help me in configuring this. Or Is there any other APIs or SMS gateways providing this service?

Disclaimer, I do developer evangelism part time at Nexmo.
Here are a few SMS APIs that I've used (I've not really used Clickatell, but I've gone through the signup process, and the following APIs seem a lot simpler to use):
Nexmo
Twilio
Tropo
All three APIs are straight forward REST/HTTP APIs.
You can call the API directly from your mobile application, however, you should consider if you really want to then compile your API credentials into your application. It may be better to host a kind pf proxy that your application uses - here's some example code used as a verification service, but it's essentially the same concept: https://github.com/Nexmo/Verify

I would suggest to take a look at Mogreet's new Developer Web Site
Very easy to use REST/HTTP APIs and very powerful. It supports sending SMS/MMS with awesome quality for all media types.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Bypassing speech recognition in Amazon AVS - amazon-web-services

Related

Google Speech to text available offline?

aws lex human handoff / intervene

How do you use Amazon's Mechanical Turk API?

Request data from my Google Glass to Mirror API?

Invoking a web service API by using Text Message

Categories

Resources