Test Google Speech API with audio file

Test Google Speech API with audio file - google-cloud-platform

I want to see if Google Speech API will be accurate enough for my purposes. I have an audio file I want to test it with, but the demo on the main page only lets you record from a microphone. Is there a way to test Google's speech processing with an audio file without having to learn the API first?

No, you will have to use the API if you wish to upload a file.
The steps are described here on how to make the API request and it is fairly straightforward. The same page also details how to set up your account, enable billing and getting the access token for the request.

Related

Google Speech to text available offline?

I would like to leverage Google's Speech to text service for a desktop app, but I would like it to be offline. Is this possible?
They have on-prem solutions but can it be offline so no data is sent?
https://cloud.google.com/speech-to-text#all-features

Google's Speech to Text API only works through the cloud, it is not possible to work offline, this is because Speech API and Text to Speech API make request using REST or RPC calls.
The Speech-to-Text On-Prem allows you to deploy the Speech to Text API through a container or any GKE cluster, but that doesn't mean you can do it in your local desktop.

Google Speech-to-text API is available offline only for English in several devices. If You want this API to work for you offline for other languages, you should install that Specific language on you device - otherwise it won't work.
Basically Google Speech Recognition requires internet access to make REST and RPC calls. If you have a working with internet access, it will work on every Language you want. But in offline mode it only works on device-specific language, most probably English.

How to use Google APIs to reduce in memory usage and to use Advance Google features?

I want to build a system that would allow users to POST videos to their YouTube channel after they are logged in with the google account to my website. The video will be published on the website.
After that, I also need that the users could comment on the videos that showed on my website, and on YouTube.
After that, I need that the user's profile pictures will be uploaded to their own Google Drive.
Tasks:
Upload Videos to the website with Youtube (Youtube Video API).
Comment on videos on the website (Youtube Comments API).
Upload profile picture to Google Drive (Google Drive API).
I don't know where to start, and how to do that any user "Hosts" itself, for example, he can add videos, comments, and host his profile picture.
Using Django with Python3

Welcome to Stackoverflow! I hope you have a great time here. I have to warn you, however, that the question you are asking is off the site's guidelines. You should limit the scope of your question to one single problem that are you are facing. I suggest you read more about this here.
I recommend you to tackle one of the tasks first (such as the Google Drive one). You can follow this comprehensive quickstart provided by Google itself, which will allow you to write a script that lists the files on your Google Drive instance. Afterwards, you may try to upload files (your profile pictures) to it, following this piece of documentation which also includes Python examples.
Further to that, I also recommend that you check out the following links:
This explanation about how OAuth2 works, and more specifically with Google APIs.
Documentation on how to use the Google API Client library for Python.
Django social auth, a plugin for Django that allows your users to log into your platform using their Google credentials.
I hope this is useful to you and that you manage to build a great application.

Bypassing speech recognition in Amazon AVS

As I understand AVS you send an audio clip to the API which is parsed for speech recognition and then interprets that text and gives you some result based on what you asked.
What I want to do is make kind of a cli version of Alexa where you type in what you would normally say out loud to Amazon echo.
So what I'm wondering if there is some way to bypass the speech recognition step using some amazon api so I can just send the text.
I thought about implementing the ai myself but it would be nice to use all the available skills for Alexa.

No chance.
For your own skills you can do that by calling them directly. Finally it's a simple HTTPS call with a JSON Payload. But it's not possible for other skills except the owner publish is as HTTP Endpoint.
But you have to handle also the user sessions etc.
For a "CLI - Echo" have a look at the different Bot Frameworks. Most of the Companies with an Alexa App have also a documented REST Backend which you can use directly. See Twitter, Facebook etc.

Request data from my Google Glass to Mirror API?

Since you can´t do all the nice looking stuff with the GDK at this point (html, images and so on) on cards. I was wondering if there is a possibility to ask a Mirror API from within my application (created with GDK) to send me some data?
Example:
I see the flow like this:
The GDK app is started with "ok glass, search app"
You talk what you want to search for.
The app takes the word and asks the Mirror API for a result.
The Mirror API sends the result to the glass timeline.
Regards
Joakim

Absolutely. If it makes sense for your app to either communicate with an external service that sends data back via the Mirror API, or calls the Mirror API itself, then you can certainly do so. (Although you begin getting dangerously close to just doing it all in Mirror at that point.)
The biggest challenge you'll face is having your app go through the OAuth dance to get an auth token to use.

It sounds like the core of your issues is that you want a richer way to display content in the static part of the timeline to the right of the clock. You have a couple of options.
GDK
If you'd like to stay pure GDK, you can create your own view, and flatten it into a bitmap. The steps to complete this are the same as for other Android devices.
Mirror API
You could also use the Mirror API to insert HTML static cards using timeline's insert method, but to do this you will need to communicate some authentication information to your GDK Glassware. For example, if you want to insert into the Mirror API directly from Glass, you would need a way to provide an access and refresh token to your GDK Glassware.
There is no graceful way to do this with the released APIs, but I've seen some people accomplish this using the OAuth 2.0 flow for devices or scannable QR codes.
If you go down this route, be prepared to update your implementation. Google has announced improved support for sending authentication information to GDK Glassware. Once it's available you will want to switch over to it.

Mirror API allow you to communicate back to it from Glass through the contact that you can create using Mirror API.
How it works -
you create a contact,
then when you need to do wit GDK on step 2 of your flow is share a
note with your contact
the words get transcribed and delivered to you Mirror notification
listener.
the mirror notification listener on server get the text of what kind
of app you would like to search, perform the search and deliver the
result by simply publishing on your timeline.
That's the best that i can see right now.
Here is a link how to declare voice menu commands (now only two are available, but you can propose more)
https://developers.google.com/glass/develop/mirror/contacts#declaring_voice_menu_commands
P.S. To go through oauth2 challenge - download sample mirror app
https://developers.google.com/glass/samples/mirror edit you
oauth2.properties file with creds you will get on your google development web console (you will need to create the app with google and request to enable Mirror API)
then run mvn clean install
then run
mvn jetty:jetty

What kind for authentication I should use in Google Blogger Data API for my facebook application

I am building a facebook application (using Django) in which I have to read data from blogs using Google Blogger Data API. The blog could be any public blog.
So, what kind of authentication/ authorization mechanism I should use in Google Blogger API for my application and how? I don't want a redirected Google log-in page to open in my app.
Google Console also provides API Key by which we can read public data but I am not sure if it is really a right choice for my app.
I am currently using ClientLogin under development.
I even got a weird idea of using ClientLogin even after the release to read data from my blog as it doesn't limit the number of requests/ day. Does that make my blog insecure?

Using OAuth Service protocols or Google API Access tokens for pulling public data would do that job.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Test Google Speech API with audio file - google-cloud-platform

I want to see if Google Speech API will be accurate enough for my purposes. I have an audio file I want to test it with, but the demo on the main page only lets you record from a microphone. Is there a way to test Google's speech processing with an audio file without having to learn the API first?

No, you will have to use the API if you wish to upload a file. The steps are described here on how to make the API request and it is fairly straightforward. The same page also details how to set up your account, enable billing and getting the access token for the request.

Related

Google Speech to text available offline?

How to use Google APIs to reduce in memory usage and to use Advance Google features?

Bypassing speech recognition in Amazon AVS

Request data from my Google Glass to Mirror API?

What kind for authentication I should use in Google Blogger Data API for my facebook application

Categories

Resources