Set google translate to translate name as it sounds - google-cloud-platform

Well I want to translate a person name from arabic or hebrew to english.
but if the name has a meaning, so the translation result will be the meaning, and not the name pronunciation, which is what I need.
Is there a way to tell google to translate just the pronunciation and not the meaning?
thanks

Is there a way to tell Google to translate just the pronunciation and not the meaning? Google Cloud Translation api does not support this feature(transliteration).
However, there is a feature request filed for the same. You can vote for this feature by clicking +1 and STAR mark to receive updates on it.

Related

Google Dialogflow - Asking a customer for a unique number

I am trying to ask a customer for a unique number. I have tested this using the test console and it's coming up with multiple variations without giving a value.
The numbers are a mix of 4/6/8 digits. I want a customer to be able to say 'my plan number is 12345678' and for me to be able to get that value and work with it.
What parameters/system entities should I be using to get a result? Often times it will miss a digit/put in a hyphen etc.
P.S. this is using voice only, not text.
There's a feature called Auto speech adaptation that will help you in this specific case. After enabling it, check the point 5 in the Example speech recognition improvements. It explains how you can use auto speech adaptation with Regexp entities to capture digit sequences and it gives you a regular expression you can use. It also recommends using #sys.number-sequence entity.
The enhanced speech models can also help with the number identification accuracy, but bear in mind that it is still a beta feature.
For reference you can also check the article Improving speech recognition for contact centers in the Google Cloud Blog.

How to disable auto correction for Google Cloud Speech to Text API

Is there a way to disable auto correction for Google Cloud Speech to Text API? It is important for me to get accurate transcript of user's speech, with any errors they make rather than a corrected version.
It is difficult to distinguish between mistakes made by speaker (grammar/pronunciation errors) in the audio content and mistakes made by Speech API. However, you can check different versions of text output predicted by model behind the scene with the help of maxAlternatives property of the API.
You have not provided the example of such use-case, but if you are already expecting unusual pronunciation or Acronyms you can provide hint to the request using phraseHint property.
Please provide further details if it doesn't answer your question.

Google Cloud Vision API - TEXT_DETECTION

When i try to recognize a text in image, like the italian word "Perchè", Vision API get back the word "Perche" (give back the "e" and not the correct one "è").
I don't want to use languageHints to try to obtain better results because i've to do OCR Recognition across different language.
What is the problem here?
This is known issue with the Cloud Vision API when you don't use language hints.
You can see the actual bug report here.
It is in state accepted, but there seems to be radio silence on it for the last few months. It may take some time to roll it out.

A tool which checks that a local version of a site is fully translated (for continuous integration)

I'm working on a project, in which we design a localized version of an existing site (written in English) for another country (which is not English-speaking). And the business requirement is "no English text for all possible and impossible cases".
Does anyone know if there is a checker software/service which could check if a site is fully translated, that is which checks that there are no English text in it.
I new that there are sites for checking broken links, html validity etc, I need something like http://validator.w3.org/checklink but for checking that on all pages of the site there is no English text.
The reasons I think this way is needed are:
1. There is a lot of code which is common (both on backend and frontend) for all countries
2. If someone commits anything to the common code I need to be sure that this will not lead to english text issues in localized version.
3. From business point of view it is preferable that site does not support some functionality, than it shows english text ( legal matters)
4. The code both on frontend and backend changes a lot
5. There are a lot of files which affect text on the client's screen. Not just one with messages, unfortunately. And some of messages comes from backend, but most of them are in frontend
6. Due to all those fact currently someone manually fills all the forms and watch with his own eyes, and that is before each deploy...
I think you're approaching the problem from the wrong direction. You're looking for an algorithm or webcrawler that can detect wether any text is English or not? I don't know, but I doubt such a thing even exists.
If you have translated the website, you have full access to the codebase and/or translation texts, right? Can't you just open both the English and non-English strings files (.resx or whatever you are using) in a comparetool like Notepad++ to check the differences to see if there are any missing strings? And check the sourcecode and verify that all parts that can output user-displayable text use the meta:resourceKey property (or whatever you are using).
If you want to go the way of crawling, I'm not aware of an existing crawler that does this, but it sounds like a combination of two simple issues:
Finding existing open-source code for a web crawler should be dead simple
Identifying a language through n-gram analysis is trivial if there's a limited number of languages the text can be in.
The only difficult part would be to ensure that the analyzer always has a decent chunk of text to work with. You could extract stuff paragraph by paragraph. For forms you'd probably have to combine the text of several form labels.

Free Language Identifier Service?

Do any of you guys know of a free online (or offline if it's in java) language identifier service? (I don't want a tool you use manually. I need a service, sice I have to do this identifying programatically.)
I've got a form and I'd like to figure out what language a user has written in.
Come to think of it, shouldn't this be doable through a Google thingy somehow? Since they detect page languages and all, and they're mostly open source...
Thanks for any help. Cheers!
[I added a "google-translate" tag since there isn't anything regarding text-recognition (there's image and voice but no text)]
Language Detection Library for Java looks like the kind of thing you are looking for.
Also see http://en.wikipedia.org/wiki/Language_identification for more links.
Language Detection API has free plan. You can pass text via HTTP POST and receive JSON result with detected languages and scores.