I have been doing a bit of experimentation with Amazon Lex but I can't get voice to work in the console at all.
I'm using the Flower bot demo with the associated Python Lambda function connected and working with text on Chrome browser running on a Mac (10.13.1).
I am able to log any text entered into the test bot on the console from the Lambda function along with the rest of the event.
By going to the monitoring tab of the bot in the console I can see utterances from previous days (seems to be a one day delay on utterances appearing wether missed or detected, no idea why…).
I made a bunch of attempts to use voice yesterday that appear in the utterance table as a single blank entry with a count of 13 now that it is the next day. I'm not sure if this means that audio isn't getting to Lex or if Lex can't understand me.
I'm a native English speaker with a generic American accent (very few people can identify where I'm from more specifically than the U.S.) and Siri has no trouble understanding me.
My suspicion is that something is either blocking or garbling the audio before it gets to Lex but I don't know how to find what Lex is hearing to check that.
Are there troubleshooting tools I haven't found yet? Is there a way to get a live feed of what is being fed to a bot under test? (All I see for the test bot is the inspect response section, nothing for inspecting the request.)
Regarding the one day delay in appearance of utterances, according to AWS documentation:
Utterance statistics are generated once a day, generally in the
evening. You can see the utterance that was not recognized, how many
times it was heard, and the last date and time that the utterance was
heard. It can take up to 24 hours for missed utterances to appear in
the console.
In addition to #sid8491's answer, you can get the message that Lex parsed from your speech in the response it returns. This is in the field data.inputTranscript when using the Node SDK.
CoffeeScript example:
AWS = require 'aws-sdk'
lexruntime = new AWS.LexRuntime
accessKeyId: awsLexAccessKey
secretAccessKey: awsLexSecretAccessKey
region: awsLexRegion
endpoint: "https://runtime.lex.us-east-1.amazonaws.com"
params =
botAlias: awsLexAlias
botName: awsLexBot
contentType: 'audio/x-l16; sample-rate=16000; channels=1'
inputStream: speechData
accept: 'audio/mpeg'
lexruntime.postContent params, (err, data) ->
if err?
log.error err
else
log.debug "Lex heard: #{data.inputTranscript}"
Go to Monitoring tab of your Bot in Amazon Lex console, click "Utterances", there you can find a list of "Missed" and "Detected" utterance. From the missed utterances table, you can add them to any intent.
Related
I am trying to create a demo call centre using AWS connect.
Part of my contact flow makes use of the "Get Customer Input", as I want to use an Amazon lex bot. I have created to divert to a specific working queue. For example, if the user says "sales" they should be directed to the sales queue.
I have tested the Lex bot within the Lex console and it works as intended.
However when testing the Lex integration within AWS connect it will always follow the "error" path on the block after a user says something on the phone.
Here is the CloudWatch log showing the Error result of the module.
{
"Results": "Error",
"ContactFlowName": "Inbound Flow",
"ContactFlowModuleType": "GetUserInput",
"Timestamp": "2022-02-12T18:06:10.940Z"
}
Here is the contact flow:
Here is the settings for the "Get Customer Input Block":
Here is a test of the Lex bot in the Lex dashboard:
Any help would be greatly appreciated.
Turns out the solution was if you're using LexV2 make sure you set the proper Language Attribute as well. Easiest way is using the set Voice block in your Contact Flow, on the very bottom of the block you can enable "set language attribute".
I am a new in AWS services and we want to build a simple demo that detect a special word and: [1] trigger an action [2] responses (as speech during the call).
For example, if the user say: "Help" I want to reply "OK" and make an operation (AWS lambda).
We're using Twilio, and Twilio should streaming the audio.
As I understand I have two options, Android Lex and Transcribe, when Lex is for bots and transcribe just translate the speech and can't get involved in conversation.
So the questions are:
What Services should I use to trigger an action when the special word is recognize AND involved in the conversation?
Can I streaming the call directly to AWS service via Twilio?
Edit
To be more clear: The communication will be with two persons in real time, and I want to make interject during their call when someone say "Help" I want to add a bot voice to the conversation and say "OK", for example"
[Person 1]: Hi, how are you
[Person 2]: HELP ...
[BOT]: OK (like a third person in a conference call..).
I am not fully clear on the interaction taking place with the user, before they interject with help. Are they listening to a bot, media file, TTS, or communicating with another person in real time?
For realtime analysis, you would need to use Twilio Media Streams, which streams the voice conversation to a service that could then convert the speech to text in near real time, looking for keywords, and then programmatically perform some action based on those keywords.
An example of using Twilio Media streams with Lex:
Use Amazon Lex as a conversational interface with Twilio Media Streams
I was wondering if anybody has ever experimented with this issue I'm having and could give me any input on the subject.
As it stands right now I'm trying to see if there is a way to grab a users input through the AWS Connect. I understand that there is already a "Get User Input" block in the GUI that is available for me to use, unfortunately it does not offer the fine grain control I am looking for with requests and responses from Lex.
Right now I am able to Post Content to Lex and get responses just fine, as well as output speech using Amazon Polly via my Lambda. This works great for things that do not require a user to have to give feedback for a question.
For example if a client asks
"What time is my appointment?"
and we give back
"Your appointment is for X at X time, would you like an email with
this confirmation?"
I want to be able to capture what the user says back within that same lambda.
So the interaction would go like so:
User asks a question.
Lambda POST's it to Lex and gets a response
Amazon Polly says the response - i.e: 'Would you like an email to confirm?'
Lambda then picks up if the user says yes or no - POST's info to Lex
Gets response and outputs voice through Polly.
If anybody has any information on this please let me know, thank you!
Why do you make so much complications to implement IVR system using Amazon Connect. I have done the complete IVR automated system to one of my biggest US banking client. Use the below procedure to achieve what you desire.
Build a complete interactive lex bot(So that you can avoid amazon poly & using lex post content api). It is advised to build each bot has only one intent in it.
In connect using "Get User Input" node map the lex bot which you have created earlier with the question to be asked "What time is my appointment?". Once this question has been played the complete control goes to lex and then you fulfilled your intent from lex side, you can come back to connect as like that.
Refer AWS contact center for the clear idea.
I’ve created a Lex bot that is integrated with an Amazon Connect work flow. The bot is invoked when the user calls the phone number specified in the Connect instance, and the bot itself invokes a Lambda function for initialisation & validation and fulfilment. The bot asks several questions that require the caller to provide simple responses. It all works OK, so far so good. I would like to add a final question that asks the caller for their comments. This could be any spoken text, including non-English words. I would like to be able to capture this Comment slot value as an audio stream or file, perhaps for storage in S3, with the goal of emailing a call centre administrator and providing the audio file as an MP3 or WAV attachment. Is there any way of doing this in Lex?
I’ve seen mention of ‘User utterance storage’ here: https://aws.amazon.com/blogs/contact-center/amazon-connect-with-amazon-lex-press-or-say-input/, but there’s no such setting visible in my Lex console.
I’m aware that Connect can be configured to store a recording in S3, but I need to be able to access the recording for the current phone call from within the Lambda function in order to attach it to an email. Any advice on how to achieve this, or suggestions for a workaround, would be much appreciated.
Thanks
Amazon Connect call recording can only record conversations once an agent accepts the call. Currently Connect cannot record voice in the Contact Flows. So in regards to getting the raw audio from Connect, that is not possible.
However, it looks like you can get it from lex if you developed an external application (could be lambda) that gets utterances: https://docs.aws.amazon.com/lex/latest/dg/API_GetUtterancesView.html
I also do not see the option to enable or disable user utterance storage in Lex, but this makes me think that by default, all are recorded: https://docs.aws.amazon.com/lex/latest/dg/API_DeleteUtterances.html
I'm using the Amazon Lex service. My input is always a text message, but sometimes I'd like a spoken response in addition to the text. I configured an output voice in the Lex settings.
I've tried adding a header amz-lex:accept-content-types=SSML to the request, but it returns with Invalid Bot Configuration: No usable messages given the current slot and sessionAttribute set. (Service: AmazonLexRuntime; Status Code: 400; Error Code: BadRequestException;. The same request works just fine when I ask for PlainText. And even if I ask for SSML,PlainText it'll respond with plain text only.
Do I need to configure something else inside Lex to allow it to do voice responses?
Lex cannot actually output voice by itself.
Lex will always output a JSON response and that response needs to be processed by the channel the user is accessing Lex with. So that channel is what outputs either text or voice based on how it processes the response message delivered from Lex.
Amazon Lex can handle speech-to-text.
Amazon Polly can do the reverse: text-to-speech.
If you go to the above Lex page, they have a few examples of using Lex for conversation logic and then Polly for text-to-speech and outputting voice to the user.
You can utilize SSML(Speech Synthesis Markup Language) for this, even to test voice in Lex Test Bot Console, using message content-type.
Using SSML tags, you can customize and control aspects of speech, such as pronunciation, volume, and speech rate.
SSML comes with a variety of directives with which you can customize pronunciation and create based on your requirement . Eg - say-as directive
`"message": {
"contentType": "SSML",
"content": "<speak> Hi " + data["User ID"].split('.')[0]+", Your Reference Number <say-as interpret-as="characters">" + "ABC"+event.currentIntent.slots.RefNo+ "</say-as> is ," + data["Status"] +"</speak>"
}`
Introduced in 2018 - https://aws.amazon.com/about-aws/whats-new/2018/02/announcing-responses-capability-in-amazon-lex-and-ssml-support-in-text-response/