I'm building an Alexa skill that will allow Alexa users to interact with a consumer facing e-commerce site. There is functionality to call a representative that already exists on the site. Now, I want to build out a voice app as a side project that extends that same option via a conversation. There will be a need for slots like location, category of call, etc. It's basically an Application/Transactional bot.
In the future, if this is successful, I'd like that same general app to be accessible on different IoT devices (like Google Home Assistant, etc.) Therefore, I'd like to abstract out the voice interactions and have the same (general) flow and API to interact with.
This leaves me doing some research on different technologies like api.ai, wit.ai, Lex, etc.
But, since this is an app for Alexa and I already rely on AWS and Amazon in general, I think I'd prefer to use Lex or just write a native Alexa app for now.
I'm having a hard time understanding the differences between the two. I understand that Alexa was built using Lex and I see that they have similar concepts like intent, slots, etc.
But, I'm looking for any differences between the two services:
Would using Lex allow me to more easily integrate with other devices? Or is there any benefit?
Would using Lex allow me greater flexibility in designing/modifying the flow of a conversation? It seems like Lex is a little more complex and, therefore, might allow greater functionality.
Or is it just that Lex offers nearly the exact same functionality and is just meant for devices that aren't Alexa?
Does Lex offer any more analytics processing than Alexa? In Alexa I can only see intents/slots, but if I could see the actual text in Lex, that would be ideal.
Alexa Skills Kit (ASK) is used to build skills for use in the Alexa ecosystem and devices and lets developers take advantage of all Alexa capabilities such as the Smart Home and Flash Briefing API, streaming audio and rich GUI experiences. Amazon Lex bots support both voice and text and can be deployed across mobile and messaging platforms.
Lex Faqs
In my view (very limited Alexa dev experience) AWS Lex allows greater control over the bot dialog. It defines separate validation and fulfilment code hooks, enables specific prompts for slots on the UI, supports programmatic transitions between intents, gives proper versioning and alias handling, etc... so it seems it's more of an enterprise offering as opposed to "consumer-level" Alexa skills.
But surprisingly it lacks a few important features, e.g. it does not have a built in "boolean" slot type, so you have to code around yes/no questions. Or there are no Cloudwatch logs for lex at all. Also the (growing) list of integrations will make it more generic.
But despite being a huge AWS fan, I have to say that api.ai seems to be a reasonably more polished, feature rich proposition at least for now.
With regards to integrations with other devices, I do not think any of these platforms promise that. It seems that if you target Google home, than it's their platform, if you target Alexa, then hmm it's alexa or api.ai (not sure if Google will push this in the future). But if you plan to integrate with chat platforms, or directly into web applications, then I think all major platforms can give you that, or in the near future.
By the way, have you checked IBM Watson or Microsoft Bot framework (with LUIS)? They are also very capable, complete frameworks, too, don't discount them!
There is a risk using an external NLP service to process raw text delivered by Alexa over its native hobbled interaction model. Amazon may not certify your skill. This is unfortunate to hear, but their excuse is the threat of exposing private user data users may not realize they're sending. This is sickening because to do anything robust you must avoid Alexa's native NLP system. And I don't believe LEX is advanced much beyond it. You're caught in a bind. This is what will set Alexa back perhaps in the long run with respect to natural conversation. We've been preparing our skis in stealth mode, and an Amazon rep said our approach was a "hack" and may not get certification when published. I'm not yet sure what the answer is. Does this raw text issue exist with Google Home or other voice platforms? Beware.
"Alexa for Business is intended to enable organizations to take advantage of Amazon’s voice enabled assistant, Alexa. Alexa for Business provides Alexa capabilities that make workers more productive, while working alongside all of the other capabilities that Alexa has today like music, smart home controls, shopping, and thousands of third party skills.
Amazon Lex is intended to help build custom conversational interfaces and chat bots for use cases like call centers or application based bots. Bots built with Lex can be highly customized and exist separately from Alexa but they do not take advantage of Alexa’s built in capabilities or third party skills. Both Alexa for Business and Amazon Lex use Amazon’s deep learning capabilities that provide Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU)."
Related
I am not sure if this is the appropriate place for this, but I have come up with a "conceptual" modular design architecture that separates the logic out into individual services to allow an almost plug and play type scenario whereby there are no dependencies between the services. Think a list of features and only enabling the ones that you want.
To facilitate this I realise that I will need some type of middleware that will connect these all together and control the flow of data. However I am not sure of the specifics around what would be appropriate to achieve this.
I plan on implementing the services using .NET soap based services, so is this a case of using something like Tibco?
Any suggestions around what would be most appropriate or even where to start looking would be great.
If the above description didn't make sense hopefully this image is a bit clearer in describing the relationship between the services.
Thanks.
Depending on your needs you could use NServiceBus (http://particular.net/nservicebus). NServiceBus is communication middle ware which can be used with different types of queuing systems like MSMQ, RabbitMQ and others. It is essentially a servicebus which is very developer friendly and focused. It does not only facilitate asynchronous message based distributed communication but also:
Publish / Subscribe that is transport agnostic using automatic registration
Transports: Can be used with MSMQ, RabbitMQ, Azure Storage Queues, etc.
Security: Supports encryption of messages
BLOB's: Has support for storing large message payloads transparently with the data bus to allow for communicatie message larger then the transport allows.
Scalability: Out and upscaling to increase throughput
Reliability: Deduplication, idempotent processing without having distributed transactions.
Orchestration: Sagas can help in controlling message flow and routing.
Exception handling: Exceptions get automatically retried in two different stages.
Monitoring: Tools like Service Pulse, Service Insight and Windows Performance monitors to monitor performance and errors. See what errors occurred and
Serialization: Can use different serializers that support formats like xml, json, binary
Open Source: All source code is available
Auditing: Can move all processed message to an audit queue for archiving or audit requirements
Community: Has a large community of developers that are active on the forums but also supply additional transports, serializers and other features.
I must mention that I work for Particular but also that there are other options to consider. NServiceBus does not use SOAP for message exchange but a lightweight message in a format of choice as mentioned as the serialization bullet. It can integrate with services that require SOAP. It has the ability to expose an service (endpoint) as a WCF service for easy integration and it can use SOAP from within code to call external SOAP services using the features that the .net framework and visual studio provide.
Good luck in choosing the right technology for your project.
From what I know, Parse offers convenient communication stacks for various platforms such as iOS, so it is easy to build clients that use your web app.
But Parse also seems to be tightly integrated with Facebook. If you were to build a web app that does not need Facebook, but that may integrate with Facebook in the long term, is Parse the clear winner over deploying directly to AWS, or are there important disadvantages to consider?
As far as I understand their page Parse is a PaaS (platform as a service) provider like Heroku and others while AWS is a IaaS (infrastructure as a service) provider.
Pros for PaaS:
They care about the infrastructure
You build your app on an existing platform
For the start you don't need "ops-guys" as you don't do ops
You can take their knowledge and prebuilt tools for your advance
Pros for IaaS:
You have full control about the underlaying infrastructure
You can start with a greenfield and build what ever you want
You can use tools like Puppet / Chef / ... to control your servers
You don't have to pay for the additional stuff you get when using PaaS
(but have to pay your people for it)
So there is not a winner of this "battle" but you have to decide whether you want to use prebuilt tooling and give some independence for this or whether you want to have the absolute control over everything (nearly as you can't touch the hardware) and invest time and manpower into building your own tooling.
"Better, Faster, Cheaper.."
If you are pursuing mobile first strategy, Parse is a great tool for bootstrapping a mature, full web-presence from nothing more than an original beta app.
I dont have direct experience with AWS.
I have used Heroku/Parse integrating (very quickly) a stand alone mobile app with the back-end where the back end needs to cover following:
DB/persistence/noSql
Workflow - async tasks
REST API interface HTTP
Once the mobile app existed with only stubbed local data , Parse allowed a single engineer to build out ALL infrastructure mentioned above very quickly, taking the app from single user to multi-user with full DB and workflow that backs client side events with considerable server-side and cloud side business logic and process. Scaling related startup stuff that used to take weeks took only days.
The compression (time&money) when scaling up an app stack is really something. The Parse API did almost everything that i needed with one small exception (remuxing UGC media).
Personally, i abandoned the parse/android SDK in favor of a more robust REST API (threading on client-side and heavy HTTP activity ).
Developers used to Curl/REST dev stacks will take to Parse.
For personal and university research reasons I am thinking of building a simple CRM using a service oriented architecture. Its meaning is just to explain the architecture itself, not commercial use.
I was thinking of implementing a CRM that offers a simple analytics service and customer care (user storing, personal comments, and few other things).
The architecture that I'm designing defines:
- WebGUI (a client of the other services)
- AnalyticsService (a service that receives data, analyzes and collect it)
- CustomerCareService (a service that uses RESTful APIs to apply CRUD operations).
Each service has it own database, being completely independent from others. They expose a public interface. The interface of course must provide some sort of authentication, to deny unautorized requests.
The advantages I'd like to explain in this kind of architecture is the possibility to have all things indepentent and the ability to combine them to offer new services (for example if there was an OrderService to handle orders it would be easy to combine it with Customer using the public APIs). The big advantage to me is that it'd be easy enough to build other clients that use these services.
I don't know what is some good Authentication method, that could be easy to implement, I'm also not sure about how to make this APIs (use XML or plain REST APIs with GET/POST data). I've worked with Amazon, PayPal and other company APIs, they seem to use REST services (paypal uses an ugly _cmd GET parameter while Amazon uses better URI) to know what to do, but reading something about SOAs it appears that people also use XML. Of course I also need to take into account that the web interface must be able to recognize the logged in user, get the permissions (token or whatever else) and use it with services to show information.
So I'm not sure SOA is the kind of architecture I'm really building up... is it SaaS instead of SOA?
I think it would be better to use RESTful applications, with JSON or something like that to implement it (I'm not a big fan of XML, I find it to be too verbose).
For clarity I'm listing here my questions:
Is this kind of architecture called SOA or SaaS (or both)?
What is a good implementation for what I want to obtain? (please explain it as more detailed as possible)
What sort of authentication is more suitable for a client (user token vs OAuth or similiar)
Do you have some suggestion for this kind of project?
I've about 3 months to do it, so I cannot do something real complex (beside the fact that it would not be realistic for a single programmer).
I know Python (WSGI frameworks), Ruby on Rails, C/C++ and other languages (.net excluded) and I'd like to develop it under a Linux environment (MySQL or Postgres, or even a NoSQL if you have any suggestion for the right choice), I could also combine several languages being these services independent programs.
What I'd like here is to have some good point of view and some good suggestion.
Thanks!
I would define SaaS as a Business model rather than an architecture; however like all business domain requirements it will influence systems architecture but it, itself is not. What you have defined is essential a Service Oriented Architecture.
Your statement "independent and the ability to combine them to offer new services" is the essential non-functional design requirement that suggests SOA.
Good implementation for SOA is about having well defined and flexible interfaces, with very clear delineation of responsibilities. However it is difficult subject to be prescriptive about. The proof is in the eating; does it provide that flexible reuse. My suggestion is spend time reading SOA design pattern resources, and understand the defining characteristics with regard to the appropriate context for use. Then apply the Single Responsibility principle appropriate level of abstraction. c.f. (Domain) Space Based Architecture is kind of SOA meta-pattern.
In regard to Authorisation, I recommend following the service approach, use a distribute directory services system like open LDAP, and note that is entirely reasonable for service provides and users to have their own credentials and you can use Public-Private keys for signing messages.
The main suggestion is study and learn from experience of others:
http://www.soapatterns.org/
http://martinfowler.com/eaaCatalog/
SOA doesn't forces to use XML.
Currently web technologies dominate, and define future.
So we in my company selected JSON RESTful services as foundation. And SOA as principles.
There is no sense to suggest languages, because the purpose of SOA and good implementation is
- to enable any language or framework to be used
(FYI we use Java with Spring MVC-based web-services, Node.js, PHP)
Web services and web APIs have managed to increase the accessibility of the information stored and catalogued on the internet. They have also opened up a vast array of enterprise power functionality for smaller thin client applications.
By taping into these services developers can provide functionality that would have taken them months perhaps years to set up. They can combine them into single applications that make life generally easier for its users.
Whether displaying information about the music being played, finding items of interest in the locale of the user or just simply tweeting and blogging from the same application - the possibilities are growing everyday.
I want to know about the most interesting or useful services that are out there, especially ones that most of us may not have heard about yet. Do you maintain an API or service? or do you have a clever mash up that provides even more benefits than the originals?
YQL - Yahoo provide a tool that lets you query many different API's across the web, even for sites that don't provide an API as such.
From the site:
The Yahoo! Query Language is an
expressive SQL-like language that lets
you query, filter, and join data
across Web services.
...
With YQL, developers can access and
shape data across the Internet through
one simple language, eliminating the
need to learn how to call different
APIs.
The World Bank API is pretty cool. Google uses it in search results. My favourite implementations are the cartograms at worldmapper.
(source: worldmapper.org)
It's very niche, but I happen to think the OpenCongress API is amazing.
Less niche: Google Translate has an API which will guess the language of something. You'd be AMAZED how frequently this comes in handy (even though it's not as tweakable as you'd like and is not trained on small samples).
I was just about to have a stab at using the SoundCloud API
I know many people who already use for sharing their musical masterpieces and its a pretty good site. Hopefully the api will be as well!
I like the RESTful API for weather.com. It's free and very useful for the new age of location-aware apps: https://registration.weather.com/ursa/xmloap/step1
It does require registration, but they don't spam you or anything - it's just to provide you a key to use the API.
Ah yes - here's another one I've been meaning to check out but haven't tried yet
The BBC offer a bunch of apis/feeds that look very promising
http://ideas.welcomebackstage.com/data
They include apis for accessing schedule data for both TV and Radio listings along with all kinds of news searches. It even looks like they'll be offering some sort of geo-location service soon so it will be interesting to see what that has to offer
Another interesting one for liberal brits! ;)
The Guardian news paper have their own api
http://www.guardian.co.uk/open-platform
MuiscBrainz
Excellent service for music mashups.
Not so many knows that Last.FM initial database was scraped from this service.
The United States Postal Service offers a web service that does address standardization. Quite useful in reducing clutter and cleaning data before it gets put into your database.
I think a package that would be quite useful is a centralised notification/news system.
This would run on a web server and client libraries could send messages to the server. Examples of messages might be:
Commits to version control.
Continuous build server failures (including logs).
News from project management.
Users could create accounts on the server and decide how they want to view the messages, e.g. email, RSS, etc. There could be filters based on channels, priorities, regexs, etc.
Does anyone know of any software package that provides these features (or could be extended to do so)? (Preferrably Windows based, but please cover other platforms)
If I can't find one I was thinking of writing one in Python using Django.
I found an XMPP protocol (xep-0060) from the pubsubhubbub link:
This specification defines an XMPP
protocol extension for generic
publish-subscribe functionality. The
protocol enables XMPP entities to
create nodes (topics) at a pubsub
service and publish information at
those nodes; an event notification
(with or without payload) is then
broadcasted to all entities that have
subscribed to the node. Pubsub
therefore adheres to the classic
Observer design pattern and can serve
as the foundation for a wide variety
of applications, including news feeds,
content syndication, rich presence,
geolocation, workflow systems, network
management systems, and any other
application that requires event
notifications.
What you describe sounds like a normal feed aggregator service, but with real-time?
I recently saw a video of a Google tech talk where they announced a product that hooks over the existing RSS/Atom structure but provides real-time notification. Didn't bookmark it unfortunately (hopefully someone will comment with it?), but that sounds like the underlying technology for what you want.