I'm working on client-side SDK for my product (based on AWS). Workflow is as follows:
User of SDK somehow uploads data to some S3 bucket
User somehow saves command on some queue in SQS
One of the worker on EC2 polls the queue, executes operation and sends notification via SNS. This point seems to be clear.
As you might have noticed, there are quite some unclear points about access management here. Is there any common practice to provide access to AWS services (S3 and SQS in this case) for 3rd-party users of such SDK?
Options which I see at the moment:
We create IAM-user for users of the SDK which have access to some S3 resources and write permission for SQS.
We create additional server/layer between AWS and SDK which is writing messages to SQS instead of users as well as provides one-time short-living link for SDK to write data directly to S3.
First one seems to be OK, however I'm hesitant that I'm missing some obvious issues here. Second one seems to have a problem with scalability - if this layer will be down, whole system won't work.
P.S.
I tried my best to explain the situation, however I'm afraid that question might still lack some context. If you want more clarification - don't hesitate to write a comment.
I recommend you look closely at Temporary Security Credentials in order to limit customer access to only what they need, when they need it.
Keep in mind with any solution to this kind of problem, it depends on your scale, your customers, and what you are ok exposing to your customers.
With your first option, letting the customer directly use IAM or temporary credentials exposes knowledge to them that AWS is under the hood (since they can easily see requests leaving their system). It has the potential for them to make their own AWS requests using those credentials, beyond what your code can validate & control.
Your second option is better since it addresses this - by making your server the only point-of-contact for AWS, allowing you to perform input validation / etc before sending customer provided data to AWS. It also lets you replace the implementation easily without affecting customers. On availablily/scalability concerns, that's what EC2 (and similar services) are for.
Again, all of this depends on your scale and your customers. For a toy application where you have a very small set of customers, simpler may be better for the purposes of getting something working sooner (rather than building & paying for a whole lot of infrastructure for something that may not be used).
Related
I've an application that queries some of my AWS accounts every few hours. Is it safe (from memory, number of connections perspective) to create a new client object for every request ? As we need to sync almost all of the resource types for almost all of the regions, we end up with hundred clients(number of regions multiplied by resource types) per service run.
In general creating the AWS clients are pretty cheap and it is fine to create them and quickly dispose them. The one area I would be careful with when comes to performance is when the SDK has do resolve the credentials like assuming IAM roles to get credentials. It sounds like in your case you are iterating through a bunch of accounts so I'm guessing you are explicitly setting credentials and so that will be okay.
I've recently signed up to AWS to test out their IoT platform and after setting up a few Things and going through the documentation I still seem to be missing a crucial bit of information - how to wrangle all the information from my Things?
For example if I were to build a web-based application to display the health/status of all the Things and possibly also interact with a specific Thing, what would be the way to go about it?
Do I register a "dummy" thing that also uses the device SDK to pub/sub to the topics?
Do I take whatever data the Things publish and route it to a shared DB for further processing?
Do I create Lambdas that the Things invoke?
Do I create a stand-alone application that uses the general AWS SDK to connect itself to the IoT platform?
To me the last idea sounds the most viable and "preferred" as I would need two-way interaction, not just passive listening to changes in Things, is that correct?
Generally speaking your setup might be:
IoT device publishes to AWS SQS
Some Service (application or lambda) reads from SQS and processes data (e.g. saves it to DynamoDB)
And then to display data
Stand alone application reads from DynamoDB and makes data available to users
There are lots of permutations of this. For example your IoT device can write directly to DynamoDB, then you can process the data from there. I would suggest a better pattern is to write to SQS, as you will have a clean separation between data publishing, processing and storage.
In the first instance I would probably write one application that reads from the SQS, processes the data, stores it in DynamoDB and then provides access to that data for users. A better solution longer term is to have separate systems to process/store the data, and to present that data to users.
Lambda is popular for processing of the device data, as its cost effective (runs only when needed) and scales well. Your data presentation application is probably a traditional webapp running on something like elastic beanstalk.
If i understood the whole concept correctly, the "serverless" architecture assumes that instead of using own servers or containers, one should use bunch of aws services. Usually such architecture includes Amazon API Gateway, bunch of Lambda functions and DynamoDB (or alternative) for storing data and state, as Lambda can't keep state. And such services as EC2 is not participating in all this, well, because this is a virtual server and it diminish all the benefits of serverless architecture.
All this looks really cool, but i feel like i'm missing something important, because right now this seems to be not applicable for such cases as real time applications.
Say, i have 2 users online. One of them performs an action in an app, which triggers changes in database, which in turn, should trigger changes in the second user app.
The conventional way to send some data or command from server to client is websocket connection. But with serverless architecture there seem to be no way to establish and maintain websocket connection. So... where did i misunderstood the concept? Or, if i understood everything correctly, then how do i implement the interactions between 2 users as described above?
where did i misunderstood the concept?
Your observation is correct. It doesn't work out of the box using API Gateway and Lambda.
Applicable solution as described here is to use AWS IoT - yes, another AWS Service.
Serverless isn't just a matter of Lambda, API Gateway and DynamoDB, it's much bigger than that. One of the big advantages to Serverless is the operational burden that it takes off your plate. No more patching, no more capacity planning, no more config management. Those may seem trivial but doing those things well and across a significant fleet of instances is complex, expensive and time consuming. Another benefit is the economics. Public cloud leverages utility billing, meaning you pay for what you run whether or not you actually use it. With AWS most of the billing per service is by hour but with Lambda it's per 100ms. The cheapest EC2 instance running for a full month is about $10/m (double that for redundancy). $20 in Lambda pricing gets you millions of invocations so for most cases serverless is significantly cheaper.
Serverless isn't for everything though, it has it's limitations, for example it's not meant for running binaries. You can't run nginx in Lambda (for example), it's only meant to be a runtime environment for the programming languages that it supports. It's also specifically meant for event based workloads, which is perfect for microservice based architectures. Small independent discrete pieces of compute doing work that when done they send an event to another(s) to do something else and if needed return a response.
To address your concerns about realtime processing, depending on what your code is doing your Lambda function could complete in less than 100ms all the way up to 5 minutes. There are strategies to optimize it's duration time but in general it's for short lived work which is conducive of realtime scenarios.
In your example about the 2 users interacting with the web app and the db, that could very easily be built using serverless technologies with one or 2 functions and a DynamoDB table. The total roundtrip time could be as low as milliseconds if not seconds, it really all depends on your code and what it's doing. These would all be HTTP calls so no websockets needed. Think of a number of APIs calling each other and your Lambda code is the orchestrator.
You might want to look at SNS (simple notification service). In your example, if app user 2 is a a subscriber to an SNS topic, then when app user 1 makes a change that triggers an SNS message, it will be pushed to the subscriber (app user 2). The message can be pushed over several supported protocols (Amazon, Apple, Google, MS, Baidu) in addition to SMTP or SMS. The SNS message can be triggered by a lambda function or directly from a DynamoDB stream after an update (a database trigger). It's up to the app developer to select a message protocol and format. The app only has to receive messages through its native channels. This may not exactly be millisecond-latency 'real-time', but it's fast enough for all but the most latency-sensitive applications.
I've been working on an AWS serverless application for several months now, and am amazed at the variety of services available. The rate of improvement and new features being added is enough to leave you out-of-breath.
I'm still trying to wrap my mind around the limitations of AWS Lambda, especially now that AWS API Gateway opens up a lot of options for serving REST requests with Lambda.
I'm considering building a web app in Angular with Lambda serving as the back-end.
For simple CRUD stuff it seems straightforward enough, but what about authentication? Would I be able to use something like Passport within Lambda to do user authentication?
Yes, you can do pretty much anything, just store your session on an AWS hosted database (RDS, Dynamo, etc). But be aware exactly you are buying with lambda. It has a lot of trade-offs.
Price: An EC2 server costs a fixed price per month, but lambda has a cost per call. Which is cheaper depends on your usage patterns. Lambda is cheaper when nobody is using your product, EC2 is most likely cheaper as usage increases.
Scale: EC2 can scale (in many ways), but it's more "manual" and "chunky" (you can only run 1 server or 2, not 1.5). Lambda has fine-grained scaling. You don't worry about it, but you also have less control over it.
Performance: Lambda is a certain speed, and you have very little control. It may have huge latencies in some cases, as they spin up new containers to handle traffic. EC2 gives you many more options for performance tuning. (Box size, on-box caches, using the latest node.js, removing un-needed services from the box, being able to run strace, etc) You can pay for excess capacity to ensure low latency.
Code: The way you code will be slightly different in Lambda vs EC2. Lambda forces you to obey some conventions that are mostly best practice. But EC2 allows you to violate them for performance, or just speed of development. Lambda is a "black box" where you have less control and visibility when you need to troubleshoot.
Setup: Lambda is easier to setup and requires less knowledge overall. EC2 requires you to be a sysadmin and understand acronyms like VPC, EBS, VPN, AMI, etc.
Posting this here, since this is the first thread I found when searching for running NodeJS Passport authentication on Lamdba.
Since you can run Express apps on Lamda, you really could run Passport on Lambda directly. However, Passport is really middleware specifically for Express, and if you're designing for Lamda in the first place you probably don't want the bloat of Express (Since the API Gateway basically does all that).
As #Jason has mentioned you can utilizing a custom authorizer. This seems pretty straight-forward, but who wants to build all the possible auth methods? That's one of the advantages of Passport, people have already done this for you.
If you're using the Servlerless Framework, someone has built out the "Serverless-authentication" project. This includes modules for many of the standard auth providers: Facebook, Google, Microsoft. There is also a boilerplate for building out more auth providers.
It took me a good bunch of research to run across all of this, so hopefully it will help someone else out.
but what about authentication?
The most modular approach is to use API Gateway's Custom Authorizers (new since Feb'16) to supply an AWS Lambda function that implement Authentication and Authorization.
I wrote a generic Custom Authorizer that works with Auth0 a the 3rd-party Single-Sign-On service.
See this question also: How to - AWS Rest API Authentication
Would I be able to use something like Passport within Lambda to do user authentication?
Not easily. Passport relies on callback URLs which you would have to create and configure.
Objective: Using iPhone app, I would like the users store objects in DynamoDB and have Fine-Grained Access Control for the objects using IAM with TVM.
The objects will contain only Strings, no images/file storage -- I'm thinking I won't need an S3?
Question: Since there is no server-side application, do I still need an EC2 Instance? What all suite of AWS services will I have to subscribe to in order to accomplish my objective?
You can use either DynamoDB (or S3), and neither of them would require an EC2 instance - there is no dependency.
If it was me, I'd first see if I could get what I wanted down in S3(because you mentioned it as a possibility), and then go to DynamoDB if I couldn't (i.e. I wanted to be able to run agregation queries across my data set). S3 will be cheaper and depending on what your are doing, may even be faster and would allow you to globally distribute the stored data thru CloudFront easily, which if you have a globally diverse user base may be beneficial.