A company's data warehouse receives orders from multiple ordering systems. This data needs to be stored and sales commission needs to be paid based on the state's sales made. A mapping table is present which associates state to Sales manager. How would you implement such a solution? Which AWS services would you use? What are the major design decisions you will take to ensure that a payment is accurately tracked?
You would need a database to store the information, such as Amazon RDS for MySQL. It doesn't sound like the data volume or usage justifies a Data Warehouse solution like Amazon Redshift.
You'll also need to run some application logic somewhere, presumably on an Amazon EC2 instance.
The design of the application, including ensuring that the "payment is accurately tracked", is totally your responsibility. AWS provides the infrastructure for such a system, but not the software application.
Related
We are building a customer facing App. For this app, data is being captured by IoT devices owned by a 3rd party, and is transferred to us from their server via API calls. We store this data in our AWS Documentdb cluster. We have the user App connected to this cluster with real time data feed requirements. Note: The data is time series data.
The thing is, for long term data storage and for creating analytic dashboards to be shared with stakeholders, our data governance folks are requesting us to replicate/copy the data daily from the AWS Documentdb cluster to their Google cloud platform -> Big Query. And then we can directly run queries on BigQuery to perform analysis and send data to maybe explorer or tableau to create dashboards.
I couldn't find any straightforward solutions for this. Any ideas, comments or suggestions are welcome. How do I achieve or plan the above replication? And how do I make sure the data is copied efficiently - memory and pricing? Also, don't want to disturb the performance of AWS Documentdb since it supports our user facing App.
This solution would need some custom implementation. You can utilize Change Streams and process the data changes in intervals to send to Big Query, so there is a data replication mechanism in place for you to run analytics. One of the use cases of using Change Streams is for analytics with Redshift, so Big Query should serve a similar purpose.
Using Change Streams with Amazon DocumentDB:
https://docs.aws.amazon.com/documentdb/latest/developerguide/change_streams.html
This document also contains a sample Python code for consuming change streams events.
I have an app, built upon multiple AWS services, that provides a document storage service for my users.
Rather than track my users usage based on the way they use my app, and then multiplying their usage with the cost of each service they consume (doing the calculation myself), I was wondering if there was a way to do this automatically (having aws track my users at a granular per user level and compute per user costs automatically)?
For example, when a user consumes some AWS service, is there an option to provide an identifieR to AWS, so AWS tracks usage and computes costs of individual ids itself? That way it would be much simpler to just ask AWS how much my users are consuming and charge them appropriately.
It appears that you are wanting to determine the costs of your service so that you can pass-on these costs to your customers.
I would recommend that you re-think your pricing strategy. Rather than charging as "cost plus some profit", focus your pricing with these rules in mind:
Charge for things that are of value to your customers (that they want to do)
Charge for things that you want them to do less (what you don't want them to do)
Make everything else free
Think about how this applies to other services that charge money:
Water utilities: Charge for water, but charge people for consuming too much water
Netflix: Charge for providing shows, but charge extra for consuming more bandwidth (4K, multiple accounts)
Cell phone: Charge for service, but charge extra for consuming more data
Each of these organizations suffer extra costs for providing more water, bandwidth and data, so they would prefer people either don't consume them much, or pay extra for consuming it.
If your application provides benefits to customer for storing documents, then they will pay for your service. However, you will suffer extra costs for storing more documents so you should charge extra for consuming more storage. Everything else, if possible, should be free.
Therefore, don't calculate how much each user costs you for running the Amazon EC2 instances, Data Transfer bandwidth, domain names, database storage, support staff, programmers, management and your time. Instead, concentrate on the value you are giving to your customers and they should be willing to pay if the benefit is greater than the cost. Find the element that really costs you more if they over-consume and charge extra for that. For example, if somebody stores more documents, which consumes more space in Amazon S3 and costs more Data Transfer, then charge them for having more documents. Don't charge them based on some technical aspect like the size of EC2 instance you are using.
To answer your specific question, you could use tagging to identify resources that relate to a specific user, such as objects stored in Amazon S3. However, you'll probably find that most of your costs are shared costs (eg you can't split-up an EC2 instance hosting a web app or a database used by all customers). Your application database is probably already keeping track of their usage so tagging wouldn't necessarily provide much additional insight unless specific resources are being consumed by specific users.
See: AWS Tagging Strategies
I'm looking for a list of all the AWS products and their pricing information and I came across the Bulk API offer index file. This file contain the links for AWS services pricing info (such as SNS). However this file contain many products. I don't understand the difference between a service and a product, shouldn't be for each service only one product ?
In this context, a service is an AWS service, like SNS, SQS, S3, etc. The products represent each individual usage and billing component associated with that service, such as (in the case of SNS) various types of message delivery events (per event), or outbound/inbound data transfer (per GB).
Note that billing components are defined for each individual product. When there is no charge for a particular product -- such as is sometimes the case for certain classes of data transfer -- there is still a billing component, but it happens to be priced as $0.00 per unit.
If I want to utilize Amazon Web Services to provide the hardware (cores and memory) to process a large amount of data, do I need to upload that data to AWS? Or can I keep the data on the system and rent the hardware?
Yes, in order for an AWS-managed system to process a large amount of data, you will need to upload the data to an AWS region for processing at some point. AWS does not rent out servers to other physical locations, as far as I'm aware (EDIT: actually, AWS does have an offering for on-premises data processing as of Nov 30 2016, see Snowball Edge).
AWS offers a variety of services for getting large amounts of data into its data centers for processing, (ranging from basic HTTP uploads to physically mailing disk drives for direct data import), and the best service to use will depend entirely on your specific use-case, needs and budget. See the overview page at Cloud Data Migration for an overview of the various services and help on selecting the most appropriate service.
Is there some way to purchase a product on Amazon via the API?
Currently I'm buying several products on daily base, where each product can be delivered to differnet addresses, and each time I have to go over the checkout phase on Amazon (many clicks).
According to my searches (for example Programmatically make Amazon purchase?) it seems that there is no way to purchase a product via the API and I understand the reasons for that.
However, I wonder if there is some other way to automate the process of ordering multiple products on Amazon.
Another way at it would be to automate the browser with Selenium. Of course this would require updating the code every time the Amazon website changed.