Say I just issue a daily copy command, as opposed to streaming all my data immediately into redshift. Does that mean I have a really low percent usage, and therefore I have a low bill?
According to the Amazon simple monthly calculator, using 10 ds1.xlarge on-demand nodes will run me $6,844.20 a month.
However, if I only use those nodes for one hour a day, it will only run me $263.50 a month.
To be more specific, there are two strategies I'm considering. One is to send my data (which comes in at a rate of hundreds a second) to a Firehose stream, which is pointed at a Redshift cluster (with an intermediate S3 bucket of course). The other strategy is to send my data to a different Firehose stream, which is pointed at an S3 bucket; then, I issue a daily COPY command (through JDBC). Let's assume that I read very rarely from my database, such that the total amount of time spent COPYing and reading in my database does not exceed one hour per day.
You pay for Redshift by the server hour, just like EC2, RDS and ElastiCache. You are reserving a specific amount of server resources and you pay for that each hour that it exists, regardless of actual "usage".
The "Usage" field in the calculator defaults to "100% Utilized/Month" which would result in the price of a Redshift cluster that existed for the entire month. By changing it to "1 Hours/Day" you have indicated to the price calculator that you plan to create a Redshift cluster once a day, and delete it before it has existed for more than an hour, and then do that again the next day, every day of the month.
The amount of time you spend copying/updating/reading from your Redshift cluster has no bearing on the monthly price of the cluster.
Related
Lets suppose I have a dynamodb table with 10 frequently accessed items of around 8KB each.
I decided to use dax infront of the table.
I got total 1 million read requests for the items.
a. Will I be charged for 10 dynamodb requests, since only 10 requests made it to dynamodb and rest were fetched from dax cache itself,
or
b. will I be charged for all 1 million dynamodb requests.
I had a similar question, and asked AWS. The answer I received was:
Whenever DAX has the item available (a cache hit), DAX returns the
item to the application without accessing DynamoDB. In that case, the
request will not consume read capacity units (RCUs) from DynamoDB
table and hence there will not be any DynamoDB cost for that
request. Therefore, if you have 10k requests and out of that if
only 2k requests goes to DynamoDB, the total charge which gets charged
will be for the 2k read request charge for DynamoDB, running cost for
DAX cluster and data transfer charges (if applicable).
DynamoDB charges for DAX capacity by the hour and your DAX instances
run with no long-term commitments. Pricing is per node-hour consumed
and is dependent on the instance type you select. Each partial
node-hour consumed is billed as a full hour. Pricing applies to all
individual nodes in the DAX cluster. For example, if you have a
three-node DAX cluster, you are billed for each of the separate nodes
(three nodes in total) on an hourly basis.
https://aws.amazon.com/dynamodb/pricing/on-demand/
In the GCP user interface I can estimate the pricing for whatever disk size I wish to use, but when I want to create my BigTable instance I can only choose the number of nodes and each node comes with 2.5TB of SSD or HDD disk.
Is there a way to, for example, setup a BigTable cluster with 1 node and 1TB of SSD instead of the 2.5TB default one ?
Even in the GCP pricing calculator I can change the disk size, but I can't find where to configure it when creating the cluster (https://cloud.google.com/products/calculator#id=2acfedfc-4f5a-4a9a-a5d7-0470d7fa3973)
Thanks
If you only want a 1TB database, then only write 1TB and you'll be charged accordingly.
From the Bigtable pricing documentation:
Cloud Bigtable frequently measures the average amount of data in your
Cloud Bigtable tables during a short time interval. For billing
purposes, these measurements are combined into an average over a
one-month period, and this average is multiplied by the monthly rate.
You are billed only for the storage you use, including overhead for
indexing and Cloud Bigtable's internal representation on disk. For
instances that contain multiple clusters, Cloud Bigtable keeps a
separate copy of your data with every cluster, and you are charged for
every copy of your data.
When you delete data from Cloud Bigtable, the data becomes
inaccessible immediately; however, you are charged for storage of the
data until Cloud Bigtable compacts the table. This process typically
takes up to a week.
In addition, if you store multiple versions of a value in a table
cell, or if you have set an expiration time for one of your table's
column families, you can read the obsolete and expired values until
Cloud Bigtable completes garbage collection for the table. You are
also charged for the obsolete and expired values prior to garbage
collection. This process typically takes up to a week.
They say there is always a monthly fee. But does it get charged as the month progresses or right away?
Amazon WorkSpaces is charged on a pro-rata basis.
So, if you run one for one week and turn it off, you'll be charged approximately one quarter of the monthly fee.
The first month's fee and the last month's fee are similarly pro-rated, based upon how long the WorkSpace was running.
There is no charge for creating a WorkSpace.
Suppose I have a script which uploads a 100GB object every day to my S3 bucket. This same script will delete any file older than 1 week from the bucket. How much will I be charged at the end of the month?
Let's use pricing from the us-west-2 region. Suppose this is a 30-day month and I start with no data in the bucket at the beginning of the month.
If charged for maximum bucket volume per month, I would have 700GB at the end of the month and be charged $0.023 * 7 * 100 = $16.10. Also some money for my PUT requests ($0.005 per 1,000 requests so effectively 0).
If charged for total amount of data that had transited through the bucket over the course of that month, I would be charged $0.023 * 30 * 100 = $69. (again +effectively $0 for PUT requests)
I'm not clear on which of these two cases Amazon bills. This becomes very important for me, since I expect to have a high amount of churn in my bucket.
Both of your calculations are incorrect, although the first one comes close to the right answer, for the wrong reason. It is neither peak nor end-of-month that matters.
The charge for storage is calculated hourly. For all practical purposes, this is the same as saying that you are billed for your average storage over the course of a month -- not your maximum, and not the amount you uploaded.
Storing 30 GB for 30 days or storing 900 GB for 1 day would cost the same amount, $0.69.
The volume of storage billed in a month is based on the average storage used throughout the month. This includes all object data and metadata stored in buckets that you created under your AWS account. We measure your storage usage in “TimedStorage-ByteHrs,” which are added up at the end of the month to generate your monthly charges.
https://aws.amazon.com/s3/faqs/#billing
This is true for STANDARD storage.
STANDARD_IA and GLACIER are also billed hourly, but there is a notable penalty for early deletion: Each object stored in these classes has a minimum billable lifetime of 30 days in IA or 90 days in Glacier, no matter when you delete it. Both of these alternate storage classes are only appropriate for data you do not intend to delete soon or retrieve often, by design.
REDUCED_REDUNDANCY storage follows the same rules as STANDARD (hourly billing, no early delete penalty) but after the most recent round of price decreases, it is now only less expensive than STANDARD in regions with higher costs. It is an older offering that is no longer competitively priced in regions where STANDARD pricing is lowest.
Your bill will for storage will be closer to your #1 example, perhaps a bit higher because for brief amounts of time, while uploading the 8th day, you still have 7 days of storage accruing charges, but you won't be charged anywhere near your #2 example.
Firstly, you don't need a script to delete files older than 1 week. You can set a transition cycle on the bucket which will automatically do that; or might be transfer contents to Glacier ( with 10% cost ) if you might need them later.
Secondly, storage cost might not be huge.. Probably better idea would be to that script first deletes data from S3 ( if u want script to do that ) and then you add more data.. so that your bucket overall never have more data and you are always charged on consistent storage basis.
Thirdly, your main charges could be bandwidth charges (if not handled well) which can be really huge as you are transferring so much data. If all this data is generated internally from your grid then make sure u create VPC endpoint to your S3 so that you don't pay "bandwidth charges" as then this data transfer will be considered to be transferred on intranet.
I would like to ask which services would suits me the best. For example, an facebook-like mobile app where I need to track every movement of a user such as the pages visited or links clicked.
I am thinking of using DynamoDB to create multiple tables to track each different activities. When I run my analytic app, it will query all the data for each table (Similar hash key but different range key so I can query all data) and compute the result in the app. So the main cost is the read throughput which can easily be 250 reads/s (~ $28/mth) for each table. The storage for each table has no limit so it is free?
For Redshift, I will be paying for the storage size on a 100% utilized per month basis for 160GB. That will cost me about $14.62/mth. Although it looks cheaper, I am not familiar with Redshift, hence, I am not sure of what are the other hidden costs.
Thanks in advance!
Pricing for Amazon DynamoDB has several components:
Provisioned Throughput Capacity (the speed of the tables)
Indexed Data Storage (the cost of storing data)
Data Transfer (for data going from AWS to the Internet)
For example, 100GB of data storage would cost around $25.
If you want 250 reads/second, it would cost $0.0065 per 50 units, which is .0065 * 5 units * 24 hours * 30 days = $23.40 (plus some write capacity units).
Pricing for Amazon Redshift is based upon the number and type of nodes. A 160GB dc1.large node would cost 25c/hour * 24 hours * 30 days = $180 per node (but only one node is probably required for your situation).
Amazon Redshift therefore comes out as more expensive, but it is also a more-featured system. You can run complex SQL against Amazon Redshift, whereas you would have to write an application to retrieve, join and compute information from DynamoDB. Think of DynamoDB as a storage service, while Redshift is also a querying service.
The real decision, however, should be based on how you are going to use the data. If you can create an application that will work with DynamoDB, then use it. However, many people find the simplicity of using SQL on Redshift to be much easier.