How to calculate monthly price for AWS Serverless Opensearch? - amazon-web-services

I want to know min monthly price for serverless option of Opensearch.
Docs say,
"you will be billed for a minimum of 4 OCUs (2x indexing includes
primary and standby, and 2x search includes one replica for HA) for
the first collection in an account."
Min 4 OCU per hour 4 * $0.24 * 24 * 30 = 691$ per month and + s3 cost.
Is it correct or I misunderstand something?
I am asking because it looks quite pricy and we can use servers to reduce the price

Related

Amazon obtain current buybox prices

I have a list containing Amazon products (ASINS). I want to update the buybox price in my list like every 5 hours. I am an registered Amazon seller so I do have access to the Selling Partner API - Amazon-Services-API. But the issue here is the rate limit. It is only 0.5 requests per second.
I have like 500k products in my list, it would take like multiple days with a rate limit of 0.5 requests per second.
There are serveral tools like scanunlimited or analyzer.tools which are able to obtain the current buybox price of a product way faster. Where are they getting their live data from? Am I missing out on some API?
Does anyone have an idea, how I can gather the data more quickly then 0.5 requests per seconds?
Kind regards

AWS Dynamo DB Free Tier Limits

I am new to DynamoDB, and I have a small in house application, which will be used by my parents for their small business. I just have to keep records of 10 - 20 rows daily, and will have a few edits close to 5 - 10 at max.
WIll I be able to use the Free Tier of Dynamo DB for the same?
I am using Heroku to host my LWC OSS (Node JS) application, which is again a free version. If not then any heads up to any particular type of Database which can fulfil my need.
Will I be able to use the Free Tier of Dynamo DB for the same?
Yes, dependent on the size of the data you want to be inputting & the rate at which you want to input.
Amazon DynamoDB offers a free tier with the following provisions, which is enough to handle up to 200M requests per month:
25 GB of Storage
25 provisioned Write Capacity Units (WCU)
25 provisioned Read Capacity Units (RCU)
Just be aware of the fact that:
25 WCU is 25 writes per second for an item up to 1KB or 5 writes per second for an item up to 5KB etc.
25 RCU is 50 reads per second for an item up to 4KB or 10 reads per second for an item up to 20KB etc.
If your API calls fall within the above criteria, you'll be within the free tier.
The main costed aspects of DynamoDB are how much you read and write to the tables. AWS call them "Read capacity units" (RCU) and "Write capacity units" (WCU).
When you create a DynamoDB table there are many options to choose from, but it's roughly accurate to say that:
One RCU gives you one strongly consistent read request per second
One WCU gives you one standard write request per second
So if you create a standard class table with 1 RCU and 1 WCU (the lowest possible) that would already easily accomodate what you predict you will need.
According to the AWS DynamoDB pricing page you can get 25 WCUs and 25 RCUs in the free tier.
So I would say choose DynamoDB standard class table, with Provisioned Capacity, no Auto Scaling, and customized to 1 RCU and 1 WCU like below, and your usage will remain well within the free tier.

How can I calculate the RCUs and WCUs from Cassandra for a AWS Keyspace cost estimation?

In order to consider AWS Keyspaces as an alternative to an on-prem Cassandra cluster, I'd like to do a cost estimation. However, the keyspaces pricing is based on write request units (WRUs) and read capacity units (RCUs).
https://aws.amazon.com/keyspaces/pricing/
Each RRU provides enough capacity to read up to 4 KB of data with LOCAL_QUORUM consistency.
Each WRU provides enough capacity to write up to 1 KB of data per row with LOCAL_QUORUM consistency
What metrics in Cassandra can be used for calculating the RCUs and WCUs for an existing cluster?
Currently we are storing iostats information (in every sec). Based on that information we were able to come up with an approx RC and WC. (+- 10% error margin, 95% confidence level)
We are going to cross check our numbers with the AWS folks soon.
Example:
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
abc 0.00 0.00 1.00 0.00 0.03 0.00 64.00 0.00 0.00 0.00 0.00 0.00 0.00
We use the following calculation:
10,000 writes, of up to 1Kb, per second, in the AWS-EAST region, cost will be
Write cost:
On-demand capacity mode
=$1.45 * 0.01 * 60 * 60 * 24 * 365 = $457,272 per year
Provisioned capacity mode
=$0.00075 * 0.01 * 60 * 60 * 24 * 365 = $236.52 per year
Updated: AWS folks are calculating based on a table partition size, which is wrong IMO.
Some accuracy can be lost by using IOPs. Cassandra has a lot of iops overhead. On reads, cassandra can be reading from multiple sstables. Cassandra also performs background compaction and repair which consumes iops. This is not factor in Amazon Keyspaces. Additionally, Keyspaces scales up and down based on utilization. Taking the average at a point in time will only provide you with a single dimension of cost. You need to take an average that represents a large period of time to cover for peaks and valleys of your workload. Workloads tend to look like sine or cosine waves instead of a flat line.
Gathering the following metrics will help provide more accurate cost estimates.
Results of the Average row size report(below)
Table live space in GBs divided by replication factor
Average writes per second over extended period
Average reads per second over extended period
Storage size
Table live space in GBs
This method uses Apache Cassandra sizing statistics to determine the data size in Amazon Keyspaces. Apache Cassandra exposes storage metrics via Java Management Extensions (JMX). You can capture these metrics by using third-party monitoring tools such as DataStax OpsCenter, Datadog, or Grafana.
Capture the table live space from the cassandra.live_disk_space_used metric. Take the LiveTableSize and divide it by the replication factor of your data (most likely 3) to get an estimate on Keyspaces storage size. Keyspaces replicates data three times in multiple AWS Availability Zones automatically, but pricing is based on the size of a single replica.
Table live space is 5TB and have replication factor of 3. For the us-east-1 you would use the following formula
(Table live space in GB / Replication Factor) * region storage price per gb
5000 / 3 * 0.3 = 500$ per month.
Collect the Row Size
Results of the Average row size report
Use the following script to collect row size metrics for your tables. The script exports table data from Apache Cassandra by using cqlsh and then uses awk to calculate the min, max, average, and standard deviation of row size over a configurable sample set of table data. Update the username, password, keyspace name, and table name placeholders with your cluster and table information. You can use dev and test environments if they contain similar data.
https://github.com/aws-samples/amazon-keyspaces-toolkit/blob/master/bin/row-size-sampler.sh
./row-size-sampler.sh YOURHOST 9042 -u "sampleuser" -p "samplepass"
The output will be used in the request unit calculation below. If your model uses large blobs then divide the average size by 2 because cassandra returns a hex value character representation.
Read/write request metrics
Average writes per second/Total writes per month
Average reads per second/Total reads per month
Capturing the read and write request rate of your tables will help determine capacity and scaling requirements for your Amazon Keyspaces tables.
Keyspaces is serverless, and you pay for only what you use. The price of Keyspaces read/write throughput is based on the number and size of requests.
To gather the most accurate utilization metrics from your existing Cassandra cluster, you will capture the average requests per second (RPS) for coordinator-level read and write operations. Take an average over an extended period of time for a table to capture peaks and valleys of workload.
average write request per second over two weeks = 200 writes per second
average read request per second over two weeks = 100 read request per second
LOCAL_QUORUM READS
=READ REQUEST PER SEC * ROUNDUP(ROW SIZE Bytes / 4096) * RCU per hour price * HOURS PER DAY * DAYS PER MONTH
200 * (900 bytes / 4096) * 0.00015 * 24 * 30.41 = 27$ per month
LOCAL_ONE READS
Using eventual consistency reads can save you half the cost on your read workload.
=READ REQUEST PER SEC * ROUNDUP(ROW SIZE Bytes / 8192) * RCU per hour price * HOURS PER DAY * DAYS PER MONTH
200 * (900 bytes / 4096) * 0.00015 * 24 * 30.41 = 14$ per month
LOCAL_QUORUM WRITES
=WRITE REQUEST PER SEC * ROUNDUP(ROW SIZE Bytes / 1024) * RCU per hour price * HOURS PER DAY * DAYS PER MONTH
100 * (900 bytes / 4096) * 0.00075 * 24 * 30.41 = 68$ per month
Storage 500$ per month
Eventual Consistent Reads 14$ per month
Writes 68$ per month
Total: 592 per month
To further reduce cost I may look use client side compression on writes for large blob data or if I have many small rows, I may use collections to fit more data in a single row.
Check out the pricing page for the most up-to-date information.

How to lower Data Transfer costs on an AWS course platform?

I am calculating the operation costs for a platform we want to develop for a client in AWS. The platform is an online course solution where users can subscribe and access different multimedia contents.
My initial thought was to store the videos in an S3 bucket and simply "feed" them to my back end solution so that the front end can access and show them.
My problem is that when doing the cost estimate I am getting huge cost estimates in Outbound Data Transfer. I don't really know how much traffic the client is expecting so I estimated traffic the following way (this platform is gonna be supported by the state so it will have some traffic):
20MB for every minute of video:
2 hours per week for every user
200 users per month
20MB/m * 60 m/h * 2h/week * 4weeks * 200 = 1.92 TB
This, at 0.09 USD / GB gives me 184.23 USD per month...
I don't know if I am not designing a well made solution, if my estimate is wrong... but I find this to be very expensive. Adding other costs this means I have to pay nearly 2 USDs per user. If someone finds a way to reduce costs please let me know!
Thank you

AWS Personalize in Cost Explorer

I am using for 4 dataset group for example:-
Movies
Mobile
Laptops
AC
And in each datasetGroup, we have 3 datasets with name Users, Item and Item_User_INTERACTIONS
And we also have one solution and Campaigns for each dataset group.
I am also sending the real-time event to AWS Personalize using API (putEvent)
The above things cost me about 100USD in two days and showing 498 TPS hours used and I am unable to find the real reason for this much cost.
Or does AWS Personalize simply cost this much?
As your billing tells you, you have used 498 TPS hours, let's calculate if it should be $100.
According to official Amazon Personalize pricing:
https://aws.amazon.com/personalize/pricing/
For first 20K TPS-hour per month you have to pay $0.20 per TPS-hour.
You have used 498 TPS hours in two days, it gives us:
$0.2 * 498 = $99.6 in total.
The answer is: yes, it's expensive.
Another question is:
How TPS usage is calculated?
They charge you for each TPS that is currently reserved. So if you have a campaign with 1 TPS and it's created for 24 hours, then you will be charged for 24[h] x 1[TPS] = 24 TPS hours = $4.8.
The problem is, that $0.2 doesn't look expensive, but if you multiply it by hours, it becomes very expensive.
For testing purposes you should always set TPS to 1, since you cannot set it to 0. 1 TPS allows you to get 3600 recommendations per hour, which is a lot anyways.
The reason for such high price is because of created Campaign which exists and therefore running (this part of AWS Personalize uses more resources than uploading data to s3/creating a model. It is based on TPS-hour per month metric)
E.g. suppose you uploaded a dataset with 100000 rows
Training will cost you about $0.24*2=0.5$ (assuming training took 2h)
Uploading to s3 and dataset - almost free
A created campaign which allows 1 request per second will cost $0.2*24*30=144$ per month
If in the production environment you will set a campaign to support 20 requests per second, it will be 2880$ per month.
So definitely, if these are your first steps with AWS Personalize, create campaigns only that support 1 request per second and verify that you delete unused resources on time.
In case of the SIMS recipe, there is also another way which might save you some money. Try to check how much it will cost for you just to retrain the model every 3d, for example, and to create batch recommendations for your items. Using this strategy we are spending now only 50$ per month per e-Shop instead of 1000$ per month.
Find more data in AWS docs