I started working with amazon CloudWatch Logs. The question is, are AWS using Glacier or S3 to store the logs? They are using Kinesis to process the logs using filters. Can anyone please tell the answer?
AWS is likely to use S3, not Glacier.
Glacier would make problems if you would want access older logs as to get data stored in Amazon Glaciers can take few hours and this is definitely not the reaction time one expects from CloudWatch log analysing solution.
Also the price set for storing 1 GB of ingested logs seems to be derived from 1 GB stored on AWS S3.
S3 price for one GB stored a month is 0.03 USD, and price for storing 1 GB of logs per month is also 0.03 USD.
On CloudWatch pricing page is a note:
*** Data archived by CloudWatch Logs includes 26 bytes of metadata per log event and is compressed using gzip level 6 compression. Archived
data charges are based on the sum of the metadata and compressed log
data size.
According to Henry Hahn (AWS) presentation on CloudWatch it is "3 cents per GB and we compress it," ... " so you get 3 cents per 10 GB".
This makes me believe, they store it on AWS S3.
They are probably using DynamoDB. S3 (and Glacier) would not be good for files that are appended to on a very frequent basis.
Related
I'm currently surprised by a pretty high daily AWS S3 cost of over 31 USD per day (I expected 9 USD - 12 USD per month):
I'm using eu-central-1
All buckets combined are less than 400 GB
No replication
The best explanation I have is that the number of requests was way higher than expected. But I don't know how I can confirm this. How can I narrow down the source of AWS S3 cost?
Is it possible to see the costs by bucket?
Is it possible to see a breakdown by storage / requests / transfers / other features like replication?
First pay attention to the factors on which AWS S3 charges - i.e. based on storage, how many requests s3 is getting, data transfer and retrieval.
Some of the ways for cutting and keep track on the cost -
Delete the previous version of the buckets if you don't need that.
Move the data to different s3 storage based on frequency of data retrieval.
activate the cost allocation tags on your buckets so that you can review the cost on individual bucket.
create an S3 Storage Lens dashboard for all the buckets in your account.
I am reading AWS CloudWatch Logs documentation here. They says
Archive log data – You can use CloudWatch Logs to store your log data in highly durable storage. The CloudWatch Logs agent makes it easy to quickly send both rotated and non-rotated log data off of a host and into the log service. You can then access the raw log data when you need it.
And in the pricing page, they have
Store (Archival) $0.03 per GB
And in the Pricing Calculator, they mention
Log Storage/Archival (Standard and Vended Logs)
Log volume archived is estimated to be 15% of Log volume ingested (due to compression). Storage/Archival costs are estimated assuming customer choses a retention period of one (1) month. Default retention setting is ‘never expire’.
Problem
I am trying to understand the behavior of this archive feature to decide if I need to move my log data to S3. but I cannot find any further details. I have tried exploring every button and link in CloudWatch Logs pages but cannot find a way to archive the data, I can only delete them or edit their retention rules.
So how does it work? The remark in the Pricing Calculator says it is estimated to be 15% of ingested volume, does this mean it always archive 15% of the log automatically? And why do they have to assume in the calculation taht the retention period is set to 1 month, does the archive feature behave differently otherwise?
The Archive log data feature refers to storing log data in CloudWatch Logs. You do not need to do anything additional to 'archive'. It is the regular storage you can see on console.
Considering only storage pricing, storing logs in S3 is cheaper. It varies depending on region but in average on S3 Standard is about $0.025 per GB vs $0.03 per GB on CloudWatch Logs Storage. And if you move the objects to other storage classes it becomes cheaper.
About:
Log volume archived is estimated to be 15% of Log volume ingested (due
to compression)
It refers to if 100GB of data are ingested on CloudWatch Logs, it reflects as only 15GB (15%) on Storage due to the special compressed format in which they stored this logs.
I have an ec2 instance which is running apache application.
I have to store my apache log somewhere. For this, I have used two approaches:
Cloudwatch Agent to push logs to cloudwatch
CronJob to push log file to s3
I have used both of the methods. Both methods suit fine for me. But, here I am little worried about the costing.
Which of these will have minimum cost?
S3 Pricing is basically is based upon three factors:
The amount of storage.
The amount of data transferred every month.
The number of requests made monthly.
The cost for data transfer between S3 and AWS resources within the same region is zero.
According to Cloudwatch pricing for logs :
All log types. There is no Data Transfer IN charge for any of CloudWatch.Data Transfer OUT from CloudWatch Logs is priced.
Pricing details for Cloudwatch logs:
Collect (Data Ingestion) :$0.50/GB
Store (Archival) :$0.03/GB
Analyze (Logs Insights queries) :$0.005/GB of data scanned
Refer CloudWatch pricing for more details.
Similarly, according to AWS, S3 pricing differs region wise.
e.g For N.Virginia :
S3 Standard Storage
First 50 TB / Month :$0.023 per GB
Next 450 TB / Month :$0.022 per GB
Over 500 TB / Month :$0.021 per GB
Refer S3 pricing for more details.
Hence, we can conclude that sending logs to S3 will be more cost effective than sending them to CloudWatch.
They both have similar storage costs, but CloudWatch Logs has an additional ingest charge.
Therefore, it would be lower cost to send straight to Amazon S3.
See: Amazon CloudWatch Pricing – Amazon Web Services (AWS)
I need to download this 60 milions tweets dataset (https://github.com/compsocial/CREDBANK-data) for my thesis work.
So I signed up for an AWS account and configured it with the correct parameters in my Mac terminal, then I signed in also into the S3 page, BUT when I try the following command:
aws s3api get-object --request-payer requester --bucket credbank --key stream_tweets_byTimestamp.data stream_tweets_byTimestamp.data
I get this error message:
An error occurred (NotSignedUp) when calling the GetObject operation: Your account is not signed up for the S3 service. You must sign up before you can use S3.
So, from what I understand, the problem is that my prepaid card wasn't accepted as a valid payment method and therefore my account isn't fully activated yet.
So, my question is, if I only need to run the get-object command above to download the data, could I be charged for some money if I use a real credit card, even if with my new account I'm in the 12-months free tier period?
Please let me know if you need further details to understand my question!
Thank you very much!
No. Your card will not be charged a single penny unless one of these happens:
a) You use >5 GB of Standard Storage. This one is really important in your case so ensure the data size you will be downloading
b) You make >20,000 Get Requests
c) You make >2,000 Put Requests
I've signed up for AWS free tier and used their S3 and EC2 services for a year without bearing any cost.
If the file you want to download is smaller than 15GB, then the download will cost you nothing, even if you use a real credit card.
You'll only use 1 GET request (you have 20,000 free) and a portion of the 15 GB of free download transfer per month.
Source:
As part of the AWS Free Usage Tier, you can get started with Amazon S3
for free. Upon sign-up, new AWS customers receive 5 GB of Amazon S3
storage in the Standard Storage class, 20,000 Get Requests, 2,000 Put
Requests, and 15 GB of data transfer out each month for one year.
Im studing aws pricing and I have some doubts.
About Amazon S3, it says that we pay $0.03 per gb per month.
But for example If I use only 256kb of storage, 256kb = 0.000256gb, using AWS S3 calculator it says that the cost is $0.00. So for small amounts of storage is always free??
And I have my s3 bucket configured with glacier class, so when I store this 256kb of data in s3, after 1 day this data is stored in glacier. So in this case, using 256kb for a day in s3 and then store in glacier, I dont pay nothing for s3 and glacier?
And also about Amazon S3 it says that we pay for get requests and for data transfer out from Amazon S3 To internet, If I acess for example a file inside my bucket from for example this link: https://s3.amazonaws.com/uploads/uploadedfiles/test/file.txt, it is a get request or data transfer out from Amazon S3 to internet??
And just one more about dynamoDB, it says that first 25GB stored per month is free, it is always free? Or it is just free for free tier?
S3 is free for 12 months for up to 5GB per month.
DynamoDB is 25GB per month for up to 12 months on the free tier.
Glacier is not part of the free tier program.
If I access for example a file inside my bucket from for example this
link: https://s3.amazonaws.com/uploads/uploadedfiles/test/file.txt, it
is a get request or data transfer out from Amazon S3 to internet??
Blockquote
That is both a S3 GET request and a S3 data transfer out.
AWS has each item and how much the FREE tier provides broken out on this page
http://aws.amazon.com/free/faqs/
http://aws.amazon.com/free/
Using the calculator with 256kb will not give you realistic results. That's like using a mortgage calculator on a $0.01 loan.
Try using the AWS calculator http://calculator.s3.amazonaws.com/index.html with 3 GB at least. Still with the AWS free tier you can do a lot for your first year and pay zero dollars to Amazon.