AWS Ubuntu Server filling up, but why? - amazon-web-services

I'm currently on a free-tier AWS AmazonEC2 server with Ubuntu 16.04 installed. The main purpose is a web server running HTTP serving HTML/PHP pages and a MySQL database. The MySQL database is about 7GB large. It hasn't been inserting data for months so I don't believe the database is at fault here.
Currently Amazon is telling me my storage is at 27GB. About two days ago it was at 25GB. I haven't even touched the server in about a month and I absolutely have not been installing anything. I'm trying to find out what is taking up all this data.
I installed ncdu and switched to root, ran it and these are the results:
As you can see it's absolutely nowhere near 27GB, it's at just over 13GB. So where is this other 14GB of storage coming from? How do I find this out?
I'm afraid it's going to go over the 30GB free-tier limit and I don't know what will happen or how much I will be charged (I don't know how to find this out either).

You're misunderstanding the email. It has nothing to do with how much data you have stored.
It has to do with how much EBS storage you have used -- that is, how much you have provisioned over time. EBS doesn't bill based on what you store, it bills based on how big the disks are, regardless of what you put on them. Billing is in gigabyte-months.
A volume of size 1 gigabyte that exists for 30 days is said to "use" 1 gigabyte-month of EBS capacity.
1 gigabyte for 1 day is 1 day × 1 gb ÷ 30 days/mo =~ 0.333 gigabyte-month.
30 gigabytes for 30 days uses 30 gigabyte-months of EBS capacity.
So 30 gigabytes for 27 days would be 27 gigabyte-months, and 30 gigabytes for 25 days would be 25 gigabyte-months.
Currently Amazon is telling me my storage is at 27GB. About two days ago it was at 25GB.
So this is normal, exactly what you'd expect if you had a single 30 GB EBS volume. Your usage over time is growing because time is passing, not because usage is increasing.
AWS can't see¹ what you've stored on the volume -- they have no idea how full it is... only how large it is, which isn't a number that changes unless you resize the volume to make it physically larger.
¹ can't see may seem implausible but is true for multiple reasons, including the simple fact that they simply don't look. An EBS volume is a block storage device. Although typically the volumes are used for standards-based filesystems, there's no constraint on that. They can be used in any other way that you can use a block device. But even when used as a filesystem, the concept of "free space" is a concept only the filesystem itself understands -- not the raw, underlying device.

The issue had to do with an elastic IP being detached from a server while the server was down at one point for an amount of time.

Related

Google Cloud SQL - Database instance storage size increased dramatically everyday

I have a database instance (MySQL 8) on Google Cloud and since 20 days ago, the instance's storage usage just keeps increasing (approx 2Gb every single day!).
But I couldn't find out why.
What I have done:
Take a look at Point-in-time recovery "Point-in-time recovery" option, it's already disabled.
Binary logs is not enabled.
Check the actual database size and I see my database is just only 10GB in size
No innodb_per_table flag, so it must be "false" by default
Storage usage chart:
Database flags:
The actual database size is 10GB, now the storage usage takes up to 220GB! That's a lot of money!
I couldn't resolve this issue, please give me some ideal tips. Thank you!
I had the same thing happen to me about a year ago. I couldn't determine any root cause of the huge increase in storage size. I restarted the server and the problem stopped. None of my databases experienced any significant increase in size. My best guess is that some runaway process causes the binlog to blow up.
Turns out the problem is in a Wordpress theme's function called "related_products" which just read and write every instance of the products that user comes accross (it would be millions per day) and makes the database physically blew up.

Amazon Elasticsearch Upgrade from 7.7 to 7.9 still processing after 12 hours

Last night I started an upgrade of my Amazon Elasticsearch Cluster from version 7.7 to 7.9. It's now been running for over 12 hours and remains in the Upgrade Processing state. It has 12,000 documents which doesn't seem like a lot to me so I'm concerned that it may have become stuck in a partially upgraded state. Any input?
It is hard to answer to a question without logs or details but i will try to guess.
From what i'm seeing in the image, it seems you're lacking free space for the reindexing to continue.
Free storage 7.15 GiB == Minimum free storage space 7.15 GiB
When reindexing data you need at least twice the amount of space taken by the data
But you also need to take into account the fact that Elasticsearch stop writing data when it feels out of free space. The tresholds can be configured with the following properties:
cluster.routing.allocation.disk.threshold_enabled
cluster.routing.allocation.disk.watermark.low
cluster.routing.allocation.disk.watermark.high
cluster.routing.allocation.disk.watermark.flood_stage
See https://www.elastic.co/guide/en/elasticsearch/reference/7.12/modules-cluster.html#disk-based-shard-allocation for more details

How much would AWS ec2 cost for a project of my type

I have tried many times to install the R server on an AWS instance using terminal commands without any luck. I can install it using http://www.louisaslett.com/RStudio_AMI/
and following a Youtube video but I cannot get the dropbox sync to stop "syncing". I have tried installing a fresh version using the terminal and Putty and other methods without much success.
What I wanted to use AWS for was to use the bandwidth / computing time.
I basically wanted to run an R script to download a bunch of documents which could take 2 weeks to download. I had hoped to save these on a large dropbox account I have access to but unfortunately library("RStudioAMI")
linkDropbox()
excludeSyncDropbox("*") doesn`t seem to work for me and the whole dropbox folder gets synced onto my AWS instance and I run out of space.
So basically... I think I will forget dropbox and just use AWS storage.
I want to download appox 500GB - or perhaps 1TB worth of data (running an R script to download documents and save them), it just connects to a website and downloads a document, so no ML or high computing power needed. Just a consistent connection. Once the documents are fully downloaded I would like to then just transfer them to an external hard drive I have for further analysis.
So my question is, "approximately" how much do you think this may cost, I don't care about paying 20-30$ I just don`t want to go in with inexperience/without knowledge and rack up hundreds$.
Additionally: What other instances/servers do you suggest I pay for, I feel like I dont need that much power just consistency.
Here is another SO question I opened:
Amazon AWS Dropbox link error: "No directories are being ignored."
There will be three main costs for your scenario:
Amazon EC2, which is charged hourly. You do not need much processing power, so a t3.small would probably be adequate if you're not doing any big computations. It's only about 2c/hour, which is $7 for 2 weeks.
An Amazon EBS disk volume attached to your Amazon EC2 instance for storing the data. A General Purpose volume is 10c/GB/month. So, 1TB for 2 weeks would be $50. If you configure it to use "Cold HDD (sc1)", then it's a quarter of that price.
Data Transfer for when you download from AWS. If you are using AWS in the USA, it is 9c/GB. So, 1TB = $90. This would be your major cost.
There might be some other minor costs, but they won't be significant compared to the above.
Or, given that your basic goal is to collect and download data, you could just do it on a computer at home.
If you are not strictly limited to EC2 ( which I think you are not, considering the requirement you stated and the AMI approach failed for you) , AWS Lightsail would be a much better solution
It has bundled data transfer package and acceptable performance
Here is the 1-month plan
512 MB Memory
1 Core Processor
20 GB SSD Disk
1 TB Transfer ( Data in will cost nothing, only data Out, Ex: From LightSail to your local PC )
Additional SSD - $10 for 1 TB
Average network performance for that instance I see is about 30 Megabyte per second. You can just shutdown everything and only billed for the hours you used in the month

aws ecs instances running out of space

Since this morning I'm having troubles while updating services in AWS ECS. The tasks fails to start. The failed tasks shows this error:
open /var/lib/docker/devicemapper/metadata/.tmp928855886: no space left on device
I have checked disk space and there is.
/dev/nvme0n1p1 7,8G 5,6G 2,2G 73% /
Then I have checked the inodes usage, and I found that 100% are used:
/dev/nvme0n1p1 524288 524288 0 100% /
Narrowing the search I found that Docker volumes are the ones using the inodes.
I'm using the standard Centos AMI.
Does this mean that there is a maximum number of services that can run on a ECS cluster? (at this moment I'm running 18 services)
This can be solved? At this moment I can't do updates.
Thanks in advance
You need to tweak the following environment variables on your EC2 hosts:
ECS_ENGINE_TASK_CLEANUP_WAIT_DURATION
ECS_IMAGE_CLEANUP_INTERVAL
ECS_IMAGE_MINIMUM_CLEANUP_AGE
ECS_NUM_IMAGES_DELETE_PER_CYCLE
You can find the full docs on all these settings here: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-config.html
The default behavior is to check every 30 minutes, and only delete 5 images that are more than 1 hour old and unused. You can make this behavior more aggressive if you want to clean up more images more frequently.
Another thing to consider to save space is rather than squashing your image layers together make use of a common shared base image layer for your different images and image versions. This can make a huge difference because if you have 10 different images that are each 1 GB in size that takes up 10 GB of space. But if you have a single 1 GB base image layer, and then 10 small application layers that are only a few MB in size that only takes up a little more than 1 GB of disk space.

What is an instance and how do I convert this to $

You'll have to excuse my ignorance on this one...but honestly, I've had a hard time finding clarity on this. That being said, I'm looking for a non technical answer...something in layman's terms!
Anyways, I've been playing around building a web app (first time obviously) and I'm getting to the point where I've started looking into hosting services. A quick google search and a few blogs later, I thought AWS would be a good place to start, since they give a free-year trial. I don't care about speedy upstarts or other hosting serves, so save your key strokes on offering other services.
My question is based on the fact that AWS charges "Linux Usage per hour" and they also use this term "instance". Yeah...an "instance" is an "object", which is also above my head (probably the real source of the problem), but that was the extent I was able to learn via a google search. That being said, I don't know how to translate the cost into a ball park cost. Yes, I can probably use the trial to help monitor predictable costs, but I don't want to go through the effort of learning one hosting companies system just to find out it's not going to work in the end.
OK...so hopefully by now you see where I'm coming from. What is an "instance" and how do I use the "Linux Usage per hour" to estimate cost? Is an instance a server? For example if I start NGINX is that in instance? Is it just one instance running NGINX or does every VPN represent an instance? If I have 100 people calling the server at once, can they fit on one instance? If I start another server say, Apache or Node, does that become another instance? If I connect to a database, is that an instance? Do instances start as needed? Yes, I know, that's more than one question...I'm just trying to express my confusion.
If I'm suppose to choose a pricing model from this list, "Linux Usage per hour", I need to know what them mean by "Linux Usage". If it's based on an "instance", I need to know what that is. So please, in layman's terms, help clear this up. Maybe some examples or analogies, but no deep technical stuff.
This is more a side note, but I was reading this article and it said
For a client needing to run 800 virtual instances, the annual cost of
a private cloud came to below $400,000 vs. somewhere between $800,000
and $1.2 million for public cloud services.
Considering I don't know what an instance is, that kinda made me a bit nervous...WAAAAAAyyyyyy outta my price range! Yes, it's obviously a big company, but can you imagine "hitting the lottery" with an app everyone loves then before you know it, AWS hits you with a bill of $1,000,000. Or even worse, your security sucks and someone spawns millions of these "instances"...help alive my paranoia!!
Basically, an instance is a virtual machine, which looks very much like a server. As such it's running an operating system - e.g. linux - which is capable of running many programs (aka 'processes' or sometimes, 'services') at the same time.
To go through your questions (some of the explanations below are not technically accurate, but are hopefully more explanatory for it - if anything is obvious or already known, apologies - trying not to assume any knowledge)
An instance is an object
This definition is coming up in your searches because 'instance' has many definitions in different situations. If you see the definition of 'instance' as an object, it's from the topic of object oriented programming languages - you define a class in your code (kind of like a 'template'), and then create instances of the class - kind of like real copies of the template.
Amazon borrowed the term to be analogous - because in the 'cloud' world, you can create an AMI (Amazon Machine Image - the template) and then create lots of instances that are copies or clones of that template.
Is an instance a server?
In terms of what you can do with it, yes, it's a server.
(Technically it's a virtual server - Amazon runs multiple virtual servers on each physical server.)
how do I use the "Linux Usage per hour" to estimate cost?
Estimate how long you will have your instance running for in hours per month, multiply it by cost per hour and you will have your estimated cost per instance per month.
e.g. - one instance always turned on would be - 24 hrs * 31 days = 744 hours. At $0.013/hr (for a t2.micro) that would be 744 * $0.013 = $9.672/mth.
(And that's the reason the free tier gives you 750 hours of instance time per month.)
Instances come in different types and sizes and each size costs a different amount. If you are not sure what size you need, I'd start with the smallest until you discover you need more - which would be when your program starts running too slowly.
For example if I start NGINX is that in instance?
Nginx is a program that runs as a daemon in linux terms - a program that runs in the background so it's always on. It will be one of the many programs running on the server (aka the instance)
If I have 100 people calling the server at once, can they fit on one instance?
It depends - on how big your instance is, and how efficient the program is that is responding to their requests. If you are just getting started learning to program websites, I wouldn't worry about handling 100 people issuing requests to the server all at once just yet - walk before you run :) (also, even when there are 100 people visiting your website, the odds that all of them issue a request at exactly the same time is low - usually they load a page and read it - while they're reading it, some of the other people are loading other pages, and it all spreads out so you might only have ~10 page requests actively being processed by your server at the same time.)
However, if you have 2,000 people on your site at the same time, you might be processing 200 page requests at once, so by then you do need to have put some thought into performance and scalability.
(Note: these numbers are arbitrary and depend entirely on the type of site and it's traffic patterns.)
Generally, most websites pick a mid-level instance size, and then to handle more requests they 'scale out' - create lots of copies of that instance, and allow each instance to handle a portion of the traffic.
If I start another server say, Apache or Node, does that become another instance
The language to use here would be 'start another service say, Apache or Node' - they are other programs, and your instance will be perfectly fine running nginx, apache and node all at the same time. Although each will consume some of the resources (e.g. memory and cpu) and the more activity they are doing, the faster you will run out of resources and need to get a bigger instance size
So - no, they don't automatically become another instance. The language is confusing because sometimes people don't distinguish between the 'server' (aka the instance) and the service (aka the program) and will say the 'apache server' and the 'apache service' interchangably.
If I connect to a database, is that an instance?
Your instance, as a fully capable server, could run a database service on it at the same time as the other services - e.g. you could install and run mysql on your instance.
There is another option, though - if you use the AWS RDS product, then you will be starting an RDS instance. An RDS instance is different from an EC2 instance (what we've been talking about so far) in that RDS instances are specialised to just run the database service and nothing else, but EC2 instances are general servers that you can do pretty much anything on.
It's usually recommended to use RDS, but if you are trying to save money and aren't serving many users, there's nothing particularly wrong with installing mysql on your instance yourself (especially while you're learning how it works) and then moving your data to an RDS instance when you want to support more load or traffic.
Do instances start as needed?
Not by default, no - you have to manually start and stop them.
However, there are options other than manually starting and stopping. Amazon provides a lot of APIs, so you could write a program that would connect to the API and automatically start and stop your instance(s) based on rules you build into your program..
Also, Amazon offers a product called "AutoScalingGroups" which allows you to have a related group of instances and for Amazon to automatically start and stop them according to rules that you configure into that product. These rules can be 'scheduled actions' - start/stop at certain times of day - or they can be reactive - e.g. when the average CPU usage is > 50% for more than 5 minutes, start another instance.
This is more a side note, but I was reading this article and it said
For a client needing to run 800 virtual instances, the annual cost of
a private cloud came to below $400,000 vs. somewhere between $800,000
and $1.2 million for public cloud services.
The 'free tier' gives you a t2.micro sized instance (1 vCPU, 1 GiB RAM) which you could leave turned on permanently for free during that free year.
Even after your free tier expires, that same instance would cost you $9.67/mth, and you have the option to go downgrade to a t2.nano (0.5 GiB RAM) which would only cost ~$4/mth - but 0.5GiB RAM isn't much these days, so may not be enough for you.
A t2.micro should be more than enough to learn how to build websites on. If you are fortunate enough to build a site that is popular enough that you are getting more requests than that server can handle, then you will have to decide if you can generate revenue from that popularity sufficient to cover the cost, but by then you'll have more of a sense of how efficient your program is, and what instance size (and/or how many instances) you'll need.
Yes, it's obviously a big company, but can you imagine "hitting the
lottery" with an app everyone loves then before you know it, AWS hits
you with a bill of $1,000,000
AWS protects you from yourself here a bit - they have limits which generally restrict you from running more than 20 instances at a time - unless you ask for permission. So, by default, your instance won't go multiplying like rabbits on it's own - unless you set it up to. And even if you have set it up to, it won't be able to grow beyond 20 instances unless you have asked amazon to let you. So, worst case is 20 x $9.67/mth - $197/mth.
But - that's just the instance cost. Amazon charges you for lots of things including data traffic in and out, RDS instance costs, and if you start using other service such as S3 buckets and/or elastic load balancers, they all attract their own costs.
But hopefully, if you hit the lottery with an app everyone loves, you've worked out how to convert that love into dollars and cents so you can pay for all those instances you're going to need :)