Aurora PostgreSQL engine: no space left on device - amazon-web-services

I am working with the AWS Aurora PostgreSQL 10.4 engine. I am trying to cluster table ... using index and getting an error like
could not write block .... of temporary file: no space left on device
If I were managing my own PostgreSQL instance I would be looking at the space available on individual volumes with df. (See also: I get an error "could not write block .... of temporary file no space left on device ..." using postgresql)
But with Aurora, AWS should be managing the storage and automatically expanding it on demand. So I'm wondering how I would go about fixing this condition if I'm not managing the storage myself. I'm guessing that that the PG engine's temp storage is separate from the Aurora-managed virtualized storage layer, but not sure how to change it.

Temp space uses the local “ephemeral” volume on the instance. Currently the only way to increase that space is to move to a larger instance size.

You're right in stating that Aurora should take care of this. If you have multiple instances in your cluster, then your cluster would self recover by initiating a failover. The faulty instance would be repaired in the background as well - mostly automatically, and in some rare cases by AWS operators.
If you noticed the issue persisting for more than a few minutes, then you should:
Manually trigger a failover using API/Console
Engage AWS Support to look into the matter if it happens more often.
If you think AWS missed your SLA, then do bring it to their notice as well.
AWS will use commercially reasonable efforts to make Multi-AZ instances available with a Monthly Uptime Percentage (defined below) of at least 99.95% during any monthly billing cycle (the "Service Commitment"). In the event Amazon RDS does not meet the Monthly Uptime Percentage commitment, you will be eligible to receive a Service Credit
Doc: https://aws.amazon.com/rds/sla/

Related

How to take a backup of EC2 instance in AWS and move to a low cost alternative?

We have an EC2 instance running in AWS EC2 instance. We have our ML algorithms and data that. We have also hosted a web-based interface also in that machine.
Now there are no new developments happening in that EC2 instance. We would like to terminate AWS subscription for a short period of time (for the purpose of cost-reduction and exploring new cloud services). Most importantly, we want to be in a position where we can purchase a new EC2 instance with a fresh AWS subscription, use the backup which we take now, and resume all operations (web-backend, SMS services for our app which is hosted in AWS, etc.).
What is the best way to do it? Is temporary termination of AWS subscription advisable?
There is no concept of an "AWS Subscription". AWS is charged on-demand, which means you only pay when you use resources.
If you temporarily do not want the Amazon EC2 instance, you could:
Stop the instance, which is like turning off the power. You will not be charged for the instance, but you will still pay for the disk storage attached to the instance. You can simply Start the instance again when you wish to use it. You will only be charged while the instance is running. OR
Create an image of the instance, then terminate the instance. This will create an Amazon Machine Image (AMI), which contains a copy of the disks. You can then launch a new Amazon EC2 instance from the AMI when you wish to use it again. This is a lower-cost option compared to simply stopping the instance, but it takes more effort to stop/start.
It is quite common for companies to stop Amazon EC2 instances at night or over the weekend to reduce costs while they are not needed.
EDIT: Just thought of a third option. Will test it and be back. Not worth it; it would involve creating an image from the EC2 instance and then convert that image to a VM image, storing the VM image in S3. There may be some advantages to this, but I do not see them.
I think you have two options, both of them very reasonably priced. If you can separate the data from the operating system, then your best option would be to use an S3 bucket as a file system within the EC2 instance. Your EC2 instance would use this bucket to store all your "ML algorithms and data" and, possibly, even your "web-based interface". Whenever you decide that you no longer need the processing capacity of the EC2, you would unmount the S3 bucket file system from the EC2 instance and terminate that instance. After configuring an appropriate lifecycle rule for the S3 bucket, it would transition to Glacier, or even Glacier Deep Archive [you must considerer the different options of long term storage]. In the future, whenever you want to work with your data again, you would move your data from Glacier back to S3, create a new EC2 instance, install your applications, mount your S3 bucket as a file system and you would have access to all your data. I think this is your least expensive and shortest recovery time objective option. To implement this option, look at my answer to this question; everything you need to use an S3 bucket as a regular folder inside the EC2 instance is there.
The second option provides an integrated solution, meaning the operating system and the data stay together, and allows you to restore everything as it was the day you stopped processing your data. It's made up of the following cycle:
Shutdown your EC2 and make a note of all the specs [you need them further down].
Export your instance to a virtual image, vmdk for example, and store it in your S3 bucket. Something like this:
aws ec2 create-instance-export-task --instance-id i-0d54b0682aa3998a0
--target-environment vmware --export-to-s3-task DiskImageFormat=VMDK,ContainerFormat=ova,S3Bucket=sm-vm-backup,S3Prefix=vms
Configure an appropriate lifecycle rule for the S3 bucket so that it transitions to Glacier, or even Glacier Deep Archive.
Terminate the EC2 instance.
In the future you will need to implement the inverse, so you will need to restore the archived S3 Object [make sure you you can live with the time needed by AWS to do this]
Import the virtual image as an EC2 AMI, something like this [this is not complete - you will need some more options that you saved above]:
aws ec2 import-image --disk-containers
Format=ova,UserBucket="{S3Bucket=sm-vm-backup,S3Key=vmsexport-i-0a1c382e740f8b0ee.ova}"
Create an EC2 instance based on the image and you're back in business.
Obviously you should do some trial runs and even automate the entire process if it's something that will be done frequently. I have a feeling, based on what you said, that the first option is a better option, provided you can easily install whatever applications they use.
I'm assuming that you launched an EC2 instance from a base Amazon Machine Image and then added your own software and models to it. As opposed to launched an EC2 instance from an AWS Marketplace offering.
The simplest thing to do is to create an Amazon Machine Image (AMI) from your running EC2 instance. That will capture the current state of the instance and persist it in your AWS account. Then you can terminate the instance. Later, when you want to recreate it, launch a new instance, selecting the saved AMI instead of a standard AMI.
An alternative is to avoid the need to capture machine state at all, by using standard DevOps practices to revision-control everything you need to recreate the state of a running machine.
Note that there are costs associated with an AMI, though they are minimal ($0.05 per GB-month of data stored, for example).
I had contacted AWS customer care regarding this issue. Given below is the response I received. Please add your comments on which option might be good for me.
Note: I acknowledge the AWS customer care team for their help.
I understand that you require some information on cost saving for your
Instance since you will not be utilizing the service for a while.
To assist you with this I would recommend checking out the Instance
Stop/Start link here:
==>https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Stop_Start.html .
When you stop an Instance, you do not lose any data & you are not
charged for the resources any further. However please keep in mind
that you will still be charged for any EBS Storage Volumes attached to
the stopped Instance(s).
I also recommend checking out the below links on how you can reduce
your costs.
==>https://aws.amazon.com/premiumsupport/knowledge-center/reduce-aws-bill/
==>https://aws.amazon.com/blogs/compute/10-things-you-can-do-today-to-reduce-aws-costs/
That being said, please note that as I am in the billing department,
for the best assistance with the various plans you will require the
assistance of our Sales Team.
The Sales Team will be able to assist with ways to save while
maintaining your configurations.
You will be able to reach the Sales Team here:
==>https://aws.amazon.com/websites/contact-us/.
Once you have completed the details in the link, a member of the team
will be in touch with you at their soonest.

Multi region EC2 & RDS replication from Region A to various other regions

Our current server consisting of an 2x EC2 instances and RDS (Read/Write) database is in Mumbai Region. However I would like to copy everything (2x EC2 & RDS (R/W)) across to Sydney, and other to other regions.
Ideally I would like to replicate the contents in those instances as well.
Does anyone know a quick and easy way of doing this?
Edit 25/01/2019:
However I would like to copy everything including what ever is inside the instances (2x EC2s and the RDSs)
Edit 29/01/2019:
The purpose is to "scale/expand out". I want to have the same infrastructure replicated 1-to-1 (exactly/identically) across various regions.
It is simple!
- For EC2 - you need to create an AMI of those instances then right click on the AMI you've just created and choose "copy AMI" to the designated region.
For RDS
If you just wanna copy data to another region then take a snapshot then copy that snapshot to destination region
If you want to make the RDS replicate to another region continuously then you need to create a read-replica from your RDS instance.
Option for replicating environment depends on how much downtime can you tolerate.
If you are okay with downtime
1. Copy the AMI of EC2 instance and snapshot of RDS to another regions
2. Bring up your new environment.
This is perfect for non critial workload
If this is critical application
1. Copy the AMI of ec2 instance ( I am assuming this would be your web/app instnaces) For real time replication use rsync or robocopy .. or solution like cloudendure .
2. Create a new RDS instance in sydney
3. USE DMS migration tool .. create source and target relationship
4. once insync cut off the relation bring new environment in sydney
As suggested by previous answers for EC2 you can create AMIs and then move the AMI to a different region.
For RDS, you can either create read replicas (and read replicas of read replicas, but beware of latency), read replicas are used to mainly improve read performance of your app.
You can also create a Multi AZ backup which will act as a disaster recovery site. However, note that Multi-AZ is only used in case of a failover. Moreover, Multi-AZ involves Synchronous data copy and read replicas are asynchronous, so read replicas can demonstrate eventual consistency behavior.
But the real question here is - What are you trying to achieve?
Are you trying to "scale out" your infrastructure to support huge traffic to your application? Or are you simply trying to setup disaster recovery (DR)?
If your answer is DR, then the approach is pretty straight forward with Multi AZ and EC2 instance snapshots. But if the answer is scaling out and performance, you really need to be thinking of better strategies such as using Cloudfront (CDN) if it is a web app, using Elasticache in-memory cache for frequently read data, or RDS read replicas, using Elastic Load Balancers with Dynamic/Step scale-out/scale-in. Other, methods would be to evaluate the type of RDS storage subsystem used i.e. using Provisional IOPs vs. Using General Purpose SSD, checking if there are any NAT “instance” bottlenecks in your VPC and so on.
It may be tempting to spin up all these redundant copies of EC2 AMIs or RDS read replicas with a click of a button, but you really need to be thinking about the cost you are going to incur on a monthly basis for completely un-used resources.

AWS vs GCP Cost Model

I need to make a cost model for AWS vs GCP. Currently, our organization is using AWS. Our biggest services used are:
EC2
RDS
Labda
AWS Gateway
S3
Elasticache
Cloudfront
Kinesis
I have very limited knowledge of cloud platforms. However, I have access to:
AWS Simple Monthly Calculator
Google Cloud Platform Pricing Calculator
MAP AWS services to GCP products
I also have access to CloudHealth so that I can get a breakdown of costs per services within our organization.
Of the 8 major services listed above are main usage and costs go to EC2, S3, and RDS.
Our director of engineering mentioned that I should be most concerned with vCPU and memory.
I would appreciate any insight (big or small) that people have into how I can go about creating this model, any other factors I should consider, which functionalities of the two providers for the services are considered historically "better" or cheaper, etc.
Thanks in advance, and any questions people may have, I am more than happy to answer.
-M
You should certainly cost-optimize your resources. It's so easy to create cloud resources that people don't always think about turning things off or right-sizing them.
Looking at your Top 5...
Amazon EC2
The simplest way to save money with Amazon EC2 is to turn off unused resources. You can even stop instances overnight and on the weekend. If they are only used 8 hours per workday, then that is only 40 out of 168 hours, so you can save 75% by turning them off when unused! For example, Dev and Test instances. People have written various types of automated utilities to turn instances on and off based on tags. Try search the Internet for AWS Stopinator.
Another way to save money on Amazon EC2 is to use spot instances. They are a fraction of the price, but have a risk that they might be turned off when demand increases. They are great where it is okay for systems to be terminated sometimes, such as automated testing systems. They are also a great way to supplement existing capacity at a fraction of the price.
If you definitely need the Amazon EC2 instances to keep running all the time, purchase Amazon EC2 Reserved Instances, which also offer a price saving.
Chat with your AWS Account Manager for help with the above options.
Amazon Relational Database Service (RDS)
Again, Amazon RDS instances can be stopped overnight/on weekends and turned on again when needed. You only pay while the instance is running (plus storage costs).
Examine the CloudWatch metrics for your RDS instances and determine whether they can be downsized without impacting applications. You can even resize them when they are used less (eg over weekends). Everything can be scripted, so you could trigger such downsizing and upsizing on a schedule.
Also look at the Engine used with RDS. Commercial offerings such as Oracle and Microsoft SQL Server are more expensive than open-source offerings like MySQL and PostgreSQL. Yes, your applications might need some changes, but the cost savings can be significant.
AWS Lambda
It is most unusual that Lambda is #3 in your list. In fact, some customers never get a charge for Lambda because it falls in the monthly free usage tier. Having high charges means you're making good use of Lambda (which is saving you EC2 costs), but take a look at which applications are using it the most and see whether they are using it wisely.
When correctly used, a Lambda function should only ever run for a few seconds, so check whether any application seem to be using it outside this pattern.
AWS API Gateway
Once again, these costs tend to be low ($3.50/million calls) so again I'd recommend trying to figure out how this is being used. If you really need that many calls, it would also explain the high Lambda costs. It would probably be more expensive if you were providing such functionality via Amazon EC2.
Amazon S3
Consider using different Storage Classes to reduce your costs. Costs can be reduced by:
Moving infrequently-accessed data to a different storage class
Moving data to One-Zone (if you have a copy of the data elsewhere, so don't need the redundancy)
Archiving infrequently-accessed data to Amazon Glacier, which offers much cheaper storage but does not have instant access
With GCP, you can benefit by receiving discounts such as the Committed Use Discount and the Sustained Use Discount.
With a Committed Use Discount, you can receive a discount of up to 70% if your usage is predictable.
With the Sustained Use Discount, there is an incremental discount if you reach certain usage thresholds.
On your concern with vCPU and memory, you may use predefined machine types. They are cheaper than custom machine types.
Lastly, you can also test the charges by trying out the Google Cloud Platform Free Tier.

RDS incremental backup

We are trying to take incremental back for mysql in RDS. We are unable to find any methods to take incremental backup . How can this be done in RDS ? In FAQ we read that we can restore the data up to last five minutes. But we are not sure how to do that?
You can use AWS Data Pipeline to do this.
It supports full RDS dump or incremental dump and restore.The problem is you cannot reuse a pipeline. You will have to clone the pipeline and create a new one using AWS Lambda or Jenkins or some other job scheduling system each time you want to create a Backup or Restore.
Check out this blog to find more information on that.
a. RDS provides Native incremental backup feature - RDS snapshots and also has a feature called Point in time recovery (PITR). This allows you to restore a state of RDS instance from last 5 minutes upto max 35 days in the past (35 days being the max automatic backup retention period).
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithAutomatedBackups.html
b. You can also trigger Manual snapshots in RDS - which is once again incremental (which means that if you have a running RDS server of 1TB your first/base snapshot will be 1TB) and any subsequent snapshots of the same server will only capture the modified blocks. In manual snapshots there is not retention period. You can keep as long as you wish unless you want to delete it manually. But the PITR feature is not available over Manual snapshots (i.e not longer than the configured automatic backup retention window)
In both the above features, you are dependent upon the RDS API/platform to take backup, list all the backups and restore RDS from backup. You dont have any control over the raw data / row level data.
For raw data backup, you need to consider Mysqldumps and restore - but that is an expensive operation (both backup and restore). You can use some third party tools like (percona) which provides good utilities to perform the same - but you cant use few tools because RDS does not allows you with RDS host access - so unless you run your own Mysql on VM/EC2, you are limited to the above 2 options. hope this helps.
https://www.percona.com/doc/percona-xtrabackup/2.3/innobackupex/incremental_backups_innobackupex.html
https://www.percona.com/doc/percona-xtrabackup/2.3/backup_scenarios/incremental_backup.html

Scaling Up an Elasticache Instance?

I'm currently running a site which uses Redis through Elasticache. We want to move to a larger instance with more RAM since we're getting to around 70% full on our current instance type.
Is there a way to scale up an Elasticache instance in the same way a RDS instance can be scaled?
Alternative, I wanted to create a replica group and add a bigger instance to it. Then, once it's replicated and running, promote the new instance to be the master. This doesn't seem possible through the AWS console as the replicas are created with the same instance type as the primary node.
Am I missing something or is it simply a use case which can't be achieved. I understand that I can start a bigger instance and manually deal with replication then move the web servers over to use the new server but this would require some downtime due to DNS migration, etc.
Thanks!,
Alan
Elasticache feels more like a cache solution in the memcached sense of the word, meaning that to scale up, you would indeed fire up a new cluster and switch your application over to it. Performance will degrade for a moment because the cache would have to be rebuilt, but nothing more.
For many people (I suspect you included), however, Redis is more of a NoSQL database solution in which data loss is unacceptable. Amazon offers the read replicas as a "solution" to that problem, but it's still a bit iffy. Of course, it offers replication to reduce the risk of data loss, but it's still nowhere near as production-safe (or mature) as RDS for a Redis database (as opposed to a cache, for which it's quite perfect), which offers backup and restore procedures, as well as well-structured change management to support scaling up. To my knowledge, ElastiCache does not support changing the instance type for a running cluster. This suggests that it's merely an in-memory solution that would lose all its data on reboot.
I'd go as far as saying that if data loss concerns you, you should look at a self-rolled Redis solution instead of simply using ElastiCache. Not only is it marginally cheaper to run, it would enable you to change the instance type like you would on any other EC2 instance (after stopping it, of course). It would also enable you to use RDB or AOF persistence.
You can now scale up to a larger node type while ElastiCache preserves:
https://aws.amazon.com/blogs/aws/elasticache-for-redis-update-upgrade-engines-and-scale-up/
Yes, you can instantly scale up a running Elasticache instance type to a larger size. I've tested it and experienced very little actual downtime (I think a few seconds at first, but very quickly it's back online, even while the Console will show the process taking roughly a few minutes to actually finish.) I went from a t2.micro to a m3.medium with no problem.
You can scale up or down
Go to Elasticache service
Select the cluster
From Actions menu in top, choose Modify
Modify Node Type as shown below
If you have a cluster, you can add more shards, decrease number of shards, rebalance slot distributions, or add more read replicas. just click on the cluster itself, you should be see something like this
Be aware when you delete shards, it will automatically redistribute data to other existing shards so it will affect on traffic and overloading other shards, when you try to delete a shard you would get a warning like this
Still need more help, please feel free to leave a comment and I would be more than happy to help.