On one of my AWS instances running Ubuntu 16.04, I've a MySQL replica database on a 1TB ext4 EBS volume. I plan to increase it to 2TB. Before I increase the size of the volume and extend the filesystem using the resize2fs command, do I need to take any precautions? Is there any possibility of data corruption? If so would it be sane to create a EBS snapshot of this volume?
Do I need to take any precautions?
You shouldn't need to take any unusual precautions -- just standard best practices, like maintaining backups and having a tested recovery plan. Anything can go wrong at any time, even when you're sitting, doing nothing.
Important
Before modifying a volume that contains valuable data, it is a best practice to create a snapshot of the volume in case you need to roll back your changes.
https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/ebs-modify-volume.html
But this is not indicative of the operation being especially risky. Anecdotally, I've never experienced complications, and have occasionally resized an EBS volume and then its filesystem under a live, master, production database.
Is there any possibility of data corruption?
The possibility of data corruption is always there, no matter what you are doing... but this seems to be a safe operation. The additional space becomes available immediately, and there is no I/O freeze or disruption.
If so would it be sane to create a EBS snapshot of this volume?
As noted above, yes.
Concerns about errors creeping in later are valid, but EBS maintains internal consistency checks and will disable a volume if this fails to help avoid further scrambling of data so that you can do a controlled recovery and repair operation.
This would not help if EBS is prefectly storing data that was corrupted by something on the instance, such as might be caused by a defect in resize2fs, but it seems to be a solid utility. It doesn't move your existing data -- it just fleshes out the filesystem structures as needed to the filesystem use the entire free space that has become available.
Related
I would like to update the samba on a 3TB NAS. My boss suggested making a clone, however, there is no storage that will fit him whole. If a snapshot of the VM costs a smaller size, and serves to, in case of failure, restore the samba as it was, making it a better idea.
There's no real guide on how much space snapshots occupy. That will greatly depend on the activity on the VM where the snapshot has been taken. If it's an active VM (database or something of the like), there could be a considerable amount of data written. If it's not a very used VM, there could be limited to no data written to the backend datastore.
We install our own MySQL in GCE and we are thinking to use GCE snapshot as a backup solution. As our MySQL database is quite busy, we would like to know if taking snapshot on it while still in production, can the data be incorrupt and remain integrity in the snapshot? Thanks.
As described in Best Practices for Persistent Disk Snapshots documentation, if your database is in use during snapshot you may have some data loss.
If you don't have too many write but lots of read, that could do the trick as the chance of loosing new datas will be smaller, but that's still not a 100% sure thing.
We have been using MariaDB in RDS and we noticed that the swap space is getting increasingly high whithout being recycled. The freeable memory however seems to be fine. Please check the attached files.
Instance type : db.t2.micro
Freeable memory : 125Mb
Swap space : increased by 5Mb every 24h
IOPS : disabled
Storage : 10Gb (SSD)
Soon RDS will eat all the swap space, which will cause lots of issues to the app.
Does anyone have similar issues?
What is the maximum swap space? (didn't find anything in the docs)
Please help!
Does anyone have similar issues?
I had similar issues on different instance types. The trend of swapping stays even if you would switch to higher instance type with more memory.
An explanation from AWS you can find here
Amazon RDS DB instances need to have pages in the RAM only when the pages are being accessed currently, for example, when executing queries. Other pages that are brought into the RAM by previously executed queries can be flushed to swap space if they haven't been used recently. It's a best practice to let the operating system (OS) swap older pages instead of forcing the OS to keep pages in memory. This helps make sure that there is enough free RAM available for upcoming queries.
And the resolution:
Check both the FreeableMemory and the SwapUsage Amazon CloudWatch metrics to understand the overall memory usage pattern of your DB instance. Check these metrics for a decrease in the FreeableMemory metric that occurs at the same time as an increase in the SwapUsage metric. This can indicate that there is pressure on the memory of the DB instance.
What is the maximum swap space?
By enabling Enhanced Monitoring you should be able to see OS metrics, e.g. The amount of swap memory free, in kilobytes.
See details here
Enabling enhanced monitoring in RDS has made things more clear.
Obviously what we needed to watch was Committed Swap instead of Swap Usage. We were able to see how much Free Swap we had.
I now also believe that MySQL is dumping things in swap just because there is too much space in there, even though it wasn't really in urgent need of memory.
Currently I am looking how the backup/restore be done in Cassandra. We've setup a three node cluster in AWS. I understand that using nodetool snapshot tool we can take a snapshot but it's bit cumbersome process.
My idea is :
Make use of EBS snapshot because they're more durable and easy to setup but one problem which I see with EBS is inconsistency backup. Hence, my plan is run a script prior to taking EBS snapshot which would just run flush command to flush out all the memtable data and copies it on to the disk(SSTable) and then prepares the hard link with flushed sstables.
Once that's done, initiate the EBS snapshot, this was we can address the inconsistency issue which we might face if we only use EBS snapshost.
Please let me know if you see any issue with this approach or share your suggestions.
Being immutable, SSTables do help a lot when it comes to backups, indeed.
Your ideia sounds ok for situations where everything is healthy on your cluster. Actually, Cassandra is consistency-configurable (if I say eventually consistent, some people may be offended here, hehe), and as the system itself may no be fully consistent at a given time, you cannot say your backup will be as well. But, by the other hand, one of the beauties of Cassandra (and NoSQL models) is that it tends to recover pretty well, which is true for Cassandra in most situations (quite opposite to a relational databases, which are very sensitive to data losses). It's very unlikely you end up with a bunch of useless data if you have at least fully preserved SSTables files.
Be aware that EBS Snapshots are block-level. So, when you have a filesystem on top of it, it may be a concern as well. Fortunately, any modern filesystem have journaling nowadays and are pretty reliable, so that shouldn't be a problem, but having your data in a separate partition is a good practice, so the chances of someone else writing in it right after a full flush are smaller.
You may have some lost replicas when you eventually need to restore you cluster, demanding you to run nodetool repair, what, if you have done before, is a bit painful and takes very long for large amounts of data. (But, repair is recommended to be run regularly anyway, specially if you delete a lot.)
Another thing to consider are hinted handoffs (writes whose row owners are missing, but which are kept by other nodes until the owners come back). I don't know what happens with them when you flush, but I guess they're kept in memory and on commit logs only.
And, off course, do a full restore before you assume this will work in the future.
I don't have a large experience with Cassandra, but what I have heard about backup solutions for it are whole cluster replicas in another region, or datacenter, instead of cold backups like snapshots. It's probably more expensive but more reliable too than raw disks snapshots like you trying to do.
I am not sure how backup of a node will help, because in C* data is already backed up in the replica nodes.
If a node is dead and has to be replaced, the new node will learn about the data from other nodes that it needs to own and get it from other nodes, so you might not need to restore from a disk backup.
Would a replication scenario like the following help ?
Use two data centers (DC:A with 3 nodes) (DC:B with one node) with RF of (A:2 & B:1). Allow clients to interact with nodes in DC:A, with a Read/write consistency of Local_QUORUM. Here since quorum in 2 all reads and write will be successful and you will get data replicated on DC:B. Now you could back up DC:B
I put our application on EC2 (Windows 2003 x64 server) and attached up to 7 EBS volumes. The app is very I/O intensive to storage -- typically we use DAS with NTFS mount points (usually around 32 mount points, each to 1TB drives) so i tried to replicate that using EBS but the I/O rates are bad as in 22MB/s tops. We suspect the NIC card to the EBS (which are dymanic SANs if i read correctly) is limiting the pipeline. Our app uses mostly streaming for disk access (not random) so for us it works better when very little gets in the way of our talking to the disk controllers and handling IO directly.
Also when I create a volume and attach it, I see it appear in the instance (fine) and then i make it into a dymamic disk pointing to my mount point, then quick format it -- when I do this does all the data on the volume get wiped? Because it certainly seems so when i attach it to another AMI. I must be missing something.
I'm curious if anyone has any experience putting IO intensive apps up on the EC2 cloud and if so what's the best way to setup the volumes?
Thanks!
I've had limited experience, but I have noticed one small thing:
The initial write is generally slower than subsequent writes.
So if you're streaming a lot of data to disk, like writing logs, this will likely bite you. But if you make a big file fill it with data, and do a lot of random access I/O to it, it gets better on the second time writing to any specific location.