I am trying to automate the process of backing up some ec2 instance volumes with ansible's respective module.
However, when I log in to my instance:
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 488M 0 488M 0% /dev
tmpfs 100M 11M 89M 11% /run
/dev/xvda1 59G 3.2G 55G 6% /
tmpfs 496M 0 496M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 496M 0 496M 0% /sys/fs/cgroup
/dev/loop4 13M 13M 0 100% /snap/amazon-ssm-agent/495
/dev/loop2 17M 17M 0 100% /snap/amazon-ssm-agent/734
/dev/loop6 88M 88M 0 100% /snap/core/5548
/dev/loop3 88M 88M 0 100% /snap/core/5662
/dev/loop1 17M 17M 0 100% /snap/amazon-ssm-agent/784
/dev/loop0 88M 88M 0 100% /snap/core/5742
tmpfs 100M 0 100M 0% /run/user/1003
tmpfs 100M 0 100M 0% /run/user/1004
When I tried to use /dev/xvda1 as volume name, I got an error that
msg: Could not find volume with name /dev/xvda1 attached to instance i-02a334fgik4062
I had to explicitly use /dev/sda1
Why this inconsistency?
That's not specific to ansible, the AWS EC2 API does the same thing, as specified in the Device Name Considerations section of their documentation; summarized here to avoid the "link-only" answer anti-pattern:
Depending on the block device driver of the kernel, the device could be attached with a different name than you specified. For example, if you specify a device name of /dev/sdh, your device could be renamed /dev/xvdh or /dev/hdh. In most cases, the trailing letter remains the same. In some versions of Red Hat Enterprise Linux (and its variants, such as CentOS), even the trailing letter could change (/dev/sda could become /dev/xvde). In these cases, the trailing letter of each device name is incremented the same number of times. For example, if /dev/sdb is renamed /dev/xvdf, then /dev/sdc is renamed /dev/xvdg. Amazon Linux creates a symbolic link for the name you specified to the renamed device. Other operating systems could behave differently.
In every case I've ever seen, the sd versions are specified to the AWS API, but they materialize as xvd (or even sometimes as nvme) on the actual instance
Related
Processing a large dataset from a Google Cloud Platform notebook, I ran out of disk
space for /home/jupyter:
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 15G 0 15G 0% /dev
tmpfs 3.0G 8.5M 3.0G 1% /run
/dev/sda1 99G 38G 57G 40% /
tmpfs 15G 0 15G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 15G 0 15G 0% /sys/fs/cgroup
/dev/sda15 124M 5.7M 119M 5% /boot/efi
/dev/sdb 492G 490G 2.1G 100% /home/jupyter
I deleted a large number of files and restarted the instance. And no change for /home/jupyter.
/dev/sdb 492G 490G 2.1G 100% /home/jupyter
Decided to explore this a little further to identify what on /home/jupyter was still
taking up space.
$ du -sh /home/jupyter/
490G /home/jupyter/
$ du -sh /home/jupyter/*
254M /home/jupyter/SkullStrip
25G /home/jupyter/R01_2022
68G /home/jupyter/RSNA_ASNR_MICCAI_BraTS2021_TrainingData
4.0K /home/jupyter/Validate-Jay-nifti_skull_strip
284M /home/jupyter/imgbio-vnet-cgan-09012020-05172021
4.2M /home/jupyter/UNet
18M /home/jupyter/scott
15M /home/jupyter/tutorials
505M /home/jupyter/vnet-cgan-10042021
19M /home/jupyter/vnet_cgan_gen_multiplex_synthesis_10202021.ipynb
7.0G /home/jupyter/vnet_cgan_t1c_gen_10082020-12032020-pl-50-25-1
(base) jupyter#tensorflow-2-3-20210831-121523-08212021:~$
This does not add up. I would think that by restarting the instance, the processes that were referencing deleted files would be cleaned up.
What is taking up my disk space and how can I reclaim it?
Any direction would be appreciated.
Thanks,
Jay
Disk was fragmented. Created a new instance from scratch.
I am sharing AWS a Volume (Volume) to multiple EC2 instances (Instance A and Instance B).
[Instance A]
root#ip-xxx-xx-59-75:/data# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 8065444 2405580 5643480 30% /
devtmpfs 1986800 0 1986800 0% /dev
tmpfs 1990908 8 1990900 1% /dev/shm
tmpfs 398184 820 397364 1% /run
tmpfs 5120 0 5120 0% /run/lock
tmpfs 1990908 0 1990908 0% /sys/fs/cgroup
/dev/nvme1n1 102687672 61468 97366940 1% /data <----- Same Volume
/dev/loop1 99328 99328 0 100% /snap/core/9665
/dev/loop0 28800 28800 0 100% /snap/amazon-ssm-agent/2012
/dev/loop2 56320 56320 0 100% /snap/core18/1880
/dev/loop3 56704 56704 0 100% /snap/core18/1885
/dev/loop4 73088 73088 0 100% /snap/lxd/16100
/dev/loop5 73216 73216 0 100% /snap/lxd/16530
tmpfs 398180 0 398180 0% /run/user/1001
tmpfs 398180 0 398180 0% /run/user/1000
[Instance B]
root#ip-xxx-xx-54-217:/home/ubuntu# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 8065444 2368588 5680472 30% /
devtmpfs 1986800 0 1986800 0% /dev
tmpfs 1990908 8 1990900 1% /dev/shm
tmpfs 398184 828 397356 1% /run
tmpfs 5120 0 5120 0% /run/lock
tmpfs 1990908 0 1990908 0% /sys/fs/cgroup
/dev/nvme1n1 102687672 61468 97366940 1% /data <----- Same Volume
/dev/loop0 28800 28800 0 100% /snap/amazon-ssm-agent/2012
/dev/loop1 99328 99328 0 100% /snap/core/9665
/dev/loop2 56320 56320 0 100% /snap/core18/1880
/dev/loop3 56704 56704 0 100% /snap/core18/1885
/dev/loop4 73088 73088 0 100% /snap/lxd/16100
/dev/loop5 73216 73216 0 100% /snap/lxd/16530
tmpfs 398180 0 398180 0% /run/user/1001
tmpfs 398180 0 398180 0% /run/user/1000
Both instances are using the same Volume.
I created a file (test.html) in Instance A. But I couldn't see the same file Instance B.
If I reboot the Instance B, then I can see the test.html.
Is there any way we can share the same files instantly to Multiple at the same time (Instance A and B) without reboot?
You'll need a filesystem / app that's multi-attach aware. For example Oracle RAC can use such volumes, while normal filesystems like ext4 or xfs can't. They are designed to be mounted on a single host only.
Let's step back - what are you trying to achieve? Share files between the instances I suppose? Your best bet is EFS (Elastic File System) - an AWS cloud-native NFS service. Unless you've got a very specific need for multi-attach EBS and running some very special app that can make use of that I suggest you explore the EFS way instead. The need for multi-attach disks is rare, both in the cloud and outside.
i created instance one month ago, it was working fine, but suddenly it stopped working, its strange, when i logged in with ssh, i used command df -h, i can some of that process are showing 100%, looks like cause of that i am getting that issue : No space left on device, i tried attach volume, but still it is not working, can anyone please help me how to resolve this issue ? here i have attacked my process
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 787M 80M 707M 11% /run
/dev/nvme0n1p1 7.7G 7.7G 0 100% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/loop0 29M 29M 0 100% /snap/amazon-ssm-agent/2012
/dev/loop1 97M 97M 0 100% /snap/core/9436
/dev/loop2 18M 18M 0 100% /snap/amazon-ssm-agent/1566
/dev/loop3 98M 98M 0 100% /snap/core/9289
tmpfs 787M 0 787M 0% /run/user/1000
First of all dont panic. It can be easily solved.
Steps
use the following command to see the memory status of your machine
df-h
This will dosplay the list of mounted volume on file system
Filesystem Size Used Avail Use% Mounted on
udev 992M 0 992M 0% /dev
tmpfs 200M 21M 180M 11% /run
/dev/xvda1 7.7G 7.7G 0 100% /
tmpfs 1000M 0 1000M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 1000M 0 1000M 0% /sys/fs/cgroup
tmpfs 200M 0 200M 0% /run/user/1000
You can see that /dev/xvda1 is full.
To peek in /dev/xvda1 further issue following command
sudo du -shx /* | sort -h
This will hive the folderwise memory consumed like
4.0K /mnt
4.0K /srv
16K /lost+found
16K /opt
24K /snap
32K /tmp
532K /home
7.0M /etc
14M /sbin
16M /bin
42M /run
76M /boot
186M /lib
195M /root
2.2G /usr
5.1G /var
we can see that /usr and /var are consuming highest memory.
just drill down in these folders and remove unnecessary files
Go in aws ec2 instance and reboot it.
Problem solved.
I am setting up a g3.16xlarge instance at AWS for a deep learning project. After installing cuda libraries, I am facing space shortage issue. The df -h output the following:
Filesystem Size Used Avail Use% Mounted on
udev 241G 0 241G 0% /dev
tmpfs 49G 8.8M 49G 1% /run
/dev/xvda1 7.7G 7.4G 376M 96% /
tmpfs 241G 0 241G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 241G 0 241G 0% /sys/fs/cgroup
/dev/loop0 13M 13M 0 100% /snap/amazon-ssm-agent/495
/dev/loop1 88M 88M 0 100% /snap/core/5328
tmpfs 49G 0 49G 0% /run/user/1000
While my root /dev/xvda1 is full, I have 241G inudevpartition. Can I allot that space to/dev/xvda1. When I checked for solutions, all I am getting to see is to increase the size ofEBS Volume` online, which I don't want to do, as I am already spending a huge fee for this machine. Even before I could complete the setup I stumbled upon this issue.
I have chosen a Ubuntu 16.04 AMI. However, this doesn't happen if I choose Ubuntu 16.04 AMI that is pre-configured for deep learning.
First of all, this is my df -h output:
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7.9G 4.2G 3.4G 56% /
udev 1.9G 8.0K 1.9G 1% /dev
tmpfs 751M 180K 750M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 1.9G 0 1.9G 0% /run/shm
/dev/xvdb 394G 8.4G 366G 3% /mnt
I know that the /mnt is an ephemeral storage, all data stores in will be deleted after reboot.
Is it OK to create a /mnt/swap file to use as swap file? I add the following line into /etc/fstab
/mnt/swap1 swap swap defaults 0 0
By the way, what's the /run/shm used to do ?
Thanks.
Ephemeral storage preserves data between reboots, but looses them between restarts (stop/start). Also see this: What data is stored in Ephemeral Storage of Amazon EC2 instance?