I've started an instance which had a -d at the back (this should be a scratch disk).
But on boot the disk space stated is not seen.
It should be:
8 vCPUs, 52 GB RAM, 2 scratch disks (1770 GB, 1770 GB)
But df -h outputs:
Filesystem Size Used Avail Use% Mounted on
rootfs 10G 644M 8.9G 7% /
/dev/root 10G 644M 8.9G 7% /
none 1.9G 0 1.9G 0% /dev
tmpfs 377M 116K 377M 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 753M 0 753M 0% /run/shm
so how does one get to run an instance with boot that's a persistent disk and have scratch disks available?
The thing is that I need high CPU and lots of scratch space.
df does not show scratch disks because they are not formatted and mounted. Issue the following command:
ls -l /dev/disk/by-id/
In the output there will be something like:
lrwxrwxrwx 1 root root ... scsi-0Google_EphemeralDisk_ephemeral-disk-0 -> ../../sdb
lrwxrwxrwx 1 root root ... scsi-0Google_EphemeralDisk_ephemeral-disk-1 -> ../../sdc
Then, you can use mkfs and mount the appropriate disks.
See documentation for more info.
Related
Processing a large dataset from a Google Cloud Platform notebook, I ran out of disk
space for /home/jupyter:
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 15G 0 15G 0% /dev
tmpfs 3.0G 8.5M 3.0G 1% /run
/dev/sda1 99G 38G 57G 40% /
tmpfs 15G 0 15G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 15G 0 15G 0% /sys/fs/cgroup
/dev/sda15 124M 5.7M 119M 5% /boot/efi
/dev/sdb 492G 490G 2.1G 100% /home/jupyter
I deleted a large number of files and restarted the instance. And no change for /home/jupyter.
/dev/sdb 492G 490G 2.1G 100% /home/jupyter
Decided to explore this a little further to identify what on /home/jupyter was still
taking up space.
$ du -sh /home/jupyter/
490G /home/jupyter/
$ du -sh /home/jupyter/*
254M /home/jupyter/SkullStrip
25G /home/jupyter/R01_2022
68G /home/jupyter/RSNA_ASNR_MICCAI_BraTS2021_TrainingData
4.0K /home/jupyter/Validate-Jay-nifti_skull_strip
284M /home/jupyter/imgbio-vnet-cgan-09012020-05172021
4.2M /home/jupyter/UNet
18M /home/jupyter/scott
15M /home/jupyter/tutorials
505M /home/jupyter/vnet-cgan-10042021
19M /home/jupyter/vnet_cgan_gen_multiplex_synthesis_10202021.ipynb
7.0G /home/jupyter/vnet_cgan_t1c_gen_10082020-12032020-pl-50-25-1
(base) jupyter#tensorflow-2-3-20210831-121523-08212021:~$
This does not add up. I would think that by restarting the instance, the processes that were referencing deleted files would be cleaned up.
What is taking up my disk space and how can I reclaim it?
Any direction would be appreciated.
Thanks,
Jay
Disk was fragmented. Created a new instance from scratch.
I am setting up a g3.16xlarge instance at AWS for a deep learning project. After installing cuda libraries, I am facing space shortage issue. The df -h output the following:
Filesystem Size Used Avail Use% Mounted on
udev 241G 0 241G 0% /dev
tmpfs 49G 8.8M 49G 1% /run
/dev/xvda1 7.7G 7.4G 376M 96% /
tmpfs 241G 0 241G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 241G 0 241G 0% /sys/fs/cgroup
/dev/loop0 13M 13M 0 100% /snap/amazon-ssm-agent/495
/dev/loop1 88M 88M 0 100% /snap/core/5328
tmpfs 49G 0 49G 0% /run/user/1000
While my root /dev/xvda1 is full, I have 241G inudevpartition. Can I allot that space to/dev/xvda1. When I checked for solutions, all I am getting to see is to increase the size ofEBS Volume` online, which I don't want to do, as I am already spending a huge fee for this machine. Even before I could complete the setup I stumbled upon this issue.
I have chosen a Ubuntu 16.04 AMI. However, this doesn't happen if I choose Ubuntu 16.04 AMI that is pre-configured for deep learning.
I am Running an AWS ami using a T2.large instance using the US East. I was trying to upload some data and I ran in the terminal:
df -h
and I got this result:
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 799M 8.6M 790M 2% /run
/dev/xvda1 9.7G 9.6G 32M 100% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
tmpfs 799M 0 799M 0% /run/user/1000
I know I have not uploaded 9.7 GB of data to the instance, but I don't know what /dev/xvda1 is or how to access it.
I also assume that all the tmpfs are temporal files, how can I erase those?
Answering some of the questions in the coments, I runned
sudo du -sh /*
And I got:
16M /bin
124M /boot
0 /dev
6.5M /etc
2.7G /home
0 /initrd.img
0 /initrd.img.old
4.0K /jupyterhub_cookie_secret
16K /jupyterhub.sqlite
268M /lib
4.0K /lib64
16K /lost+found
4.0K /media
4.0K /mnt
562M /opt
du: cannot access '/proc/15616/task/15616/fd/4': No such file or directory
du: cannot access '/proc/15616/task/15616/fdinfo/4': No such file or directory
du: cannot access '/proc/15616/fd/4': No such file or directory
du: cannot access '/proc/15616/fdinfo/4': No such file or directory
0 /proc
28K /root
8.6M /run
14M /sbin
8.0K /snap
8.0K /srv
0 /sys
64K /tmp
4.7G /usr
1.5G /var
0 /vmlinuz
0 /vmlinuz.old
When you run out of root filesystem space, and aren't doing anything that you know consumes space, then 99% of the time (+/- 98%) it's a logfile. Run this:
sudo du -s /var/log/* | sort -n
You'll see a listing of all of the sub-directories in /var/log (which is the standard logging destination for Linux systems), and at the end you'll probably see an entry with a very large number next to it. If you don't see anything there, then the next place to try is /tmp (which I'd do with du -sh /tmp since it prints a single number with "human" scaling). And if that doesn't work, then you need to run the original command on the root of the filesystem, /* (that may take some time).
Assuming that it is a logfile, then you should take a look at it to see if there's an error in the related application. If not, you may just need to learn about logrotate.
/dev/xvda1 is your root volume. The AMI you listed has a default root volume size of 20GB as you can see here:
Describe the image and get it's block device mappings:
aws ec2 describe-images --image-ids ami-3b0c205e --region us-east-2 | jq .Images[].BlockDeviceMappings[]
Look at the volume size
{
"DeviceName": "/dev/sda1",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 20,
"SnapshotId": "snap-03341b1ff8ee47eaa"
}
}
{
"DeviceName": "/dev/sdb",
"VirtualName": "ephemeral0"
}
{
"DeviceName": "/dev/sdc",
"VirtualName": "ephemeral1"
}
When launched with the correct volume size of 20GB there is plenty of free space (10GB)
root#ip-10-100-0-64:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 488M 0 488M 0% /dev
tmpfs 100M 3.1M 97M 4% /run
/dev/xvda1 20G 9.3G 11G 49% /
tmpfs 496M 0 496M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 496M 0 496M 0% /sys/fs/cgroup
tmpfs 100M 0 100M 0% /run/user/1000
It appears the issue here is the instance was launched with 10GB (somehow, I didn't think this was possible) of storage instead of the default 20GB.
/dev/xvda1 is your disk based storage on the amazon storage system.
It is the only storage on your system have, and it contains your operation system and all data. So I guess most of the space it used by the Ubuntu installation
Remember: T instances at amazon don't have any local disk at all.
I have launched i3.large which offer about 475Gb for storage (https://aws.amazon.com/ec2/instance-types/) but when I invoke df -h I got:
Filesystem Size Used Avail Use% Mounted on
devtmpfs 7.5G 64K 7.5G 1% /dev
tmpfs 7.5G 0 7.5G 0% /dev/shm
/dev/xvda1 7.8G 7.7G 0 100% /
E.g. this instance contains only 22.5Gb. Why? What does 0.475Tb refers from intance types table?
Try copying a large amount of data there - I'm wondering if it expands. I tried doing a df -h on mine and got small numbers but then I built a 200GB database on it.
First of all, this is my df -h output:
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7.9G 4.2G 3.4G 56% /
udev 1.9G 8.0K 1.9G 1% /dev
tmpfs 751M 180K 750M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 1.9G 0 1.9G 0% /run/shm
/dev/xvdb 394G 8.4G 366G 3% /mnt
I know that the /mnt is an ephemeral storage, all data stores in will be deleted after reboot.
Is it OK to create a /mnt/swap file to use as swap file? I add the following line into /etc/fstab
/mnt/swap1 swap swap defaults 0 0
By the way, what's the /run/shm used to do ?
Thanks.
Ephemeral storage preserves data between reboots, but looses them between restarts (stop/start). Also see this: What data is stored in Ephemeral Storage of Amazon EC2 instance?