Scratch disk visibility on Google Compute Engine VM - google-cloud-platform

I've started an instance which had a -d at the back (this should be a scratch disk).
But on boot the disk space stated is not seen.
It should be:
8 vCPUs, 52 GB RAM, 2 scratch disks (1770 GB, 1770 GB)
But df -h outputs:
Filesystem Size Used Avail Use% Mounted on
rootfs 10G 644M 8.9G 7% /
/dev/root 10G 644M 8.9G 7% /
none 1.9G 0 1.9G 0% /dev
tmpfs 377M 116K 377M 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 753M 0 753M 0% /run/shm
so how does one get to run an instance with boot that's a persistent disk and have scratch disks available?
The thing is that I need high CPU and lots of scratch space.

df does not show scratch disks because they are not formatted and mounted. Issue the following command:
ls -l /dev/disk/by-id/
In the output there will be something like:
lrwxrwxrwx 1 root root ... scsi-0Google_EphemeralDisk_ephemeral-disk-0 -> ../../sdb
lrwxrwxrwx 1 root root ... scsi-0Google_EphemeralDisk_ephemeral-disk-1 -> ../../sdc
Then, you can use mkfs and mount the appropriate disks.
See documentation for more info.

Related

Ran out of disk space on Google Cloud notebook, deleted files, still shows 100% usage

Processing a large dataset from a Google Cloud Platform notebook, I ran out of disk
space for /home/jupyter:
$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 15G 0 15G 0% /dev
tmpfs 3.0G 8.5M 3.0G 1% /run
/dev/sda1 99G 38G 57G 40% /
tmpfs 15G 0 15G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 15G 0 15G 0% /sys/fs/cgroup
/dev/sda15 124M 5.7M 119M 5% /boot/efi
/dev/sdb 492G 490G 2.1G 100% /home/jupyter
I deleted a large number of files and restarted the instance. And no change for /home/jupyter.
/dev/sdb 492G 490G 2.1G 100% /home/jupyter
Decided to explore this a little further to identify what on /home/jupyter was still
taking up space.
$ du -sh /home/jupyter/
490G /home/jupyter/
$ du -sh /home/jupyter/*
254M /home/jupyter/SkullStrip
25G /home/jupyter/R01_2022
68G /home/jupyter/RSNA_ASNR_MICCAI_BraTS2021_TrainingData
4.0K /home/jupyter/Validate-Jay-nifti_skull_strip
284M /home/jupyter/imgbio-vnet-cgan-09012020-05172021
4.2M /home/jupyter/UNet
18M /home/jupyter/scott
15M /home/jupyter/tutorials
505M /home/jupyter/vnet-cgan-10042021
19M /home/jupyter/vnet_cgan_gen_multiplex_synthesis_10202021.ipynb
7.0G /home/jupyter/vnet_cgan_t1c_gen_10082020-12032020-pl-50-25-1
(base) jupyter#tensorflow-2-3-20210831-121523-08212021:~$
This does not add up. I would think that by restarting the instance, the processes that were referencing deleted files would be cleaned up.
What is taking up my disk space and how can I reclaim it?
Any direction would be appreciated.
Thanks,
Jay
Disk was fragmented. Created a new instance from scratch.

Increasing root space in AWS instance

I am setting up a g3.16xlarge instance at AWS for a deep learning project. After installing cuda libraries, I am facing space shortage issue. The df -h output the following:
Filesystem Size Used Avail Use% Mounted on
udev 241G 0 241G 0% /dev
tmpfs 49G 8.8M 49G 1% /run
/dev/xvda1 7.7G 7.4G 376M 96% /
tmpfs 241G 0 241G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 241G 0 241G 0% /sys/fs/cgroup
/dev/loop0 13M 13M 0 100% /snap/amazon-ssm-agent/495
/dev/loop1 88M 88M 0 100% /snap/core/5328
tmpfs 49G 0 49G 0% /run/user/1000
While my root /dev/xvda1 is full, I have 241G inudevpartition. Can I allot that space to/dev/xvda1. When I checked for solutions, all I am getting to see is to increase the size ofEBS Volume` online, which I don't want to do, as I am already spending a huge fee for this machine. Even before I could complete the setup I stumbled upon this issue.
I have chosen a Ubuntu 16.04 AMI. However, this doesn't happen if I choose Ubuntu 16.04 AMI that is pre-configured for deep learning.

Running out of disk space in Amazon EC2, can't find what I am using my storage for

I am Running an AWS ami using a T2.large instance using the US East. I was trying to upload some data and I ran in the terminal:
df -h
and I got this result:
Filesystem Size Used Avail Use% Mounted on
udev 3.9G 0 3.9G 0% /dev
tmpfs 799M 8.6M 790M 2% /run
/dev/xvda1 9.7G 9.6G 32M 100% /
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
tmpfs 799M 0 799M 0% /run/user/1000
I know I have not uploaded 9.7 GB of data to the instance, but I don't know what /dev/xvda1 is or how to access it.
I also assume that all the tmpfs are temporal files, how can I erase those?
Answering some of the questions in the coments, I runned
sudo du -sh /*
And I got:
16M /bin
124M /boot
0 /dev
6.5M /etc
2.7G /home
0 /initrd.img
0 /initrd.img.old
4.0K /jupyterhub_cookie_secret
16K /jupyterhub.sqlite
268M /lib
4.0K /lib64
16K /lost+found
4.0K /media
4.0K /mnt
562M /opt
du: cannot access '/proc/15616/task/15616/fd/4': No such file or directory
du: cannot access '/proc/15616/task/15616/fdinfo/4': No such file or directory
du: cannot access '/proc/15616/fd/4': No such file or directory
du: cannot access '/proc/15616/fdinfo/4': No such file or directory
0 /proc
28K /root
8.6M /run
14M /sbin
8.0K /snap
8.0K /srv
0 /sys
64K /tmp
4.7G /usr
1.5G /var
0 /vmlinuz
0 /vmlinuz.old
When you run out of root filesystem space, and aren't doing anything that you know consumes space, then 99% of the time (+/- 98%) it's a logfile. Run this:
sudo du -s /var/log/* | sort -n
You'll see a listing of all of the sub-directories in /var/log (which is the standard logging destination for Linux systems), and at the end you'll probably see an entry with a very large number next to it. If you don't see anything there, then the next place to try is /tmp (which I'd do with du -sh /tmp since it prints a single number with "human" scaling). And if that doesn't work, then you need to run the original command on the root of the filesystem, /* (that may take some time).
Assuming that it is a logfile, then you should take a look at it to see if there's an error in the related application. If not, you may just need to learn about logrotate.
/dev/xvda1 is your root volume. The AMI you listed has a default root volume size of 20GB as you can see here:
Describe the image and get it's block device mappings:
aws ec2 describe-images --image-ids ami-3b0c205e --region us-east-2 | jq .Images[].BlockDeviceMappings[]
Look at the volume size
{
"DeviceName": "/dev/sda1",
"Ebs": {
"Encrypted": false,
"DeleteOnTermination": true,
"VolumeType": "gp2",
"VolumeSize": 20,
"SnapshotId": "snap-03341b1ff8ee47eaa"
}
}
{
"DeviceName": "/dev/sdb",
"VirtualName": "ephemeral0"
}
{
"DeviceName": "/dev/sdc",
"VirtualName": "ephemeral1"
}
When launched with the correct volume size of 20GB there is plenty of free space (10GB)
root#ip-10-100-0-64:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 488M 0 488M 0% /dev
tmpfs 100M 3.1M 97M 4% /run
/dev/xvda1 20G 9.3G 11G 49% /
tmpfs 496M 0 496M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 496M 0 496M 0% /sys/fs/cgroup
tmpfs 100M 0 100M 0% /run/user/1000
It appears the issue here is the instance was launched with 10GB (somehow, I didn't think this was possible) of storage instead of the default 20GB.
/dev/xvda1 is your disk based storage on the amazon storage system.
It is the only storage on your system have, and it contains your operation system and all data. So I guess most of the space it used by the Ubuntu installation
Remember: T instances at amazon don't have any local disk at all.

How setup ec2 i3.large to get 475 Gb?

I have launched i3.large which offer about 475Gb for storage (https://aws.amazon.com/ec2/instance-types/) but when I invoke df -h I got:
Filesystem Size Used Avail Use% Mounted on
devtmpfs 7.5G 64K 7.5G 1% /dev
tmpfs 7.5G 0 7.5G 0% /dev/shm
/dev/xvda1 7.8G 7.7G 0 100% /
E.g. this instance contains only 22.5Gb. Why? What does 0.475Tb refers from intance types table?
Try copying a large amount of data there - I'm wondering if it expands. I tried doing a df -h on mine and got small numbers but then I built a 200GB database on it.

Can I create swap file in the ephemeral storage?

First of all, this is my df -h output:
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7.9G 4.2G 3.4G 56% /
udev 1.9G 8.0K 1.9G 1% /dev
tmpfs 751M 180K 750M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 1.9G 0 1.9G 0% /run/shm
/dev/xvdb 394G 8.4G 366G 3% /mnt
I know that the /mnt is an ephemeral storage, all data stores in will be deleted after reboot.
Is it OK to create a /mnt/swap file to use as swap file? I add the following line into /etc/fstab
/mnt/swap1 swap swap defaults 0 0
By the way, what's the /run/shm used to do ?
Thanks.
Ephemeral storage preserves data between reboots, but looses them between restarts (stop/start). Also see this: What data is stored in Ephemeral Storage of Amazon EC2 instance?