Additional 500 GB persistent disk attached by default - google-cloud-platform

I am trying to run a workflow on GCP using Nextflow. The problem is, whenever an instance is created to run a process, it has two disks attached. The first boot-disk (default 10GB) and an additional 'google-pipelines-worker' disk (default 500GB). When I run multiple processes in parallel, multiple VM's are created and each has an additional disk attached of 500GB. Is there any way to customize the 500GB default?
nextflow.config
process {
executor = 'google-pipelines'
}
cloud {
driver = 'google'
}
google {
project = 'my-project'
zone = 'europe-west2-b'
}
main.nf
#!/usr/bin/env nextflow
barcodes = Channel.from(params.analysis_cfg.barcodes.keySet())
process run_pbb{
machineType: n1-standard-2
container: eu.gcr.io/my-project/container-1
output:
file 'this.txt' into barcodes_ch
script:
"""
sleep 500
"""
}
The code provided is jus a sample. Basically, this will create a VM instance with an additional 500GB standard persistent disk attached to it.

Nextflow updated this in the previous release, will leave this here.
First run export NXF_VER=19.09.0-edge
Then in the scope 'process' you can declare a disk directive like so:
process this_process{
disk "100GB"
}
This updates the attached persistent disk (default: 500GB)
There is still no functionality to edit the size of the boot disk (default: 10GB)

I have been checking the Nextflow documentation, where is specified:
The compute nodes local storage is the default assigned by the Compute Engine service for the chosen machine (instance) type. Currently it is not possible to specify a custom disk size for local storage.

Related

How to update disk in GCP using terraform?

Is it possible to create a terraform module that updates a specific resource which is created by another module?
Currently, I have two modules...
linux-system: which creates a linux vm with boot disks
disk-updater: which I'm planning to use to update the disks I created from the first module
The reason behind is I want to create a pipeline that will do disk operations tasks via terraform like disk resizing.
data "google_compute_disk" "boot_disk" {
name = "linux-boot-disk"
zone = "europe-west2-b"
}
resource "google_compute_disk" "boot_disk" {
name = data.google_compute_disk.boot_disk.name
zone = data.google_compute_disk.boot_disk.zone
size = 25
}
I tried to use data block to retrieve the existing disk details and pass it to resource block hoping to update the same disk but it seems like it will just try to create a new disk with the same name thats why im getting this error.
Error creating Disk: googleapi: Error 409: The resource ... already exists, alreadyExists
I think I'm doing it wrong, can someone give me advice how to proceed without using the first module I built. btw I'm a newbie when it comes to terraform
updates a specific resource which is created by another module?
No. You have to update the resource using its original definition.
The only way to update it from other module, is to import to the other module, which is bad design, as now you will have to definitions for the same resource, resulting in out-sync state files.

unable to create google compute disk using terraform

i want to create a gcloud compute disk so in order to achieve i wrote below code
resource "google_compute_disk" "default2" {
name = "test-disk"
type = "pd-balanced"
zone = "us-central1-a"
image = "centos-7-v20210609"
physical_block_size_bytes = 20480
}
when i run terraform apply it show following error
how can i fix this
As described in the documentation
physical_block_size_bytes - (Optional) Physical block size of the persistent disk, in bytes. If not present in a request, a default value is used. Currently supported sizes are 4096 and 16384, other sizes may be added in the future. If an unsupported value is requested, the error message will list the supported values for the caller's project.

What happen when I increase the size of running volume of ec2 instance

My question is so simple:
What happens when I increase the size of running volume of ec2 instance.
1) Does my all data wiped ?
2) Does the space of my instance will also modify with new size ?
Actually my instance has storage of 8GB and that is almost full. I want to increase space that can help me to save more files to my instance.
I have found this option in my console.
I have found that connected ec2 volume. Does directly modifying the volume size will automatically reflect my instance space after reboot.
I
know this is quiet simple. I am just worried about my existing data.
Thank you for your help !
Assuming you have found the option in console to modify the size of the instance and the Instance here is Linux Instance. What the other answer forgets to mentions an important thing that is according to AWS Documentation:
Modifying volume size has no practical effect until you also extend
the volume's file system to make use of the new storage capacity. For
more information, see Extending a Linux File System after Resizing the
Volume.
For ext2, ext3, and ext4 file systems, this command is resize2fs. For XFS file systems, this command is xfs_growfs
Note:
If the volume you are extending has been partitioned, you need to increase the size of the partition before you can resize the file system
To check if your volume partition needs resizing:
Use the lsblk command to list the block devices attached to your instance. The example below shows three volumes: /dev/xvda, /dev/xvdb, and /dev/xvdf.
In Case if the partition occupies all of the room on the device, so it does not need resizing.
However, /dev/xvdf1if is an 8-GiB partition on a 35-GiB device and there are no other partitions on the volume. In this case, the partition must be resized in order to use the remaining space on the volume.
To extend a Linux file system
Log In to Instance via SSH
Use the df -h command to report the existing disk space usage on the file system.
Expand the modified partition using growpart (and note the unusual syntax of separating the device name from the partition number):
sudo growpart /dev/xvdf 1
Then Use a file system-specific command to resize each file system to the new volume capacity.
Finally Use the df -h command to report the existing file system disk space usage
Note : It is Recommended to take snapshot of ebs volume before making any changes.
Please Refer to this AWS Documentation
Well you can just modify the volume directly and this will not affect any file, it will take around 1 min or so to upgrade the size or you might want to restart your instance.
to ensure data safety you can create a snapshot of that volume and from that snapshot create a new volume of whatever size you want and delete the old volume which now contains old data.

Attaching the disk with same device-path or UUID

I had one disk attached to an instance & i had taken snapshot of it.
Now, after few days - the disk went bad and i want to restore the disk.
What i have implemented is :
Store metadata of snapshot, when taken
When restore request comes, i create new disk from snapshot
detach original disk (say attached inside host as /dev/sdz )
attach Newly created disk to the same instance
With this way, the user will get the view that the disk has been restored using the snapshot he had taken.
Now, the problem i'm seeing with this approach is :
as the original disk was attached as /dev/sdz, after detach & attach of NEW disk, the new disk should be seen as /dev/sdz ONLY,
Otherwise the application or upper-layers may break.
So, is there any provision that google-cloud APIs provide to handle this ?
PLEASE NOTE: I'm using google-api-python-client library & code is in Python.
I believe the name you are referring to is the "index" of the disk. I am not sure of that however. If that is the case, you would just need to make sure the index of the new disk matches the index of the disk you remove.
That being said, there are better ways to do this if you can modify your fstab. For example, you can use the "deviceName" by mounting /dev/disk/by-id/whatever in which case you would just need to make sure that the new disk has the same deviceName as the old disk.
Another option is to use the UUID of the filesystem to mount. Since these new disks are snapshots of the old disk, they will have the same UUID.
ls -l /dev/disk/by-uuid/
That should not change unless you reformat the partition entirely. In your fstab, instead of /dev/sdz1, you would use UUID=ef7481ea-a6f9-425b-940f-56e9c93492dd or whatever.

how to create a vm snapshot using pyvmomi

I have a task of implementing a basic backup and recovery system within a django app. I have heard of pyvmomi, but never used it before.
My specific tasks at hand is:
1) make a call to a vCenter, pass the vm name, and request to make a snapshot
2) obtain the file location of the snapshot
3) and upload the snapshot file into an OpenStack Swift object store
What is the actual syntax of creating a vm snapshot using pyvmomi?
Also - what is the syntax to request the actual snapshot file from vCenter?
https://github.com/rreubenur/vmware-pyvmomi-examples/blob/master/create_and_remove_snapshot.py
This should be helpful
Snapshot task result itself contains Moref to snashot created
So that you can get reference to created snapshot.