What is a good API or way to create VMs programatically? - virtualbox

I'm working on a project where I think of distributing tasks to VMs that are dynamically managed (created, destroyed, paused, run processes from host, etc). I was wondering what would be a good approach or API to accomplish the management of the VMs. Below are some examples of what I'm thinking but I wanted to get some guidance on the best approach.
Vagrant-binding: it looked perfect but it out of date and not supported.
Oracle Tools: the vagrant module looks interesting but there isn't much documentation and I'm a bit confused on how to actually use it.
VirtualBox sdk: I'm a bit confused on the setup of this.
As you can see from the examples I was thinking in Java but I'm open to working on other languages. This project academic in nature and I'm a student so I know that this might not be the most practical thing to do but I wanted to see if it possible and what would be best way to accomplish it.

I have recently created a program to manage the VMs at my work. I used object-pascal (Delphi) to create the GUI and then did all the heavy lifting using vboxmanage commands that are ran as a process through cmd but called from my program. Oracle has a nice list of available commands here
For example
List running Vms
VBoxManage.exe list runningvms
Import a VM
VBoxManage.exe import (VM_To_Import_Location) --vsys 0 --vmname (Name_of_VM) --unit 11 --disk (Where_You_Want_VM_Stored)
Start a VM
VBoxManage.exe startvm (Name_Of_VM) --type headless
Take a Snapshot
VBoxManage.exe snapshot (VM_Name) take (Snapshot_Name) -desc "My Snapshot Description"
Restore a Snapshot
Vboxmanage.exe snapshot (VM_Name) restore (Snapshot_Name)
There are many more for pretty much everything you would want to do with a VM.

Related

Data Science/Engineering (Dev/Prod) Environment

I am going to create environments. For now i have gcp machine and i run jupyter in there. Everytime, i need start it, and with 3 people it is hard to work in same environment. I know, there is docker, jupyter hub, but did not find and suitable roadmap to create dev/prod environment.
My aim to create dev and production environment. Everything should be on GCP.
Any suggested path ?
Thanks
You can take a look at the best practices for enterprise organizations. In order to properly split resources it's often advised to use different projects. However, depending on the GCP product, you could also use versions, such as with App Engine (see this StackOverflow thread).

Create Google Cloud instance with custom FreeBSD ISO

I want to create a new Google Cloud instance with Hardenedbsd iso. Hardenedbsd is a FreeBSD based OS. I checked public documentation on https://cloud.google.com/compute/docs/images/import-existing-image but I couldn't see FreeBSD on supported OS section.
Is there a way to do that?
FreeBSD works pretty well in GCE, the upload procedure of a custom image or making your own is quite easy I would say even better than with AWS, therefore high are the changes the same could apply for Hardenedbsd, the only "trick" is that after you have your raw disk you need to use gnu tar to upload the image:
gtar -cSzf freebsd.tar.gz disk.raw
To create the disk.raw I use this script https://github.com/fabrik-red/images/blob/master/fabrik.sh (root on ZFS) to read more about the procedures you could check: https://fabrik.red/post/google/
For testing or getting an idea, you could try FreeBSD 12.0
https://github.com/fabrik-red/images/releases/download/12.0/disk.tar.gz
I haven't tried working with any *BSD on Google Cloud Platform so take my words with a grain of salt.
You could try booting the instance in rescue mode (if supported) and perform a dd to write the Hardenedbsd to the main disk.
You could also take a look on Packer from Hashicorp which is meant to create OS images to be deployed on the cloud.
https://www.packer.io/docs/builders/googlecompute.html

Machine Learning (NLP) on AWS. Cloud9? SageMaker? EC2-AMI?

I have finally arrived in the cloud to put my NLP work to the next level, but I am a bit overwhelmed with all the possibilities I have. So I am coming to you for advice.
Currently I see three possibilities:
SageMaker
Jupyter Notebooks are great
It's quick and simple
saves a lot of time spent on managing everything, you can very easily get the model into production
costs more
no version control
Cloud9
EC2(-AMI)
Well, that's where I am for now. I really like SageMaker, although I don't like the lack of version control (at least I haven't found anything for now).
Cloud9 seems just to be an IDE to an EC2 instance.. I haven't found any comparisons of Cloud9 vs SageMaker for Machine Learning. Maybe because Cloud9 is not advertised as an ML solution. But it seems to be an option.
What is your take on that question? What have I missed? What would you advise me to go for? What is your workflow and why?
I am looking for an easy work environment where I can quickly test my models, exactly. And it won't be only me working on it, it's a team effort.
Since you are working as a team I would recommend to use sagemaker with custom docker images. That way you have complete freedom over your algorithm. The docker images are stored in ecr. Here you can upload many versions of the same image and tag them to keep control of the different versions(which you build from a git repo).
Sagemaker also gives the execution role to inside the docker image. So you still have full access to other aws resources (if the execution role has the right permissions)
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
In my opinion this is a good example to start because it shows how sagemaker is interacting with your image.
Some notes on other solutions:
The problem of every other solution you posted is you want to build and execute on the same machine. Sure you can do this but keep in mind, that gpu instances are expensive and therefore you might only switch to the cloud when the code is ready to run.
Some other notes
Jupyter Notebooks in general are not made for collaborative programming. I think they want to change this with jupyter lab but this is still in development and sagemaker only use the notebook at the moment.
EC2 is cheaper as sagemaker but you have to do more work. Especially if you want to run your model as docker images. Also with sagemaker you can easily build an endpoint for model inference which would be even more complex to realize with ec2.
Cloud 9 I never used this service and but on first glance it seems good to develop on, but the question remains if you want to do this on a gpu machine. Because you're using ec2 as instance you have the same advantage/disadvantage.
One thing I'd like to call out first is SageMaker notebook is not the only IDE environment in which you can interact with other components of SageMaker such as training and hosting. In fact you can make API calls to SageMaker training/hosting through Cloud9 or any IDEs you've installed on EC2 or even your laptop, as long as you have AWS SDK or SageMaker Python SDK installed.
Regarding the choice of the IDE, it's really up to your particular needs. SageMaker notebook is Jupyter based (now also supports JupyterLab beta), ML focused, and fully managed. Hundreds of Python packages that are commonly used in ML, as well as Tensorflow, Keras, MxNet, SageMaker Python SDK, etc., are preinstalled and automatically maintained for you. It also integrates more closely with other components of SageMaker as one can imagine.
Cloud9 is a managed IDE too but it is for general purpose rather than ML specific. If you want to use Jupyter on cloud9 it requires extra work from your side. It does not preinstall and maintain the version of common ML/DL related packages like SageMaker notebook does.

Remove Machines from VMware Horizon View with PowerCLI

We often add and remove machines from a manual Desktop Pool on our Horizon server. Registering the machine is done with an install script. When we're done with a machine we want to un-register it from the server. The only way we have found to do this is to log into the web portal and manually delete each one. This is cumbersome and time-consuming when we have large numbers of machines to un-register.
The machines that will need to be un-registered will have similar names. Is there a way to automate this with PowerCLI?
Of course you can,
$VM = Get-VM -Name nameOfVM
Remove-VM $VM
Full example you can find here
Not sure if you had this question answered already, but there are a couple of ways that you can do this, depending on the version of PowerCLI that you have installed. The easiest way right now is to get the latest version of PowerCLI and make sure that you install the View module along with it. From here, peruse through the View API (https://code.vmware.com/web/dp/explorer-apis?id=58)
VMware also has a helper PSM1 script in their Example gallery, which is available via github: https://github.com/vmware/PowerCLI-Example-Scripts/tree/master/Modules/VMware.Hv.Helper

Best way to work with code on cloud?

I've lately started with Amazon Web Services and deployed a couple of express applications on EC2 and I find it extremely tedious to edit code on the fly via SSH (ssh is little unresponsive for coding purposes and I'm not really comfortable with nano and vim for heavy editing).
I know I can edit it on my machine and scp it to EC2. I was thinking whether there's any way I can setup something like nodemon but for cloud, i.e. whenever I make a change on my local development, it deploys it on cloud with scp? Kind of extending nodemon to cloud.
Or is there any other way to work with that?
There are plug-ins and utilities that can allow you to edit locally with Sublime Text (a good Australian editor -- please register if you use it a lot!) and have the file automatically updated on a remote server.
See:
Stackoverflow: How to use Sublime over SSH
Editing files remotely via SSH on SublimeText 3
There are probably many similar utilities if you go looking for them.