We have the need to perform tests on localized platforms that put some burden on our hardware resources because for just a few weeks we might need plenty of servers and clients (Windows 2003 and Windows 2008, Vista, XP, Red Hat, etc) in multiple languages.
We typically have relied on blades with Windows 2003 and VMWare, but sometimes these are overgrown by punctual needs and also have the issue that the acquisition and deployment process is quite slow if the environment needs to grow.
Is Amazon EC2/S3 usable in the following scenario?
Install VMWare (Desktop because we need the ability to have snapshots) on an Amazon AMI.
Load existing VMWare images from S3 and run them on EC2 instances (perhaps 3 or 4 server or client OSes on each EC2 instance.
We are more interested in the ability to very easily start or stop VMware snaphsots for relatively short tests. This is just for testing configurations, not a production environment to actually serve a user workload. The only real user is the tester. These configurations might be required for just a few weeks and then turned off for a few months until the next release requires them again.
Is EC2/S3 a viable alternative for this type of testing purpose?
Do you actually need VMWare, or are you testing software that runs in the VMWare VMs? You might actually need VMWare if you are testing e.g. VMWare deployment policy, or are running code that tests the VMWare APIs. Examples of the latter might be you are testing an application server stack and currently using VMWare to test on many platforms.
If you actually need VMWare, I do not believe that you can install VMWare in EC2. Someone will correct & enlighten me if this is not the case.
If you don't actually need VMWare, you have more options. If you can use one of the zillion public AMIs as a baseline, clone the appropriate AMIs and customize them to suit your needs (save the customized version as a private AMI for your team). Then, you can use as many of them as you like. Perhaps you already have a bunch of VMWare images that you need to use in your testing. In that case, you can migrate your VMWare image to an EC2 AMI as described in various places in Google, for example:
http://thewebfellas.com/blog/2008/9/1/creating-an-new-ec2-ami-from-within-vmware-or-from-vmdk-files
(Apologies to the SO censors for not pasting the entire article here. It's pretty long.) But that's a shortcut; you can always use the documented AMI creation process to convert any machine (VMWare or not) to an AMI. Perform that process for each VMWare VM you have, and you'll be all set. Just keep in mind that when you create an AMI, you have to upload it to S3, and that will take a lot of time for large VMs.
This is a bit of a shameless plug, but we have a new startup that may deal with exactly your problem. Amazon EC2 is excellent for on-demand computing, but is really targeted at just a single user launching production servers. We've extended EC2 to make it a Virtual Lab Management environment, with self-service, policies and VM sharing. You can check it out at http://LabSlice.com and see if it meets your needs.
Amazon provides a solution themselves now: http://aws.typepad.com/aws/2010/12/amazon-vm-import-bring-your-vmware-images-to-the-cloud.html
Related
We're testing a cloud version of our core product which is fundamentally a windows machine running a small vm instance of a heavily modified old OS (QNX6.5) as the core of a suite of our applications. Workspaces is perfect for our use case (allowing a lot of operators using a variety of clients in easily). However we're having real trouble creating this nested VM which is fundamental. There doesn't seem to be a way to activate hypervisor but it's still present which means that VMware won't run. Is there any solution to getting this up and running or is workspaces a non-starter? the required resources are very light. Any help will be greatly appreciated, thanks
I have finally arrived in the cloud to put my NLP work to the next level, but I am a bit overwhelmed with all the possibilities I have. So I am coming to you for advice.
Currently I see three possibilities:
SageMaker
Jupyter Notebooks are great
It's quick and simple
saves a lot of time spent on managing everything, you can very easily get the model into production
costs more
no version control
Cloud9
EC2(-AMI)
Well, that's where I am for now. I really like SageMaker, although I don't like the lack of version control (at least I haven't found anything for now).
Cloud9 seems just to be an IDE to an EC2 instance.. I haven't found any comparisons of Cloud9 vs SageMaker for Machine Learning. Maybe because Cloud9 is not advertised as an ML solution. But it seems to be an option.
What is your take on that question? What have I missed? What would you advise me to go for? What is your workflow and why?
I am looking for an easy work environment where I can quickly test my models, exactly. And it won't be only me working on it, it's a team effort.
Since you are working as a team I would recommend to use sagemaker with custom docker images. That way you have complete freedom over your algorithm. The docker images are stored in ecr. Here you can upload many versions of the same image and tag them to keep control of the different versions(which you build from a git repo).
Sagemaker also gives the execution role to inside the docker image. So you still have full access to other aws resources (if the execution role has the right permissions)
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
In my opinion this is a good example to start because it shows how sagemaker is interacting with your image.
Some notes on other solutions:
The problem of every other solution you posted is you want to build and execute on the same machine. Sure you can do this but keep in mind, that gpu instances are expensive and therefore you might only switch to the cloud when the code is ready to run.
Some other notes
Jupyter Notebooks in general are not made for collaborative programming. I think they want to change this with jupyter lab but this is still in development and sagemaker only use the notebook at the moment.
EC2 is cheaper as sagemaker but you have to do more work. Especially if you want to run your model as docker images. Also with sagemaker you can easily build an endpoint for model inference which would be even more complex to realize with ec2.
Cloud 9 I never used this service and but on first glance it seems good to develop on, but the question remains if you want to do this on a gpu machine. Because you're using ec2 as instance you have the same advantage/disadvantage.
One thing I'd like to call out first is SageMaker notebook is not the only IDE environment in which you can interact with other components of SageMaker such as training and hosting. In fact you can make API calls to SageMaker training/hosting through Cloud9 or any IDEs you've installed on EC2 or even your laptop, as long as you have AWS SDK or SageMaker Python SDK installed.
Regarding the choice of the IDE, it's really up to your particular needs. SageMaker notebook is Jupyter based (now also supports JupyterLab beta), ML focused, and fully managed. Hundreds of Python packages that are commonly used in ML, as well as Tensorflow, Keras, MxNet, SageMaker Python SDK, etc., are preinstalled and automatically maintained for you. It also integrates more closely with other components of SageMaker as one can imagine.
Cloud9 is a managed IDE too but it is for general purpose rather than ML specific. If you want to use Jupyter on cloud9 it requires extra work from your side. It does not preinstall and maintain the version of common ML/DL related packages like SageMaker notebook does.
For some time I am managing EC2 (Windows Boxes), RDS and S3 on AWS.
I do know manual steps that must be made in order to set up lets say a normal box (DB, Storage and Server. I heard about Vagrand, but everywhere I looked it mainly talks about Linux boxes on AWS.
My main question is: Is Vagrand a tool that will save me time for deyploment (windows), or should I not use it at all (in Windows scenario).
Vagrant plays nicely with AWS (via vagrant-aws plugin).
Vagrant seems to play nicely with Windows as well since version 1.6 and the introduction of WinRM support (ssh alternative for Windows).
However AWS plugin doesn't support WinRM communicator yet. So you'll need to pre-bake your Windows AMIs with SSH service pre installed, if you want vagrant to provision it.
Update (29/03/2016): Thanks to Rafael Goodman for pointing to vagrant-aws-winrm plugin as a possible workaround.
We recently bought a new rack and set of servers for it, we want to be able to redeploy these boxes as build servers, QA regression test servers, lab re-correlation servers, simulation servers, etc.
We have played a bit with VMWare, VirtualPC, VirtualBox etc, creating a virtual build server, but we came across a lot of issues when we tried to copy it for others to use, having to reconfigure every new copy of the VM.
We are using Windows XP x86/x64 and Windows Vista x86/x64, so I had to rename the machine, join the domain etc for every new copy.
Ideally we just want to be able to add a new box, deploy a thin boot strap OS (Linux is fine here) to get the VM up an running, then use it.
One other thing we have limited to no budget, so free is best.
I would like to understand others experiences in doing the same thing.
FYI, I am not in systems IT, this we are group of software engineers trying to set this up.
Any links to good tutorials would be great.
The problem you're running into is the machine SID must be unique for each machine in a domain. Of course by copying an image you now break that unique constraint.
I'd suggest that you read the documentation for Sysprep in the reskit and Vista System Image Manager - your friends for XP/Win2k3 and Vista/Win2k8 respectively.
These tools enable to "reseal" your configured instance of the OS such that the next time it boots - it can prompt for information such as network configuration, machine names, admin user ID's, run scripts etc.
Also be aware that the licencing restrictions for Windows desktop clients are generally per image - not per server.
Using these tools with HyperV we created complete preconfigured instances of Win2k3 & Win2k8 that boot to finish installing Sharepoint - going further we used the diffing disks to overlay Visual Studio so our devs could use the production images for their work. It has radically changed our development process.
At this point our entire public website is run on HyperV with of 5 boxes running 15 images for a mix of soft and hard redundancy - they take several hundred million page views per week.
Another option for dealing with the SID probelm is NewSID. This is a simpler tool than sysprep, in that all it does is rename the machine and reassign the SID; if you don't need all the other features of sysprep this is a much easier tool to use.
I develop exclusively on VMs. I currently run Boot Camp on a MacBook Pro and do all my development on a series of Virtual PC VMs for many different environments. This post by Andrew Connell litterally changed the way I work.
I'm thinking about switching to Fusion and running everything in OS X but I wasn't able to answer the following questions about VM Fusion/Workstation/Server. I need to know if the following features from Virtual PC/Server exist in their VMWare counter parts.
Differencing Disks (ability to create a Base VM and provision new VMs which just add deltas on top of the base [saves a ton of disk space, and makes it easy to spin up new VMs with a base set of funcitonality]). (Not available with Fusion, need Workstation [$189])
Undo disks (ability to rollback all changes to the VM within a session). (Available in both Workstation and Fusion [$189/$79.99 respectively])
Easily NAT out a different subnet for the VM to sit in. (In both Fusion/Workstation).
Share VMs between VM Player and VM Server. I'd like to build up a VM locally (on OS X/Fusion) and then move it to some server (Win2k3/Win2k8 and VM Server) and host it there but with VM Server. (In both Fusion/Workstation).
An equivalent to Hyper-V. (Both Fusion and Workstation take advantage of type-2 hypervisor a for 64x VMs, neither do for 32 bit VMs. VMWare claims they're no slower as a result some benchmarks corroborate this assertion).
Ability to Share disks between multiple VMs. If I have a bunch of databases on a virtual disk and want them to appear on more than one VM I should be able to just attach them. (Available in both Fusion and Workstation)
(Nice to have) Support for multiple processors assigned to a VM (Available in both Fusion and Workstation).
Is there a VMWare guru out there who knows for sure that the above features are available on the other side?
Also the above has been free (as long as you have licenses for Windows machines), besides buying Fusion are there any other costs?
The end result of my research, thanks so much!
You can only create Linked clones and Full Clones (which are close to differencing disks) in VMWare Workstation (not Fusion). Workstation also has at better snapshot management in addition to other features which are difficult to enumerate. That being said Workstation is $189 (as opposed to $79) and not available on OS X. In addition Fusion 1.1 (current release) has a bunch of display bugs on OS X 10.5 (works well on 10.4). These will be remedied in Fusion 2.0 which is currently in (RC1). I'll probably wait until v2.0 comes out and then use both Workstation/Fusion to provision and use these VMs on OS X.
I've not used Fusion, just workstation and server
1) Yes, you can create a linked clone from current vm state, or from a saved state (snapshot) in VMware Workstation
2) Yes, revert to snapshots
3) There's a number of different network setups, NAT's one of them
4) VMware virtual machines created with VMware Fusion are fully compatible with VMware’s latest products.
5) ?
6) You can add pre-existing to disks to other vm's
7) Yup, you create multi-cpu vm's
Workstation costs, but VMWare Server is free
It doesn't have #1, at least.
VMWare server is free, but only allows for one snapshot, a serious deficiency. VMWare Workstation allows multiple snapshots and can perform most of the same functionality.
VMWare has a Hypervisior which is equivalent to Hyper-V in Virtual PC.
You can not share a VM that was created in Fusion with Windows VMWare Server (free version) you'll need the paid version to be able to share amongst both.
I'd also take a look at Sun's xVM VirtualBox for Mac. It runs Windows XP and Vista quite swift on my Mac.
1 and 2) VirtualBox has snapshots that branch off from the base VM like a tree. You can revert to any previous snapshots and name them.
3) It has NAT support and bridged networking like the VMWare and Microsoft products.
4) There is no server version of VirtualBox, but I know it shares an engine with Qemu, so it may be possible to host your VBox images on Qemu.
5) VirtualBox does have a hypervisor if your Mac has VT-x enabled.
6) Sure, you can add existing disks to other VMs. But you can't run the same disk in multiple VMs at once. (Isn't that a restriction of all virtualization hosts, though?)
7) No. VirtualBox will give each image one CPU and spread them out.