StarCluster is a well known toolkit for Amazon EC2. However, it was developed based on Python 2, which is going out if date. And it is not compatible with Python 3.x.
So I'd like to know is there any alternative to StarCluster? I have searched in stackoverflow but found no answers. Does anyone know?
I am looking forward to your advice! Thanks!
I'd recommend taking a look at cfncluster, which is a python tool for managing HPC clusters on AWS. It shares much of the functionality of StarCluster, and in particular uses SGE out of the box. It seems to be actively developed, at least at the time of writing, and has (experimental) support for Python 3.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
Edited question:
We all know that Google is one of the largest technology companies in the world and is a benchmark of innovation and cutting-edge technology for all of us and especially those who study technology.
I need to draw up a study plan, and since our time is scarce we need to come up with the best strategy, so I would like to be inspired by the ways and choices made by Google since it is a reference for all of us.
So I would like to know what technologies Google uses on its cloud platform in both infrastructure and OS as well as in the chosen development languages.
For example, it is known that Microsoft likes to use Hyper-v and C ++ and C # in its cloud, Amazon used Xen and now migrated to KVM ....
I do not want to "discover the secrets of Google", even because the secret lies in the great talented team that was company yes. I just want a reference of what they used, because there may be the best way forward.
Thanks to all who can help.
First, it does not make sense to ask about programming languages in a scope as broad as the Google Cloud Platform. Many languages are used for many different parts of the platform.
Besides that, the software behind the platform is proprietary and not publicly available. For that reason, we can't tell you more than the obvious - Google Cloud Platform uses JavaScript.
The question is to be closed, how sad, but the answer is probably the same as "What programming languages does Google use?". This would make the answer a combination of mostly C++, Java, Python, and Go on the server, and others on clients, e.g. JavaScript and Swift.
You can get some insight by looking into open source code written by Google Cloud folks:
Kubernetes, Istio, gVisor and Cloud SQL Proxy are written in Go.
GCE Guest Environment, gcloud, gsutil and Spinnaker are written in Python.
Apache Beam is written for Java, Python and Go, with Java being the primary language.
Tensorflow is written in Java and Python.
I am attacking a combinatorial optimization problem similar to the multi-knapsack problem. The problem has an optimal solution, and i prefer not to settle for an approximate solution.
Are there any recommended tutorials regarding the quick prototyping and deployment of combinatorial optimization solutions (for senior software engineers that are also Big Data newbies)? I want to move quickly from prototype to deployment onto a docker cluster or AWS.
My background is in distributed systems (a focus on .NET, java, kafka, docker containers, etc...), thus I'm typically inclined to solve complex problems by parallel processing across a cluster of machines (via scaling on a docker cluster or AWS). However, this particular problem can NOT be solved in a brute force manner as the problem space is too large (roughly 100^1000 combinations are possible).
I've limited experience with “big data”, but I'm studying up on knapsack solvers, genetic algorithms, reinforcement learning, and some other AI/ML approaches. Given my limited exposure in this area, how would one recommend I tackle a problem such as this?
I tend to favor the approach of leveraging existing frameworks/libraries as much as possible. Good idea? Or would one recommend using Accord.Net or ML.Net or some other library to build a custom model?
If existing frameworks are the way to go, any particular favorites? tensorflow? Any thoughts on Google OR tools: https://developers.google.com/optimization/ Anything in the AWS space?
Any good tutorials, videos, or podcasts that can get me prototyping quickly? (keeping in mind my goal of deploying and validating the model on a docker cluster)
Thank you for any help and guidance!
The Cloud Balancing problem in OptaPlanner (open source, java) is a multi-knapsack problem. There's a tutorial for it in the user guide. Many users run OptaPlanner implementations on Docker (normal open JDK 8 image) and AWS. Here's an Employee Rostering implementation that is deployed to OpenShift Dedicated (which generates an docker image that it runs on AWS) - it exposes a REST api (which is Swagger documented even).
Thanks to all for your insight above. I’m having a look at optaplanner and google-OT, as well as a few other solvers.
To follow up on this question, if I were to relax the constraint that I want the optimal answer , and allow for “approximate” solutions , would this change your guidance or recommended tool set (libraries/frameworks) in any way?
I am working on using Cloudify for vCloud. So far the information I gathered , this intgegration is possible as jCloud has interfaces for vCloud too. I am yet to explore the jCloud too. I am looking for if any effort has been put to integrate Cloudify with vCloud. Any pointers to documentation or earlier Qs/blogs which I couldn't find myself will be valuable.
Thanks
Raghavendra
You can find a detailed discussion about this here
I'm not the only one with this question, but haven't found a lot of information in my research so far, so help me out.
We are a small IT crowd in an organization. We're looking to build a small, private service that would emulate a heroku/gae workflow. The basics of this: deploy an app as a git repository, and have it scale in a 'cloud' environment. Basically, a platform as a service (Paas).
Pretend we are amateur PM's, programmers, and sysadmins tasked with this. What would you recommend? We know generally what is needed: some sort of routing, database, caching, authentication, etc. What other tools do we need?
We would prefer tools along a ruby/python/haskell/erlang dimension, on a linux/bsd stack, with postgres databases(couchdb or cassandra in the future). We are not touching anything in the ms/.net area, nothing on the JVM (We've looked at Steamcannon, but no; Scala and Clojure tools are not entirely out of the question). We have a basic grasp of bootstrapping a cloud (e.g. Eucalyptus) to build on. We have an understanding of the basics in server admin, and the physical infrastructure limitations aren't a factor right now.
We're not looking into why gaerokuyardspace is the best choice, a list of such services, why we should ditch our plans for one of these services, or an argument against this plan. For this situation the decision has been made that the cost to build privately is more attractive than the cost of deploying elsewhere. We already know why and how for these services. We're looking to emulate and build upon these for private needs.
A short list of tools to be expanded:
Beehive
Steamcannon
Gitosis/Gitolite
?
Basically, I'd like to generate a list of tools for building heroku/gae like service on a small, private, definitely experimental/toy level.
I don't know that it will meet all of your stated needs today, but you should take a look at Cloud Foundry from VMware. You can check the FAQ for the commercial project or look in to the Open Source version that you can host and manage yourself.
Some combination of Cloud Foundry (above) gitolite, and fabric
will probably do well for you. Any such solution will take some time to get right.
(Disclaimer: I'm a lead developer on the AppScale project)
AppScale is pretty much right up your alley, especially if you're looking to run Google App Engine apps in your own private cloud. It's open source, so grab it and extend it if there are other types of apps you want to support (and definitely commit it back to us if you do).
trying to pick version control, continuous integration, and host for Flex + Ruby or Django smallish project. Question:
version control: I've used SVN and CVS in the past. I hear great things about git. Not sure what to pick.
continuous integration: I've heard good things about hudson and cruiseControl. Not sure what to pick
hosting: is my own server the only way to go? Are the decent cloud options that are not too expensive? or should I look for some free hosting service?
thank you for your help!
f
Use Git.
Git is a great tool that allows a very flexible workflow. It has lots of benefits over subversion/cvs, the biggest of which is the ability to branch and merge seamlessly. This can't be overstated. The merge-hell that ensues when attempting to use svn's branching and merging is a thing of the past. For a better case on why to use git, check out http://whygitisbetterthanx.com/
Use Hudson.
Hudson is the easily the best CI tool in the game. The reason Hudson is the best is that its easy to configure (for one or multiple nodes), it has a ton of plugins, and handles the 90% use case extremely well. You are in the 90% use case. People like Mozilla aren't. Check out C. Titus Brown's talk at Pycon for more info. http://pycon.blip.tv/file/3259794/ (If you decide that Hudson isn't what you should use, check out buildbot)
Use Webfaction (or Rackspace Cloud).
Webfaction is a great starter ground. If your needs are low, check them out. Beyond that, I'd suggest taking a hard look at Rackspace Cloud (RSC). RSC makes scaling out much easier and their pricing model is very palatable for things that aren't bandwidth intensive (ie: most things that don't require tons of uploads/downloads). It starts at $10/mo. Their management console is good (save the DNS administration interface, but even that is more than bearable). If your needs expand beyond RSC (doubtful), you would do well to check out Amazon's EC2. Companies like RightScale can help when it comes to scaling out.