AWS AMI export to local Ubuntu - amazon-web-services

I have got a running AMI Ubuntu instance with CI/CD pipeline with Jenkins, Tomcat, Selenium, Sonar, Nagios etc .I have used elastic ip directly in different configuration to get it working.
Is there any easy and direct way to export this image to my local Ubuntu apart from AWS export/import service.

For what it's worth I don't agree with the down votes. Cloud is fast becoming "infrastructure as code" and I place emphasis on the code aspect. As the differences between programming and infrastructure management begin to fade (see FaaS - functions as a service) there needs to be room for having that conversation in this forum IMHO.
That said, let's move on to your questions. AWS provides instance export functionality though this solution may not fit all use cases. Two links below illustrate how to perform that export. YMMV. Good luck!
http://docs.aws.amazon.com/cli/latest/reference/ec2/create-instance-export-task.html
http://docs.aws.amazon.com/vm-import/latest/userguide/vmexport.html

Related

Data Science/Engineering (Dev/Prod) Environment

I am going to create environments. For now i have gcp machine and i run jupyter in there. Everytime, i need start it, and with 3 people it is hard to work in same environment. I know, there is docker, jupyter hub, but did not find and suitable roadmap to create dev/prod environment.
My aim to create dev and production environment. Everything should be on GCP.
Any suggested path ?
Thanks
You can take a look at the best practices for enterprise organizations. In order to properly split resources it's often advised to use different projects. However, depending on the GCP product, you could also use versions, such as with App Engine (see this StackOverflow thread).

GCP Deployment Manager - What Dev Ops Tool To Use In Conjunction?

I'm presently looking into GCP's Deployment Manager to deploy new projects, VMs and Cloud Storage buckets.
We need a web front end that authenticated users can connect to in order to deploy the required infrastructure, though I'm not sure what Dev Ops tools are recommended to work with this system. We have an instance of Jenkins and Octopus Deploy, though I see on Google's Configuration Management page (https://cloud.google.com/solutions/configuration-management) they suggest other tools like Ansible, Chef, Puppet and Saltstack.
I'm supposing that through one of these I can update something simple like a name variable in the config.yaml file and deploy a project.
Could I also ensure a chosen name for a project, VM or Cloud Storage bucket fits with a specific naming convention with one of these systems?
Which system do others use and why?
I use Deployment Manager, as all 3rd party tools are reliant upon the presence of GCP APIs, as well as trusting that those APIs are in line with the actual functionality of the underlying GCP tech.
GCP is decidedly behind the curve on API development, which means that even if you wanted to use TF or whatever, at some point you're going to be stuck inside the SDK, anyway. So that's why I went with Deployment Manager, as much as I wanted to have my whole infra/app deployment use other tools that I was more comfortable with.
To specifically answer your question about validating naming schema, what you would probably want to do is write a wrapper script that uses the gcloud deployment-manager subcommand. Do your validation in the wrapper script, then run the gcloud deployment-manager stuff.
Word of warning about Deployment Manager: it makes troubleshooting very difficult. Very often it will obscure the error that can help you actually establish the root cause of a problem. I can't tell you how many times somebody in my office has shouted "UGGH! Shut UP with your Error 400!" I hope that Google takes note from my pointed survey feedback and refactors DM to pass the original error through.
Anyway, hope this helps. GCP has come a long way, but they've still got work to do.

Google Cloud Data Fusion + CI/CD for Data Pipelines

I am just getting started with both, GCP & Google Cloud Data Fusion. Just viewed the intro video. I see that pipelines can be exported. I was wondering how we might promote a pipeline from say, Dev to Prod env? My guess is that after some testing, the exported file is copied to the Prod branch on Git, from where we need to invoke the APIs to deploy it? Also, what about connection details, how do we avoid hard-coding the source/destination configurations & credentials?
Yes. You would have to export and re-import the pipeline.
About the first question, if you have different environments for development and production, you can export your pipeline and import it in the correct environment.
I didn't understand the second question very well. In the official Data Fusion plugins there is a standard way to provide your credentials. If you need a better answer, please explain a little more carefully your doubt.

Machine Learning (NLP) on AWS. Cloud9? SageMaker? EC2-AMI?

I have finally arrived in the cloud to put my NLP work to the next level, but I am a bit overwhelmed with all the possibilities I have. So I am coming to you for advice.
Currently I see three possibilities:
SageMaker
Jupyter Notebooks are great
It's quick and simple
saves a lot of time spent on managing everything, you can very easily get the model into production
costs more
no version control
Cloud9
EC2(-AMI)
Well, that's where I am for now. I really like SageMaker, although I don't like the lack of version control (at least I haven't found anything for now).
Cloud9 seems just to be an IDE to an EC2 instance.. I haven't found any comparisons of Cloud9 vs SageMaker for Machine Learning. Maybe because Cloud9 is not advertised as an ML solution. But it seems to be an option.
What is your take on that question? What have I missed? What would you advise me to go for? What is your workflow and why?
I am looking for an easy work environment where I can quickly test my models, exactly. And it won't be only me working on it, it's a team effort.
Since you are working as a team I would recommend to use sagemaker with custom docker images. That way you have complete freedom over your algorithm. The docker images are stored in ecr. Here you can upload many versions of the same image and tag them to keep control of the different versions(which you build from a git repo).
Sagemaker also gives the execution role to inside the docker image. So you still have full access to other aws resources (if the execution role has the right permissions)
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
In my opinion this is a good example to start because it shows how sagemaker is interacting with your image.
Some notes on other solutions:
The problem of every other solution you posted is you want to build and execute on the same machine. Sure you can do this but keep in mind, that gpu instances are expensive and therefore you might only switch to the cloud when the code is ready to run.
Some other notes
Jupyter Notebooks in general are not made for collaborative programming. I think they want to change this with jupyter lab but this is still in development and sagemaker only use the notebook at the moment.
EC2 is cheaper as sagemaker but you have to do more work. Especially if you want to run your model as docker images. Also with sagemaker you can easily build an endpoint for model inference which would be even more complex to realize with ec2.
Cloud 9 I never used this service and but on first glance it seems good to develop on, but the question remains if you want to do this on a gpu machine. Because you're using ec2 as instance you have the same advantage/disadvantage.
One thing I'd like to call out first is SageMaker notebook is not the only IDE environment in which you can interact with other components of SageMaker such as training and hosting. In fact you can make API calls to SageMaker training/hosting through Cloud9 or any IDEs you've installed on EC2 or even your laptop, as long as you have AWS SDK or SageMaker Python SDK installed.
Regarding the choice of the IDE, it's really up to your particular needs. SageMaker notebook is Jupyter based (now also supports JupyterLab beta), ML focused, and fully managed. Hundreds of Python packages that are commonly used in ML, as well as Tensorflow, Keras, MxNet, SageMaker Python SDK, etc., are preinstalled and automatically maintained for you. It also integrates more closely with other components of SageMaker as one can imagine.
Cloud9 is a managed IDE too but it is for general purpose rather than ML specific. If you want to use Jupyter on cloud9 it requires extra work from your side. It does not preinstall and maintain the version of common ML/DL related packages like SageMaker notebook does.

implement a tool that uses the technologies - Jenkins, Docker, Docker Swarm, and AWS

I want to implement a tool that uses the technologies - Jenkins, Docker, Docker Swarm, and AWS - to achieve a deployment tool that our team of developers can use to manage instances and deploys.
Please recommend what technologies should we (both administrators and developers) be using, what needs to be built and what sorts of machines must be having.
Any help here would be much appreciated.
Your question is too generic to provide a specific answer, as there are different approaches to implement what you are trying to achieve. IMHO the best approach would be to talk with your existing dev team & administrators and come up with a solution which all parties find easy to manage and maintain container based environment rather than specifying several specific technologies.
Each tool you have mentioned has different capabilities and also there are other tools that provide the same features which would be more ideal for your situation. (Thats why proper understanding between Devs and admins are necessary on what you really want to achieve.) .
Since you have asked about what kind of machines you must be having (I suppose this is on AWS env) try Core OS on AWS instances. CoreOS (Container Linux) will be the best option to manage and run your container based environments. [About CoreOS]
Jenkins can run in a docker container and issue docker commands to deploy new docker containers that reside in the same swarm as jenkins. You also need to hook into a software repo like git. Jenkins Blue Ocean is something you could look at for pipe-lining your dev->build->test->deploy->maintain pipes. Also, Travis-ci, github, JIRA, and Dockerhub are useful components to what you are trying to achieve.