As we move our workloads to AWS I am looking for an ETL tool which is widely used and has the appropriate connectors - Apache Camel appears to fit the bill. However, I am struggling to find information on how Camel can be deployed in AWS - the obvious one is on an EC2 instance, but we would like to avoid the setup and maintenance required by Virtual Machines. I don't see anyone offering it as a managed service, so the option I'd like to explore is running it as a container in ECS, as we will have numerous other containers running.
Containers don't seem to be an installation option on the Apache Camel website - perhaps it is just too limiting for a tool whose purpose is to connect to everything else?
Is it acceptable and practical to run Camel in a container, and where could I find more information about it?
Apache Camel appears to fit the bill.
Indeed the Apache Camel is a great integration framework. And that's the point. It is a framework, not a product. So there are multiple ways to run the Camel flows. As a web app, as a standalone app, as a part if our own code. Camel itself is pretty agnostic to the way you run the flows and that's why you don't have very specific way enforced in the web site.
If you want an out of box product, which can generate containerized deployments with Apache Camel, you could have a look at Apache ServiceMix, Apache Karaf or it's supported RedHat Fuse.
Is it acceptable and practical to run Camel in a container, and where could I find more information about it?
It is perfectly fine.
Question: Can you (are you able) to create a docker container with your (any other) application?. Based on the question this skill is lacking and I really suggest to learn it.
You may check folowing post https://medium.com/#wkrzywiec/how-to-put-your-java-application-into-docker-container-5e0a02acdd6b
FROM java:8-jdk-alpine
COPY ./target/myapp.jar /usr/app/myapp.jar
WORKDIR /usr/app
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "myapp.jar"]
Let's assume you can run your ETL tasks as a standalone application, then just run it in the container as any other standalone application.
it we would like to avoid the setup and maintenance required by Virtual Machines
Question: how do you distribute your camel tasks? I mean - what is result of your build? A war file? A standalone app?
To build a web app you could see https://www.baeldung.com/spring-apache-camel-tutorial
The most convenient way to deploy a war file in AWS is AWS Beanstalk service.
If you build a standalone application (or use servicemix) and you can build a container, then indeed ECS or Fargate seem as natural options.
Related
AWS App2Container (A2C) is a recently launched feature by AWS. It is a CLI tool to help you lift and shift applications that run in your on-premises data centres or on virtual machines so that they run in containers that are managed by Amazon ECS or Amazon EKS. Since there is not much info on the internet about this, apart from the AWS document so does anybody knows how to implement it and what are the dependencies required for it?
This is a fairly new service so most people will be relying on reading at the moment.
For JAVA applications the setup instructions on Linux indicate that you just download the app2container package and then run the following over your code
sudo app2container containerize --application-id java-app-id
For .NET applications the setup instructions on Windows indicate that it is exactly the same process, run the install file and that will have all dependencies.
The best way to try and implement this will be by following these tutorials step by step. Also remember at this time it is JAVA or .NET only.
This is my first time deploying an application. I have some idea about it but I am not sure if it is correct. How do I go about deploying a play application on google cloud?
1) I have created a package using dist command. I have the zip file now on my local pc. https://www.playframework.com/documentation/2.5.x/Deploying
2) Do I first need to create a compute resource on gcp? What configuration shall I use for the vm? My app is still in test phase so there are no external users at the moment
3) I suppose play uses netty web server. So do I need to install netty on the compute resource? I have looked online a bit but can't find a resource on how to deploy an application on netty.
deploy an application on netty
Netty is not a web server/application server, but an IO framework which can be used to build web servers or any high-performant IO applications.
If you really want to use netty, you need to write an HTTP server yourself, or just use an HTTP framework built on netty.
If you want to build an application using netty, have a look at the examples on https://github.com/netty/netty/tree/4.1/example/src/main/java/io/netty/example/
Deploying a container to the Cloud using Google Cloud Platform and Kubernetes Engine
Kubernetes is a way of orchestrating containers in the Cloud, enabling you to do things like auto-scale, fast deploys and manage running versions of containers. You simply create a container and upload it to a container repository. In this example I used Google’s Container Registry, it’s really simple to use and works brilliantly with their Kubernetes implementation.
follow this tutorial might help you with this
https://medium.com/beyond/deploying-a-container-to-the-cloud-using-google-cloud-platform-and-kubernetes-engine-10d8ee3aba86
I want to implement a tool that uses the technologies - Jenkins, Docker, Docker Swarm, and AWS - to achieve a deployment tool that our team of developers can use to manage instances and deploys.
Please recommend what technologies should we (both administrators and developers) be using, what needs to be built and what sorts of machines must be having.
Any help here would be much appreciated.
Your question is too generic to provide a specific answer, as there are different approaches to implement what you are trying to achieve. IMHO the best approach would be to talk with your existing dev team & administrators and come up with a solution which all parties find easy to manage and maintain container based environment rather than specifying several specific technologies.
Each tool you have mentioned has different capabilities and also there are other tools that provide the same features which would be more ideal for your situation. (Thats why proper understanding between Devs and admins are necessary on what you really want to achieve.) .
Since you have asked about what kind of machines you must be having (I suppose this is on AWS env) try Core OS on AWS instances. CoreOS (Container Linux) will be the best option to manage and run your container based environments. [About CoreOS]
Jenkins can run in a docker container and issue docker commands to deploy new docker containers that reside in the same swarm as jenkins. You also need to hook into a software repo like git. Jenkins Blue Ocean is something you could look at for pipe-lining your dev->build->test->deploy->maintain pipes. Also, Travis-ci, github, JIRA, and Dockerhub are useful components to what you are trying to achieve.
Please let me know if this question is more appropriate for a different channel but I was wondering what the recommended tools are for being able to install, configure and deploy hadoop/spark across a large number of remote servers. I'm already familiar with how to setup all of the software but I'm trying to determine what I should start using that would allow me to easily deploy across a large number of servers. I've started to look into configuration management tools (ie. chef, puppet, ansible) but was wondering what the best and most user friendly option to start off with is out there. I also do not want to use spark-ec2. Should I be creating homegrown scripts to loop through a hosts file containing IP? Should I use pssh? pscp? etc. I want to just be able to ssh with as many servers as needed and install all of the software.
If you have some experience in scripting language then you can go for chef. The recipes are already available for deployment and configuration of cluster and it's very easy to start with.
And if wants to do it by your own then you can use sshxcute java API which runs the script on remote server. You can build up the commands there and pass them to sshxcute API to deploy the cluster.
Check out Apache Ambari. Its a great tool for central management of configs, adding new nodes, monitoring the cluster, etc. This would be your best bet.
I have never actually worked for a company which is deploying a Django App (with a large user base), and am curious about what is the best way to do this.
Right now I am hosting a Django App on EC2. The code for the app is sitting in my github account. I have nginx serving static content, and behind it a single apache server running django + mod_wsgi.
I am trying to figure out what the best practice is for "continuous deployment". Right now, after I have added additional functionality I do the following on EC2:
1) git reset HEAD --hard
2) git pull
3) restart apache
4) restart nginx
I have custom logic in my settings.py file so that if I am running on EC2, debug gets set to False, and my databases switch from sqlite3 (development) to mysql (production).
This seems to be working for me now, but I am wondering what is wrong with this process and how could I improve it.
Thanks
I've worked with systems that use Fabric to deploy to multiple servers
I'm the former lead developer at The Texas Tribune, which is 100% Django. We deployed to EC2 using RightScale. I didn't personally write the deployment scripts, but it allowed us to get new instances into the rotation very, very quickly and scales on-demand. it's not cheap, but was worth every penny in my opinion.
I'd agree with John and say that Fabric is the tool to do this sort of thing comfortably. You probably don't want to configure git to automatically deploy with a post commit hook, but you might want to configure a fabric command to run your test suite locally, and then push to production if it passes.
Many people run separate dev and production settings files, rather than having custom logic in there to detect if it's in a production environment. You can inherit from a unified file, and then override the bits that are different between dev and production. Then you start the server using the production file, rather than relying on a single unified settings.py.
If you're just using apache to host the application, you might benefit from a lighter weight solution. Using fastcgi with nginx would allow you to do away with the overhead of apache entirely. There's also a wsgi module for nginx, but I don't know if it's production ready at this point.
There is one more good way how to manage this. For ubuntu/debian amis it is good to manager versions and do deployemnts by packeging your application into .deb