So I was able to connect to a GKE cluster from a java project and run a job using this:
https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/JobExample.java
All I needed was to configure my machine's local kubectl to point to the GKE cluster.
Now I want to ask if it is possible to trigger a job inside a GKE cluster from a Google Cloud Function, which means using the same library https://github.com/fabric8io/kubernetes-client
but from a serverless environment. I have tried to run it but obviously kubectl is not installed in the machine where the cloud function runs. I have seen something like this working using a lambda function from AWS that uses https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/ecs/AmazonECS.html
to run jobs in an ECS cluster. We're basically trying to migrate from that to GCP, so we're open to any suggestion regarding triggering the jobs in the cluster from some kind of code hosted in GCP in case the cloud function can't do it.
Yes you can trigger your GKE method from a serverless environment. But you have to be aware to some edge cases.
With Cloud Functions, you don't manage the runtime environment. Therefore, you can't control what is installed on the container. And you can't control, or install Kubectl on it.
You have 2 solutions:
Kubernetes expose API. From your Cloud Functions you can simply call an API. Kubectl is an APIf call wrapper, nothing more! Of course, it requires more efforts, but if you want to stay on Cloud Functions you don't have any other choice
You can switch to Cloud Run. With Cloud Run, you can define your own container and therefore install Kubectl in it, in addition of your webserver (you have to wrap your Function in a webserver with Cloud Run, but it's pretty easy ;) )
Whatever the solution chosen, you also have to be aware of the GKE control plane exposition. If it is publicly exposed (generally not recommended), there is no issue. But, if you have a private GKE cluster, your control plane is only accessible from the internal network. To solve that with a serverless product, you have to create a serverless VPC connector to bridge the serverless Google-managed VPC with your GKE control-plane VPC.
Related
CONTEXT:
We have a platform where users can create their own projects - multiple projects per user. We need to provide them with a browser-based IDE to edit those projects.
We decided to go with coder-server. For this we need to configure an auto-scalable cluster on AWS. When the user clicks "Edit Project" we will bring up a new container each time.
https://hub.docker.com/r/codercom/code-server
QUESTION:
How to pass parameters from the url query (my-site.com/edit?project=1234) into a startup script to pre-configure the workspace in a docker container when it starts?
Let's say the stack is AWS + ECS + Fargate. We could use kubernetes instead of ECS if it helps.
I don't have any experience in cluster configuration. Will appreciate any help or at least a direction where to dig further.
The above can be achieved using multiple ways in AWS ECS. The basic requirements for such systems are to launch and terminate containers on the fly while persisting the changes in the files. (I will focus on launching the containers)
Using AWS SDK's:
The task can be easily achieved using AWS SDKs, Using a base task definition. AWS SDK allows starting tasks with overrides on the base task definition.
E.G. If task definition has a memory of 2GB then the SDK can override the memory to parameterised value while launching a task from task def.
Refer to the boto3 (AWS SDK for Python) docs.
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ecs.html#ECS.Client.run_task
Overall Solution
Now that we know how to run custom tasks with python SDK (on demand). The overall flow for your application is your API calling AWS lambda function whit parameters to spin up and wait to keep checking task status and update and rout traffic to it once the status is healthy.
API calls AWS lambda functions with parameters
Lambda function using AWS SDK create a new task with overrides from base task definition. (assuming the base task definition already exists)
Keep checking the status of the new task in the same function call and set a flag in your database for your front end to be able to react to it.
Once the status is healthy you can add a rule in the application load balancer using AWS SDK to route traffic to the IP without exposing the IP address to the end client (AWS application load balancer can get expensive, I'll advise using Nginx or HAProxy on ec2 to manage dynamic routing)
Note:
Ensure your Image is lightweight, and the startup times are less than 15 mins as lambda cannot execute beyond that. If that's the case create a microservice for launching ad-hoc containers and hosting them on EC2
Using Terraform:
If you looking for infrastructure provisioning terraform is the way to go. It has a learning curve so recommend it as a secondary option.
Terraform is popular for parametrising using variables and it can be plugged in easily as a backend for an API. The flow of your application still remains the same from step 1, but instead of AWS Lambda API will be calling your ad-hoc container microservice, which in turn calls terraform script and passing variables to it.
Refer to the Terrafrom docs for AWS
https://registry.terraform.io/providers/hashicorp/aws/latest
Can we run an application that is configured to run on multi-node AWS EC2 K8s cluster using kops (project link) into local Kubernetes cluster (setup using kubeadm)?
My thinking is that if the application runs in k8s cluster based on AWS EC2 instances, it should also run in local k8s cluster as well. I am trying it locally for testing purposes.
Heres what I have tried so far but it is not working.
First I set up my local 2-node cluster using kubeadm
Then I modified the installation script of the project (link given above) by removing all the references to EC2 (as I am using local machines) and kops (particularly in their create_cluster.py script) state.
I have modified their application yaml files (app requirements) to meet my localsetup (2-node)
Unfortunately, although most of the application pods are created and in running state, some other application pods are unable to create and therefore, I am not being able to run the whole application on my local cluster.
I appreciate your help.
It is the beauty of Docker and Kubernetes. It helps to keep your development environment to match production. For simple applications, written without custom resources, you can deploy the same workload to any cluster running on any cloud provider.
However, the ability to deploy the same workload to different clusters depends on some factors, like,
How you manage authorization and authentication in your cluster? for example, IAM, IRSA..
Are you using any cloud native custom resources - ex, AWS ALBs used as LoadBalancer Services
Are you using any cloud native storage - ex, your pods rely on EFS/EBS volumes
Is your application cloud agonistic - ex using native technologies like Neptune
Can you mock cloud technologies in your local - ex. Using local stack to mock Kinesis, Dynamo
How you resolve DNS routes - ex, Say you are using RDS n AWS. You can access it using a route53 entry. In local you might be running a mysql instance and you need a DNS mechanism to discover that instance.
I did a google search and looked at the documentation of kOps. I could not find any info about how to deploy to local, and it only supports public cloud providers.
IMO, you need to figure out a way to set up your local EKS cluster, and if there are any usage of cloud native technologies, you need to figure out an alternative way about doing the same in your local.
The true answer, as Rajan Panneer Selvam said in his response, is that it depends, but I'd like to expand somewhat on his answer by saying that your application should run on any K8S cluster given that it provides the services that the application consumes. What you're doing is considered good practice to ensure that your application is portable, which is always a factor in non-trivial applications where simply upgrading a downstream service could be considered a change of environment/platform requiring portability (platform-independence).
To help you achieve this, you should be developing a 12-Factor Application (12-FA) or one of its more up-to-date derivatives (12-FA is getting a little dated now and many variations have been suggested, but mostly they're all good).
For example, if your application uses a database then it should use DB independent SQL or no-sql so that you can switch it out. In production, you may run on Oracle, but in your local environment you may use MySQL: your application should not care. The credentials and connection string should be passed to the application via the usual K8S techniques of secrets and config-maps to help you achieve this. And all logging should be sent to stdout (and stderr) so that you can use a log-shipping agent to send the logs somewhere more useful than a local filesystem.
If you run your app locally then you have to provide a surrogate for every 'platform' service that is provided in production, and this may mean switching out major components of what you consider to be your application but this is ok, it is meant to happen. You provide a platform that provides services to your application-layer. Switching from EC2 to local may mean reconfiguring the ingress controller to work without the ELB, or it may mean configuring kubernetes secrets to use local-storage for dev creds rather than AWS KMS. It may mean reconfiguring your persistent volume classes to use local storage rather than EBS. All of this is expected and right.
What you should not have to do is start editing microservices to work in the new environment. If you find yourself doing that then the application has made a factoring and layering error. Platform services should be provided to a set of microservices that use them, the microservices should not be aware of the implementation details of these services.
Of course, it is possible that you have some non-portable code in your system, for example, you may be using some Oracle-specific PL/SQL that can't be run elsewhere. This code should be extracted to config files and equivalents provided for each database you wish to run on. This isn't always possible, in which case you should abstract as much as possible into isolated services and you'll have to reimplement only those services on each new platform, which could still be time-consuming, but ultimately worth the effort for most non-trival systems.
Currently we have Jenkins that is running on-premise(VMware), planning to move into the cloud(aws). What would be the best approach to install Jenkins whether on ec2 or ECS?
Best way would be running on EC2. Make sure you have granular control over your instance Security Group and Network ACL's. I would recommend using terraform to build your environment as you can write code and also version control it. https://www.terraform.io/downloads.html
Have you previously containerized your Jenkins? On VMWare itself? If not, and if you are not having experience with containers, go for EC2. It will be as easy as running on any other VM. For reproducing the infrastructure, use Terraform or CloudFormartion.
I would recommend dockerize your on-premise Jenkins first. See how much efforts are required in implementation and administrating/scaling it. Then go for ECS.
Else, shift to EC2 and see how much admin overhead + costs you are billed. Then if required, go for ECS.
Another point you have to consider is how your Jenkins is architected. Are you using master-slave? Are you running builds contentiously so that VMs are never idle? Do you want easy scaling such that build environment is created and destroyed per build execution?
If you have no experience with running containers then create it on EC2. Before running on ECS make sure you really understand containers and container orchestration.
Just want to complement the other answers by providing link to official AWS white paper:
Jenkins on AWS
It might be of special interest as it discusses both options in detail: EC2 and ECS:
In this section we discuss two approaches to deploying Jenkinson AWS. First, you could use the traditional deployment on top of Amazon Elastic Compute Cloud (Amazon EC2). Second, you could use the containerized deployment that leverages Amazon EC2 Container Service (Amazon ECS).Both approaches are production-ready for an enterprise environment.
There is also AWS sample solution for Jenkins on AWS for ECS:
https://github.com/aws-samples/jenkins-on-aws:
This project will build and deploy an immutable, fault tolerant, and cost effective Jenkins environment in AWS using ECS. All Jenkins images are managed within the repository (pulled from upstream) and fully configurable as code. Plugin installation is automated, including versioning, as well as configured through the Configuration as Code plugin.
I am trying to access Kafka and 3rd-party services (e.g., InfluxDB) running in GKE, from a Dataflow pipeline.
I have a DNS server for service discovery, also running in GKE. I also have a route in my network to access the GKE IP range from Dataflow instances, and this is working fine. I can manually nslookup from the Dataflow instances using my custom server without issues.
However, I cannot find a proper way to set up an additional DNS server when running my Dataflow pipeline. How could I achieve that, so that KafkaIO and similar sources/writers can resolve hostnames against my custom DNS?
sun.net.spi.nameservice.nameservers is tricky to use, because it must be called very early on, before the name service is statically instantiated. I would call java -D, but Dataflow is going to run the code itself directly.
In addition, I would not want to just replace the systems resolvers but merely append a new one to the GCP project-specific resolvers that the instance comes pre-configured with.
Finally, I have not found any way to use a startup script like for a regular GCE instance with the Dataflow instances.
I can't think of a way today of specifying a custom DNS in a VM other than editing /etc/resolv.conf[1] file in the box. I don't know if it is possible to share the default network. If it is machines are available at hostName.c.[PROJECT_ID].internal, which may serve your purpose if hostName is stable [2].
[1] https://cloud.google.com/compute/docs/networking#internal_dns_and_resolvconf [2] https://cloud.google.com/compute/docs/networking
I did not find any topic talking about eucalyptus and Kubernetes. And I find that quite weird because eucalyptus allow you to create an hybrid cloud with a private cloud A.W.S. compatible for the S3, EBS and EC2. And, so I was thinking with just Eucalytpus and Kubernetes, you can easily create an awesome hybrid cloud.
Does some experiments Kubernetes with Eucalyptus? If yes, how did you setup it? And, how works your private and public cloud (working together or are they independent)?
Eucalyptus does not yet officially support AWS ECS service, but there has been some work done on this area recently. For that you will have to install Eucalyptus from source using a custom git branch.
The link below explains how to get AWS ECS like service on Eucalyptus,
http://jeevanullas.in/blog/aws-ec2-container-service-api-in-eucalyptus/
We installed Kubernetes on top of Eucalyptus using Rancher. We had to modify the AWS Docker Machine driver to make it work automatically, but a few months ago the ability to use custom regions with Docker Machine so it should be in Rancher soon. You can also install Rancher hosts manually using the bootstrap script.
For the most part, it works quite well and the bonus with using Rancher is you get Swarm, Mesos, and Cattle orchestration engines which increases the types of container deployments. New versions of Rancher to be released will support EBS volumes and we plan to experiment that for container storage.
We haven't tried connecting private and public, but in theory it should work as long as the UDP ports are open required for Rancher networking. But I'm not even sure this is how container orchestration is designed to work multi-region so try with caution.