Kubernetes Persistent volumes - amazon-web-services

I have this MYSQL database pod running within a google cloud cluster, also using a persistent volume(GCEpersistentdisk) to backup my data.
Is there a way to also have a AWS persistent volume(AWSElasticBlockStore) backing up the same Pod, in case something goes wrong with google cloud platform and can't reach any of my data, so when I create another Kubernetes Pod within AWS, I'll be able to get my latest data(before GCP crushes) from that AWSElasticBlockStore.
If not what's the best way to simultaneously backing up a kubernetes database pod at two different cloud provider. So when one crushes, you'll be still able to deploy at the other.

Related

Can we run an application that is configured to run on multi-node AWS EC2 K8s cluster using kops into local kubernetes cluster (using kubeadm)?

Can we run an application that is configured to run on multi-node AWS EC2 K8s cluster using kops (project link) into local Kubernetes cluster (setup using kubeadm)?
My thinking is that if the application runs in k8s cluster based on AWS EC2 instances, it should also run in local k8s cluster as well. I am trying it locally for testing purposes.
Heres what I have tried so far but it is not working.
First I set up my local 2-node cluster using kubeadm
Then I modified the installation script of the project (link given above) by removing all the references to EC2 (as I am using local machines) and kops (particularly in their create_cluster.py script) state.
I have modified their application yaml files (app requirements) to meet my localsetup (2-node)
Unfortunately, although most of the application pods are created and in running state, some other application pods are unable to create and therefore, I am not being able to run the whole application on my local cluster.
I appreciate your help.
It is the beauty of Docker and Kubernetes. It helps to keep your development environment to match production. For simple applications, written without custom resources, you can deploy the same workload to any cluster running on any cloud provider.
However, the ability to deploy the same workload to different clusters depends on some factors, like,
How you manage authorization and authentication in your cluster? for example, IAM, IRSA..
Are you using any cloud native custom resources - ex, AWS ALBs used as LoadBalancer Services
Are you using any cloud native storage - ex, your pods rely on EFS/EBS volumes
Is your application cloud agonistic - ex using native technologies like Neptune
Can you mock cloud technologies in your local - ex. Using local stack to mock Kinesis, Dynamo
How you resolve DNS routes - ex, Say you are using RDS n AWS. You can access it using a route53 entry. In local you might be running a mysql instance and you need a DNS mechanism to discover that instance.
I did a google search and looked at the documentation of kOps. I could not find any info about how to deploy to local, and it only supports public cloud providers.
IMO, you need to figure out a way to set up your local EKS cluster, and if there are any usage of cloud native technologies, you need to figure out an alternative way about doing the same in your local.
The true answer, as Rajan Panneer Selvam said in his response, is that it depends, but I'd like to expand somewhat on his answer by saying that your application should run on any K8S cluster given that it provides the services that the application consumes. What you're doing is considered good practice to ensure that your application is portable, which is always a factor in non-trivial applications where simply upgrading a downstream service could be considered a change of environment/platform requiring portability (platform-independence).
To help you achieve this, you should be developing a 12-Factor Application (12-FA) or one of its more up-to-date derivatives (12-FA is getting a little dated now and many variations have been suggested, but mostly they're all good).
For example, if your application uses a database then it should use DB independent SQL or no-sql so that you can switch it out. In production, you may run on Oracle, but in your local environment you may use MySQL: your application should not care. The credentials and connection string should be passed to the application via the usual K8S techniques of secrets and config-maps to help you achieve this. And all logging should be sent to stdout (and stderr) so that you can use a log-shipping agent to send the logs somewhere more useful than a local filesystem.
If you run your app locally then you have to provide a surrogate for every 'platform' service that is provided in production, and this may mean switching out major components of what you consider to be your application but this is ok, it is meant to happen. You provide a platform that provides services to your application-layer. Switching from EC2 to local may mean reconfiguring the ingress controller to work without the ELB, or it may mean configuring kubernetes secrets to use local-storage for dev creds rather than AWS KMS. It may mean reconfiguring your persistent volume classes to use local storage rather than EBS. All of this is expected and right.
What you should not have to do is start editing microservices to work in the new environment. If you find yourself doing that then the application has made a factoring and layering error. Platform services should be provided to a set of microservices that use them, the microservices should not be aware of the implementation details of these services.
Of course, it is possible that you have some non-portable code in your system, for example, you may be using some Oracle-specific PL/SQL that can't be run elsewhere. This code should be extracted to config files and equivalents provided for each database you wish to run on. This isn't always possible, in which case you should abstract as much as possible into isolated services and you'll have to reimplement only those services on each new platform, which could still be time-consuming, but ultimately worth the effort for most non-trival systems.

How to backup ETCD of google kubernetes engine?

For google kubernetes engine, the master node, and ETCD cluster is abstracted away from the me the user.
Most of the ETCD backup guide (such as) assumes I have the endpoint or file system access to perform backups respectively.
As such - how do I perform such a backup, and restoration of ETCD in GKE?
Or would GKE provide subsequently a managed backup/restore service similar to cloud SQL?
Also if a full backup is not possible, even namespace backups will be great.
To clarify the scenario to guard against is not "if google goes down", but "if we do something stupid"
GKE backend is completely managed and thus, there is no way to access the etcd API. Even if you could access the cluster etcd, there are no guarantees of backwards compatibility for the storage backend. So the storage layer could change.
You'll have to use the kubernetes API which is backwards compatible for any backups you might want. There is some discussion on the kubernetes users google group here which should clarify this further.

What are strategies for bridging Google Cloud with AWS?

Let's say a company has an application with a database hosted on AWS and also has a read replica on AWS. Then that same company wants to build out a data analytics infrastructure in Google Cloud -- to take advantage of data analysis and ML services in Google Cloud.
Is it necessary to create an additional read replica within the Google Cloud context? If not, is there an alternative strategy that is frequently used in this context to bridge the two cloud services?
While services like Amazon Relational Database Service (RDS) provides read-replica capabilities, it is only between managed database instances on AWS.
If you are replicating a database between providers, then you are probably running the database yourself on virtual machines rather than using a managed service. This means the databases appear just like any resource on the Internet, so you can connect them exactly the way you would connect two resources across the internet. However, you would be responsible for managing, monitoring, deploying, etc. This takes away from much of the benefit of using cloud services.
Replicating between storage services like Amazon S3 would be easier since it is just raw data rather than a running database. Also, Big Data is normally stored in raw format rather than being loaded into a database.
If the existing infrastructure is on a cloud provider, then try to perform the remaining activities on the same cloud provider.

Deploying MEAN app on AWS ECS

I have successfully deployed a MEAN app on AWS ECS, but there are a couple things I don't have set-up properly.
1) If I spin up a new task, the Mongo data does not persist between the containers
2) Should my Mongo container and my frontend container be in the same task definition? This seems wrong because I feel like they should be able to scale independently of each other. But if they should be in separate task definitions, do I link them the same way?
Current Architecture:
1 Task Defintion
contains frontend container and mongo container which are linked
I did not define any mounts or volumes (which I assume is why data isn't persisting, but I am struggling to figure out how to properly set this up)
1 Cluster
1 service
contains load balancer and auto-scaling group (when this auto-scaling group creates a new task, I run into the issue of not having data persistence)
I guess what you assume is correct. Since you are not defining any mounts the data is not persistent. I recommend using Amazon EFS to Persist Data from Amazon ECS Containers.You can find step by step guide below to achieve the same.
Using Amazon EFS to Persist Data from Amazon ECS Containers

Deploy Docker container using Kubernetes

I'm learning about Kubernetes because it's a very useful tool to manage and deploy container.
So My question is:
For example i have 2 instances Amazon EC2 called Kube1 and Kube2. So on Kube1 i create some container using Docker and deploy wordpress successfully. Now i want to make a cluster between Kube1 and Kube2 and after that using Kubernetes to deploy all of the containers on Kube1 to Kube2. Is there any step-by-step tutorial to get me through it? I'm kind of stuck with a lot of new concept of Kubernetes.
Kubernetes is an orchestration tool.
I lets you deploy containers on a cluster to insure availability.
What that means is that you define containers specs (or sets of containers) called Pods, and you send them to the cluster manager to be deployed.
You do not choose where the Pods get deployed: Kubernetes decides where you Pod is deployed depending on the resources it needs and the resources that are available in the cluster.
There is a concept of Service, (which I find is confusing as 'service' often means your 'application' in today's jargon), but a Kubernetes Service is a load balanced proxy to the Pod you target.
The Service insures that you can talk to a Pod using a 'selector' which defines what Pods are targeted.
If you have a Wordpress site, it serves content. If you have 2 containers running this site, that are the same, then the Service would load balance the requests to either one of the 2 Pods.
Now, for this to work, you need the 2 Pods to be the same, that means if the data is updated (as it would be on a blog) the data needs to get to the Wordpress server somehow from a single source.
You could have a shared Database Pod that both servers connect to. Ideally you use a distributed version of the database that takes care of replication. Or you'd need to mirror the DB.
The use-case you mention though is a bit different: you're talking about porting an infrastructure to another server.
If you have your containers running on one node, replicating somewhere else should be as easy as pushing your containers to a registry and pulling then on to the other node. For the data, you may need to backup the volume and move it manually, or create a Docker volume to push to your registry.