Upgrading AWS elasticsearch & logstash with terraform - amazon-web-services

I am currently trying to upgrade our AWS Elasticsearch (ES) with terraform and want to create two new clusters from the one that we currently have.
It would be preferable to do it through terraform as we have a huge cluster that is run through terraform so if we update through the console it would revert back when we apply the terraform. Has anyone any experience doing this from ES version 2.3 to version 5? I have been told to snapshot and restore but cant find any documentation on how to do this through terraform. Thanks

So I think we have figured it out. You cannot upgrade from Elasticsearch version 2.3 through terraform as it is just to old. You need to manually create two new clusters, point you logs at that & then take a snapshot of the old cluster and restore it to the new one. After version 5 of elasticsearch I believe you can upgrade through the terraform as long as your provider (e.g. AWS) is at least version 1.55. This should not need a snapshot as the cluster will be updated in place, I have been informed

Related

Migrating AWS Neptune Snapshots

I have taken a snapshot of a Neptune cluster which is on Neptune Engine V1.0.x and when I try to restore it I am getting an option to create a cluster with engine versions 1.0.x or 1.1.x.
The option to restore it on a cluster with engine version 1.2.x is not present.
If engine version 1.0.x and 1.1.x reach their end of life, then how would a snapshot created from engine version 1.0.x get restored?
Is it possible to migrate AWS Neptune snapshot from one engine version to another?
I reached out AWS support and got a response for this query.
As per them, there are significant changes in the architecture starting from engine version 1.1.1.0 and that's why db engines on 1.0.x.x must be upgraded to 1.1.1.0 before upgrading to 1.2.x.x.
Also, there is no way to restore snapshots which are on 1.0.x.x to 1.2.x.x once they reach their end of life. Only way to restore those snapshots will be to restore the snapshot before end of life, upgrade the restored cluster, and then take a new snapshot of the upgraded cluster.

EKS upgrade of first of November

I've an EKS cluster in AWS.
[cloudshell-user#ip-10-0-87-109 ~]$ kubectl version --short
Kubeconfig user entry is using deprecated API version client.authentication.k8s.io/v1alpha1. Run 'aws eks update-kubeconfig' to update.
Client Version: v1.23.7-eks-4721010
Server Version: v1.20.11-eks-f17b81
[cloudshell-user#ip-10-0-87-109 ~]$ kubectl get nodes
Kubeconfig user entry is using deprecated API version client.authentication.k8s.io/v1alpha1. Run 'aws eks update-kubeconfig' to update.
NAME STATUS ROLES AGE VERSION
fargate-172.101.0.134 Ready <none> 13d v1.20.15-eks-14c7a48
fargate-172.101.0.161 Ready <none> 68d v1.20.15-eks-14c7a48
On Nov 1st AWS is going to update the server version to 1.21 because AWS going to end the support of 1.20.
What problems will come up? I read
no EKS features were removed in 1.21
What should I do in order to be safe?
With EKS version upgraded, it is based on the official Kubernetes upstream version upgrade so you better check what changes and deprecation between 1.20 and 1.21 and do they affect your current workload because of changes from APIs.
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.21.md#changelog-since-v1200
If yes, you have to prepare updating your manifests: https://aws.amazon.com/blogs/containers/preparing-for-kubernetes-api-deprecations-when-going-from-1-15-to-1-16/
From the official calendar, it says End Of Support on November 1st but there are some FAQs that you need to understand first: https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html#version-deprecation
On the end of support date, you can no longer create new Amazon EKS clusters with the unsupported version.
Amazon EKS will automatically upgrade existing control planes (not nodes) to the oldest supported version through a gradual deployment process after the end of support date. Amazon EKS does not allow control planes to stay on a version that has reached end of support.
After the automatic control plane update, you must manually update cluster add-ons and Amazon EC2 nodes.
To be safe, you have to
prepare your manifests to be updated with latest Kubernetes version that you are going to upgrade to.
proactively upgrade your EKS before AWS forces doing it.
remember to upgrade EKS add-ons, node groups and test any of your current controllers in the new version: https://docs.aws.amazon.com/eks/latest/userguide/update-managed-node-group.html
I suggest you just provision a test cluster with spot instances and try out your application first.
References:
You do not really need to perform any safety checks on your end, AWS will take care of the upgrade process (at least from 20 to 21. My recommendation will be to upgrade before AWS tries to upgrade the cluster, as once it reached the end of life, the upgrade can happen anytime
The only thing you need to update manually
Self-managed Node Group
Any add-on
The breaking change is the service account token expiry
Any service that depends on a service token account, keep in mind that now the token will expire in one hour and the service/pod need to refresh the token
Service account tokens now have an expiration of one hour. In previous Kubernetes versions, they didn't have an expiration.
On the end of support date, you can no longer create new Amazon EKS clusters with the unsupported version. Existing control planes are automatically updated by Amazon EKS to the earliest supported version through a gradual deployment process after the end of support date. After the automatic control plane update, make sure to manually update cluster add-ons and Amazon EC2 nodes.
kubernetes-versions-1.21

AWS Elasticsearch Update version 6.3 -> 7.1

I am newbie and started to use some AWS services these days. I have a little bit confused about how I update AWS ES version 6x to 7x, what I should concern about it? Any effect on the existed data on ES.
Thank you
Before performing an upgrade Amazon recommends performing the following steps:
Pre-upgrade checks – Amazon ES performs a series of checks for issues that can block an upgrade and doesn't proceed to the next step unless these checks succeed.
Snapshot – Amazon ES takes a snapshot of the Elasticsearch cluster and doesn't proceed to the next step unless the snapshot succeeds. If the upgrade fails, Amazon ES uses this snapshot to restore the cluster to its original state. For more information about this snapshot, see Can't Downgrade After Upgrade.
Upgrade – Amazon ES starts the upgrade, which can take from 15 minutes to several hours to complete. Kibana might be unavailable during some or all of the upgrade.
If you're using a single node you may experience downtime during this upgrade from between 15 minutes to several hours, however Amazon do perform checks prior to validate that your data will be compatible with the newer version. If you have a multi node setup, upgraded nodes will be rotated to not affect your service.

AWS EMR Spark 1.0

Is there a way to force Amazon EMR to use Spark 1.0.1? The current selectable versions stop at 1.4.1.
I am using the Alternating Least Squares implementation within MLlib, and since v1.1 they have implemented weighted regularization and for specific reasons (research study) I do not want this implementation, rather I am trying to access the non-weighted regularization version they had implemented in v1.0.
I am using Zepplin notebooks with Scala if that helps.
Is working with Zeppelin a requirement? Because if so, it could be very difficult. Zeppelin is compiled against a specific version of Spark so downgrading the jar will most likely fail.
Otherwise, if you are ok with not using Zeppelin and instead using the EMR Step API, then you might be able to spin up an EMR cluster with a bootstrap action that installs spark-assembly 1.0.1. I said it might work, because there's no guarantee that the current EMR version is compatible with a 2 year old version of Spark.
To create the cluster:
Create a cluster from the UI, make sure to uncheck Spark from the additional software menu
Add a custom bootstrap action and use the script at s3://support.elasticmapreduce/spark/install-spark with arguments -v 1.0.1
(See https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark for configuration options)
To run spark using the EMR Step API:
Upload your compiled jar to s3, then submit a step against that cluster
Cluster ID: the id of your cluster (ex j-XXXXXXXX)
Region of cluster. Where you created your EMR cluster. Ex us-west-2
Your spark main class: This is where you put your ml pipeline code.
Your jar: you have to upload the jar with your code to S3 so your cluster can download it
arg1, arg2: arguments to your main (optional)
aws emr add-steps --cluster-id --steps \
Name=SparkPi,Jar=s3://.elasticmapreduce/libs/script-runner/script-runner.jar,Args=[/home/hadoop/spark/bin/spark-submit,--deploy-mode,cluster,--master,yarn,--class,com.your.spark.class.MainApp,s3://>/your.jar,arg1,arg2],ActionOnFailure=CONTINUE
(Taken from the official github repo at https://github.com/awslabs/emr-bootstrap-actions/blob/master/spark/examples/spark-submit-via-step.md)
Also if that fails, install Hadoop and check out https://spark.apache.org/docs/1.0.1/running-on-yarn.html
Or you could also run 1.0.1 locally on your laptop if your data is small.
Good luck.
Amazon EMR provide a list of supported versions of software packages you can install by selecting a drop menu. Nothing stop you from installing additional custom software with a bootstrap action. I had some experience installing java 8 when EMR was supporting only Java 7. It is a bit painful but totally possible.
EMR supports Spark 1.6.0. Take a look at their latest release of emr-4.4.0: http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-whatsnew.html

Installing Impala 2.3 on Amazon EMR

I see that Impala 2.3 is only supported on Cloudera CDH 5.5 & above. Impala 2.2 can be installed on Amazon EMR as there is Bootstrap script available on GitHub & you don't require Cloudera installation.
However, I don't see any way to install Cloudera CDH 5.5 or 5.6 on Amazon EMR. I want to install Impala 2.3 so is there any way through which Impala 2.3 can be installed on Amazon EMR?
Well, my previous answer has been deleted as long as "does not provide an answer to the question". I'm not going to argue if it's better to have a partially incorrect answer to this question or if making categorical claims without foundation is a good answer :/.
In any case, I'm not giving up :)
Yes, it's possible to install "anything" on the paper.
Once you launch the EMR cluster, all instances will appear on your EC2 console. The only thing is that you have to be careful assigning the right permissions to access thru SSH to your instances. My suggestion is to create a specific security group with the access and assign this extra security group to the instances using the Advanced configuration of the cluster.
By having the proper configuration, you could ssh into any instance and install anything (you should be able to scp any file or download from internet if you have the proper configuration of your VPC). Note that the user will be "hadoop" instead "ec2-root" but this is documented on the EMR user guide.
Keep in mind that the cluster is "Terminated" so, the EMR instances are volatile and the installation is not going to survive the cluster termination.
On the other hand, using the latest versions of EMR AMIs and the latest capabilities of AWS (I think that it was all the time the case, but, it doesn't matter now) you should be able to create some actions on the bootstrap and install anything you want.
Using the "Advanced configuration" of your cluster, you can access to the "Bootstrap" actions to be executed on your cluster. You could even have different actions depending on the node type (master, core, tasks). You should store your scripts (and/or jar files) on an S3 bucket and made this bucket available to your cluster. On the paper, you could install Impala on these EC2 instances comprising the EMR cluster but I'm not sure if this will work.
For more information, you can read http://docs.aws.amazon.com//emr/latest/ManagementGuide/emr-plan-bootstrap.html
And for a previous version of EMR AMI and not so recent version of Impala you can read https://github.com/awslabs/emr-bootstrap-actions/tree/master/impala
Thanks Mark, you forced me to elaborate better my comment.
No, it is not possible to "install" anything on EMR because it's a PaaS provided by AWS. But if your goal is to run a newer version of Impala on AWS, there is an AWS Quick Start path for installing CDH 5.x (including Impala) that makes the process relatively easy.
http://aws.amazon.com/quickstart/