Ganglia and Amazon Elastic Map Reduce - install issues - elastic-map-reduce

Following the instructions for "Initializing Ganglia on a Job Flow" I get my cluster up but don't see any Ganglia process running (on 8157).
http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/init_Ganglia.html
elastic-mapreduce --create --alive --name "Tom's Daily Hive 8x Flow" --instance-type c1.medium --num-instances 8 --availability-zone us-east-1a --bootstrap-action "s3://elasticmapreduce/bootstrap-actions/install-ganglia" --stream
There is a 0K file in: /tmp/ganglia-installed
Any suggestions? Thanks!

OK ... so it turns out the instructions at Amazon are a bit light on the details.
Try here http://ecs-network.serv.pacific.edu/ecpe-293a/projects/commoncrawl-tutorial and here http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/UsingtheHadoopUserInterface.html#AccessingtheHadoopUserInterfacetoMonitorJobStatus2
and you will do much better.
I am light on port forwarding experience, and what they have in the EMR instructions led me down many wrong paths (guesses).

Your create cluster command looks OK. Set up FoxyProxy on Firefox like they mention in the same document you are referring to in your question.
After the set up, go to localhost:8157/ganglia, that should take you to your monitoring graphs!

Related

Orchestrating many instances from a local - Too frequent operations from the source resource

I have a local linux machine of the following flavor:
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.2 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
On that machine I have gcloud installed with the following version:
$ gcloud -v
Google Cloud SDK 334.0.0
alpha 2021.03.26
beta 2021.03.26
bq 2.0.66
core 2021.03.26
gsutil 4.60
On that machine I have the following shell script:
#!/bin/bash
client_name=$1
instance_name_arg=imageclient${client_name}
z=us-east1-b
gcloud beta compute --project=foo-123 instances create $instance_name_arg --zone=${z} --machine-type=g1-small --subnet=default --network-tier=PREMIUM --source-machine-image projects/foo-123/global/machineImages/envNameimage2
gcloud compute instances start $instance_name_arg --zone=${z}
sleep 10s
gcloud compute ssh --zone=${z} usr#${instance_name_arg} -- '/home/usr/miniconda3/envs/minienvName/bin/python /home/usr/git/bitstamp/envName/client_run.py --payload=client_packages_'${client_name}
gcloud compute instances stop $instance_name_arg --zone=${z}
As I am scaling this project, I am needing to launch this task many times at exactly the same time but with different settings.
As I am launching more and more of these bash scripts, I am starting to get the following error message:
ERROR: (gcloud.beta.compute.instances.create) Could not fetch resource:
- Operation rate exceeded for resource 'projects/foo-123/global/machineImages/envNameimage2'. Too frequent operations from the source resource.
How do I work around this issue?
As my solution is architected, I have a "one-machine to one-launch-setting" solution and would ideally want to persist this.
There must be other, large-scale gcloud customers who may or may not need a large number of machines spawned in parallel.
Thank you very much
As you are using the machine image envNameimage2 for creating the new instance, this is seen as a snapshot of the disk.
You can snapshot your disks at most once every 10 minutes. If you want to issue a burst of requests to snapshot your disks, you can issue at most 6 requests in 60 minutes.
Reference:
Snapshot frequency limits
Creating disks from snapshots
Anothers API rate limits
A workaround could be to follow the rate limits, or to create an instance using an existing (available) disk with the --disk flag.
--disk=name=clone-disk-1,device-name=clone-disk-1,mode=rw,boot=yes

installing Neo4j on AWS (instructions fail)

I recently spun up a t2.micro image and I want to install neo4j on it. I started with the instructions at https://neo4j.com/developer/neo4j-cloud-aws-ec2-ami/. But I got to the step for creating a security group and I received an error that a region needed to be supplied. Here is the command I used:
aws ec2 create-security-group \
--group-name $GROUP \
--description "Neo4j security group"
The error message was
You must specify a region. You can also configure your region by running "aws configure".
When I run this command I get prompted by a lot of stuff that don't seem related to region? Not only am I prompted for values that I don't know where/how to get them, when I am prompted for the region I am not sure the format to enter the region. So my question is how to I configure a security group so I can move on to installing neo4j on this instance?
There are still several steps to follow to install neo4j, but I seem to be tripped up on this step.
The commands expect a default region under ~/.aws/config
[default]
region=us-west-2
output=json
On the link that you have shared, there is a step to "Configure the AWS CLI with Your Credentials". This step allows you setup aws profile(s) and as part of those profiles, you can set a region.
Follow this link to understand how you can setup your aws profile with credentials and region details
https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html
Hope it helps

spark-submit from outside AWS EMR cluster

I have an AWS EMR cluster running spark, and I'd like to submit a PySpark job to it from my laptop (--master yarn) to run in cluster mode.
I know that I need to set up some config on the laptop, but I'd like to know what the bare minimum is. Do I just need some of the config files from the master node of the cluster? If so, which? Or do I need to install hadoop or yarn on my local machine?
I've done a fair bit of searching for an answer, but I haven't yet been able to be sure that what I was reading referred to launching a job from the master of the cluster or some arbitrary laptop...
If you want to run the spark-submit job solely on your AWS EMR cluster, you do not need to install anything locally. You only need the EC2 key pair you specified in the Security Options when you created the cluster.
I personally scp over any relevant scripts &/or jars, ssh into the master node of the cluster, and then run spark-submit.
You can specify most of the relevant spark job configurations via spark-submit itself. AWS documents in some more detail how to configure spark-submit jobs.
For example:
>> scp -i ~/PATH/TO/${SSH_KEY} /PATH/TO/PYSPARK_SCRIPT.py hadoop#${PUBLIC_MASTER_DNS}:
>> ssh -i ~/PATH/TO/${SSH_KEY} hadoop#${PUBLIC_MASTER_DNS}
>> spark-submit --conf spark.OPTION.OPTION=VALUE PYSPARK_SCRIPT.py
However, if you already pass a particular configuration when creating the cluster itself, you do not need to re-specify those same configuration options via spark-submit.
You can setup the AWS CLI on your local machine, put your deployment on S3, and then add an EMR step to run on the EMR cluster. Something like this:
aws emr add-steps --cluster-id j-xxxxx --steps Type=spark,Name=SparkWordCountApp,Args=[--deploy-mode,cluster,--master,yarn,--conf,spark.yarn.submit.waitAppCompletion=false,--num-executors,5,--executor-cores,5,--executor-memory,20g,s3://codelocation/wordcount.py,s3://inputbucket/input.txt,s3://outputbucket/],ActionOnFailure=CONTINUE
Source: https://aws.amazon.com/de/blogs/big-data/submitting-user-applications-with-spark-submit/

How to setup Kubernetes Master HA on AWS

What I am trying to do:
I have setup kubernete cluster using documentation available on Kubernetes website (http_kubernetes.io/v1.1/docs/getting-started-guides/aws.html). Using kube-up.sh, i was able to bring kubernete cluster up with 1 master and 3 minions (as highlighted in blue rectangle in the diagram below). From the documentation as far as i know we can add minions as and when required, So from my point of view k8s master instance is single point of failure when it comes to high availability.
Kubernetes Master HA on AWS
So I am trying to setup HA k8s master layer with the three master nodes as shown above in the diagram. For accomplishing this I am following kubernetes high availability cluster guide, http_kubernetes.io/v1.1/docs/admin/high-availability.html#establishing-a-redundant-reliable-data-storage-layer
What I have done:
Setup k8s cluster using kube-up.sh and provider aws (master1 and minion1, minion2, and minion3)
Setup two fresh master instance’s (master2 and master3)
I then started configuring etcd cluster on master1, master 2 and master 3 by following below mentioned link:
http_kubernetes.io/v1.1/docs/admin/high-availability.html#establishing-a-redundant-reliable-data-storage-layer
So in short i have copied etcd.yaml from the kubernetes website (http_kubernetes.io/v1.1/docs/admin/high-availability/etcd.yaml) and updated Node_IP, Node_Name and Discovery Token on all the three nodes as shown below.
NODE_NAME NODE_IP DISCOVERY_TOKEN
Master1
172.20.3.150 https_discovery.etcd.io/5d84f4e97f6e47b07bf81be243805bed
Master2
172.20.3.200 https_discovery.etcd.io/5d84f4e97f6e47b07bf81be243805bed
Master3
172.20.3.250 https_discovery.etcd.io/5d84f4e97f6e47b07bf81be243805bed
And on running etcdctl member list on all the three nodes, I am getting:
$ docker exec <container-id> etcdctl member list
ce2a822cea30bfca: name=default peerURLs=http_localhost:2380,http_localhost:7001 clientURLs=http_127.0.0.1:4001
As per documentation we need to keep etcd.yaml in /etc/kubernete/manifest, this directory already contains etcd.manifest and etcd-event.manifest files. For testing I modified etcd.manifest file with etcd parameters.
After making above changes I forcefully terminated docker container, container was existing after few seconds and I was getting below mentioned error on running kubectl get nodes:
error: couldn't read version from server: Get httplocalhost:8080/api: dial tcp 127.0.0.1:8080: connection refused
So please kindly suggest how can I setup k8s master highly available setup on AWS.
To configure an HA master, you should follow the High Availability Kubernetes Cluster document, in particular making sure you have replicated storage across failure domains and a load balancer in front of your replicated apiservers.
Setting up HA controllers for kubernetes is not trivial and I can't provide all the details here but I'll outline what was successful for me.
Use kube-aws to set up a single-controller cluster: https://coreos.com/kubernetes/docs/latest/kubernetes-on-aws.html. This will create CloudFormation stack templates and cloud-config templates that you can use as a starting point.
Go the AWS CloudFormation Management Console, click the "Template" tab and copy out the complete stack configuration. Alternatively, use $ kube-aws up --export to generate the cloudformation stack file.
User the userdata cloud-config templates generated by kube-aws and replace the variables with actual values. This guide will help you determine what those values should be: https://coreos.com/kubernetes/docs/latest/getting-started.html. In my case I ended up with four cloud-configs:
cloud-config-controller-0
cloud-config-controller-1
cloud-config-controller-2
cloud-config-worker
Validate your new cloud-configs here: https://coreos.com/validate/
Insert your cloud-configs into the CloudFormation stack config. First compress and encode your cloud config:
$ gzip -k cloud-config-controller-0
$ cat cloud-config-controller-0.gz | base64 > cloud-config-controller-0.enc
Now copy the content into your encoded cloud-config into the CloudFormation config. Look for the UserData key for the appropriate InstanceController. (I added additional InstanceController objects for the additional controllers.)
Update the stack at the AWS CloudFormation Management Console using your newly created CloudFormation config.
You will also need to generate TLS asssets: https://coreos.com/kubernetes/docs/latest/openssl.html. These assets will have to be compressed and encoded (same gzip and base64 as above), then inserted into your userdata cloud-configs.
When debugging on the server, journalctl is your friend:
$ journalctl -u oem-cloudinit # to debug problems with your cloud-config
$ journalctl -u etcd2
$ journalctl -u kubelet
Hope that helps.
There is also kops project
From the project README:
Operate HA Kubernetes the Kubernetes Way
also:
We like to think of it as kubectl for clusters
Download the latest release, e.g.:
cd ~/opt
wget https://github.com/kubernetes/kops/releases/download/v1.4.1/kops-linux-amd64
mv kops-linux-amd64 kops
chmod +x kops
ln -s ~/opt/kops ~/bin/kops
See kops usage, especially:
kops create cluster
kops update cluster
Assuming you already have s3://my-kops bucket and kops.example.com hosted zone.
Create configuration:
kops create cluster --state=s3://my-kops --cloud=aws \
--name=kops.example.com \
--dns-zone=kops.example.com \
--ssh-public-key=~/.ssh/my_rsa.pub \
--master-size=t2.medium \
--master-zones=eu-west-1a,eu-west-1b,eu-west-1c \
--network-cidr=10.0.0.0/22 \
--node-count=3 \
--node-size=t2.micro \
--zones=eu-west-1a,eu-west-1b,eu-west-1c
Edit configuration:
kops edit cluster --state=s3://my-kops
Export terraform scripts:
kops update cluster --state=s3://my-kops --name=kops.example.com --target=terraform
Apply changes directly:
kops update cluster --state=s3://my-kops --name=kops.example.com --yes
List cluster:
kops get cluster --state s3://my-kops
Delete cluster:
kops delete cluster --state s3://my-kops --name=kops.identityservice.co.uk --yes

"Unknown Options" Error for AWS CLI w/ EC2

I am attempting to test the creation of a spot instance from the CLI (on Windows). I am following the CLI documentation precisely as far as I can tell but I keep receiving an "unknown options" error as follows:
C:\Python27\Scripts>aws ec2 request-spot-instances --spot-price 0.04 --type persistent
--availability-zone us-east-1a
--block-device-mapping "/dev/sdb=snap-ec0f8df5::false" ami-e55a7e8c
Unknown options: --block-device-mapping, ami-e55a7e8c, /dev/sdb=snap-ec0f8df5::false
I don't see what I'm doing wrong... Thanks for you help!
I ultimately found that at the time, there were separate CLI tools for AWS in general & EC2.