How to configure Cassandra in GCP to remotely connect? - google-cloud-platform

I am following the below steps to install and configure Cassandra in GCP.
It works perfectly as long as working with Cassandra within GCP.
$java -version
$echo "deb http://downloads.apache.org/cassandra/debian 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
$curl https://downloads.apache.org/cassandra/KEYS | sudo apt-key add -
$sudo apt install apt-transport-https
$sudo apt-get update
$sudo apt-get install cassandra
$sudo systemctl status cassandra
//Active: active (running)
$nodetool status
//Datacenter: datacenter1
$tail -f /var/log/cassandra/system.log
$find /usr/lib/ -name cqlshlib
##/usr/lib/python3/dist-packages/cqlshlib
$export PYTHONPATH=/usr/lib/python3/dist-packages
$sudo nano ~/.bashrc
//Add
export PYTHONPATH=/usr/lib/python3/dist-packages
//save
$source ~/.bashrc
$python --version
$cqlsh
//it opens cqlsh shell
But I want to configure Cassandra to remotely connect.
I tried the following 7 different solutions.
But still I am getting the error.
1.In GCP,
VPC network -> firewall -> create
IP 0.0.0.0/0
port tcp=9000,9042,8088,9870,8123,8020, udp=9000
tag = hadoop
Add this tag in VMs
2.rm -Rf ~/.cassandra
3.sudo nano ~/.cassandra/cqlshrc
[connection]
hostname = 34.72.70.173
port = 9042
4. cqlsh 34.72.70.173 -u cassandra -p cassandra
5. firewall - open ports
https://stackoverflow.com/questions/2359159/cassandra-port-usage-how-are-the-ports-used
9000,9042,8088,9870,8123,8020,7199,7000,7001,9160
6. Get rid of this line: JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=localhost"
Try restart the service: sudo service cassandra restart
If you have a cluster, make sure that ports 7000 and 9042 are open within your security group.
7. you can set the environment variable $CQLSH_HOST=1.2.3.4. Then simply type cqlsh.
https://stackoverflow.com/questions/20575640/datastax-devcenter-fails-to-connect-to-the-remote-cassandra-database/20598599#20598599
sudo nano /etc/cassandra/cassandra.yaml
listen_address: localhost
rpc_address: 34.72.70.173
broadcast_rpc_address: 34.72.70.173
sudo service cassandra restart
sudo nano ~/.bashrc
export CQLSH_HOST=34.72.70.173
source ~/.bashrc
sudo systemctl restart cassandra
sudo service cassandra restart
sudo systemctl status cassandra
nodetool status
Please suggest how to get rid of the following error
Connection error: ('Unable to connect to any servers', {'127.0.0.1:9042': ConnectionRefusedE
rror(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})

This indicates that when you ran cqlsh, you didn't specify the public IP:
Connection error: ('Unable to connect to any servers', \
{'127.0.0.1:9042': ConnectionRefusedError(111, "Tried connecting to [('127.0.0.1', 9042)]. \
Last error: Connection refused")})
When running Cassandra nodes on public clouds, you need to configure cassandra.yaml with the following:
listen_address: private_IP
rpc_addpress: public_IP
The listen address is the what Cassandra nodes use for communicating with each other privately, e.g. gossip protocol.
The RPC address is what clients/apps/drivers use to connect to nodes on the CQL port (9042) so it needs to be set to the nodes' public IP address.
To connect to a node with cqlsh (a client), you need to specify the node's public IP:
$ cqlsh <public_IP>
Cheers!

Related

Host key verification failed in google compute engine based mpich cluster

TLDR:
I have 2 google compute engine instances, I've installed mpich on both.
When I try to run a sample I get Host key verification failed.
Detailed version:
I've followed this tutorial in order to get this task done: http://mpitutorial.com/tutorials/running-an-mpi-cluster-within-a-lan/.
I have 2 google compute engine vms with ubuntu 14.04 (the google cloud account is a trial one, btw). I've downloaded this version of mpich on both instances: http://www.mpich.org/static/downloads/3.3rc1
/mpich-3.3rc1.tar.gz and I installed it using these steps:
./configure --disable-fortran
sudo make
sudo make install
This is the way the /etc/hosts file looks on the master-node:
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
169.254.169.254 metadata.google.internal metadata
10.128.0.3 client
10.128.0.2 master
10.128.0.2 linux1.us-central1-c.c.ultimate-triode-161918.internal linux
1 # Added by Google
169.254.169.254 metadata.google.internal # Added by Google
And this is the way the /etc/hosts file looks on the client-node:
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
169.254.169.254 metadata.google.internal metadata
10.128.0.2 master
10.128.0.3 client
10.128.0.3 linux2.us-central1-c.c.ultimate-triode-161918.internal linux
2 # Added by Google
169.254.169.254 metadata.google.internal # Added by Google
The rest of the steps involved adding an user named mpiuser on both nodes and configuring passwordless ssh authentication between the nodes. And configuring a cloud shared directory between nodes.
The configuration worked till this point. I've downloaded this file https://raw.githubusercontent.com/pmodels/mpich/master/examples/cpi.c to /home/mpiuser/cloud/mpi_sample.c, compiled it this way:
mpicc -o mpi_sample mpi_sample.c
and issued this command on the master node while logged in as the mpiuser:
mpirun -np 2 -hosts client,master ./mpi_sample
and I got this error:
Host key verification failed.
What's wrong? I've tried to troubleshoot this problem over more than 2 days but I can't get a valid solution.
Add
package-lock.json
in ".gcloudignore file".
And deploy it again.
It turned out that my password less ssh wasn't configured properly. I've created 2 new instances and did the following things to get a working password less and thus get a working version of that sample. The following steps were execute on an ubuntu server 18.04.
First, by default, instances on google cloud have PasswordAuthentication setting turned off. In the client server do:
sudo vim /etc/ssh/sshd_config
and change PasswordAuthentication no to PasswordAuthentication yes. Then
sudo systemctl restart ssh
Generate a ssh key from the master server with:
ssh-keygen -t rsa -b 4096 -C "user.mail#server.com"
Copy the generated ssh key from the master server to the client
ssh-copy-id client
Now you get a fully functional password less ssh from master to client. However mpich still failed.
The additional steps that I did was to copy the public key to the ~/.ssh/authorized_keys file, both on master and client. So execute this command from both servers:
sudo cat .ssh/id_rsa.pub >> .ssh/authorized_keys
Then make sure the /etc/ssh/sshd_config files from both the client and server have the following configurations:
PasswordAuthentication no
ChallengeResponseAuthentication no
UsePAM no
Restart the ssh service from both client and master
sudo systemctl restart ssh
And that's it, mpich works smoothly now.

Trouble Connecting to PostgreSQL Running in a Ubuntu VM

I have created an instance of PostgreSQL running in a Ubuntu/Bionic box in Vagrant/VirtualBox that will be used by Django in my dev environment. I wanted to test my ability to connect to it with either the terminal or pgAdmin before connecting with DJango, just to be sure it was working on that end first; the idea being that I could make later Django debugging easier if I am assured the connection works; but, I've had no success.
I have tried editing the configuration files that many posts suggest, with no effect. I can, however, ping the box via the ip assigned in the Vagrantfile with no issue - but not when specifying port 5432 with ping 10.1.1.1:5432. I can also use psql from within the box, so it's running.
I have made sure to enable ufw on the vm, created a rule to allow port 5432 and insured that it took using sudo ufw status. I have also confirmed that I'm editing the correct files using the show command within psql.
Here are the relevant configs as they currently are:
Vagrantfile:
Vagrant.configure("2") do |config|
config.vm.hostname = "hg-site-db"
config.vm.provider "virtualbox" do |v|
v.memory = 2048
v.cpus = 1
end
config.vm.box = "ubuntu/bionic64"
config.vm.network "forwarded_port", host_ip: "127.0.0.1", guest: 5432, host: 5432
config.vm.network "public_network", ip: "10.1.1.1"
config.vm.provision "shell", inline: <<-SHELL
# Update and upgrade the server packages.
sudo apt-get update
sudo apt-get -y upgrade
# Install PostgreSQL
sudo apt-get install -y postgresql postgresql-contrib
# Set Ubuntu Language
sudo locale-gen en_US.UTF-8
SHELL
end
/etc/postgresql/10/main/postgresql.conf:
listen_addresses = '*'
/etc/postgresql/10/main/pg_hba.conf - I am aware this is insecure, but I was just trying to find out why it was not working, with plans to go back and correct this:
host all all 0.0.0.0/0 trust
As we discussed in comments, you should remove host_ip from your forwarded port definition and just leave the guest and host ports.

CoreOS fleetctl list-machines not showing 3 machines

I am following the DigitalOcean tutorial on CoreOS (https://www.digitalocean.com/community/tutorials/how-to-create-flexible-services-for-a-coreos-cluster-with-fleet-unit-files). When I do a fleetctl list-machines command on node 1 and node 2, I am not able to see all the 3 machines listed but just one for it's own node. The following is what I see:
core#coreos-1 ~ $ fleetctl list-machines
MACHINE IP METADATA
XXXX... 10.abc.de.fgh -
I logged onto my 3rd node and noticed that when I do a fleetctl list-machines I get the following error:
core#coreos-3 ~ $ fleetctl list-machines
Error retrieving list of active machines: googleapi: Error 503: fleet server unable to communicate with etc
What should I do to find out what is the problem and how to resolve this? I have tried rebooting and other things mentioned but nothing is helping.
What happened was that I had a etcd dependencies in my unit file where I had such as following:
# Dependency ordering
After=etcd.service
I think I needed etcd2 instead.
So I did the following as directed:
sudo systemctl stop fleet.service fleet.socket etcd
sudo systemctl start etcd2
sudo systemctl reset-failed
I had to clean up on the instance that had the file when I queried for it:
core#coreos1 ~ $ etcdctl ls /_coreos.com/fleet/job
/_coreos.com/fleet/job/apache.1.service
/_coreos.com/fleet/job/apache#.service
/_coreos.com/fleet/job/apache#80.service
/_coreos.com/fleet/job/apache#9999.service
/_coreos.com/fleet/job/apache-discovery.1.service
/_coreos.com/fleet/job/apache-discovery#.service
/_coreos.com/fleet/job/apache-discovery#80.service
/_coreos.com/fleet/job/apache-discovery#9999.service
by issuing
etcdctl ls /_coreos.com/fleet/job/apache.1.service
etcdctl rm --recursive /_coreos.com/fleet/job/apache-discovery.1.service
Then I started fleet
sudo systemctl start fleet
And when I did a fleetctl list-machines again it showed all my instances connected.

docker container port mapping issue

I think I am missing something obvious but I can't seem to crack this one. I am trying to map a port from a django application running uwsgi in a docker container to my local Macintosh host. Here is the setup.
Mac 10.11 running docker-machine 0.5.1 with virtualbox 5.0.10 and docker 1.9.1
I created a server with docker-machine setup my docker file and successfully built my docker container. In the container I have the following command
# Port to expose
EXPOSE 8000
Which maps to the port used via uwsgi inside the container. When I runt he container via
eval "$(docker-machine env dev)"
docker-machine ip dev
192.168.99.100
docker run -P launch
The container starts properly. If I enter the container and perform a
curl http://localhost:8000
I get my HTML as I would expect. On the outside a docker inspect container_id gets me a
"Ports": {
"8000/tcp": [
{
"HostIp": "0.0.0.0",
"HostPort": "32768"
}
]
},
So i can see the mapping to 32768 on the docker-machine host of 192.168.99.100 as from the above commands. However whenever I try and curl http://192.168.99.100:32768
curl http://192.168.99.100:32768
curl: (7) Failed to connect to 192.168.99.100 port 32768: Connection refused
So any thoughts on this?? Everything should work as I understand docker.
Thanks
Craig
Since you are running through a VirtualBox VM, I would still recommend mapping the port on the VirtualBox level, as I mention in "How to connect mysql workbench to running mysql inside docker?"
VBoxManage controlvm "boot2docker-vm" --natpf1 "tcp-port8000 ,tcp,,8000,,8000"
VBoxManage controlvm "boot2docker-vm" --natpf1 "udp-port8000 ,udp,,8000,,8000"
And run the container with an explicit port mapping (instead of the random -P)
docker run -p 8000:8000 launch

ubuntu rabbitmq - Error: unable to connect to node 'rabbit#somename: nodedown

I am using celery for django which needs rabbitmq. Some 4 or 5 months back, it used to work well. I again tried using it for a new project and got below error for rabbitmq while listing queues.
Listing queues ...
Error: unable to connect to node 'rabbit#somename': nodedown
diagnostics:
- nodes and their ports on 'somename': [{rabbitmqctl23014,44910}]
- current node: 'rabbitmqctl23014#somename'
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: XfMxei3DuB8GOZUm1vdUsg==
Whats the solution? If there is no good solution, can I uninstall and reinstall rabbitmq ?
I had installed rabbit as a service apparently and the
sudo rabbitmqctl force_reset
command was not working.
sudo service rabbitmq-server restart
Did exactly what I need.
P.S. I made sure I was the root user to do the previous command
sudo su
if you need change hostname:
sudo aptitude remove rabbitmq-server
sudo rm -fr /var/lib/rabbitmq/
set new hostname:
hostname newhost
in file /etc/hostname set new value hostname
add to file /etc/hosts
127.0.0.1 newhost
install rabbitmq:
sudo aptitude install rabbitmq-server
done
Check if the server is running by using this command:
sudo service rabbitmq-server status
If it says
Status of all running nodes...
Node 'rabbit#ubuntu' with Pid 26995:
running done.
It's running.
In my case, I accidentally ran the rabbitmqctl command with a different user and got the error you mentioned.
You might have installed it with root, try running
sudo rabbitmqctl stop_app
and see what the response is.
(If everything's fine, run
sudo rabbitmqctl start_app
afterwards).
Double check that your cookie hash file is the same
Double check that your machine name (uname) is the same as the one stated in your configuration — this one can be tricky
And double check that you start rabbitmq with the same user as the one you installed it. Just using 'sudo' won't do the trick.