Browsing hdfs not work - hdfs

I installed HDFS using Cloudera Manager 5 . Then i tried to browse http://localhost:50070/ (Hdfshealth) it was not working !
please help !

In your question, you are saying that you are browsing Hadoop. Does that mean you are listing hadoop file system? 50070 is a default namenode port so try to access Cloudera manager on port 7180.

Related

Problems Integrating Hadoop 3.x on Flink cluster

I am facing some issues while trying to integrate Hadoop 3.x version on a Flink cluster. My goal is to use HDFS as a persistent storage and store checkpoints. I am currectly using Flink 1.13.1 and HDFS 3.3.1. The error that I am getting while trying to submit a job is that HDFS is not supported as a file system. In the standalone version, this error was solved by specifying the HADOOP_CLASSPATH on my local machine. As a next step I applied the solution above in all the machines that are used in my cluster and in the standalone version I managed to successfully submit my jobs in all of them without facing any issues. However, when I started modifying the configurations to setup my cluster (by specifying the IPs of my machines) that problem came up once again. What I am missing?
In Hadoop 2.x there are the pre-bundled jar files in the official flink download page that would solve similar issues in the past but that's not the case with Hadoop 3.x versions
It should be enough to set HADOOP_CLASSPATH on every machine in the cluster.
For anyone still struggling with a similar issue, the answer proposed by David worked for me in the end. The detail that I was missing was in the definition of the environment variables.
In my initial attempts, I was using the .bashrc script to permanently define my environment variables. This works in the standalone cluster which is not the case with a distributed cluster due to the scope of the script. What actually worked for me was defining my variables(and $HADOOP_CLASSPATH) in the /etc/profile
I also managed to find another solution while was struggling with HADOOP_CLASSPATH. As I mentioned in my initial post, in Hadoop 2.x there are pre-bundled jar files in the official Flink download page to support HDFS integration, which is not the case in Hadoop 3.x. I found the following maven repository page and after testing all of the existing jars I managed to find one that worked in my case. To be more precise, for Hadoop 3.3.1 the 3.1.1.7.2.8.0-224-9.0 jar (Placed the jar in the $FLINK_HOME/lib) worked. While it is not an "official solution" it seems to solve the issue.
https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop-3-uber?repo=cloudera-repos

can't connect airflow to google cloud

I am trying to setup a Google Cloud Platform connection with apache airflow using JSON key file,
I got this error when I trigger the DAG:
ERROR - [Errno 2] No such file or directory: 'C:/AIRFLOW1/secret/key.json'
I tried '/AIRFLOW1/secret/key.json' also got the same issue.
Am using Windows 10 as OS and Ubuntu 20.04 as a subsystem where my airflow webserver is running.
Thank you guys, I have solved the problem myself.
I share the solution :
instead of defining the keyPath in this way: "C:/AIRFLOW1/secret/key.json"
the right way is to define it like that: "/c/AIRFLOW1/secret/key.json"

How to setup and use Kafka-Connect-HDFS in HDP 2.4

I want to use kafka-connect-hdfs on hortonworks 2.4. Can you please help me with the steps i need to follow to setup in HDP env.
Other than building Kafka Connect HDFS from source, you can download and extract Confluent Platform's TAR.GZ files on your Hadoop nodes. That doesn't mean you are "installing Confluent"
Then you can cd /path/to/confluent-x.y.z/
And run Kafka Connect from there.
./bin/connect-standalone ./etc/kafka/connect-standalone.properties ./etc/kafka-connect-hdfs/quickstart-hdfs.properties
If that is working for you, then in order to run connect-distributed (the recommended way to run Kafka Connect), you need to download the same thing on the rest of the machines you want to run Kafka Connect on.

Send file to Jenkins from web server

I have a web server is running in Ubuntu (AWS EC2) and I would like to send a file on it. To do that I would like to use Jenkins but I didn't find a plugin or a good configuration to do it.
The problem when I configure a plugin or something else in Jenkins they ask a password, so my password to the server is encrypted by ssh and they cannot read it.
I tried with :
FTP repository hosts
Publish over FTP
Publish over SSH
Is there someone can help me please ?
Thank you in advance.
I found the solution. In fact it was a problem with the access. I used this command : sudo chown -R ubuntu:ubuntu [Directory] where I have my files. Then when I launched the build it was succeed.
Hope this help.
Thank you

Zeppelin: how to download or save the Zeppelin notebook?

I am using Zeppelin sandbox with aws EMR.
Is there a way to download or save the zeppelin notebook in a way so that it can be imported into another Zeppelin server ?
As noted in the comments above, this feature is available starting in version 0.5.6. You can find more details in the release notes. Downloading and installing this version would solve that issue.
Given that you are using EMR, it looks like you will have to work with the version available. As Samuel mentioned above, you can backup the contents of the incubator-zeppelin/notebook folder and make the transfer.