connecting pyarrow with libhdfs3 - hdfs

I'm trying to connect to a hadoop cluster via pyarrows' HdfsClient / hdfs.connect().
I noticed pyarrows' have_libhdfs3() function, which returns False.
How does one go about getting the required hdfs support for pyarrow? I understand there's a conda command for libhdfs3, but I pretty much need to make it work through some "vanilla" way that doesn't involve things like conda.
If it's of importance, the files I'm interested in reading are parquet files.
EDIT:
The creators of hdfs3 library have made a repo that allows installing libhdfs3:
http://hdfs3.readthedocs.io/en/latest/install.html

I don't know of a way to get libhdfs3 except through conda-forge, or building from source. You will need to conda install libhdfs3=2.2.31 since there was a breaking API change that made libhdfs3 have a different ABI from libhdfs that we have not addressed in Arrow yet. See https://issues.apache.org/jira/browse/ARROW-1445 (patches welcome)

On ubuntu this worked for me -
echo "deb https://dl.bintray.com/wangzw/deb trusty contrib" | sudo tee /etc/apt/sources.list.d/bintray-wangzw-deb.list
sudo apt-get install -y apt-transport-https
sudo apt-get update
sudo apt-get install libhdfs3 libhdfs3-dev
It should work on other Linux distros as well using the appropriate installer.
Taken from:
http://hdfs3.readthedocs.io/en/latest/install.html

Related

no such option --system for pip install

I'm attempting to deploy a very basic trading system to AWS using serverless (following along with this link), but I have a bit of a problem.
Prior to running the deployment command, I'm supposed to run
pip3 install -r requirements.txt -t . --system
but I am getting an error message saying 'no such option: --system'
Initially, I just tried to install the packages without the --system option, but I think that's causing the cron lamda(??) function to fail when I execute it manually through the serverless console because it's not finding the requisite modules.
I'm assuming it's because they aren't being installed properly so my question is how then should I install them so this doesn't happen?
Running
pip3 install -r requirements.txt
alone (while in the trading system directory) does not suffice.
So, what should I do?
The original author was working on an older Debian-derived system, you aren't. You can safely omit this option if it's not supported.
I don't have an authoritative link available, although this came up in a Google search. But here's my summary:
With older Debian-derived systems (eg, Ubuntu 18.04), the --user flag was enabled by default and it overrode the -t flag, so all packages would be installed in the $HOME/.local. The --system flag was nominally intended to allow installation in the system package directory, but in practice it was needed to enable -t.
This is fixed for Debian-derived systems that default to Python 3 (eg, Ubuntu 20.04).
It was never an issue for non-Debian systems (eg, EC2 Linux).
Since you don't seem to be familiar with pip, the -r argument tells it to use a file containing dependencies, and the -t argument tells it to install those dependencies in the current directory (not a great habit, but I don't want to describe virtual environments).

Why is desired version of libboost-all-dev not found when building Docker container?

I'm trying to build a basic Docker container based on a tutorial. I am on Windows 10 Home version 2004, and I am using the standard command line. I've created the following Docker file to facilitate this, with the only change from the tutorial's version being my older version of gcc:
FROM gcc:6.3.0
RUN apt-get -qq update
RUN apt-get -qq upgrade
RUN apt-get -qq install cmake
RUN apt-get install libboost-all-dev=1.62.0.1
RUN apt-get -qq install build-essential libtcmalloc-minimal4 && \
ln -s /usr/lib/libtcmalloc_minimal.so.4 /usr/lib/libtcmalloc_minimal.so
Once the script gets to the step where it tries to install libboost-all-dev I get the following output:
Reading package lists...
Building dependency tree...
Reading state information...
E: Version '1.62.0.1' for 'libboost-all-dev' was not found
The command '/bin/sh -c apt-get install libboost-all-dev=1.62.0.1' returned a non-zero code: 100
and the build stops.
I've tried updating the build script to use the current version of Boost (1.74.0) as well and get the same issue. I'm not really finding any solutions in my research online and the output is not very helpful in trying to figure out what the issue is. Could anyone with more experience with installing Boost as part of the Docker process point me in the right direction?
The package manager will only be able to install versions of Boost that it knows exist, based on the enabled package manager repositories. There is typically only one version of Boost in the default repositories. In my experience, this applies to any Linux OS that supplies Boost, not only those that are run within a Docker container.
The Docker image you started with, gcc:6.3.0, appears to have only Boost version 1.55.0.2, so requesting any other version will yield the same error.
If you want a different version of Boost in your image, you can follow the typical steps for installing a different version of Boost outside a Docker container. These steps are well-documented on Stack Overflow, or you might find a repository such as this to enable in your package manager to directly install it from apt-get.

RHEL: This system is currently not set up to build kernel modules

I am trying to install virtualbox5.2 on a RHEL 7 VM When I try to rebuild kernels modules I get the following error:
[root#myserver~]# /usr/lib/virtualbox/vboxdrv.sh setup
vboxdrv.sh: Stopping VirtualBox services.
vboxdrv.sh: Building VirtualBox kernel modules.
This system is currently not set up to build kernel modules.
Please install the Linux kernel "header" files matching the current kernel
for adding new hardware support to the system.
The distribution packages containing the headers are probably:
kernel-devel kernel-devel-3.10.0-693.11.1.el7.x86_64
I tried install kernet-devel and got success message
Installed:
kernel-devel.x86_64 0:3.10.0-693.21.1.el7
Complete!
But still the setup fails.
Any idea what is missing here?
sudo yum install -y "kernel-devel-$(uname -r)"
Substitute dnf on Fedora. I didn't need to do a reboot, but ymmv.
Edit for 2020:
Centos/RHEL 8 now also use dnf instead of yum. I haven't had occasion to test this on those distros, so the same YMMV disclaimer still applies.
First run in terminal: uname -r then you will get name and information about current kernel (CURRENT_KERNEL).
Now you can install with command: yum install kernel-devel-CURRENT_KERNEL
Note: replace CURRENT_KERNEL with string you get from uname -r.
The same message happened when I tried to upgrade VirtualBox 5.2.12 Guest Additions on my Kali Linux (GNU/Linux Rolling version). I fixed it by following steps:
Do apt update/upgrade to keep your system up-to-date. Do not forget to reboot the system.
Run "apt-get install linux-headers-$(uname -r)".
Run VBoxLinuxAdditions.run from terminal, error message gone and Guest Additions will be installed successfully.
Reboot system, Guest Additions works fine.
I got here looking for the same answer for CentOS 6, and the above answers worked with slight modification (so, for anyone else that lands here too)...
yum install -y kernel-devel kernel-devel-$(uname -r)
So, "yum" instead of "apt-get"
Also, some Linux use "linux-headers" instead of "kernel-devel" but the principle seems to be the same.
The kernel your were using was kernel-devel-3.10.0-693.11.1.el7.x86_64 is slightly different with the one that you installed kernel-devel.x86_64 0:3.10.0-693.21.1.el7 . In my case, there are several different version installed on my OS, and "sudo yum install kernel-devel" always install the newest one for me. Then I work it out by setting my default kernel version as same as yum installed for me. You can check the kernel you have installed on your OS by following command:
sudo awk -F\' '$1=="menuentry " {print i++ " : " $2}' /etc/grub2.cfg
Then just set the kernel version you choose to use as same as yum choose for you,by following command:(notice that the number at last is pick up from preceding command result),
sudo grub2-set-default 0
generate the grub2 config with 'gurb2-mkconfig' command, and then reboot the server.
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
sudo reboot
Milan Rakos is right. Your installed kernel-devel must have suffix string exactly the same as the uname -r output. Besides, the logs during the vboxdrv.sh setup also shows the wanted version of the kernel-devel.
So to your case, You will run the command:sudo yum install kernel-devel-3.10.0-693.11.1.el7.x86_64
to solve this problem I ran yum update -y. I think this is the fastest way to solve it. Another solution is to configure the repos with the installation DVD, so you can install the kernel-headers of your current version of CentOS.
My History:
yum install epel-release
yum install perl gcc dkms kernel-devel kernel-headers make bzip2
yum groupinstall "Development tools"
yum update -y
reboot
After that, I mount de VBoxGuestAdditions and I ran the process
yum install kernel-devel-3.10.0-693.11.1.el7.x86_64 fixed the issue.
A little late to the party but I just ran into this problem myself and here's what I did to resolve the issue.
yum update -y
yum install -y redhat-lsb-core net-tools kernel-headers kernel-devel epel-release
yum groupinstall -y "Development Tools"
reboot
Ensure your system has been fully updated when you ran yum update -y before continuing!
Cheers

Can I install using Macports both py27 and py34 ports in the same location?

I've been using Python3.4 to complete certain tasks, though I still use Python2.7 as default.
I think I should be able to begin downloading py34 ports from using sudo port install py34-whatever in the same location as my Python2.7 ports.
However, I am running into significant downloading errors doing this.
Is it possible to download both py27 and py34 ports into the same location? Will there be problems doing this?
Your problems appear to be a generic Macports download problem. Resetting the download process via sudo port clean <portname> should help.
As to the general question of using multiple versions:
Macports allows you to install an arbitrary number of different versions in parallel. You switch between them using port select --set <application> <portname>, for example sudo port select --set python python34.
For easier access, you can define your own shell alias (e.g. python3 or python34), pointing to /opt/local/bin/python34.
My personal experience is that Anaconda makes these types of tasks painless. All the while providing the same functionality. http://docs.continuum.io/anaconda/install
Suppose you want an isolated environment for py27:
http://conda.pydata.org/docs/using/envs.html#create-an-environment
conda create --name py27 python==2.7.10
To use the environment:
source activate py27
To install a package, conda install or pip install.
If you want a Python 3.4 environment just change the above command a bit. I have no affiliation with Anaconda, and I would guess other Python distros work just as well. This just made things easier for me, hope it does for others as well!

How to install Docker from the Source code?

I am trying to install docker from the source code downloaded from github.com/docker/docker
I am unable to install it from the source code .
The Makefile present creates a image , but i want to install it in my system.
Can anyone suggest solution ?
I am using UBUNTU 14.04
Well, idk if this works for your linux distro. (looks like it is ubuntu) but i run kali linux and even if we have different commands to use the process is just as same in every linux distro.
first, before we jump on, we need to update our linux repos.(repositories)
sudo apt update
and,
sudo apt-get update
then,
sudo apt install git
[This installs git]
Now we can start cloning git repos. into our system
go to your desired folder/working directory and type:
sudo git clone "link of the git repo. without the commas"
i would better suggest you to just:
sudo apt install docker.io
[To install docker by apt]
it's better to install it via the docker package and update it to the last version. This is the best way to install docker.