AWS Linux Server install R package - amazon-web-services

I try to install the package "data.table" (and "aws.s3)" via Rstudio Server on an Amazon Linux instance following this guide:
http://stanke.co/category/r/
Unfortunately, I get the following error message. I really don't know what else to do.
Can anybody help? I installed devtools and I am able to install other packages such as xml2, devtools and deplyr.

I had the same issue on AWS and already fixed.
You need first install gcc64 and openmp shared support library.
sudo yum install gcc64
sudo yum install libgomp
Then under your user home create an .R folder with a Makevars file in it, with the following content (it will tell to R which compiler to use):
CC = /usr/bin/gcc64
CXX = /usr/bin/g++
SHLIB_OPENMP_CFLAGS = -fopenmp
I hope it's working for you as well ...

You need to install dmlc-core.
This link will provide more information:
A common bricks library for building scalable and portable distributed machine learning

based on https://github.com/RcppCore/RcppArmadillo/issues/200, I think this issue is due to a g++ compatability issue. It might also explain why when I installed devtools it kept giving me [-Wdeprecated-declarations]
so run:
sudo yum remove gcc72-c++.x86_64 libgcc72.x86_64

yum install R-devel
Then you should be able to run the installation command.

Related

How to install the lustre client on Ubuntu nodes?

I am trying to install the lustre clients on Unbuntu 20.04 nodes I have in GCP. Im using linux kernel version 5.15.0-1021-gcp.
I'm trying to install the client with the following code:
cd /home/apps/
mkdir lustre
git clone git://git.whamcloud.com/fs/lustre-release.git
cd lustre-release
git checkout 2.15.0
sh autogen.sh
./configure --prefix=/home/apps/lustre --disable-server --enable-client ## doesnt run! Fails at ./configures with error message "error: Run make config in /lib/modules/5.15.0-1021-gcp/build"
make debs
The configure step fails with an error about running make config in /lib/modules/5.15.0-1021-gcp/build. I tried running make config in /lib/modules/5.15.0-1021-gcp/build but was asked to input some values that I was unsure of.
I also tried downloading the deb package of the client software at
https://downloads.whamcloud.com/public/lustre/lustre-2.15.0/ubuntu2004/client/lustre-client-modules-5.4.0-96-generic_2.15.0-1_amd64.deb. However this is for the wrong linux kernel and I'm not sure what env variables need to be set for this package.
Anyone know how to install the client modules for lustre on Ubuntu?
You need to have the kernel sources or kernel-devel package that exactly match the kernel that you are installing on. This should also include the .config file that describes all of the options used when building your kernel.
Alternately, you could try a pre-built package, but it isn't clear if this will install on your kernel or not.
https://build.whamcloud.com/job/lustre-b2_15/40/arch=x86_64,build_type=client,distro=ubuntu2204,ib_stack=inkernel/

Why is desired version of libboost-all-dev not found when building Docker container?

I'm trying to build a basic Docker container based on a tutorial. I am on Windows 10 Home version 2004, and I am using the standard command line. I've created the following Docker file to facilitate this, with the only change from the tutorial's version being my older version of gcc:
FROM gcc:6.3.0
RUN apt-get -qq update
RUN apt-get -qq upgrade
RUN apt-get -qq install cmake
RUN apt-get install libboost-all-dev=1.62.0.1
RUN apt-get -qq install build-essential libtcmalloc-minimal4 && \
ln -s /usr/lib/libtcmalloc_minimal.so.4 /usr/lib/libtcmalloc_minimal.so
Once the script gets to the step where it tries to install libboost-all-dev I get the following output:
Reading package lists...
Building dependency tree...
Reading state information...
E: Version '1.62.0.1' for 'libboost-all-dev' was not found
The command '/bin/sh -c apt-get install libboost-all-dev=1.62.0.1' returned a non-zero code: 100
and the build stops.
I've tried updating the build script to use the current version of Boost (1.74.0) as well and get the same issue. I'm not really finding any solutions in my research online and the output is not very helpful in trying to figure out what the issue is. Could anyone with more experience with installing Boost as part of the Docker process point me in the right direction?
The package manager will only be able to install versions of Boost that it knows exist, based on the enabled package manager repositories. There is typically only one version of Boost in the default repositories. In my experience, this applies to any Linux OS that supplies Boost, not only those that are run within a Docker container.
The Docker image you started with, gcc:6.3.0, appears to have only Boost version 1.55.0.2, so requesting any other version will yield the same error.
If you want a different version of Boost in your image, you can follow the typical steps for installing a different version of Boost outside a Docker container. These steps are well-documented on Stack Overflow, or you might find a repository such as this to enable in your package manager to directly install it from apt-get.

connecting pyarrow with libhdfs3

I'm trying to connect to a hadoop cluster via pyarrows' HdfsClient / hdfs.connect().
I noticed pyarrows' have_libhdfs3() function, which returns False.
How does one go about getting the required hdfs support for pyarrow? I understand there's a conda command for libhdfs3, but I pretty much need to make it work through some "vanilla" way that doesn't involve things like conda.
If it's of importance, the files I'm interested in reading are parquet files.
EDIT:
The creators of hdfs3 library have made a repo that allows installing libhdfs3:
http://hdfs3.readthedocs.io/en/latest/install.html
I don't know of a way to get libhdfs3 except through conda-forge, or building from source. You will need to conda install libhdfs3=2.2.31 since there was a breaking API change that made libhdfs3 have a different ABI from libhdfs that we have not addressed in Arrow yet. See https://issues.apache.org/jira/browse/ARROW-1445 (patches welcome)
On ubuntu this worked for me -
echo "deb https://dl.bintray.com/wangzw/deb trusty contrib" | sudo tee /etc/apt/sources.list.d/bintray-wangzw-deb.list
sudo apt-get install -y apt-transport-https
sudo apt-get update
sudo apt-get install libhdfs3 libhdfs3-dev
It should work on other Linux distros as well using the appropriate installer.
Taken from:
http://hdfs3.readthedocs.io/en/latest/install.html

How to restore gsutil command?

I have updated Google Cloud SDK to the latest version 135.0.0 from
After the update , I got the following message.
WARNING: There are older versions of Google Cloud Platform tools on
your system PATH. Please remove the following to avoid accidentally
invoking these old tools:
/usr/bin/git-credential-gcloud.sh
/usr/bin/bq
/usr/bin/gcloud
/usr/bin/gsutil
So I have deleted all the above folders.
After that gsutil stopped working.
Please help me how can I resolve the issue.
The issue that it was installed via
sudo apt-get update && sudo apt-get install google-cloud-sdk
see
https://cloud.google.com/sdk/docs/#deb
and you should have used the same mechanism to upgrade.
gcloud is also kind of a package manager, and is able to upgrade itself and its depended packages. Unfortunately if you use gcloud itself to upgrade it installs it in different location. It likely does not work because new location needs to be added to your path.
You can try to reinstall googcle-cloud-sdk package via apt-get.

Building a Docker file

I am trying to reproduce my development environment in a docker image. I am able to get simple dependencies met such as python+a couple standard packages, largely through the builds from docker hub. But when it comes to installing xgboost or pandas I am having great difficulty.
After looking into the error messages it looked like I had the wrong version of g++ installed. The build had 4.7, but xgboost requires 4.9+. As I tried to update g++ I kept running into a wall where I couldn't update g++ because I needed another package (apt-add-repository), but to install that package I needed another (apt-utils) etc.
Does anyone have any general advice with setting up a Docker image or for this specific problem of upgrading the g++.
Here is the Docker file:
FROM continuumio/anaconda
MAINTAINER maintainer
RUN apt-get install -y g++-4.9
One test would be to start from a gcc:4.9 image (which uses wheezy), and try to add what anaconda Dockerfile does.
That way, you start from an image with the right gcc.
You first need to make sure your source list is up-to-date. The line with RUN command in the dockerfile should be
RUN apt-get update && apt-get install -y g++