no such option --system for pip install - amazon-web-services

I'm attempting to deploy a very basic trading system to AWS using serverless (following along with this link), but I have a bit of a problem.
Prior to running the deployment command, I'm supposed to run
pip3 install -r requirements.txt -t . --system
but I am getting an error message saying 'no such option: --system'
Initially, I just tried to install the packages without the --system option, but I think that's causing the cron lamda(??) function to fail when I execute it manually through the serverless console because it's not finding the requisite modules.
I'm assuming it's because they aren't being installed properly so my question is how then should I install them so this doesn't happen?
Running
pip3 install -r requirements.txt
alone (while in the trading system directory) does not suffice.
So, what should I do?

The original author was working on an older Debian-derived system, you aren't. You can safely omit this option if it's not supported.
I don't have an authoritative link available, although this came up in a Google search. But here's my summary:
With older Debian-derived systems (eg, Ubuntu 18.04), the --user flag was enabled by default and it overrode the -t flag, so all packages would be installed in the $HOME/.local. The --system flag was nominally intended to allow installation in the system package directory, but in practice it was needed to enable -t.
This is fixed for Debian-derived systems that default to Python 3 (eg, Ubuntu 20.04).
It was never an issue for non-Debian systems (eg, EC2 Linux).
Since you don't seem to be familiar with pip, the -r argument tells it to use a file containing dependencies, and the -t argument tells it to install those dependencies in the current directory (not a great habit, but I don't want to describe virtual environments).

Related

What is the proper way to install bubblewrap for opam (ideally without admin priviledges)?

I am getting this error:
(iit_synthesis) brando9~ $ bash -c "sh <(curl -fsSL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh)"
## Using already downloaded "/tmp/user/22003/opam-2.1.4-x86_64-linux"
## Where should it be installed ? [/lfs/ampere4/0/brando9/.local/bin] ~/.local/bin
## '~/.local/bin' resolves to '/lfs/ampere4/0/brando9/.local/bin', do you confirm [Y/n] Y
## opam 2.1.4 installed to /lfs/ampere4/0/brando9/.local/bin
## Converting the opam root format & updating
No configuration file found, using built-in defaults.
Checking for available remotes: rsync and local, git.
- you won't be able to use mercurial repositories unless you install the hg command on your system.
- you won't be able to use darcs repositories unless you install the darcs command on your system.
[WARNING] Missing dependencies -- the following commands are required for opam to operate:
- bwrap: Sandboxing tool bwrap was not found. You should install 'bubblewrap'. See https://opam.ocaml.org/doc/FAQ.html#Why-does-opam-require-bwrap.
[ERROR] Sandboxing is not working on your platform ubuntu:
"~/.opam/opam-init/hooks/sandbox.sh build sh -c echo SUCCESS >$TMPDIR/opam-sandbox-check-out && cat $TMPDIR/opam-sandbox-check-out; rm -f
$TMPDIR/opam-sandbox-check-out" exited with code 10
Do you want to disable it? Note that this will result in less secure package builds, so please ensure that you have some other isolation mechanisms in
place (such as running within a container or virtual machine). [y/N]
but it doesn't link to a way to actually install it, link given https://opam.ocaml.org/doc/FAQ.html#Why-does-opam-require-bwrap and also I thought this would mean I don't need to do that:
opam init --disable-sandboxing
opam update --all
eval $(opam env)
am I wrong? I'm confused.
To install bubblewrap on Ubuntu 18.04 or later just do
sudo apt-get install bubblewrap
If you have an older Ubuntu distribution or a distribution that doesn't package this program, then follow the instructions on the bubblewrap page to install it.
Of course, you can opt out of using bubblewrap, this is actually what the message is telling you. Just say, y for yes and it will continue without bubble-wrapping anything. For example, if you're building in a docker container, you don't need an extra layer of containerization that is provided by bubblewrap so you can drop it off.
also I thought this would mean I don't need to do that:
opam init --disable-sandboxing
...
Yes, once you install opam the binary, and if you opted out of using bubblewrap you need to initialize opam with this option (opam installation roughly consists of two steps, first you download and install the binary, next you need to run opam init so that it configures itself in your system).

How to make GitLab Windows shared runners to build faster?

Background
I have a CI pipeline for a C++ library I've been developing. So far, I can distribute this lib to Linux and Windows systems. Since I use GitLab to build, test and package my lib, I'd like to have my Windows builds running faster and I have no clue on how to do that.
Currently, I use the following script for my Windows builds:
.windows_template:
tags:
- windows
before_script:
- choco install cmake.install -y --installargs '"ADD_CMAKE_TO_PATH=System"'
- choco install python --pre -y
- choco install git -y
- $env:ChocolateyInstall = Convert-Path "$((Get-Command choco).Path)\..\.."; Import-Module "$env:ChocolateyInstall\helpers\chocolateyProfile.psm1"; refreshenv
- python -m pip install --upgrade pip
- pip install conan monotonic
The problem
Any build with the script above can take up to 10 minutes; worse: I have two stages, each one taking the same amount of time. This means that my whole CI pipeline will take 20 minutes to finish because of slowness in Windows builds.
Ideal solution
EVERYTHING in my before_script can be cached or stored as an image. I only need some hints on how to do it properly.
Additional information
I use the following tools for my builds:
CMake: to support my building process;
Python3: to test and build packages;
Conan (requires Python3): to support the creation of packages with several features, as well as distribute them;
Git: to download Googletest in CMake configuration step This is already provided in the cookbooks - I might just remove this extra installation step in my before_script;
Googletest (requires Python3): testing library;
Visual Studio DEV Tools: to compile the library This is already in the cookbooks.
Installing packages like this (whether it's OS packages though apt-get install... or pip, or anything else) is generally against best practices for CI/CD jobs because every job that runs will have to do the same thing, costing a lot of time as you run more pipelines, as you've seen already.
A few alternatives are to search for an existing image that has everything you need (possible but not likely with more dependencies), split up your job into pieces that might be solved by an image with just one or two dependencies, or create a custom docker image to use in your jobs. I answered a similar question with an example a few weeks ago here: "Unable to locate package git" when running GitLab CI/CD pipeline
But here's an example Dockerfile with Windows:
# Dockerfile
FROM mcr.microsoft.com/windows
RUN ./install_chocolatey.sh
RUN choco install cmake.install -y --installargs '"ADD_CMAKE_TO_PATH=System"'
RUN choco install python --pre -y
RUN choco install git -y
...
The FROM line says that our new image extends the mcr.microsoft.com/windows base image. You can extend any image you have access to, even if it already extends another image (in fact, that's how most images work: they start with something small, like a base OS installation, then add things needed for that package. PHP for example starts on an Ubuntu image, then installs the necessary PHP packages).
The first RUN line is just an example. I'm not a Windows user and don't have experience installing Chocolatey, but you'd do here whatever you'd normally do to install it locally. The rest are for installing whatever else you need.
Then run
docker build /path/to/dockerfile-dir -t mygroup/mytag:version
The path you supply needs to be the directory that contains the Dockerfile, not the Dockerfile itself. The -t flag sets the image's tag after it's built (though you can do that with a separate command, docker tag too).
Next, you'll have to log into whichever registry you're using (Docker Hub (https://docs.docker.com/docker-hub/repos/), Gitlab Container Registry (https://docs.gitlab.com/ee/user/packages/container_registry/), a private registry your employer may support, or any other option.
docker login my.docker.hub.com
Now you can push the image to the registry:
docker push my.docker.hub.com/mygroup/mytag:version
You'll have to review the information in the docs about telling your Gitlab runner or pipelines how to authenticate with the registry (unless it's Public on Docker Hub or you use the Gitlab Container Registry) https://docs.gitlab.com/ee/ci/docker/using_docker_images.html#define-an-image-from-a-private-container-registry
Once all that's done, you can use your new image in your CI jobs, and everything we put into the image will be ready to use:
.windows_template:
image: my.docker.hub.com/mygroup/mytag:version
tags:
- windows
...

Getting the following error: Could not find a version that satisfies the requirement command-not-found==0.3

I am deploying a Django app using Heroku.
When I run
git push heroku master
in my terminal I get the following error:
Could not find a version that satisfies the requirement command-not-found==0.3"
When I run
sudo apt-get install command-not-found
I find that command-not-found is version 20.04.2. However, pip freeze tells me command-not-found is version 0.3.
command-not-found doesn't seem to exist on PyPI, but it is a package in Ubuntu and Debian repositories. It doesn't look like anything that your application should depend on, and it certainly doesn't belong on Heroku.
I suspect
you're trying to create your dependencies file after the fact, by simply doing pip freeze > requirements.txt, and
that you're either not working in a virtual environment or you created your virtual environment with system packages.
This is an antipattern that will cause several packages that your application doesn't actually need to be included in your requirements.txt. In this case it is even including Python packages that come from system packages and aren't meant to be installed from PyPI. Your requirements.txt should contain only your actual dependencies.
Instead of creating it with pip freeze after the fact, add things to that file before, and install them into your virtual environment with the same pip install -r requirements.txt command that you'll use in production. I also very strongly urge you to use a virtual environment.
In this case, I suggest you edit your requirements.txt and remove anything you don't actually need, commit, and redeploy.

Is it possible to create a standalone file to import a python library created with pybind?

I hope I'm clear in my question, if not please tell me.
I am using OpenImageIO's python bindings (pybind11) for some scripts that will run on hundreds of computers. Unfortunately it took me a lot of time to install OpenImageIO and make it work with my Python2 installation. I'd like to know if there's a way to create a file/folder that I could send to other computers so they can install the Python module simply with "pip install file/folder".
Thanks ofr your help
Are you running the scripts on a compute cluster with a shared filesystem? If so, then there's no need to create separate installations of python for each machine. The simplest solution is to create ONE python environment in a location that is accessible by all of your machines. An easy way to create a Python environment in a non-system location is to use Miniconda. Install it to a shared (network) location, and create an environment for all of your machines to use.
If your machines do NOT have a shared file system, then you'll need to somehow reproduce the environment on all of them independently. In this case, there's no simple way to do that with pip.**
But if you can use conda instead, then there's a very straightforward solution. First, install everything you need into a single conda environment. Then you have a choice: You can export the list of conda packages, or simply copy the entire conda environment directory to the other machines.
OpenImageIO is available from the conda-forge channel, a community-developed repository of conda packages. The name of the package is py-openimageio. They have stopped updating the python-2.7 version, but the old versions are still available.
Here's how to do it.
Install Miniconda-2.7
Create a new environment with python 2.7, OpenImageIO, and any other packages you need:
conda create -n jao-stuff -c conda-forge py-openimageio python=2.7
conda activate jao-stuff
python -c "import OpenImageIO; print('It works!')"
Do ONE of the following:
a. Export the list of packages in your environment:
conda env export -n jao-stuff -f jao-stuff-packages.yaml
Then, on the other machines, install Miniconda, then create the environments using the package list from the previous step:
conda create -n jao-stuff --file jao-stuff-packages.yaml
OR
b. Just copy all of the files in the environment to the other machines, and run them directly. Conda environments are self-contained (except for a few low-level system libraries), so you can usually just copy the whole thing to another machine and run it without any further install step.
tar czf jao-stuff.tar.gz $(conda info --prefix)/envs/jao-stuff
On the other machine, unpack the tarball anywhere and just run the python executable it contains:
tar xzf jao-stuff.tar.gz
jao-stuff/bin/python -c "import OpenImageIO; print('It works!')"
**That's because OpenImageIO is a C++ project, with several C++ dependencies, and they don't provide binaries in the wheel format. I don't blame them -- pip is not well suited to this use-case, even with wheels. Conda, on the other hand, was designed for exactly this use-case, and works perfectly for it.

connecting pyarrow with libhdfs3

I'm trying to connect to a hadoop cluster via pyarrows' HdfsClient / hdfs.connect().
I noticed pyarrows' have_libhdfs3() function, which returns False.
How does one go about getting the required hdfs support for pyarrow? I understand there's a conda command for libhdfs3, but I pretty much need to make it work through some "vanilla" way that doesn't involve things like conda.
If it's of importance, the files I'm interested in reading are parquet files.
EDIT:
The creators of hdfs3 library have made a repo that allows installing libhdfs3:
http://hdfs3.readthedocs.io/en/latest/install.html
I don't know of a way to get libhdfs3 except through conda-forge, or building from source. You will need to conda install libhdfs3=2.2.31 since there was a breaking API change that made libhdfs3 have a different ABI from libhdfs that we have not addressed in Arrow yet. See https://issues.apache.org/jira/browse/ARROW-1445 (patches welcome)
On ubuntu this worked for me -
echo "deb https://dl.bintray.com/wangzw/deb trusty contrib" | sudo tee /etc/apt/sources.list.d/bintray-wangzw-deb.list
sudo apt-get install -y apt-transport-https
sudo apt-get update
sudo apt-get install libhdfs3 libhdfs3-dev
It should work on other Linux distros as well using the appropriate installer.
Taken from:
http://hdfs3.readthedocs.io/en/latest/install.html