How to speed jenkins build process while installing requirements using pip - django

I am using Jenkins CI for my django project. For Django-Jenkins integration I am using the django-jenkins app. In the build step of Jenkins I create a fresh virtualenv and install all the dependencies for each build using requirements file. However, this makes build extremely slow because a fresh copy of all the dependencies must be downloaded from a PyPI mirror, even if nothing has changed in the dependencies since the last build. So I started using the local caching built-in to pip by setting the PIP_DOWNLOAD_CACHE environment variable. But the whole build process is still painfully slow and takes more than 10 minutes. Is there any way I could speed up the whole process? Maybe by caching the compiled dependencies or something else?

Just only install a fresh virtualenv if your requirements.txt file changes. This can be done easily with some shell commands. We are doing something similar in one of our projects. In a Jenkins shell window we have (after svn up):
touch changed.txt
stat -c %Y project/requirements.txt > changed1.txt
diff -q changed.txt changed1.txt || echo "DO YOUR PIP --upgrade HERE!"

Why bother creating a fresh virtualenv each time you build? You should be able to create just one and simply activate it with . /path/to/venv/bin/activate as an 'Execute shell script' build step (assuming the use of linux here). Then, if you need to install a new dependency, you can activate the venv on your own and pip install the new package.

Related

How to make GitLab Windows shared runners to build faster?

Background
I have a CI pipeline for a C++ library I've been developing. So far, I can distribute this lib to Linux and Windows systems. Since I use GitLab to build, test and package my lib, I'd like to have my Windows builds running faster and I have no clue on how to do that.
Currently, I use the following script for my Windows builds:
.windows_template:
tags:
- windows
before_script:
- choco install cmake.install -y --installargs '"ADD_CMAKE_TO_PATH=System"'
- choco install python --pre -y
- choco install git -y
- $env:ChocolateyInstall = Convert-Path "$((Get-Command choco).Path)\..\.."; Import-Module "$env:ChocolateyInstall\helpers\chocolateyProfile.psm1"; refreshenv
- python -m pip install --upgrade pip
- pip install conan monotonic
The problem
Any build with the script above can take up to 10 minutes; worse: I have two stages, each one taking the same amount of time. This means that my whole CI pipeline will take 20 minutes to finish because of slowness in Windows builds.
Ideal solution
EVERYTHING in my before_script can be cached or stored as an image. I only need some hints on how to do it properly.
Additional information
I use the following tools for my builds:
CMake: to support my building process;
Python3: to test and build packages;
Conan (requires Python3): to support the creation of packages with several features, as well as distribute them;
Git: to download Googletest in CMake configuration step This is already provided in the cookbooks - I might just remove this extra installation step in my before_script;
Googletest (requires Python3): testing library;
Visual Studio DEV Tools: to compile the library This is already in the cookbooks.
Installing packages like this (whether it's OS packages though apt-get install... or pip, or anything else) is generally against best practices for CI/CD jobs because every job that runs will have to do the same thing, costing a lot of time as you run more pipelines, as you've seen already.
A few alternatives are to search for an existing image that has everything you need (possible but not likely with more dependencies), split up your job into pieces that might be solved by an image with just one or two dependencies, or create a custom docker image to use in your jobs. I answered a similar question with an example a few weeks ago here: "Unable to locate package git" when running GitLab CI/CD pipeline
But here's an example Dockerfile with Windows:
# Dockerfile
FROM mcr.microsoft.com/windows
RUN ./install_chocolatey.sh
RUN choco install cmake.install -y --installargs '"ADD_CMAKE_TO_PATH=System"'
RUN choco install python --pre -y
RUN choco install git -y
...
The FROM line says that our new image extends the mcr.microsoft.com/windows base image. You can extend any image you have access to, even if it already extends another image (in fact, that's how most images work: they start with something small, like a base OS installation, then add things needed for that package. PHP for example starts on an Ubuntu image, then installs the necessary PHP packages).
The first RUN line is just an example. I'm not a Windows user and don't have experience installing Chocolatey, but you'd do here whatever you'd normally do to install it locally. The rest are for installing whatever else you need.
Then run
docker build /path/to/dockerfile-dir -t mygroup/mytag:version
The path you supply needs to be the directory that contains the Dockerfile, not the Dockerfile itself. The -t flag sets the image's tag after it's built (though you can do that with a separate command, docker tag too).
Next, you'll have to log into whichever registry you're using (Docker Hub (https://docs.docker.com/docker-hub/repos/), Gitlab Container Registry (https://docs.gitlab.com/ee/user/packages/container_registry/), a private registry your employer may support, or any other option.
docker login my.docker.hub.com
Now you can push the image to the registry:
docker push my.docker.hub.com/mygroup/mytag:version
You'll have to review the information in the docs about telling your Gitlab runner or pipelines how to authenticate with the registry (unless it's Public on Docker Hub or you use the Gitlab Container Registry) https://docs.gitlab.com/ee/ci/docker/using_docker_images.html#define-an-image-from-a-private-container-registry
Once all that's done, you can use your new image in your CI jobs, and everything we put into the image will be ready to use:
.windows_template:
image: my.docker.hub.com/mygroup/mytag:version
tags:
- windows
...

Keeping Pipfile Updated

I just started a new Django project and use Postgresql as my database, so I installed psycopg2 to make it work properly. When i deployed the project in the beginning the app did not work because psycopg2 was not installed on the production server. As i quickly realized this was because psycopg2 was missing in my pipfile.
So my question is:
Do I have to update the pipfile manually every time i install something for my project? I thought that the pipfile would take care of that automatically every time I install something.
Isn't there anything similar to pip freeze > requirements.txt where I can do the update with one short command?
Do I have to update the pipfile manually every time i install something for my project? I thought that the pipfile would take care of that automatically every time I install something.
requirements.txt is just a file. There is no logic around it that updates that (unless of course you have an IDE that does that). It is not per se the file that is used for the package manager. You can use any file, and you can use multiple files (for example sometimes one makes a requirements_test.txt file that contains extra packages that should be installed when testing the software).
You do not per se need to update the requirements.txt file each time you install software, as long as the requirements.txt file is correct when you deploy software (on another machine), it is fine.
You can however automate this to some extent. If you use git subversioning for example, you can make a pre-commit hook, that will each time run when you commit changes. For example by constructing an executable file in .git/hooks/pre-commit in the repository. Something that might look like:
#!/bin/bash
. env/bin/activate
pip freeze > requirements.txt
Each time you thus make a commit, the requirements.txt will be "harmonized" with the packages installed in the virual environment.

How do I figure out what dependencies to install when I copy my Django app from one system to another?

I'm using Django and Python 3.7. I want to write a script to help me easily migrate my application from my local machien (a Mac High Sierra) to a CentOS Linux instance. I'm using a virtual environment in both places. There are many things that need to be done here, but to keep the question specific, how do I determine on my remote machine (where I'm deploying my project to), what dependencies are lacking? I'm using rsync to copy the files (minus the virtual environment)
On the source system execute pip freeze > requirements.txt, then copy the requiremnts.txt to the target system and then on the target system install all the dependencies with pip install -r requirements.txt. Of course you will need to activate the virtual environments on both systems before execute the pip commands.
If you are using a source code management system like git it is a good idea to keep the requirements.txt up to date in your source code repository.

how to bring out django project from virtualenv

I have django project in virtualenv and now I am publishing it in server but the problem is I can not move project from virtualenv, when I do this then related packages inside site-package, cant be read and errors occur, how can I bring out my project from virtualenv without any issuing
Create a new virtualenv on the server. It's easy
Step 1 Get the list of modules in the current virtualenv
source /path/to/current/bin/activate
pip freeze > /tmp/requirements.txt
Step 2 Create a new virtualenv. Login to your new server, copy the requirements file there. Then either change into a suitable directory before excuting the virtualenv command or give a full path.
deactivate
virtualenv -p python envname
Step 3 Install modules
source envname/bin/activate
pip install -r /tmp/requirements.txt
That's it.
As #bruno has pointed out, you really should be using a virtualenv on the server. And you should be using it on your local dev server as well. Then you can be really sure that the code will run at both ends without any surprises.

Virtualenv - Cleaning up unused package installations

So I have been developing my first django web application over the past few months and I have installed a number of packages that I wanted to try and use to solve some of my problems. However, some of these packages I installed, tried to use, failed, and then never uninstalled.
Is there a way to see what packages my application is using from the list given from "pip freeze"?
That way I can uninstall some of the clutter in my application. Is it a huge disadvantage to have this clutter?
In future development I will uninstall packages right away if I do not use them. So lesson learned :).
A method I use is with my requirements.txt files. From the root of my Django project, I create a requirements/ directory with the following files in it:
requirements/
base.txt
dev.txt
prod.txt
temp.txt
base.txt contains packages to be used in all environments such as Django==1.8.6.
Then dev would include base and other packages, and might look like:
-r base.txt
coverage==4.0.2
Then temp.txt includes dev.txt and contains packages I'm not sure I'll use permanently:
-r dev.txt
temp_package==1.0
git+https://github.com/django/django.git#1014ba026e879e56e0f265a8d9f54e6f39843348
Then I can blow away the entire virtualenv and reinstall it from the appropriate requirements file like so:
pip install -r requirements/dev.txt
Or, to include the temp_package I'm testing:
pip install -r requirements/temp.txt
That's just how I do it, and it helps keep my sandbox separate from the finished product.
Potentially, you could use isort and run isort myproject/* --diff to get a list of proposed changes isort would make to your project.
In the proposed changes, it list imports that are not used. From that, you could take a look at packages installed in your virtual environment and start removing them with pip. This is assuming you did not remove the import statements, which may not be the case.
Another way would be to create a new env and attempt to run your app. Use error messages to get the packages you need until your app runs. Not pretty, but it would work.