Correctly using pre-commit with flake8 plugins - pre-commit.com

So far, I have used tools from the local repo, where everything was run as a system command in the development environment. This works great, but is a bit on the slower side from the overhead of running things through poetry, and prevents running linting without a full setup.
Adjusting black and isort to use their respective repos was easy enough, but flake8's plugin system is causing trouble.
Running it from local, the versions of the plugins are managed by poetry, and can be easily modified in pyproject.toml while locking ensures there are no unexpected version bumps happening, but there's seemingly no way to achieve this when specifying the plugins through additional_dependencies on the flake8 hook.
If the versions are left unconstrained, pre-commit may use a different linting configuration in different environments, but then if an exact constraint is provided, it becomes increasingly more difficult to manage as the version bumps have to be done completely manually.
e.g this possibly won't produce the same environment across different runs if the plugin is updated
- repo: https://github.com/pycqa/flake8
rev: 4.0.1
hooks:
- id: flake8
additional_dependencies:
[
flake8-plugin, # or some lower upper bound
]
But providing an exact version flake8-plugin==1.2.3, requires me to manually keep track of the updates of all the plugins and bump everything myself.
Is there any way to get a behaviour similar to normal locking, where I'd specify a reasonable version constraint for the plugin (e.g. ~1.2) somewhere, and that'd get locked on autoupdate ?

Related

Go cross-platform tests

Problem description
I've developed a CLI in Go. My dev environment is Linux. I wrote unit-tests and only produce releases of executable files when tests pass. When I test or use my tool in a Linux environment, everything works fine.
My CI/CD pipeline is built around goreleaser to produce multi-platform executables. Since my app doesn't use exotic cross-platform functionalities, I was quite confident that windows executable should work as expected. But it didn't.
Long story short, always normalize paths with filepath.ToSlash(). But this is not my question.
Question
Hence my question is: "since behavior might change on different platforms for such little mistakes, is it possible to run go test for a list of os/architecture ?" I can't imagine rebooting on windows to test every commit manually and I don't think discipline is the answer. If it was, we wouldn't test things at all.
Search attempts
A fast search on Google and Stack Overflow for "golang cross-platform tests" didn't return any results. Am I missing something or is my approach to this problem wrong ?
Fist edit
Most comments pointed out that the only way to test the behavior of an executable on a given platform is... to test it on this platform (in a multi-stage CI/CD for example). This is so obvious that there might not be a way to achieve it otherwise, I know.
But triggering a parallel CI/CD job on every platform for every commit (of partially untested code) doesn't sound satisfying to me. It IS the only way to know for sure that the code behave as expected on every targeted platform but I'm wondering if anyone stumbled on this issue and found a pre-CI/CD solution to this problem.
Though it might be the only way to get conclusive test results, it implies triggering CI/CD with parallel tests on each platform. I was looking for some solution on the developer machine, before committing untested code
You can install a local CICD tool which would, on (local) commit, trigger those tests.
A Local GitLab, for instance, can run test in multiple platforms simultaneously (since GitLab 11.5)
But that implies at least a Docker image in order to test on Windows from your Linux dev environment.
With Go alone however, as mentioned in "Design and unit-test cross-platform application"
It's not possible to run go test for a target system that's different from the current system.
If you are trying to test a different CPU architecture, you should be able to do it with QEMU. This blog post explains how.
If you are trying to test a different OS, you can probably do everything you need to do in Docker containers. You could use a tool like VSCode’s remote development toolkit to easily dev/build/test a project in a specific container, or you could write a custom Makefile that calls the appropriate Docker commands when you run make test (allowing you to run tests in multiple OSes with one command).

Debugging executables across git commits

Has anyone come up with a good solution for debugging executables between different code versions using git? Basically, I want to be able to generate side-by-side executables between my recent set of commits and an older set of commits, as well as keep track of the relevant code to go with it, without everything overwriting when jumping between the commits.
Currently my method of doing this is to make/compile the current code and rename the executable and whatever code I'm trying to analyze in debugger (gdb in my case). Now, to the old code, I like to checkout a new branch before checking out the older commit because I'm paranoid and it builds the reflog for extra safety and check-pointing (I know it's not really necessary). I checkout old code and run the make file on the older commit. Now I have my side-by-side executables that I can run with gdb. It's all a bit tedious though, especially if a few headers/files have changed (I like to break up my code) and also if I now want to make changes and recompile (I've got tons of references/includes and I basically just have to start the process over again). Anyone have a better method?
I recently had a similar problem when trying to debug/upgrade a library which needed to be properly installed on the system (using make install) and make jumps between last stable and development versions.
Since version 2.6.0 (about three years from now), you now can use
git worktree
It basicaly enables yourself to turn any directory, at any place, into an annex of an official git local repository. It does by filling a .git textfile (instead of the regular subdirectory) at its top level, containing informations that point the original one. On the repository side, this annex is declared as so and a new branch is created.
Consequently, you now have the possibility to perform git checkout onto two different revisions simultaneously (one per working directory). And you may add as many annexes as you need. When you're done, simply delete all useless annexes, then call git worktree prune to make it dismiss everything that no longer exists (you may use git worktree lock to prevent some of them to be deleted if their annex directory is to be unavailable sometimes, with removable devices for instance). This will let you compile two different versions of the same application at the same time.
If, however, you need to retrieve a particular revision of your soft but you can't tell which one before you have compiled it, you may want to use
git bisect run
…instead, which will automatically call a script that tells if a revision is good or bad. It's useful when a compilation is rather long and when the whole search can span over several days.
Here's a link to the documentation page related to git worktree.

Implementing single script build - non-portable dependencies

It seems that a build system best-practice is to have a single script that can build all source and package the releases. See Joel Test #2
How do you account for non-portable dependencies? For example, if you code for .net 4, then you need .net 4 installed on the box. Standard MS release .net 4 is not xcopy deployable (unless I'm mistaken?). I can see a few avenues:
the dependencies are clearly stated in some resource file (wiki, txt, whatever). When you
call the build script, the build will fail if you don't have the dependency installed. This is an acceptable outcome.
The build script is responsible for setting up the environment. So if you require .net 4 and its not on the box then it installs it for you.
A flavor of #2 - instead of installing dependencies, the script spawns a pre-packaged image (virtual machine, Amazon EC2 AMI) that is setup with all dependencies
???
For implementing a build script you have to ask yourself, how much work you want/can spent on it. This leads to the question how often you have to set up the build environment. I can see #2 would be the perfect solution, but i would need a lot of work, since usually you have more than one non portable dependency.
So we use #1 one. And it works quite well. The most important thing is, that the build script is starting with some sort of self-test. It looks for everything which is needed to build the whole software and gives an error if something is not found. And it gives a clear error message, so that any new guy knows what to do to make it running. Of course as with a lot of software it is nearly never finished and gets extended by needs. The drawback that this test can take some seconds is insignificant when whole build process needs more than minutes.
A wiki (or even sth. else) with the setup solution was not a good solution for us, since after three month nobody knows where this was, but the build script is used every day.
The build script itself is a set of a lot of different things, which where chosen by needs. It is starting with a batch (we are using Windows) which invokes a lot of other things. Other batches, MSBuild, home grown tools. Each step by it self is checking for its own dependencies, to have the problem local and you can see three lines later why this special thing is needed.
Number 2 states "Can you make a build in one step?" As described this means for a development team to be effective the build process must be as simple as possible to reduce errors in the build process and insure consistency. This is especially important as a team gets larger. You want to make sure everyone is building the same thing. (What is done with that package should also be simple, but it is not as important IMHO.) Msbuild is great at this; they provide the facilities to set up a build server that access the source control system independently so the developers actions can't corrupt the build environment. I highly recommend setting up a build server using TFS -- many build issues will go away and you will have the 1-click build Joel describes.
As for your points about what that package does for deployment -- you have many options with MS, but the more "one click" you can make it the better. I believe this is slightly different than Joel's #2. In his example he describes changing what software he will use for the install not because one performs with fewer steps, but instead because one can be incorporated into a one step build.

Handling relations between multiple subversion projects

In my company we are using one SVN repository to hold our C++ code. The code base is composed from a common part (infrastructure and applications), and client projects (developed as plugins).
The repository layout looks like this:
Infrastructure
App1
App2
App3
project-for-client-1
App1-plugin
App2-plugin
Configuration
project-for-client-2
App1-plugin
App2-plugin
Configuration
A typical release for a client project includes the project data and every project that is used by it (e.g. Infrastructure).
The actual layout of each directory is -
Infrastructure
branches
tags
trunk
project-for-client-2
branches
tags
trunk
And the same goes for the rest of the projects.
We have several problems with the layout above:
It's hard to start a fresh development environment for a client project, since one has to checkout all of the involved projects (for example: Infrastructure, App1, App2, project-for-client-1).
It's hard to tag a release in a client projects, for the same reason as above.
In case a client project needs to change some common code (e.g. Infrastructure), we sometimes use a branch. It's hard to keep track which branches are used in projects.
Is there any way in SVN to solve any of the above? I thought of using svn:externals in the client projects, but after reading this post I understand it might not be right choice.
You could handle this with svn:externals. This is the url to a spot in an svn repo
This lets you pull in parts of a different repository (or the same one). One way to use this is under project-for-client2, you add an svn:externals link to the branch of infrastructure you need, the branch of app1 you need, etc. So when you check out project-for-client2, you get all of the correct pieces.
The svn:externals links are versioned along with everything else, so as project-for-client1 get tagged, branched, and updated the correct external branches will always get pulled in.
A suggestion is to change directory layout from
Infrastructure
branches
tags
trunk
project-for-client-1
branches
tags
trunk
project-for-client-2
branches
tags
trunk
to
branches
feature-1
Infrastructure
project-for-client-1
project-for-client-2
tags
trunk
Infrastructure
project-for-client-1
project-for-client-2
There are some problems with this layout too. Branches become massive, but at least it's easier to tag specific places in your code.
To work with the code, one would simply checkout the trunk and work with that. Then you don't need the scripts that check out all the different projects. They just refer to Infrastructure with "../Infrastructure". Another problem with this layout is that you need to checkout several copies if you want to work on projects completely independently. Otherwise a change in Infrastructure for one project might cause another project not to compile until that is updated too.
This might make releases a little more cumbersome too, and separating the code for different projects.
Yeah, it sucks. We do the same thing, but I can't really think of a better layout.
So, what we have is a set of scripts that can automate everything subversion related. Each customer's project will contain a file called project.list, which contains all of the subversion projects/paths needed to build that customer. For example:
Infrastructure/trunk
LibraryA/trunk
LibraryB/branches/foo
CustomerC/trunk
Each script then looks something like this:
for PROJ in $(cat project.list); do
# execute commands here
done
Where the commands might be a checkout, update or tag. It is a bit more complicated than that, but it does mean that everything is consistent, checking out, updating and tagging becomes a single command.
And of course, we try to branch as little as possible, which is the most important suggestion I can possibly make. If we need to branch something, we will try to either work off of the trunk or the previously-tagged version of as many of the dependencies as possible.
First, I don't agree that externals are evil. Although they aren't perfect.
At the moment you're doing multiple checkouts to build up a working copy. If you used externals it would do exactly this, but automatically and consistently each time.
If you point your externals to tags (and or specific revisions) within the target projects, you only need to tag the current project per release (as that tag would indicate exactly what external you were pointing to). You'd also have a record within your project of exactly when you changed your externals references to use a new version of a particular library.
Externals aren't a panacea - and as the post shows there can be problems. I'm sure there's something better than externals, but I haven't found it yet (even conceptually). Certainly, the structure you're using can yield a great deal of information and control in your development process, using externals can add to that. However, the problems he had weren't fundamental corruption issues - a clean get would resolve everything, and are pretty rare (are you really unable to create a new branch of a library in your repo?).
Points to consider - using recursive externals. I'm not sold on either the yes or no of this and tend to go with a pragmatic approach.
Consider using piston as the article suggests, I've not seen it in action so can't really comment, it may do the same job as externals in a better way.
From my experience I think it's more useful to have a repository for each individual project. Otherwise you have the problems you say and furthermore, revision numbers change if other projects change which could be confusing.
Only when there's a relation between individual projects, such as software, hardware schematics, documentation, etc. We use a single repository so the revision number serves to get the whole pack to a known state.

Do you version control the invidual apps or the whole project or both?

OK, so I have a Django project. I was wondering if I'm supposed to put each app in its own git repository, or is it better to just put the whole project into a git repository, or whether I should have a git repo for each app and also a git repo for the project?
Thanks.
It really depends on whether those reusable apps are actually going to be reused outside of the project or not?
The only reason you might need it in a different repo is if other projects with seperate repos might need to use it. But you can always split it out later. Making a git repo is cheap so it's one of those things you can do later if it becomes necessary. Making things complicated up front is just going to frustrate you later so feel free to wait till you know it's necessary
YAGNI it's for more than just code.
I like Tom's or Alex's answers, except they lack real rationale behind them
("easier to have one repository per development team" ?
"substantial numbers of people might be interested in pulling out (or watching changes to) separate apps"
why ?)
First of all, one or several repositories is an server-side administration decision (in term of server resources).
SVN easily sets up several repositories behind one server, whereas ClearCase will have its own "vob_server" process per VOB, meaning: you do not want more than a hundred VOB per (Unix) server (or more than 20-30 on a Windows server).
Git is particular: setting a repository is cheap, accessing it can be easy (through a simple shared path), or can involved a process (a git daemon). The latest solution means: "not to much repositories accessed directly from outside". (they can be accessed indirectly through submodules referenced by a super-project)
Then there is the client-side administration: how complex will the configuration of a workspace be, when one or several repositories are involved. How can the client references the right configuration? (list of labels needed to reference the correct files).
SVN would use externals, git "submodules", and in both case, that adds complexity.
That is why Tom's answer can be correct.
Finally, there is the configuration management aspect (referenced in Alex's answer). Can you tag part of a repo of not ?
If you can (like in SVN where you actually make a svn-copy of part of a repo), that means you can have a component approach, where several groups of files have their own life cycle, and their own tags (set at their own individual pace).
But in Git, that is not possible: a tag references a commit which always concerns the all repository.
That means a "system-based" approach, where you only want the all project anyway (as opposed to the "watching separate apps - I've never observed it in real life" bit from Alex's answer). If that is the case (if you want the all system anyway), that is not important.
But for those of us who think in term of "independent groups of files", that means a git repository actually represents an individual group of files (with its own rhythm in term of evolutions and tagging), with potentially a super-project referencing those repositories as submodules.
That is not your everyday setup, so for independent projects, I would recommend only few or one Git repo.
But for more complex interdependent set of projects... you need to realize that by the very way Git has been conceived, a "repository" represents a coherent set of files supposed to evolve at the same pace as a all. And not "all" can always fit in one set of files (if "all" is complex enough). Hence, several repositories would be required in this case.
And a complex set of inter-dependent projects does happen in real life too ;)
Almost always easier to have one repository per development team, no matter how many products you have. Eventually you will want to share code between projects (even several seperate DJango websites!) and it is much easier to do with only one repository.
By all means set up a nice folder structure WITHIN that repository. However, a single checkout from git (probably from a subfolder) should give you all the files you need for your test website (python, templates, css, jpegs...), and then you just copy httpd.conf and so on to complete the installation.
I'm no git expert, but I don't believe the appropriate strategy would be any different from one for hg, or even, in this case, svn, so I'm going to give my 2 cents' worth anyway;-).
A repo per app makes any sense only if substantial numbers of people might be interested in pulling out (or watching changes to) separate apps; this could theoretically be the case but I've never observed it in real life -- mostly, people who are interested in the project are interested in it as a whole. Therefore, I would avoid complications and make the whole proj into one repo.
Do those individual apps utilize shared code and libraries? If so, they really belong in the same repository .. which enables you to see the impact of new changes when running a single testing suite.
If they are completely separate and agnostic, it really doesn't matter. Just for sanity, ease of building and ease of packaging, I prefer to keep a whole project in a combined repository. However, we typically work in very small teams. So, if no shared code is involved, its really just a question of which method is most convenient and efficient for all concerned.
If you have reusable apps, put them in a seperate repo.
If you are worried about the number of private repositories have a look at bitbucket (if you decide to make them open source, I advice github)
There is a nice way of including your own apps in to the project, even making use of version tags:
I recently found out a way to do this with buildout and git tags
(I used svn:externals to include an app, but switched from svn to git recently).
I tried mr.developer first, but while not getting this to work i found an alternative for mr.developer:
I found out gp.vcsdevlop is very easy to use for this purpose.
see https://pypi.python.org/pypi/gp.vcsdevelop
I ended up, putting this in my buildout file and got it working at once (I had to add a pip requirements.txt to my app to get it working, but that's a good thing after all):
vcs-update = True
extensions =
gp.vcsdevelop
buildout-versions
develop-dir=./local_checkouts
vcs-extend-develop=git+git#bitbucket.org:<my bitbucket username>/<the app I want to include>.git#0.1.38#egg=<the appname in django>
develop = .
in this case it chacks out my app and clone it to version tag 0.1.38 in to the project in the subdit ./local_checkouts/
and develops it when running bin/buildout
Edit: remark 26 august 2013
While using this solution and editing in this local checkout of the app used in a project. I found out this:
you will get this warning when trying the normal 'git push origin master' command:
To prevent you from losing history, non-fast-forward updates were rejected
Merge the remote changes before pushing again. See the 'Note about
fast-forwards' section of 'git push --help' for details.
EDIT 28 august 2013
About working in the local_checkouts for shared apps included by gp.vcsdevelop:
(and handle the warning discussed in the remakr above)
git push origin +master seems to screw up commit history for the shared code
So the way to work in the local_checkout dir is like this:
go in to the local checkout (after a bin/buildout so the checkout is the master):
cd localcheckouts/<shared appname>
Create a new branch and swith to it like this:
use git checkout -b 'issue_nr1' (e.g. the name of the branch is here named after the issue you are working on)
and after you are done working in this branche, use: (after the usuals git add and git commit)
git push origin issue_nr1
when tested and completed merge the branch back in the master:
first checkout the master:
git checkout master
update (probably only neede when other commits in the mean time)
git pull
and merge to the master (where you are in right now):
git merge issue_nr1
and finaly push this merge:
git push origin master
(with sepcial thanks to this simplified guide to git: http://rogerdudler.github.io/git-guide/)
and after a while, to clean up the branches, you might want to delete this branch
git branch -d issue_nr1