Checks not running on every file (when running -a) even though they are parsed - pre-commit.com

Summary of the problem
I've setup a repository with the checks I developed, designed to be used exclusively in an another repository.
The problem I've been facing is that when I run pre-commit run -a -v the checks don't go through every file, and time to time they even change!
A little more details
I've run the [identity][1] check and it prints every file in the repo, meaning the files are read by pre-commit (per my understanding)
When I execute pre-commit run the file in staging are correctly parsed
Why do i think that the checks don't run on every file?
I always print something to the standard output for every check, and when I run it verbosely it only prints 5 to 7 lines, meaning it checked only those.
If I try and edit a file (breaking a check) whose not on the short list, the check still passes through
Some code
The structure of the folder containing the files to be checked
./
articles/
some_name/
file.md
file.png
and_so_on.jpeg
team/
some_other_name/
file.md
propic.jpg
Summary of the .pre-commit-hooks.yaml
- id: team-check-name
name: blabla
description: blabla
entry: team-checkname
files: 'team/.*/'
types: [markdown]
language: python
- id: article-check-name
name: blabla
description: blabla
entry: article-checkname
files: 'articles/.*/'
types: [markdown]
language: python
Essentially the checks supposed to run on articles/.*/ start with article- and they all have this property: files: 'articles/.*/'.
Similarly every check supposed to run on team/.*/ starts with team- and they all have files: 'team/.*/'
Summary of the .pre-commit-config.yaml
repos:
- repo: https://github.com/repourl
rev: commit_hash
hooks:
- id: team-check-name
- id: article-check-name
But as you can see, I don't override any settings so it shouldn't interfere

the hook tools you've written are incorrect -- they only process sys.argv[1] rather than positional arguments
your code uses a lot of global variables so it's not a straightforward refactor -- typically you'd either use argparse to collect nargs='*' or loop over sys.argv[1:] (if you don't have any options)
as to why this is the convention -- it's very wasteful to start a linter process over and over to lint a single file (often times executable startup cost dwarfs the actual linting / formatting process)
disclaimer: I wrote pre-commit

Related

Extracting User Story from Commit Message in Jenkins

I'm using the ext-email extension to extract the User Story which is added as a commit message to include in the mail body.
This is the console output:
Commit message: "US285568"
I used the Build Log Excerpt method of the ext-email plug-in as follows:
STORY: ${BUILD_LOG_EXCERPT, start="Commit message:\ \"", end="\'"}
However, this does not match anything and I'm not able to understand why it's failing.
I couldn't find a proper documentation for this plug-in.
I used a work-around solution by triggering a helper job at the end of the current job (which contains the commit message in the console output).
I am executing the following shell code in the helper job:
result=$(curl -GET {JENKINS_IP}/jenkins/job/{UPSTREAM_JOB_NAME}/consoleFull --user "user:pass")
comm=$(grep "Commit message:" <<< "$result")
if grep -E "US[0-9]+" <<< "$comm"
then
final=$(grep -o "US[0-9]*" <<< "$comm")
else
final="<font color=\"red\">User Story not found</font>"
fi
echo FINAL=$final > env.properties
To access the FINAL variable (which contains the value of the user story) as an environment variable, I have used the EnvInject plug-in.
To access the FINAL variable outside the shell code in the job, add the "Inject environment variables" build step after the shell code and enter "env.properties" in the Properties File Path.

Using file.managed for downloading a file in Salt

salt.states.file.managed takes source_hash as an argument to verify a downloaded file. This blocks me from using file.managed for a file on an online server I don't have control over. The file also changes regularly. My configuration looks like this.
download_stuff:
file.managed:
- name: localfile.tar.gz
- source: http://someserver.net/onlinefile.tar.gz
- source_hash: ???
I don't want to use cmd.run with Curl or wget because this would always download the file, even when it's already on the local machine.
I would like the know if one of the options below is possible/exists:
online md5 calculation service. Is there any way of getting an md5 hash of the file, using a free web service? I'm thinking of something like http://md5service.net?url={url-to-file}.
salt-internal conversion or workaround. Is it possible to handle this in Salt? Maybe by leaving out source_hash somehow?
alternative state. Is there another state in Salt for doing something like this, without losing the benefit of only downloading the file when needed?
If you can't control the other server, please make sure that you can trust it to download its content. Not using a hash will prevent you from detecting partial or corrupted downloads. There's also no way to work with a file that has changed on the remote server.
Nevertheless you could use a state like this to circumvent the hashcode. The creates part will prevent a second download once the file has been downloaded:
bootstrap:
cmd.run:
- name: curl -L https://bootstrap.saltstack.com -o /etc/salt/cloud.deploy.d/bootstrap-salt.sh
- creates: /etc/salt/cloud.deploy.d/bootstrap-salt.sh
Downloading a file with file.managed can be done since version 2016.3.0., even if you don't have access to the hash, by adding skip_verify: True. For the example given, it would be:
download_stuff:
file.managed:
- name: localfile.tar.gz
- source: http://someserver.net/onlinefile.tar.gz
- skip_verify: True
From the docs:
If True, hash verification of remote file sources (http://, https://, ftp://) will be skipped, and the source_hash argument will be ignored.

C++ execute temp file as bash script

I have a program that needs to run a program we'll call externalProg in parallel on our linux (CentOS) cluster - or rather, it needs to run many instances of externalProg, each on different cores. Each "thread" creates 3 files based on a few parameters - the inputs to externalProg, a command file to tell externalProg how to execute my file, and a bash script to set up the environment (calls a setup script provided by the manufacturer) and actually call externalProg with my inputs.
Since this needs to be parallel with an unknown number of concurrent threads and I don't want to risk overwriting another thread's files, I am creating temp files using
mkstemp("PREFIX_XXXXXX")
for these input files. After the external program runs, I extract the relevant data and store it, and close the temp files (therefore deleting them).
We'll call the files created (Which actually have a name based on the template above)
tmpInputs - Inputs to externalProg
tmpCommand - Input that tells externalProg how to execute tmpInputs
tmpBash - bash script to set up and call externalProg with my inputs
The file tmpBash looks something like
source /path/to/setup/script # Sets up environment variables
externalProg < /path/to/tmpCommand
where tmpCommand is just a simple text file.
The problem I'm having is actually executing the bash script. Within my program, I call
ostringstream launchcmd;
launchcmd << "bash " << path_to_tmpBash
system(launchcmd.str().c_str());
But nothing happens. No error, no warning, no 'file not found' or permission denied or anything. I have verified that the files are being created and have the correct content. The rest of the code after system() is executed successfully (Though it fails since externalProg wasn't run).
Strangely, if I go back to the terminal and type
bash /path/to/tmpBash
then externalProg is executed successfully. I have also cout'd the launchcmd string, copy and pasted that in to the terminal, which also works successfully. For some reason, this only fails when called within my program.
After a bit of experimentation, I've determined that system() calls /bin/sh on our cluster. If I change launchcmd to look like
/path/to/tmpBash
(So that the full command should look like /bin/sh /path/to/tmpBash), I get a permission denied error, which is no surprise. The problem is that I can't chmod +x the tmpBash file while it's still open, and if I close the file, it gets deleted - so I'm not sure how to address that.
Is there something obviously wrong I'm doing, or does system() have some nuance that I'm missing?
edit: I wanted to add that I can successfully call things like
system("echo $PATH")
and get the expected results (in this case, my default $PATH).
Two separate ideas:
Change your SHELL environment variable to be /bin/bash, then call system(),
or:
Use execve directly `execve('/bin/bash', ['/path/to/tmpBash'], environ)

Is there a way to see the salt state converted to the actual command that is being run?

I have a state like
django.syncdb:
module.run:
- settings_module: mvod.dev_settings
- bin_env: /home/vagrant/virtualenv/
- migrate: True
- require:
- pip: mvod
- mysql_grants: mvod_user_grants
- file: /tmp/mvod.log
The docs aren't very specific about what this exactly does, though it indeed does seem to do what I expect, meaning run the command django-admin.py syncdb --settings=mvod.dev_settings --migrate from inside the directory /home/vagrant/virtualenv.
It actually fails to do this, since the /home/vagrant/virtualenv/ path actually needs to set to /home/vagrant/virtualenv/bin/django-admin.py.
However, i ran this in an environment where django wasn't installed, and so i'd expect this to fail. The state however returned Result: True but then the output was this Is a directory
I figured out eventually that i have to replace the line bin_env: /home/vagrant/virtualenv/ with bin_env: /home/vagrant/virtualenv/bin/django-admin.py since that's what i was trying to call.
Bottom line: i would have figured it out much sooner had i had a way of turning the state into the exact command being executed.
So is there a way to do this real fast?
You can run the minion as salt-minion --log-level=debug and then execute the state. It will show you what commands are being executed by salt on the system based on your state file.

How to properly debug a binary generated by `go test -c` using GDB?

The go test command has support for the -c flag, described as follows:
-c Compile the test binary to pkg.test but do not run it.
(Where pkg is the last element of the package's import path.)
As far as I understand, generating a binary like this is the way to run it interactively using GDB. However, since the test binary is created by combining the source and test files temporarily in some /tmp/ directory, this is what happens when I run list in gdb:
Loading Go Runtime support.
(gdb) list
42 github.com/<username>/<project>/_test/_testmain.go: No such file or directory.
This means I cannot happily inspect the Go source code in GDB like I'm used to. I know it is possible to force the temporary directory to stay by passing the -work flag to the go test command, but then it is still a huge hassle since the binary is not created in that directory and such. I was wondering if anyone found a clean solution to this problem.
Go 1.5 has been released, and there is still no officially sanctioned Go debugger. I haven't had much success using GDB for effectively debugging Go programs or test binaries. However, I have had success using Delve, a non-official debugger that is still undergoing development: https://github.com/derekparker/delve
To run your test code in the debugger, simply install delve:
go get -u github.com/derekparker/delve/cmd/dlv
... and then start the tests in the debugger from within your workspace:
dlv test
From the debugger prompt, you can single-step, set breakpoints, etc.
Give it a whirl!
Unfortunately, this appears to be a known issue that's not going to be fixed. See this discussion:
https://groups.google.com/forum/#!topic/golang-nuts/nIA09gp3eNU
I've seen two solutions to this problem.
1) create a .gdbinit file with a set substitute-path command to
redirect gdb to the actual location of the source. This file could be
generated by the go tool but you'd risk overwriting someone's custom
.gdbinit file and would tie the go tool to gdb which seems like a bad
idea.
2) Replace the source file paths in the executable (which are pointing
to /tmp/...) with the location they reside on disk. This is
straightforward if the real path is shorter then the /tmp/... path.
This would likely require additional support from the compiler /
linker to make this solution more generic.
It spawned this issue on the Go Google Code issue tracker, to which the decision ended up being:
https://code.google.com/p/go/issues/detail?id=2881
This is annoying, but it is the least of many annoying possibilities.
As a rule, the go tool should not be scribbling in the source
directories, which might not even be writable, and it shouldn't be
leaving files elsewhere after it exits. There is next to nothing
interesting in _testmain.go. People testing with gdb can break on
testing.Main instead.
Russ
Status: Unfortunate
So, in short, it sucks, and while you can work around it and GDB a test executable, the development team is unlikely to make it as easy as it could be for you.
I'm still new to the golang game but for what it's worth basic debugging seems to work.
The list command you're trying to work can be used so long as you're already at a breakpoint somewhere in your code. For example:
(gdb) b aws.go:54
Breakpoint 1 at 0x61841: file /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.go, line 54.
(gdb) r
Starting program: /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.test
[snip: some arnings about BinaryCache]
Breakpoint 1, github.com/stellar/deliverator/aws.imageIsNewer (latest=0xc2081fe2d0, ami=0xc2081fe3c0, ~r2=false)
at /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.go:54
54 layout := "2006-01-02T15:04:05.000Z"
(gdb) list
49 func imageIsNewer(latest *ec2.Image, ami *ec2.Image) bool {
50 if latest == nil {
51 return true
52 }
53
54 layout := "2006-01-02T15:04:05.000Z"
55
56 amiCreationTime, amiErr := time.Parse(layout, *ami.CreationDate)
57 if amiErr != nil {
58 panic(amiErr)
This is just after running the following in the aws subdir of my project:
go test -c
gdb aws.test
As an additional caveat, it does seem very selective about where breakpoints can be placed. Seems like it has to be an expression but that conclusion is only via experimentation.
If you're willing to use tools besides GDB, check out godebug. To use it, first install with:
go get github.com/mailgun/godebug
Next, insert a breakpoint somewhere by adding the following statement to your code:
_ = "breakpoint"
Now run your tests with the godebug test command.
godebug test
It supports many of the parameters from the go test command.
-test.bench string
regular expression per path component to select benchmarks to run
-test.benchmem
print memory allocations for benchmarks
-test.benchtime duration
approximate run time for each benchmark (default 1s)
-test.blockprofile string
write a goroutine blocking profile to the named file after execution
-test.blockprofilerate int
if >= 0, calls runtime.SetBlockProfileRate() (default 1)
-test.count n
run tests and benchmarks n times (default 1)
-test.coverprofile string
write a coverage profile to the named file after execution
-test.cpu string
comma-separated list of number of CPUs to use for each test
-test.cpuprofile string
write a cpu profile to the named file during execution
-test.memprofile string
write a memory profile to the named file after execution
-test.memprofilerate int
if >=0, sets runtime.MemProfileRate
-test.outputdir string
directory in which to write profiles
-test.parallel int
maximum test parallelism (default 4)
-test.run string
regular expression to select tests and examples to run
-test.short
run smaller test suite to save time
-test.timeout duration
if positive, sets an aggregate time limit for all tests
-test.trace string
write an execution trace to the named file after execution
-test.v
verbose: print additional output