gcovr compare two reports - c++

I find it really difficult when refactoring huge codebase, changing tests etc. to find on my code coverage report what lines I'm no longer covering.
Is there any tool/way to get diff from two reports?

Gcovr currently has no support for showing coverage changes. This gives you roughly the following two options:
You can use gcovr's JSON output and write your own script to compare the results from multiple different runs.
You can use a different tool. For example, Henry Cox has created a fork of the lcov tool with interesting support for differential coverage. Many third party tools can show coverage changes, for example the Codecov.io service. If you're running some CI system, there might already be a feature of plugin to highlight coverage changes.
As an example of a script to process gcovr's JSON output, the following would alert you of lines that lost coverage:
#!/usr/bin/env python
import sys, json
def main(baseline_json: str, alternative_json: str):
baseline = load_covered_lines(baseline_json)
alternative = load_covered_lines(alternative_json)
found_any_differences = False
# compare covered lines for each source file
for filename in sorted(baseline):
difference = baseline[filename] - alternative[filename]
if not difference:
print(f"info: {filename}: ok")
continue
found_any_differences = True
for lineno in sorted(difference):
print(f"error: {filename}: {lineno}: not covered in {alternative_json}")
if found_any_differences:
sys.exit(1)
def load_covered_lines(gcovr_json_file: str) -> dict[str, set[int]]:
# JSON format is documented at
# <https://gcovr.com/en/stable/output/json.html#json-format-reference>
with open(gcovr_json_file) as f:
data = json.load(f)
# The JSON format may change between versions
assert data["gcovr/format_version"] == "0.3"
covered_lines = dict()
for filecov in data["files"]:
covered_lines[filecov["file"]] = set(
linecov["line_number"]
for linecov in filecov["lines"]
if linecov["count"] != 0 and not linecov["gcovr/noncode"]
)
return covered_lines
if __name__ == "__main__":
main(*sys.argv[1:])
Example usage:
./old-configuration
gcovr --json baseline.json
find . -name '*.gcda' -exec rm {} + # delete old coverage data
./new-configuration
gcovr --json alternative.json
python3 diffcov.py baseline.json alternative.json
Example output:
error: changed.c: 2: not covered in alternative.json
info: same.c: ok
You could create a similar script that also opens the original source files to create an annotated report with all coverage changes, similar to the textual reports generated by gcov.

Related

Running locally blastn against nt db thru python script

I have a fasta file with sequence that I want to blast locally to 'nt' database dowloaded on my computer from ncbi website
I dowloaded blast 2.6.0.
In order to access blast from anywhere, I did:
gedit ~/.bashrc
export PATH=/usr/local/ncbi-blast-2.6.0+/bin:$PATH
then I did:
source ~/.bashrc
Then I downloaded 'nt' database (155.6GB) and stored it in /usr/local/blastdb
I want to run in python script this command:
from Bio.Blast.Applications import NcbiblastnCommandline
cline = NcbiblastnCommandline(query="/home/proprietaire/Desktop/JADE/stage_scripts/seq_error_fasta.fasta", db="/usr/local/blastdb/nt", evalue=0.001, out="blast_result_local.xml", outfmt=5)
But it is not working for a reason. Please help me figure out what I'm doing wrong. Thank you for your help.
EDIT:
'seq_error_fasta.fasta' : is my fasta file with 64 sequences that I want to blast to 'nt' database.
My 'seq_error_fasta.fasta' contains sequence loaded with error like S, J, X so I want to blast them to 'nt' db in order to get the closest better sequence
I found out that I need to format the nt database dowloaded from ncbi so I did this:
makeblastdb -dbtype nucl -in nt
Then I added this after my cline variable in my python script:
stdout, stderr = cline()
The script is running but unfortunately I'm getting this error now:
Bus error (core dumped)
I think it's a ram memory problem so I thought that I need to shorten 'nt' db by taking only the bacteria sequence. I looked on NCBI for a whole bacteria only database but there is multiple database of different species like more then a thousand.
I also tried blast online using this script:
f = open('output_blast.xml','w')
for rec in SeqIO.parse(open("seq_error_fasta.fasta"), 'fasta):
result_handle = NCBIWWW.qblast("blastn", "nt", rec.format("fasta"), format_type="XML", alignments=1, perc_ident=95, expect= 0.001)
f.write(result_handle.read())
f.close()
but this only doing one query sequence and returning all hits, althought I specified 1 hit and 95% of identity.
This is driving me crazy lollll Please help
Download NCBI nr database:
$ mkdir db
$ cd db
$ wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/nr*
Run blast:
import io
import shlex
import subprocess
BLAST_OUTFMT6 = """\
'6 qacc sacc pident length mismatch gapopen qstart qend sstart send evalue bitscore qseq sseq'\
"""
BLAST_OUTFMT6_COLUMN_NAMES = [
'query_id', 'subject_id', 'pc_identity', 'alignment_length', 'mismatches', 'gap_opens',
'q_start', 'q_end', 's_start', 's_end', 'evalue', 'bitscore', 'qseq', 'sseq',
]
def blastp(sequence, db, evalue=0.001, max_target_seqs=100000):
system_command = (
'blastp -db {db} -outfmt {outfmt} -evalue {evalue} -max_target_seqs {max_target_seqs}'
.format(db=db, outfmt=BLAST_OUTFMT6, evalue=evalue, max_target_seqs=max_target_seqs)
)
cp = subprocess.Popen(
shlex.split(system_command),
stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
universal_newlines=True)
result, error_message = cp.communicate(sequence)
if error_message.strip():
print("Error: {}".format(error_message))
return result
if __name__ == '__main__':
result = blastp('AAAAAAAAAAAAAA', db='/path/to/db/nr')
print(result)

Django-extension runscript No (valid) module for script

I'm trying to create a script that will populate my model families with informations extracted from a text file.
This is my first post in StackOverflow, please be gentle, sorry if the question is not well expressed or not correctly formatted.
Django V 1.9 and running on Python 3.5
Django-extensions installed
This is my model: it's in an app called browse
from django.db import models
from django_extensions.db.models import TimeStampedModel
class families(TimeStampedModel):
rfam_acc = models.CharField(max_length=7)
rfam_id = models.CharField(max_length=40)
description = models.CharField(max_length=75)
author = models.CharField(max_length=50)
comment = models.CharField(max_length=500)
rfam_URL = models.URLField()
Here I have my script familiespopulate.py. Positioned in the PROJECT_ROOT/scripts directory.
import csv
from browse.models import families
file_path = "/Users/work/Desktop/StructuRNA/website/scripts/RFAMfamily12.1.txt"
def run(file_path):
listoflists = list(csv.reader(open(file_path, 'rb'), delimiter='\t'))
for row in listoflists:
families.objects.create(
rfam_acc=row[0],
rfam_id=row[1],
description=row[3],
author=row[4],
comment=row[9],
)
When from Terminal i run:
python manage.py runscript familiespopulate
it returns:
No (valid) module for script 'familiespopulate' found
Try running with a higher verbosity level like: -v2 or -v3
The problem must be in importing the model families, I'm new to django, and I cannot find any solution here on StackOverflow or anywhere else online.
This is why I ask for your help!
Do you know how the model should be imported?
Or... Am I doing something else wrong.
Important piece of information is that the script runs if I modify it to PRINT out the parameters, instead of creating an object in families.
For your information and curiosity I will also post here an extract of the textfile that I'm using.
RF00001 5S_rRNA 1302 5S ribosomal RNA Griffiths-Jones SR, Mifsud W, Gardner PP Szymanski et al, 5S ribosomal database, PMID:11752286 38.00 38.00 37.90 5S ribosomal RNA (5S rRNA) is a component of the large ribosomal subunit in both prokaryotes and eukaryotes. In eukaryotes, it is synthesised by RNA polymerase III (the other eukaryotic rRNAs are cleaved from a 45S precursor synthesised by RNA polymerase I). In Xenopus oocytes, it has been shown that fingers 4-7 of the nine-zinc finger transcription factor TFIIIA can bind to the central region of 5S RNA. Thus, in addition to positively regulating 5S rRNA transcription, TFIIIA also stabilises 5S rRNA until it is required for transcription. NULL cmbuild -F CM SEED cmcalibrate --mpi CM cmsearch --cpu 4 --verbose --nohmmonly -T 24.99 -Z 549862.597050 CM SEQDB 712 183439 0 0 Gene; rRNA; Published; PMID:11283358 7946 0 0.59496 -5.32219 1600000 213632 305 119 1 -3.78120 0.71822 2013-10-03 20:41:44 2016-04-21 23:07:03
This is the first line and the result of the extraction from the listoflists is :
RF00002
5_8S_rRNA
5.8S ribosomal RNA
Griffiths-Jones SR, Mifsud W
5.8S ribosomal RNA (5.8S rRNA) is a component of the large subunit of the eukaryotic ribosome. It is transcribed by RNA polymerase I as part of the 45S precursor that also contains 18S and 28S rRNA. Functionally, it is thought that 5.8S rRNA may be involved in ribosome translocation [2]. It is also known to form covalent linkage to the p53 tumour suppressor protein [3]. 5.8S rRNA is also found in archaea.
Try adding empty file __init__.py (double underscore) into your /scipts folder and run with:
python manage.py runscript scipts.familiespopulate
Apart from adding init.py you are not supposed to pass any parameters in the run method.
def run():
<your code goes here>
Thanks for the useful comments.
I modified my code in this way:
import csv
from browse.models import families
def run():
file_path = "/Users/work/Desktop/StructuRNA/website/scripts/RFAMfamily12.1.txt"
listoflists = list(csv.reader(open(file_path, 'r'),delimiter='\t'))
print(listoflists)
for row in listoflists:
families.objects.create(
rfam_acc=row[0],
rfam_id=row[1],
description=row[3],
author=row[4],
comment=row[9],
)
This is all. Now it worked smoothly.
I want to confirm to everyone that my file: familiespopulate.py was in the folder script with the file init.py
The problem seemed to be resolved when I put
file_path = "/Users/work/Desktop/StructuRNA/website/scripts/RFAMfamily12.1.txt"
Inside the run function, removing the parameter file_path from run(file_path).
Another modify to my code was the argument r inside open(file_path, 'r'), before it was open(file_path, 'rb') that should corrispond to read binary.
I was also getting exactly the same error, I tried all of the solution above but unfortunately did not worked for me. Then I realized my mistake, and I found it.
Inside the script file (which is inside the script/ folder) I used different name for the function, which should be named as 'run'. So, make sure you checked it as well, if you get this error.
Here you can read more about "runscript"

Create version number variations for info.plist using #define and clang?

Years ago, when compiling with GCC, the following defines in a #include .h file could be pre-processed for use in info.plist:
#define MAJORVERSION 2
#define MINORVERSION 6
#define MAINTVERSION 4
<key>CFBundleShortVersionString</key> <string>MAJORVERSION.MINORVERSION.MAINTVERSION</string>
...which would turn into "2.6.4". That worked because GCC supported the "-traditional" flag. (see Tech Note TN2175 Info.plist files in Xcode Using the C Preprocessor, under "Eliminating whitespace between tokens in the macro expansion process")
However, fast-forward to 2016 and Clang 7.0.2 (Xcode 7.2.1) apparently does not support either "-traditional" or "-traditional-cpp" (or support it properly), yielding this string:
"2 . 6 . 4"
(see Bug 12035 - Preprocessor inserts spaces in macro expansions, comment 4)
Because there are so many different variations (CFBundleShortVersionString, CFBundleVersion, CFBundleGetInfoString), it would be nice to work around this clang problem, and define these once, and concatenate / stringify the pieces together. What is the commonly-accepted pattern for doing this now? (I'm presently building on MacOS but the same pattern would work for IOS)
Here is the Python script I use to increment my build number, whenever a source code change is detected, and update one or more Info.plist files within the project.
It was created to solve the issue raised in this question I asked a while back.
You need to create buildnum.ver file in the source tree that looks like this:
version 1.0
build 1
(you will need to manually increment version when certain project milestones are reached, but buildnum is incremented automatically).
NOTE the location of the .ver file must be in the root of the source tree (see SourceDir, below) as this script will look for modified files in this directory. If any are found, the build number is incremented. Modified means source files changes after the .ver file was last updated.
Then create a new Xcode target to run an external build tool and run something like:
tools/bump_buildnum.py SourceDir/buildnum.ver SourceDir/Info.plist
(make it run in ${PROJECT_DIR})
and then make all the actual Xcode targets dependent upon this target, so it runs before any of them are built.
#!/usr/bin/env python
#
# Bump build number in Info.plist files if a source file have changed.
#
# usage: bump_buildnum.py buildnum.ver Info.plist [ ... Info.plist ]
#
# andy#trojanfoe.com, 2014.
#
import sys, os, subprocess, re
def read_verfile(name):
version = None
build = None
verfile = open(name, "r")
for line in verfile:
match = re.match(r"^version\s+(\S+)", line)
if match:
version = match.group(1).rstrip()
match = re.match(r"^build\s+(\S+)", line)
if match:
build = int(match.group(1).rstrip())
verfile.close()
return (version, build)
def write_verfile(name, version, build):
verfile = open(name, "w")
verfile.write("version {0}\n".format(version))
verfile.write("build {0}\n".format(build))
verfile.close()
return True
def set_plist_version(plistname, version, build):
if not os.path.exists(plistname):
print("{0} does not exist".format(plistname))
return False
plistbuddy = '/usr/libexec/Plistbuddy'
if not os.path.exists(plistbuddy):
print("{0} does not exist".format(plistbuddy))
return False
cmdline = [plistbuddy,
"-c", "Set CFBundleShortVersionString {0}".format(version),
"-c", "Set CFBundleVersion {0}".format(build),
plistname]
if subprocess.call(cmdline) != 0:
print("Failed to update {0}".format(plistname))
return False
print("Updated {0} with v{1} ({2})".format(plistname, version, build))
return True
def should_bump(vername, dirname):
verstat = os.stat(vername)
allnames = []
for dirname, dirnames, filenames in os.walk(dirname):
for filename in filenames:
allnames.append(os.path.join(dirname, filename))
for filename in allnames:
filestat = os.stat(filename)
if filestat.st_mtime > verstat.st_mtime:
print("{0} is newer than {1}".format(filename, vername))
return True
return False
def upver(vername):
(version, build) = read_verfile(vername)
if version == None or build == None:
print("Failed to read version/build from {0}".format(vername))
return False
# Bump the version number if any files in the same directory as the version file
# have changed, including sub-directories.
srcdir = os.path.dirname(vername)
bump = should_bump(vername, srcdir)
if bump:
build += 1
print("Incremented to build {0}".format(build))
write_verfile(vername, version, build)
print("Written {0}".format(vername))
else:
print("Staying at build {0}".format(build))
return (version, build)
if __name__ == "__main__":
if os.environ.has_key('ACTION') and os.environ['ACTION'] == 'clean':
print("{0}: Not running while cleaning".format(sys.argv[0]))
sys.exit(0)
if len(sys.argv) < 3:
print("Usage: {0} buildnum.ver Info.plist [... Info.plist]".format(sys.argv[0]))
sys.exit(1)
vername = sys.argv[1]
(version, build) = upver(vername)
if version == None or build == None:
sys.exit(2)
for i in range(2, len(sys.argv)):
plistname = sys.argv[i]
set_plist_version(plistname, version, build)
sys.exit(0)
First, I would like to clarify what each key is meant to do:
CFBundleShortVersionString
A string describing the released version of an app, using semantic versioning. This string will be displayed in the App Store description.
CFBundleVersion
A string specifing the build version (released or unreleased). It is a string, but Apple recommends to use numbers instead.
CFBundleGetInfoString
Seems to be deprecated, as it is no longer listed in the Information Property List Key Reference.
During development, CFBundleShortVersionString isn't changed that often, and I normally set CFBundleShortVersionString manually in Xcode. The only string I change regularly is CFBundleVersion, because you can't submit a new build to iTunes Connect/TestFlight, if the CFBundleVersion wasn't changed.
To change the value, I use a Rake task with PlistBuddy to write a time stamp (year, month, day, hour, and minute) to CFBundleVersion:
desc "Bump bundle version"
task :bump_bundle_version do
bundle_version = Time.now.strftime "%Y%m%d%H%M"
sh %Q{/usr/libexec/PlistBuddy -c "Set CFBundleVersion #{bundle_version}" "DemoApp/DemoApp-Info.plist"}
end
You can use PlistBuddy, if you need to automate CFBundleShortVersionString as well.

Evaluating code after parsing it

I'm trying to create a tool written in Python that executes R scripts (from files), injecting values into variables before executing them and reading output variables after that.
The rinterface documentation mentions the parse function, but there is no indication about how to execute the result. The C interface contains an eval function but it doesn't seem available in Python.
Here's a very basic example of what I want to do :
import rpy2.rinterface as ri
ri.initr()
with open('script.r', 'r') as myFile:
script = myFile.read()
expr = ri.parse(script)
# prepare
ri.globalenv['input'] = ri.IntSexpVector((1, 2, 3, 4))
# execute
#??????????????????
# what to do here ?
#??????????????????
# fetch results
# The script is supposed to store results into a global var named 'output'
result = ri.globalenv['output']
Thanks
There are several ways.
One is:
from rpy2.robjects.packages import importr
base = importr('base')
base.eval(expr)

How do you shift all pages of a PDF document right by one inch?

I want to shift all the pages of an existing pdf document right one inch so they can be three hole punched without hitting the content. The pdf documents will be already generated so changing the way they are generated is not possible.
It appears iText can do this from a previous question.
What is an equivalent library (or way do this) for C++ or Python?
If it is platform dependent I need one that would work on Linux.
Update: Figured I would post a little script I wrote to do this in case anyone else finds this page and needs it.
Working code thanks to Scott Anderson's suggestion:
rightshift.py
#!/usr/bin/python2
import sys
import os
from pyPdf import PdfFileReader, PdfFileWriter
#not sure what default user space units are.
# just guessed until current document i was looking at worked
uToShift = 50;
if (len(sys.argv) < 3):
print "Usage rightshift [in_file] [out_file]"
sys.exit()
if not os.path.exists(sys.argv[1]):
print "%s does not exist." % sys.argv[1]
sys.exit()
pdfInput = PdfFileReader(file( sys.argv[1], "rb"))
pdfOutput = PdfFileWriter()
pages=pdfInput.getNumPages()
for i in range(0,pages):
p = pdfInput.getPage(i)
for box in (p.mediaBox, p.cropBox, p.bleedBox, p.trimBox, p.artBox):
box.lowerLeft = (box.getLowerLeft_x() - uToShift, box.getLowerLeft_y())
box.upperRight = (box.getUpperRight_x() - uToShift, box.getUpperRight_y())
pdfOutput.addPage( p )
outputStream = file(sys.argv[2], "wb")
pdfOutput.write(outputStream)
outputStream.close()
You can try the pypdf library. In 2022 PyPDF2 was merged back into pypdf.
two ways to perform this task in Linux
using ghostscript trough gsview
look in your /root or /home for the hidden file .gsview.ini
go to section:
[pdfwrite Options]
Options=
Xoffset=0
Yoffset=0
change the values for X axis, settling a convenient value (values are in postscript points, 1 inch = 72 postscript points)
so:
[pdfwrite Options]
Options=
Xoffset=72
Yoffset=0
close .gsview.ini
open your pdf file with gsview
file / convert / pdfwrite
select first odd pages and print to a new file (you can name this as odd.pdf)
now repeat same steps for even pages
open your pdf file with gsview
[pdfwrite Options]
Options=
Xoffset=-72
Yoffset=0
file / convert / pdfwrite
select first even pages and print to a new file (you can name this as even.pdf)
now you need to mix these two pdf with odd and even pages
you can use:
Pdf Transformer
http://sourceforge.net/projects/pdf-transformer/
java -jar ./pdf-transformer-0.4.0.jar <INPUT_FILE_NAME1> <INPUT_FILE_NAME2> <OUTPUT_FILE_NAME> merge -j
2: : use podofobox + pdftk
first step: with pdftk separate whole pdf document in two pdf files with only odd and even pages
pdftk file.pdf cat 1-endodd output odd.pdf && pdftk file.pdf cat 1-endeven output even.pdf
now with podofobox, included into podofo utils
http://podofo.sourceforge.net/about.html
podofobox file.pdf odd.pdf crop -3600 0 widht height for odd pages and
podofobox file.pdf even.pdf crop 3600 0 widht height for even pages
width and height are in postscript point x 100 and can be found with pdfinfo
e.g. if your pdf file has pagesize 482x680, then you enter
./podofobox file.pdf odd.pdf crop -3600 0 48200 68000
./podofobox file.pdf even.pdf crop 3600 0 48200 68000
then you can mix together odd and even in a unique file with already cited
Pdf Transformer
http://sourceforge.net/projects/pdf-transformer/
With pdfjam, the command to translate all pages 1 inch to the right is
pdfjam --offset '1in 0in' doc.pdf
The transformed document is saved to doc-pdfjam.pdf. For further options, type pdfjam --help. Currently pdfjam requires a Unix-like command prompt (Linux, Mac, or Cygwin). In Ubuntu, it can be installed with
sudo apt install pdfjam
Not a full answer, but you can use LaTeX with pdfpages:
http://www.ctan.org/tex-archive/macros/latex/contrib/pdfpages/
Multiple commandline linux tools also use this approach, for instance pdfjam uses this:
http://www2.warwick.ac.uk/fac/sci/statistics/staff/academic-research/firth/software/pdfjam
Maybe pdfjam can already provide what you need already.
Here is a modified version for python3.x.
First install pypdf2 via pip install pypdf2
import sys
import os
from PyPDF2 import PdfFileReader, PdfFileWriter
uToShift = 40; # amount to shift contents by. +ve shifts right
if (len(sys.argv) < 3):
print ("Usage rightshift [in_file] [out_file]")
sys.exit()
if not os.path.exists(sys.argv[1]):
print ("%s does not exist." % sys.argv[1])
sys.exit()
path=os.path.dirname(os.path.realpath(__file__))
with open(("%s\\%s" % (path, sys.argv[1])), "rb") as pdfin:
with open(("%s\\%s" % (path, sys.argv[2])), "wb") as pdfout:
pdfInput = PdfFileReader(pdfin)
pdfOutput = PdfFileWriter()
pages=pdfInput.getNumPages()
for i in range(0,pages):
p = pdfInput.getPage(i)
for box in (p.mediaBox, p.cropBox, p.bleedBox, p.trimBox, p.artBox):
box.lowerLeft = (box.getLowerLeft_x() - uToShift, box.getLowerLeft_y())
box.upperRight = (box.getUpperRight_x() - uToShift, box.getUpperRight_y())
pdfOutput.addPage( p )
pdfOutput.write(pdfout)