Writeable data files in Bazel tests - c++

I use Bazel and googletest in my C++ project. I'd like to write some tests that require opening some files with initial data and possibly modifying those files. I'd like the tests to not overwrite the original files obviously. My current (wrong) rules are like this:
filegroup(
name = "test_data",
srcs = glob(["test_data/*"]),
)
cc_test(
name = "sample_test",
srcs = ["sample_test.cc"],
data = [":test_data"],
deps = [ ... ]
)
In the sample_test.cc I try to open a file from test_data/ with RW permissions. Running bazel test //sample_test fails, as open in sample_test.cc returns EROFS (read-only filesystem). I can open the files read-only.
I found this: https://bazel.build/reference/test-encyclopedia#test-interaction-filesystem. It seems test files can only write to some very specific TEST_TMPDIR directory. Is this possible then to make bazel copy the test data files to this directory before running each test?
I guess I could create a fixture and copy the data files to the tmp directory, but this seems like a hack solution and I'd have to add this logic to every test file. It'd be much better to do it from bazel build files directly.

Related

Generation of jacoco.exec using bazel not permitted in other paths except /tmp

In my BUILD.bazel my java_test looks like this:
java_test(
name = "SomeServiceTest",
srcs = [
"src/test/java/com/service/SomeServiceTest.java",
],
test_class = "com.service.SomeServiceTest",
deps = [
"SomeService",
"#junit_junit//jar",
"#commons_logging_commons_logging//jar",
"#org_hamcrest_hamcrest_core//jar",
"#com_fasterxml_jackson_core_jackson_annotations//jar",
"#javax_servlet_javax_servlet_api//jar",
"#org_springframework_spring_aop//jar",
"#org_springframework_spring_beans//jar",
"#org_springframework_spring_context//jar",
"#org_springframework_spring_test//jar",
"#org_springframework_spring_web//jar",
"#org_mockito_mockito_core//jar",
"#net_bytebuddy_byte_buddy//jar",
],
size = "medium",
jvm_flags = ["-javaagent:$$workspacepath/jacocoagent-runtime.jar=destfile=$$workspacepath/jacoco.exec"]
)
I want to make the path of jacocoagent-runtime.jar and the path where the jacoco.exec will be generated to be dynamic, thus, the jvm_flags setup. I defined $$workspacepath in my execution of bazel test below:
bazel test --test_output=all --action_env=workspacepath=/Users/Someone/Desktop some-service:all_tests
Now, I am getting the error below:
java.io.FileNotFoundException: /Users/Someone/Desktop/jacoco.exec (Operation not permitted)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at org.jacoco.agent.rt.internal_290345e.output.FileOutput.openFile(FileOutput.java:67)
at org.jacoco.agent.rt.internal_290345e.output.FileOutput.writeExecutionData(FileOutput.java:53)
at org.jacoco.agent.rt.internal_290345e.Agent.shutdown(Agent.java:137)
at org.jacoco.agent.rt.internal_290345e.Agent$1.run(Agent.java:54)
If I change the workspacepath to /tmp, it works fine. What is wrong with other paths other than /tmp?
I agree with #Godin -- sounds like the input path is not in the sandbox. Does --spawn_strategy=standalone[1] help?
If that's indeed the problem then to fix the build with sandboxing you need to make the .jar file an input of the java_test's action and reference its path correctly from the jvm_flags.
To do that:
either create a new package in your workspace and copy the jacoco jar there, or add a new_local_repository rule to your WORKSPACE file and reference the jar's directory and specify the build_file_contents attribute as exports_files(["jacoco-runtime.jar"])
now that you can reference Jacoco by a label (e.g. #jacoco//:jacoco-runtime.jar) you need to add it to the java_test rule's data attribute
finally you need to change the java_test rule's jvm_flags attribute to reference the jar using $(location <label>), e.g. $(location #jacoco//:jacoco-runtime.jar)
[1] https://docs.bazel.build/versions/master/user-manual.html#flag--spawn_strategy

How to write files to current directory instead of bazel-out

I have the following directory structure:
my_dir
|
--> src
| |
| --> foo.cc
| --> BUILD
|
--> WORKSPACE
|
--> bazel-out/ (symlink)
|
| ...
src/BUILD contains the following code:
cc_binary(
name = "foo",
srcs = ["foo.cc"]
)
The file foo.cc creates a file named bar.txt using the regular way with <fstream> utilities.
However, when I invoke Bazel with bazel run //src:foo the file bar.txt is created and placed in bazel-out/darwin-fastbuild/bin/src/foo.runfiles/foo/bar.txt instead of my_dir/src/bar.txt, where the original source is.
I tried adding an outs field to the foo rule, but Bazel complained that outs is not a recognized attribute for cc_binary.
I also thought of creating a filegroup rule, but there is no deps field where I can declare foo as a dependency for those files.
How can I make sure that the files generated by running the cc_binary rule are placed in my_dir/src/bar.txt instead of bazel-out/...?
Bazel doesn't allow you to modify the state of your workspace, by design.
The short answer is that you don't want the results of the past builds to modify the state of your workspace, hence potentially modifying the results of the future builds. It'll violate reproducibility if running Bazel multiple times on the same workspace results in different outputs.
Given your example: imagine calling bazel run //src:foo which inserts
#define true false
#define false true
at the top of the src/foo.cc. What happens if you call bazel run //src:foo again?
The long answer: https://docs.bazel.build/versions/master/rule-challenges.html#assumption-aim-for-correctness-throughput-ease-of-use-latency
Here's more information on the output directory: https://docs.bazel.build/versions/master/output_directories.html#documentation-of-the-current-bazel-output-directory-layout
There could be a workaround to use genrule. Below is an example that I use genrule to copy a file to the .git folder.
genrule(
name = "precommit",
srcs = glob(["git/**"]),
outs = ["precommit.txt"],
# folder contain this BUILD.bazel file is tool which will be symbol linked, we use cd -P to get to the physical path
cmd = "echo 'setup pre-commit.sh' > $(OUTS) && cd -P tools && ./path/to/your-script.sh",
local = 1, # required
)
If you're passing the name of the output file in when running, you can simply use absolute paths. To make this easier, you can use the realpath utility if you're in linux. If you're on a mac, it is included in brew install coreutils. Then running it looks something like:
bazel run my_app_dir:binary_target -- --output_file=`realpath relative/path/to.output
This has been discussed and explained in a Bazel issue. Recommendation is to use a tool external to Bazel:
As I understand the use-case, this is out-of-scope for building and in the scope of, perhaps, workspace configuration. What I'm sure of is that an external tool would be both easier and safer to write for this purpose, than to introduce such a deep design change to Bazel.
The tool would copy the files from the output tree into the source tree, and update a manifest file (also in the source tree) that lists the path-digest pairs. The sources and the manifest file would all be versioned. A genrule or a sh_test would depend on the file-generating genrules, as well as on this manifest file, and compare the file-generating genrules' outputs' digests (in the output tree) to those in the manifest file, and would fail if there's a mismatch. In that case the user would need to run the external tool, thus update the source tree and the manifest, then rerun the build, which is the same workflow as you described, except you'd run this tool instead of bazel regenerate-autogenerated-sources.

how to make autotools tests read files?

my autotools project has a couple of unit-tests.
one of these tests (filereader) needs to read a file (data/test1.bin)
Here's my filesystem layout:
- libfoo/tests/filereader.c
- libfoo/tests/data/test1.bin
and my libfoo/tests/Makefile.am:
AUTOMAKE_OPTIONS = foreign
AM_CPPFLAGS = -I$(top_srcdir)/foo
LDADD = $(top_builddir)/src/libfoo.la
EXTRA_DIST = data/file1.bin
TESTS = filereader
check_PROGRAMS= filereader
filereader_SOURCES = filereader.c
this works great, as long as i do in-tree builds.
However, when running the test-suite out-of-tree (e.g. make distcheck), the filereader test cannot find the input file anymore.
This is obviously because only the source tree contains the input file, but not the build tree.
i wonder what is the canonical way to fix this problem?
compile the directory of the test-file into the unittest (AM_CPPFLAGS+=-DSRCDIR=$(srcdir))
pass the qualified input file as a cmdline argument to the test? (e.g. $(builddir)/filereader $(srcdir)/data/file1.bin)
copy the input file from the source tree to the build tree? (cp $(srcdir)/data/file1.bin $(builddir)/data/file1.bin? how would a proper make-rule look like??)
Canonically, the solution would be to define the path to your file into the unittest, so the first option you laid out. The second one is also possible but it requires using an in-between driver script.
I would suggest avoiding the third one, but if you do want to go down that route, use $(LN_S) rather than cp; this way you reduce the I/O load of the test.
There is a way to do this with autoconf. From the netcdf-c configure.ac:
##
# Some files need to exist in build directories
# that do not correspond to their source directory, or
# the test program makes an assumption about where files
# live. AC_CONFIG_LINKS provides a mechanism to link/copy files
# if an out-of-source build is happening.
##
AC_CONFIG_LINKS([nc_test4/ref_hdf5_compat1.nc:nc_test4/ref_hdf5_compat1.nc])
AC_CONFIG_LINKS([nc_test4/ref_hdf5_compat2.nc:nc_test4/ref_hdf5_compat2.nc])
AC_CONFIG_LINKS([nc_test4/ref_hdf5_compat3.nc:nc_test4/ref_hdf5_compat3.nc])
AC_CONFIG_LINKS([nc_test4/ref_chunked.hdf4:nc_test4/ref_chunked.hdf4])
AC_CONFIG_LINKS([nc_test4/ref_contiguous.hdf4:nc_test4/ref_contiguous.hdf4])

Scons and Boost.Test, my Test project cannot link with my main project object files

I'd like to use Boost.Test for Test Driven Development.
I asked scons to create two executables, the main one, and the test one.
All my main project files are in ./src/, and all my test dedicated files are in ./test/
The problem is:
the main project object files are put in ./build/src/
the test project object files are put in ./build/test/
and in such a configuration my executable Test cannot link since all the main project object files (of the classes on which I perform my tests) are not in the same directory.
Do you have an idea how I could tweak my scons file so as the linking of the executable Test can use the object files in ./src./ ?
Below is my main.scons file:
import os
env=Environment(CPPPATH=['/usr/local/boost/boost_1_52_0/boost/','./src/'],
CPPDEFINES=[],
LIBPATH=['/usr/local/boost/boost_1_52_0/boost/libs/','.'],
LIBS=['boost_regex'],
CXXFLAGS="-std=c++0x")
env['ENV']['TERM'] = os.environ['TERM']
env.Program('Main', Glob('src/*.cpp'))
#
testEnv = env.Clone()
testEnv['CPPPATH'].append('./test/')
testEnv['LIBS'].append('boost_unit_test_framework')
testEnv.Program('Test', Glob('test/*.cpp'))
While the "duplicate object lists" approach is fine for simple projects, you may run into limitations in which your test code doesn't need to link against the entire object space of your main program. For example, to stub out a database layer that's not the focus of a particular unit test.
As an alternative, you can create (static) libraries of common code that you link against your primary executable and your test framework.
common_sources = ['src/foo.cpp', 'src/bar.cpp'] # or use Glob and then filter
env.Library("common", common_sources)
program_sources = ['src/main.cpp']
env.Program("my_program", program_sources, LIBS=['common'])
...
testEnv['LIBPATH'] = ['.'] # or wherever you build the library
testEnv.Program("unit_test", test_sources, LIBS=['common'])
This also avoids the duplicate main() problem that you mention because only the program_sources and test_sources lists should contain the appropriate file with the main routine.
I have continued searching, and found This post on the web which intrigued me, using the scons env.Object. Indeed this object contains the list of all target object files.
And with slight modifications I have the scons file that does what I wanted (though now I have a problem of dupplicated main function but that's another problem):
import os
env=Environment(CPPPATH=['/usr/local/boost/boost_1_52_0/boost/','./src/'],
CPPDEFINES=[],
LIBPATH=['/usr/local/boost/boost_1_52_0/boost/libs/','.'],
LIBS=['boost_regex'],
CXXFLAGS="-std=c++0x")
env['ENV']['TERM'] = os.environ['TERM']
# here I keep track of the main project object files
mainObjectFiles = env.Object( Glob('src/*.cpp'))
env.Program('PostgresCpp', mainObjectFiles)
#
testEnv = env.Clone()
testEnv['CPPPATH'].append('./test/')
testEnv['LIBS'].append('boost_unit_test_framework')
# here I append all needed object files
testObjectFiles = Glob('test/*.cpp')
testedObjectFiles = Glob('src/*.cpp')
allObjectFilesExceptMain = [x for x in mainObjectFiles if x != 'src/main.o']
allObjectFilesExceptMain.append(testObjectFiles)
testEnv.Program('Test',allObjectFiles)

make distcheck and tests that need input files

I recently converted my build system to automake/autoconf. In my project I have a few unit tests that need some input data files in the direcory from where they are run. When I run make distcheck and it tries the VPATH build, these tests fail because they are apparently not run from the directory where the input files are. I was wondering if there is some quick fix for this. For example, can I somehow tell the system not to run these tests on make distcheck (but still run them on make check)? Or to cd to the directory where the files are before running the tests?
I had the same problem and used solution similar to William's. My Makefile.am looks something like this:
EXTRA_DIST = testdata/test1.dat
AM_CPPFLAGS = -DDATADIR=\"$(srcdir)/\"
Then, in my unittest, I use the DATADIR define:
string path = DATADIR "/testdata/test1.dat"
This works with make check and make distcheck.
The typical solution is to write the tests so that they look in the source directory for the data files. For example, you can reference $srcdir in the test, or convert test to test.in and refer to #srcdir#.
If your tests are all in the source directory, you can run all the tests in that directory by setting TESTS_ENVIRONMENT in Makefile.am:
TESTS_ENVIRONMENT = cd $(srcdir) &&
This will fail if some of your tests are created by configure and therefore live only in the build directory, in which case you can selectively cd with something like:
TESTS_ENVIRONMENT = { test $${tst} = mytest && cd $(srcdir); true; } &&
Trying to use TESTS_ENVIRONMENT like this is fragile at best, and it would be best to write the tests so that they look in the source directory for the data files.