How to make a Build system that turns features off and on based on pull requests? - c++

I'm looking for a build system for a c++ project I have on gitlab. I want to create a build similar to the linux Kernel config that allows features to be turned off or on before building and I'd like all the options to be based on the feature branches that I have merged in.
Example:
At time t=0, I have merged in features 1,2, and 3.
At time t=1, I want to create a realease using only features 1 and 3.
All my features are separated into merge requests. My current approach would be to create a script that makes a patch and removes unselected features. Some alternatives include declaring preprocessor directives in my code for each feature.
I'd like to know if such a tool already exists or if there are general best practices I should follow.

This is what is known as a feature toggle.
https://martinfowler.com/bliki/FeatureToggle.html

Related

Does Vowpal Wabbit support adding and removing features during training?

I would seem to think that it is fine at supporting adding features while training by just expanding the weight vector, and from a few tests it looks like it does exactly that.
I am also aware that the feature names are hashed by VW and therefore I was thinking that it is possible to remove features while training as well, but I cannot seem to confirm this in the code and have been having trouble testing via indices and weight values.
Is there a definitive answer on these issues?
If you train with --save_resume, you can continue the training (loading the previously trained model with -i) with different training data (with more features) and/or with different command line options for generating more features (e.g. --quadratic or --interactions). This is equivalent to adding features "while training".
There are also some reductions which automatically add more features during training. Currently, I am only aware of --stage_poly, but maybe there are more.

Advice for Web-based Remote Build System

I'm interested in setting up a remote build system at work, initially for internal use, potentially for some customers going forward. We need to compile library code on several different machines (PC, Mac) and with multiple compilers, and it can be a real pain trying to get access to a full set. This is not our main build system, which is Jenkins-based and uses an approach that is not easily modified for the purpose envisaged here.
The idea would be that you could post your source to a website with some basic build parameters, it would compile the code and you could then download the generated code. Ideally users could pick which version of the underlying software they compiled their libraries against. I envisage it being supported by a virtual machine.
Reason I'm posting is that I don't really want to roll-my-own as much as possible - longer term it has maintenance implications - and would prefer something as pre-existing as possible. Obviously one would expect some adaptation in terms of scripting.
Any suggestions? It would have to be supported on Mac and PC at absolute minimum.
This sounds like something you could do by creating a parameterized Jenkins job (the build params given as input to your web frontent could be passed on to the job, perhaps via the Jenkins API). Personally, I would see if you could skip the step of creating a new webfrontend, and have users pass their build params directly to Jenkins.
To support downloading the resulting compiled code, you could have the Jenkins job archive the build as an artifact. Users could then download the files from the result page for that individual build.
As for how to make a Jenkins job accept source code to compile as input, perhaps you could use branches in your CM system? Your users could push their code to a branch, and then pass the branch as a build param. Otherwise, you might be able to use the file parameter feature of Jenkins.

GNU Parallel host sticky jobs

I am writing a parallel build farm to build C++ cross-platform applications against various platforms / environments. Every time new code is pushed to a git repo, I build and test the latest code against all the platforms.
I've setup parallel to correctly distribute the jobs among several hosts using the --sshlogin option.
I transfer files, collect output and results. It's all working more than fine and I love the tool.
The build time being sometimes quite long for some platforms, I would like the build to be as incremental as possible.
My only issue is that the build is only incremental if the scheduler sends the jobs to the same machine and reuse the artefacts of the previous build on this specific host.
Say I have 3 hosts, I have 1 chance in 3 for the build to be incremental. If a hosts hasn't built this platform in a while, it might take a long time.
Is it possible to gain control over the host a specific input source will run on and only fallback to the other hosts if the host is busy?
Ideally, I would love to see a tag system where I tag input source with a name and tag several hosts with a name, creating pools of jobs and pools of machines specialized into that type of build.
But a very simple implementation where the input sources are distributed in the same order as the order the sshlogins are defined could be a simple & quick fix in my situation.
I tried to find the source code to implement it myself but I only see doc generation when I browse the code on Savannah.
Any ideas?
Thanks,
M
There is currently no support for prioritizing a given argument to a given sshlogin. The source code is at https://savannah.gnu.org/git/?group=parallel
Feel free to join the mailing list and discuss the idea: https://lists.gnu.org/mailman/listinfo/parallel
The only priority in the code is when a job has failed on an sshlogin, then GNU Parallel prefers to retry that job on another sshlogin. Maybe that could be extended?
If a job is marked as having failed -1 time for a given sshlogin, then GNU Parallel ought to prefer to run the job on that sshlogin.
I've been trying to discuss this idea on the mailing list as you suggested but never had any respone in more than 10 days... I guess you must be busy with other things at the moment. So I went along and forked the source code to make the necessary changes and make my solution work.
I pushed it there a week ago:
http://michakfromparis.github.io/gnu-parallel-sticky/
the source code is available on github here:
https://github.com/michaKFromParis/gnu-parallel-sticky
Wasn't exactly easy without any guidance as the source code has a lot of history so I tried to keep the changes surgical to ease merge of your future releases.
I've been using it in production for more than a week now and it works perfectly in my configuration.
It is also compatible with older formats, should be a drop-in replacement for usual parallel uses with extra features on the side.
Would love to get feedback from other users though as it might not be completely dry.
Thanks for sharing the original source code.
Best Regards,
M

buildbot vs hudson/jenkins for C++ continuous integration

I'm currently using jenkins/hudson for continuous integration a large mostly C++ project. We have separate projects for trunk and every branch. Also, there are some related projects for the Java code, but the setup for those are fairly basic right now (we may do more later though). The C++ projects do the following:
Builds everything with options for whether to reconfigure, do a clean build, or use a fresh checkout
Optionally builds and runs all tests
Optionally runs all tests using Valgrind's memcheck
Runs cppcheck
Generates doxygen documentation
Publishes reports: unit tests, valgrind, cppcheck, compiler warnings, SLOC, open tasks, and code coverage (using gcov, gcovr, and the cobertura plugin)
Deploys code nightly or on demand to a test environment and a package repository
Everything is configurable for automatic builds and optional for on demand builds. Underneath, there's a bash script that controls much of this, which farther depends on our build system, which uses automake and autoconf along with custom bash scripts.
We started using Hudson (at the time) because that's what the Java guys were using and we just wanted nightly builds. Since then, we've added a lot more and continue to add more. In some ways Hudson is great, but certainly isn't ideal.
I've looked at other solutions and the only one that looks like it could be a replacement is buildbot. Would buildbot be better for this situation? Is the investment worth it since we're already using Hudson? Why?
EDIT: Someone asked why I haven't found Hudson/Jenkins to be ideal. The short answer is that everything can be improved. I'm simply wondering if Jenkins is the best current solution for my use case or whether there is something better (buildbot?) that would be easier to maintain in the long run even as new requirements come up.
Both are open source projects, but you do not need to change buildbot code to "extend" it, it is actually quite easy to import your own packages in its configuration in which you can sub-class most of the features with your own additions. Examples: your own compilation or test code, some parsing of outputs/errors to be given to the next steps, your own formating of alert emails etc. there are lots of possibilities.
Generally I would say that buildbot is the most "general purpose" automatic builds tools. Jenkins however might be the best related to running tests, especially for parsing and presenting results in nice ways (results, details, charts.. some clicks away), things that buildbot does not do "out-of-the-box". I'm actually thinking of using both to have sexier test result pages.. :-)
Also as a rule of thumb it should not be difficult to create a new tool's config: if the specification of what to do (configs, builds, tests) is too hard to switch from one tool to another, it is a (bad) sign that not enough configuration scripts are moved to the sources. Buildbot (or Jenkins) should only call simple commands. If it is simple to run tests, then developers will do it as well and this will improve the success rate, whereas if only the continuous integration system runs the tests, you will be running after it to fix the new code failures, and will loose its non-regression value, just my 0.02€ :-)
Hope it'll help.
The 'result integration' is also in jenkins/hudson, and you can relatively easily capture build products without having to 'copy them elsewhere'.
For our instance, the coverage reports and unit test metrics and javadoc for the java code is all integrated. For our C++ code, the plugins are a little lacking, but you can still get most of it.
we ran buildbot since pre 0.7, and are now running 0.8 and are only now seeing any real reason to switch, as buildbot 0.8 forgot about windows slaves for an extended period of time and the support was pretty poor.
There are many other solutions out there, besides Jenkins/Hudson/BuildBot:
TeamCity by Jetbrains
Bamboo by Atlassian
Go by Thoughtworks
Cruise Control
OpenMake Meister
The specifics about what you are doing are not so important, in fact, as long as the agents (aka nodes) that you are doing them on support those tasks.
The beauty of a CI server is noticing when the build changes to trigger a new build (and test), publish the artifacts, and publish test results.
When you compare CI tools like those we mentioned, consider features like the usability of its interface, how easy is branching (and features it might offer like automatic merging), notifications (like XMPP/jabber), or an information-radiator (like hooking up a monitor to always show status). Product support is another thing to consider - Jenkins' support is only as good as who is responding to community questions at the time you have questions.
My personal favorite is Bamboo, but it comes with a license fee.
I'm a long-time Jenkins user in the middle of evaluating Buildbot and would like to offer a few items for folks considering using Buildbot for multi-module solutions:
*) Buildbot doesn't have any out-of-the-box concept of file artifacts related to each build. It's not in the UI and it's not in any of the builtin "steps" modules as far as I can see:
http://docs.buildbot.net/current/manual/configuration/buildsteps.html
...and I see no third party plugin:
https://github.com/buildbot/buildbot/wiki/PluginList#steps
Buildbot does collect all the console output from a given build, but critically, you can't collect files related to it.
*) Given that artifacts are not supported, it's not easy to create "collector" projects that bring multiple modules into say, a single installer. Jenkins has a great feature that lets you parameterize a build with builds from other modules (the parameter type is a run).
*) Establishing dependencies between modules is trickier in Buildbot. Say you have a library that three binaries depend on, and you want those binaries to rebuild each time the library changes. Jenkins has triggers built into the UI. If you want to do triggers in Buildbot you have to script them using schedulers.Dependent, and it causes a lot of item congestion in the Schedulers UI.
*) When you're working in Buildbot, it seems that pretty much all of the configuration is done in master.cfg in code. This is awesome and frustrating.
*) Buildbot forces you to create a worker in addition to a master server. This is annoying for beginners and systems for which a single build server is sufficient.
My impression after two days of Buildbot evaluation is that we'll stick with Jenkins, primarily due to it having artifacts. Buildbot is a tool we'd only use if we had more extensive customization needs, and the time to do it.
On the subject of buildbot and artifacts -- I don't have enough user score to make a comment -- you can get artifacts from buildbot 2.x series pretty easy with built-in file/directory upload actions. However you rarely want to just move files. Typically you make a triggered buildstep that does deployment directly off the worker for best results. eg push to cloud storage, containers, thirdparty (steam uploads), etc.
This way you can get metrics on the uploads and conditionally control them better (or even mix and match artifacts across worker machines).

Continuous build infrastructure recommendations for primarily C++; GreenHills Integrity

I need your recommendations for continuous build products for a large (1-2MLOC) software development project. Characteristics:
ClearCase revision control
Approx 80% C++; 15% Java; 5% script or low-level
Compiles for Green Hills Integrity OS, but also some windows and JVM chunks
Mostly an embedded system; also includes some UI pieces and some development support (simulation tools, config tools, etc...)
Each notional "version" of the deliverable includes deployment images for a number of boards, UI machines, etc... (~10 separate images; 5 distinct operating systems)
Need to maintain/track many simultaneous versions which, notably, are built for a variety of different board support packages
Build cycle time is a major issue on the project, need support for whatever features help address this (mostly need to manage a large farm of build machines, I guess..)
Operates in a secure environment (this is a gov't program) (Edited to add: This is a classified program; outsourcing the build infrastructure is a non-starter.)
Interested in any best practices or peripheral guidance you might offer. The build automation issues is one of several overlapping best practices that appear to be missing on the program, but try to keep your answers focused on build infrastructure piece and observations directly related.
Cost is not the driving concern. Scalability and ease of retrofitting onto an existing infrastructure are key.
(Edited to address #Dan's comment. ;-)
From my experience with similar systems, there are approximately two parts to this problem:
A repeatable method for checking out sources, building the software, and testing it (if you want to do continual testing as well as building), using a small number of command-line invocations.
A means of calling these command lines on various servers in the build farm.
For the latter, we've been using BuildBot, which seems to work pretty well.
For the former, we have a homegrown solution that started out as a simple bash shell script and grew ... rather substantially. From experience, I'd suggest starting out in python rather than bash -- you'll spend far more code in handling setup and configuration than in actually invoking programs. (Also, it's probably easier to run it on Windows if you're doing that.)
The things I've found to be really key in our script's usefulness are:
Ironclad repeatability. We have a standard set of build tools, and the scripts start out by scrubbing environment variables. There are very few command-line options; everything goes into configuration files, and those go in version control.
Logging. We produce a log of every command that the build script executes.
Configuration file inheritance. Each variant of our software gets a configuration file, and those files can include more-general settings (which include even-more-general settings).
Extensibility. When we add a new source component, it's pretty easy to add a set of instructions for building that component (and the instructions can be arbitrary bash code). The "can be arbitrary code" part is probably key here; no way is a pre-existing product going to be able to do all of the quirky things that you need for a large complex real-world system.
You can get started with a reasonably simple script and let it grow organically as the need arises; honestly, although ours is a bit messy, I think we got a much more usable result that way than we would have with heavy top-down design.
Cost isn't an object? I've worked for GreenHills, and they've solved these issues for their in-house build/test farms. Ask them to do the same for you.
When I see emphasis on things like scalability and security in a build system, I start thinking that you might be a candidate for the enterprise class build systems / CI systems. Conveniently, it sounds like you can afford them as well. A year old SD Times article provides a basic breakdown between the enterprise and team level build tools.
My company makes AnthillPro and we've worked with a number of companies on large embedded projects as well as highly secure projects. IBM is probably the largest other player in the space with BuildForge.
AnthillPro puts some extra emphasis on what you do with the images in the minutes/hours/days post build (do you install them onto simulators / hardware and run automated tests? stage them? promote them?) but we also see folks using it for just build.