Jenkins build fails when user logs off node - c++

Sometimes, builds done by Jenkins (1.461) will stop at a random spot somewhere in the middle. These builds are manually scripted calls to Visual Studio 2008 SP1's devenv.com for primarily C++ code. Visual Studio emits no error messages; the last message in devenv's log is some random file being built. The Jenkins build fails because of a post-build Windows batch command that relies on some of the build outputs. This happens fairly rarely (roughly 1 in 15 builds). Jenkins's error log shows nothing out of the ordinary around the time the build fails. Surprisingly, it says the build succeeded, even though it shows it as failed everywhere else.
The problem is isolated to Jenkins. The same build script run at a developer's desk has never failed in this way.
The Jenkins nodes are 32-bit Windows XP machines. They all have ample available disk space. Jenkins is configured to only run one job at a time per node. The event logs show no obviously bad things (e.g., Visual Studio crashes) happening at the times when the builds stop.
Does anyone have any ideas of things to look into to troubleshoot this?

I don't recall ever having this problem with jenkins myself, but I have regular linker crashes in MSVC 2008. This happens almost everyday for me. If it is the linker crashing then that could be an explanation.(perhaps a linker crash is not logged?)
Edit:
We use MSVC2008 SP1 on 32-bit Win7.
We use MSBuild 3.5 when building the c++ solutions.

We ended up correlating the random build failures with logoff events on the Jenkins nodes. This lead to this JVM bug/feature (Oracle Java bug ID 6871190), where a logoff event in Windows causes the signal handlers to terminate the JVM. You can disable this behavior (perhaps with other downsides) by passing the -Xrs option to the JVM, but that option will not automatically propagate to child Java processes.
We are passing -Xrs to kick off Jenkins itself, and the Jenkins service itself lives through a logoff. The current hypothesis is that some part of Jenkins's build process is kicked off through another Java child process who is not invoked with -Xrs.

Related

Unable to delete EXE after it crashes even though process not shown in Task Manager

I have a program that I have written that crashes and I'm in the process of debugging it.
However, the issue is that when I attempt to create a new build, very frequently (but not always!) I get the message:
Cannot open file 'TheExecutable.exe'
I am then unable to delete, rename, move, or in any way modify the executable until the system is rebooted. Attempting to do so in Windows explorer gives
The action can't be completed because the file is open in TheExecutable.exe Close the file and try again.
This behavior isn't unique to the particular crash I'm dealing with right now, nor the particular program. Development is becoming a headache as every attempt to debug will now take several minutes to reboot and bring all my tooling back up.
What, if anything, can I do to prevent Windows from "locking" the executable in such a fashion?
No running process for that executable is visible in Task Manager
Full details of build system:
Windows 10
Intel Compiler, 19.1.0.166 Build 20191121
nmake
C++14
Your process is not being terminated all the way. Since it is not list in the task manager, you can use PSKILL to end it manually.
Open power shell or the console in administrator mode and run
pskill name_of_executable
and it should terminate it so you can re-run it.

Multi processor compilation of VS2015 produces "not enough quota is available to process this command" when running cl.exe

We are running automated Jenkins builds on Amazon servers (Windows Server 2012 R2) for a few Visual Studio solutions. The bigger projects in them are configured with the /MP, use multi processor compilation, attempting to minimize build time.
We run msbuild with its /m flag as well.
Problem is that after a few minutes we get:
TRACKER : error TRK0002: Failed to execute command: ""C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\CL.exe" #C:\Users\Administrator\AppData\Local\Temp\tmpd19a7e5e426f4ec7baa597ed75516fd4.rsp". Not enough quota is available to process this command.
This problem occurs only when running Jenkins agent. With Visual Studio IDE everything is fine. When running MSBuild from command prompt everything is fine as well.
Any idea why that happens and we can work around it?
Running
WMIC CPU Get DeviceID,NumberOfCores,NumberOfLogicalProcessors
got
DeviceID NumberOfCores NumberOfLogicalProcessors
CPU0 2 4
Maybe VS2015 does not detect the number of effective processors correctly and crosses some process boundary of spawning too many processes concurrently?
Any help would be greatly appreciated.
The problem was a quota was defined in WinRM as denoted here: https://msdn.microsoft.com/en-us/library/ee309367(v=vs.85).aspx
We changed the value of MaxProcessesPerShell to be higher than the default of 25 and voila.
The facile answer from technet to "create more virtual memory" is actually probably worth a bash.
Jenkins, at the time it launches MSBuild appears to use Runtime.exec() (checkout Launcher.java if you're having a look at their code).
This typically creates a new process, that, at least initially, has the same memory footprint as the launching process. So for a bit of time you're probably running twice as much memory as necessary for the build process. So having some more virtual memory available at launch time may be enough to allow the new process to run long enough for the launching process to finish and free.
Here's Technet's cough useful description of the error message: https://technet.microsoft.com/en-us/library/cc958981.aspx

How to cleanup CommitLog files using Cassandra in Unit Tests

I've got a problem that's very similar to the one listed here:
How to cleanup embedded cassandra after unittest?
In short, I'm firing up Cassandra to run some integration tests, but when my test classes run they fail as they are unable to delete the CommitLog files that are produced by Cassandra.
I'm following the suggestions in the answer given there, which are to perform cleanup on startup, but at that time the files still cannot be deleted (if I debug through the code, I am also unable to delete the files at that time via the command line or the GUI). The result is that my first test class passes, but all subsequent ones fail.
Further details:
My colleagues are running on OSX and do not have this problem; I am on Windows 7.
I've tried running the tests under DOS and Cygwin, as well as through Eclipse, in all cases both as my local user and as Administrator.
I've used Process Explorer to confirm that nothing apart from a single Java process has a handle on the file in question.
I've debugged through the code right down to the native call in java.io.Win32FileSystem, which is unable to delete the file.
Is there anything I can do to ensure Cassandra has shut down and/or deleted its CommitLog file?
Thanks!

How do I use RestartManager to restart explorer.exe with Windows Installer custom action?

I have an installer that prompts users to restart their computer after an install. I would rather not have the user restart their computer in this case, and have explorer.exe just restart itself using the RestartManager API provided with Windows Vista and up.
I've created a separate executable that gets copied to the local computer during install and runs after that. The separate executable registers explorer.exe, shuts it down, and restarts it based on this code: http://msdn.microsoft.com/en-us/library/aa373681%28v=VS.85%29.aspx. When the executable is run separately from the installer, it works as designed. But when it runs as a custom action as part of an MSI package created with InstalShield, it shuts down explorer.exe but does not restart it.
I always get a 160 error code for RmRestart when it runs with the installer. The docs say it's an error code meaning there were invalid arguments provided. (http://msdn.microsoft.com/en-us/library/aa373665%28v=vs.85%29.aspx). I'm fairly positive that my arguments are not invalid as they work when the executable runs separately from Windows Installer.
I'm stuck at this point and not sure what else to do to get this working. The only thing I'm uncertain of is if "0" can be a proper session handle returned from RmStartSession() with error code of 0 (Success). Assuming this was wrong, I set up my executable to also take in the RmSessionKey that's created by Windows Installer before InstallValidate. And I use that to call my executable as a deferred action. I get an error of 4c3 for RmShutdown in this case, which seems to be an invalid error code.
Cliffs: Have separate .exe that uses RestartManager API to shutdown, restart explorer.exe that works when not run with Windows Installer, but when combined, it breaks. Seeing error code of 160 for RmRestart(). Ran out of ideas to try to get this working. I can provide code snippets if people want...
Thanks for any suggestions/comments.
I ended up reaching a solution to this...
Rather than creating a separate executable that registers explorer.exe and shuts it down, create a MSI DLL Custom Action. All this DLL has to have is a single function that registers explorer.exe to be restarted and use the existing restart manager session provided by Windows Installer (by default). Then in your installer, add the MsiFilesInUse dialog and you'll be good to go.
Now when the installer runs, it starts the restart manager session, and calls your MSI DLL CA, and adds explorer.exe to the list. The list gets displayed and the user is given options to close or defer closing of the applications.
Using this method allows you to avoid having to distribute a pointless executable to the user, as well as simplifies the amount of code written greatly.

visual c++ 2010 C++ build problems

When compiling any C++ project with visual studio 2010 express I'm liable to get following behaviour - build started message appears in output window, CPU climbs to near 100%, multiple MSBuild.exe processes are spawned, long pause (several minutes) with nothing happening, build aborts with the following message
xxx.vcxproj : error MSB4014: The build stopped unexpectedly because of an internal failure.
xxx.vcxproj : error MSB4014: Microsoft.Build.Exceptions.BuildAbortedException: Build was canceled. MSBuild.exe could not be launched as a child node as it could not be found at the location "c:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\MSBuild.exe". If necessary, specify the correct location in the BuildParameters, or with the MSBUILD_EXE_PATH environment variable.
xxx.vcxproj : error MSB4014: at Microsoft.Build.BackEnd.NodeManager.AttemptCreateNode(INodeProvider nodeProvider, NodeConfiguration nodeConfiguration)
xxx.vcxproj : error MSB4014: at Microsoft.Build.BackEnd.NodeManager.CreateNode(NodeConfiguration configuration, NodeAffinity nodeAffinity)
xxx.vcxproj : error MSB4014: at Microsoft.Build.Execution.BuildManager.PerformSchedulingActions(IEnumerable`1 responses)
xxx.vcxproj : error MSB4014: at Microsoft.Build.Execution.BuildManager.HandleNewRequest(Int32 node, BuildRequestBlocker blocker)
xxx.vcxproj : error MSB4014: at Microsoft.Build.Execution.BuildManager.IssueRequestToScheduler(BuildSubmission submission, Boolean allowMainThreadBuild, BuildRequestBlocker blocker)
Microsoft have acknowledged a bug with this behaviour when your user name is 20 chars, but mine is much shorter. And needless to say I do have msbuild.exe in the right place.
The only work around I've found is to task switch to another app while the build is taking place. But I'm hoping someone has a better workaround.
MTIA
John
You've already eliminated KB2298853. Do make sure to install SP1. It is not the only reason, some other users have this problem too, even after the workaround. The basic failure appears to be a problem creating a pipe that lets msbuild talk to the IDE. Which is why it doesn't fail when you run msbuild from the command line.
This is an environmental problem, as yet undiagnosed. You need to chase down the reason the execution environment is unusual on your machine. Do so by selectively disabling or killing processes. Start with your malware software. Also, start another instance of Visual Studio and use Tools + Attach to Process to attach an unmanaged debugger to the first instance and/or msbuild. Debug + Break All and use Debug + Windows + Modules to find out what DLL might be injected in the process that is not made by Microsoft. Pay attention to the Path column. Not sure if Attach to Process is available on the Express edition btw.
it's actually just the behavior shield in avast that is causing the problems with visual studio.
if you turn that off when trying to build, it will build and run. now we just need either microsoft or avast to create an update which will eliminate this problem. just discovered this 10 mins ago. at 2:50 pm 12/3/2011 central standard time in wisconsin
I can confirm that this problem was also solved for me by turning off Avast's behavior shield. Something is definitely not right with that. Endless MSBuild.exe (and conime.exe) processes and devenv.exe maxing out the CPU. Seems to be a real system killer there by Avast. :-(
In my case simply restarting VS2010 helped fix the error. Anyone encountering this error might want to try that first.
(I am using VS 2010 with SP1)