Jenkins Job DSL Plugin - Script Execution Order - jenkins-job-dsl

When specifying multiple lines of script wildcards to execute (in the DSL Scripts field), does the plugin make any guarantees about the execution order of the scripts? As of release job-dsl-1.43, the execution order has changed (apparently as a result of the fix for JENKINS-30541). Now, scripts execute in the order that they appear in the DSL Scripts field. I can't rely on this ordering when creating DSL jobs because it's based on knowing the implementation (the .each closure together with the LinkedHashSet that stores the script request).
I would like to be able to depend on the execution order.
Is it possible to add documentation that will guarantee that scripts will be run in the same order as they appear?

Scripts are executed in the same order as specified in the DSL Scripts field. The execution order of expanded wildcards is unspecified.
See JENKINS-33081 and PR #789.

Related

Process flow gets stuck on table creations

I'm trying to understand the Enterprise Guide process flow. As I understand it, the process flow is supposed to make it easy to run related steps in the order they need to be run to make a dependent action able to run and be up to date somewhere later in the flow.
Given that understanding, I'm getting stuck trying to make the process flow work in cases where the temporary data is purged. I'm warned when closing Enterprise Guide that the project has references to temporary data which must be the tables I created. That should be fine, the data is on the SAS server and I wrote code to import that data into SAS.
I would expect that the data can be regenerated when I try run an analysis that depends on that data again later, but instead I'm getting an error indicating that the input data does not exist. If I then run the code to import the data and/or join tables in each necessary place, the process flow seems to work as expected.
See the flow that I'm working with below:
I'm sure I must be missing something. Imagine I want to rerun the rightmost linear regression. Is there a way to make the process flow import the data without doing so manually for each individual table creation the first time round?
The general answer to your question is probably that you can't really do what you're wanting directly, but you can do it indirectly.
A process flow (of which you can have many per project, don't forget) is a single set of programs/tasks/etc. that you intend to run as a group. Typically, you will run whole process flows at once, rather than just individual pieces. If you have a point that you want to pause, look at things, then continue, then you have a few choices.
One is to have a process flow that goes to that point, then a second process flow that starts from that point. You can even take your 'import data' steps out of the process flow entirely, make an 'import data' process flow, always run that first, then run the other process flows individually as you need them. In fact, if you use the AUTOEXEC process flow, you could have the import data steps run whenever you open the project, and imported data ready and waiting for you.
A second is to use the UI and control+click or drag a box to select on the process flow to select a group of programs to run; select the first five, say, then run them, then select 'run branch from program...' option to run from that point on. You could also make separate 'branches' and run just the one branch at a time, making each branch dependent on the input streams.
A third option would be to have different starting points for different analysis tasks, and have the import data bit be after that starting point. It could be common to the starting points, and use macro variables and conditional execution to go different directions. For example, you could have a macro variable set in the first program that says which analysis program you're running, then the conditional from the last import step (which are in sequence, not in parallel like you have them) send you off to whatever analysis task the macro variable says. You could also have macro variables that indicate whether an import has been run once already in the current session that then would tell you not to rerun it via conditional steps.
Unfortunately, though, there's no direct way to run something and say 'run this and all of its dependencies', though.

When does an action not run on the driver in Apache Spark?

I have just started with Spark and was struggling with the concept of tasks.
Can any one please help me in understanding when does an action (say reduce) not run in the driver program.
From the spark tutorial,
"Aggregate the elements of the dataset using a function func (which
takes two arguments and returns one). The function should be
commutative and associative so that it can be computed correctly in
parallel. "
I'm currently experimenting with an application which reads a directory on 'n' files and counts the number of words.
From the web UI the number of tasks is equal to number of files. And all the reduce functions are taking place on the driver node.
Can you please tell a scenario where the reduce function won't execute at the driver. Does a task always include "transformation+action" or only "transformation"
All the actions are performed on the cluster and results of the actions may end up on the driver (depending on the action).
Generally speaking the spark code you write around your business logic is not the program that would actually run - rather spark uses it to create a plan which will execute your code in the cluster. The plan creates a task of all the actions that can be done on a partition without the need to shuffle data around. Every time spark needs the data arranged differently (e.g. after sorting) It will create a new task and a shuffle between the first and the latter tasks
Ill take a stab at this, although I may be missing part of the question. A task is indeed always transformation(s) and an action. The transformation's are lazy and would not submit anything, thus the need for an action. You can always call .toDebugString on your RDD to see where each job split will be; each level of indentation is a new stage. I think the reduce function showing on the driver is a bit of a misnomer as it will run first in parallel and then merge the results. So, I would expect that the task does indeed run on the workers as far as it can.

Tcl scripts non-instrumenting debugger using Tcl Library and/or Tcl internals?

I would like to know if it is possible to build tcl scripts debugger using Tcl Library API and/or Tcl internal interfaces (I mean if they contain sufficient data to do so). I've noticed that existing tcl debuggers instrument tcl scripts and work with this additional layer. My idea was to use Tcl_CreateObjTrace to trace every evaluated command and use it as a point to retrive callstack, locals etc. Problem is that it seems that not every information is accessible from API at a time of evaluation. For example I would like to know which line is currently evaluated but Interp has such info only for top evaluations (iPtr->cmdFramePtr->line is empty for procedures' bodies). Anyone has tried such approach? Does it make any sense? Maybe should I look into hashed entries in Interp? Any clues and opinions would be appreciated (the best for Tcl 8.5).
Your best bet for a non-intrusive debugging system might be to try using an execution step trace (called for each command called during the execution of the command to which the trace is attached) with info frame to actually get the information. Here's a simple version, attaching to source so that you can watch an entire script:
proc traceinfo args {
puts [dict get [info frame -2] cmd]
}
trace add execution source enterstep traceinfo
source yourscript.tcl
Be prepared for plenty of output. The dictionary out of info frame can have all sorts of relevant entries, such as information about what the line number of the command is and what the source file is; the cmd entry is the unsubstituted source for the command called (if you want the substituted version, see the relevant arguments to the trace callback, traceinfo above).

SAS Enterprise: Run multiple Process Flows at once

I can't seem to find a straightforward way to run multiple process flows at once. I can select multiple and right click, but the 'Run' function disappears.
Any ideas? Programmatic or otherwise?
Assuming you mean you want to run multiple flows sequentially, you would use an Ordered List to do that. You can include any number of programs in a process flow.
Process flows are intended to contain all of the items you want to run in one shot, so you would not normally run many entire process flows at once. You can of course run one, then run the next one, if it's a few. I don't believe you can link programs or objects from one process flow to another.
If you mean run simultaneously, then you can do that if you set your project up to allow parallel execution, and your server allows it. File -> Project Properties -> Code Submission, check "Allow parallel execution on the same server" allows you to run multiple things at once - but be aware that each submission is in its own distinct SAS session and doesn't have direct access to the other submissions' temporary libraries or macro variables.

Referencing information in builds specified in a run parameter [Hudson]

Day 1 with using Hudson for our CI build. Slowly but surely getting up to speed.
My question is about run parameters. I've seen that I can use them to reference a particular run of a particular project - that's all fine.
What I don't understand (and can't find any documentation on - there's nothing at Parameterized Build) is how I refer to anything in the run defined by the run parameter.
Essentially I want to reference the %BUILD_NUMBER% and %SVN_REVISION% of the run that is selected in the run parameter.
How can I do that?
Do you really need to add extra property values, extra parameters for your job?
Since BUILD_NUMBER and SVN_REVISION are already defined as environment variables (see Building a software project), you can use those in your job.
When a Hudson job executes, it sets some environment variables that you may use in your shell script, batch command, or Ant script
or:
illustrates you already have those values at your disposal.
You can then use them to define other environment variables/properties within your shell or ant script.
When it comes to pass a variable value from one job to another, the Parameterized Trigger Plugin should do the trick:
The parameters section can contain a combination of one or more of the following:
a set of predefined properties
properties from a properties file read from the workspace of the triggering build
the parameters of the current build
"Subversion revision": makes sure the triggered projects are built with the same revision(s) of the triggering build.
You still have to make sure those projects are actually configured to checkout the right Subversion URLs.
Note: there might be an issue with the Join Plugin, which might not work when the Parameterized Trigger is in action.