How to direct trial output to files - ray

I'm running an Experiment with multiple trials and I would like to log the STDOUT for the process for each trial in a file in the directory created in ray_results for that trial.
I have set log_to_driver to false since the driver couldn't keep up and I have set redirect_worker_output to true but that doesn't seem to result in what I want either.
Perhaps there is an option in ray.tune.run? I have set verbose=2 but that doesn't work.

Related

AWS Lambda download a file using Chromedriver

I have a container that is built to run selenium-chromedriver with python to download an excel(.xlsx) file from a website.
I am Using SAM to build & deploy this image to be run in AWS Lambda.
When I build the container and invoke it locally, the program executes as expected: The download occurs and I can see the file placed in the root directory of the container.
The problem is: when I deploy this image to AWS and invoke my lambda function I get no errors, however, my download is never executed. The file never appears in my root directory.
My first thought was that maybe I didn't allocate enough memory to the lambda instance. I gave it 512 MB, and the logs said it was using 416MB. Maybe there wasn't enough room to fit another file inside? So I have increased the memory provided to 1024 MB, but still no luck.
My next thought was that maybe the download was just taking a long time, so I also allowed the program to wait for 5 minutes after clicking the download to ensure that the download is given time to complete. Still no luck.
I have also tried setting the following options for chromedriver (full list of chromedriver options posted at bottom):
options.add_argument(f"--user-data-dir={'/tmp'}"),
options.add_argument(f"--data-path={'/tmp'}"),
options.add_argument(f"--disk-cache-dir={'/tmp'}")
and also setting tempfolder = mkdtemp() and passing that into the chrome options as above in place of /tmp. Still no luck.
Since this applicaton is in a container, it should run the same locally as it does on AWS. So I am wondering if it is part of the config outside of the container that is blocking my ability to download a file? Maybe the request is going out but the response is not being allowed back in?
Please let me know if there is anything I need to clarify -- Any help on this issue is greatly appreciated!
Full list of Chromedriver options
options.binary_location = '/opt/chrome/chrome'
options.headless = True
options.add_argument('--disable-extensions')
options.add_argument('--no-first-run')
options.add_argument('--ignore-certificate-errors')
options.add_argument('--disable-client-side-phishing-detection')
options.add_argument('--allow-running-insecure-content')
options.add_argument('--disable-web-security')
options.add_argument('--lang=' + random.choice(language_list))
options.add_argument('--user-agent=' + fake_user_agent.user_agent())
options.add_argument('--no-sandbox')
options.add_argument("--window-size=1920x1080")
options.add_argument("--single-process")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-dev-tools")
options.add_argument("--no-zygote")
options.add_argument(f"--user-data-dir={'/tmp'}")
options.add_argument(f"--data-path={'/tmp'}")
options.add_argument(f"--disk-cache-dir={'/tmp'}")
options.add_argument("--remote-debugging-port=9222")
options.add_argument("start-maximized")
options.add_argument("enable-automation")
options.add_argument("--headless")
options.add_argument("--disable-browser-side-navigation")
options.add_argument("--disable-gpu")
driver = webdriver.Chrome("/opt/chromedriver", options=options)```
Just in case anybody stumbles across this queston in future, adding the following to chrome options solved my issue:
prefs = {
"profile.default_content_settings.popups": 0,
"download.default_directory": r"/tmp",
"directory_upgrade": True
}
options.add_experimental_option("prefs", prefs)

Executing a batch file using CFEXECUTE

I tried to run test.bat file using cfexecute. It shows timeout error after loding for sometime. The output file is blank. But when i double click the test.bat file it works fine. My code is this,
<cfexecute name="C:\Windows\System32\cmd.exe" arguments="/C C:\ColdFusion2018\cfusion\wwwroot\test.bat" timeout="60" outputfile="C:\ColdFusion2018\cfusion\wwwroot\log_output1.txt"></cfexecute>
We recommend using CFX_EXEC (Windows) instead of the built-in CFExecute. When running BAT files, we've encountered many cases where we needed to run it under a separate Windows account that had privileges different than the CF Service. CFX_EXEC enabled us to specify the specific account whereas CFExecute doesn't have the option at all. We also use CFX_EXEC for performing IP/DNS look-ups as it's a lot faster than Java, honors TTL and doesn't cache the lookup results "forever".
If you want to run test.bat using cfexecute, test.bat should be the value of the name attribute, not the arguments attribute.
<cfexecute name="C:\ColdFusion2018\cfusion\wwwroot\test.bat"
timeout="60"
arguments ="whatever applies"
outputfile="C:\ColdFusion2018\cfusion\wwwroot\log_output1.txt">
</cfexecute>
Thanks for your response,
The batch file successfully executed after suppressing the 'Press any key to continue..'(pause) in the command line. It makes the cfexecute loading till timeout. That was the issue here.

weka: how to generate libsvm training parameter

I am running libsvm through weka. Its output accuracy looks good to me, so I am planning to write a svm model by myself. However, weka didn't generate any training parameter, such as number of support vector. Therefore i cannot do anything. Searching the web, i found somebody said it would generate some parameters like the following:
optimization finished, #iter = 27
nu = 0.058475864943863545
obj = -1.871013102744184, rho = -0.19357337828800944
nSV = 9, nBSV = 0 `enter code here`
Total nSV = 9
but how come i didn't see any of them? any step that i missed? please help me. Thanks a lot.
Weka writes the output you mentioned to stderr.
So if you have started weka.sh or weka.bat from a terminal (or "command window" if you are on Windows), you should see that output appear in your terminal window after clicking "classify"
If you want to have access to this information via scripts, you can
redirect the output to a file and read in that file.
Here is how to edit the startup file weka.sh / weka.bat.
Edit this line (it is probably the last line) in order to write log info to a file instead of the terminal window:
java -cp $CP -Xmx8092m weka.gui.GUIChooser 2>>/opt/weka-stable/weka.log &
You can also add a properties file to your home directory to add more fine-grained behaviour.
https://weka.wikispaces.com/Properties+file
(You probably can also access information via the Weka Java API somehow, but you did not ask for that)

How to properly debug a binary generated by `go test -c` using GDB?

The go test command has support for the -c flag, described as follows:
-c Compile the test binary to pkg.test but do not run it.
(Where pkg is the last element of the package's import path.)
As far as I understand, generating a binary like this is the way to run it interactively using GDB. However, since the test binary is created by combining the source and test files temporarily in some /tmp/ directory, this is what happens when I run list in gdb:
Loading Go Runtime support.
(gdb) list
42 github.com/<username>/<project>/_test/_testmain.go: No such file or directory.
This means I cannot happily inspect the Go source code in GDB like I'm used to. I know it is possible to force the temporary directory to stay by passing the -work flag to the go test command, but then it is still a huge hassle since the binary is not created in that directory and such. I was wondering if anyone found a clean solution to this problem.
Go 1.5 has been released, and there is still no officially sanctioned Go debugger. I haven't had much success using GDB for effectively debugging Go programs or test binaries. However, I have had success using Delve, a non-official debugger that is still undergoing development: https://github.com/derekparker/delve
To run your test code in the debugger, simply install delve:
go get -u github.com/derekparker/delve/cmd/dlv
... and then start the tests in the debugger from within your workspace:
dlv test
From the debugger prompt, you can single-step, set breakpoints, etc.
Give it a whirl!
Unfortunately, this appears to be a known issue that's not going to be fixed. See this discussion:
https://groups.google.com/forum/#!topic/golang-nuts/nIA09gp3eNU
I've seen two solutions to this problem.
1) create a .gdbinit file with a set substitute-path command to
redirect gdb to the actual location of the source. This file could be
generated by the go tool but you'd risk overwriting someone's custom
.gdbinit file and would tie the go tool to gdb which seems like a bad
idea.
2) Replace the source file paths in the executable (which are pointing
to /tmp/...) with the location they reside on disk. This is
straightforward if the real path is shorter then the /tmp/... path.
This would likely require additional support from the compiler /
linker to make this solution more generic.
It spawned this issue on the Go Google Code issue tracker, to which the decision ended up being:
https://code.google.com/p/go/issues/detail?id=2881
This is annoying, but it is the least of many annoying possibilities.
As a rule, the go tool should not be scribbling in the source
directories, which might not even be writable, and it shouldn't be
leaving files elsewhere after it exits. There is next to nothing
interesting in _testmain.go. People testing with gdb can break on
testing.Main instead.
Russ
Status: Unfortunate
So, in short, it sucks, and while you can work around it and GDB a test executable, the development team is unlikely to make it as easy as it could be for you.
I'm still new to the golang game but for what it's worth basic debugging seems to work.
The list command you're trying to work can be used so long as you're already at a breakpoint somewhere in your code. For example:
(gdb) b aws.go:54
Breakpoint 1 at 0x61841: file /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.go, line 54.
(gdb) r
Starting program: /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.test
[snip: some arnings about BinaryCache]
Breakpoint 1, github.com/stellar/deliverator/aws.imageIsNewer (latest=0xc2081fe2d0, ami=0xc2081fe3c0, ~r2=false)
at /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.go:54
54 layout := "2006-01-02T15:04:05.000Z"
(gdb) list
49 func imageIsNewer(latest *ec2.Image, ami *ec2.Image) bool {
50 if latest == nil {
51 return true
52 }
53
54 layout := "2006-01-02T15:04:05.000Z"
55
56 amiCreationTime, amiErr := time.Parse(layout, *ami.CreationDate)
57 if amiErr != nil {
58 panic(amiErr)
This is just after running the following in the aws subdir of my project:
go test -c
gdb aws.test
As an additional caveat, it does seem very selective about where breakpoints can be placed. Seems like it has to be an expression but that conclusion is only via experimentation.
If you're willing to use tools besides GDB, check out godebug. To use it, first install with:
go get github.com/mailgun/godebug
Next, insert a breakpoint somewhere by adding the following statement to your code:
_ = "breakpoint"
Now run your tests with the godebug test command.
godebug test
It supports many of the parameters from the go test command.
-test.bench string
regular expression per path component to select benchmarks to run
-test.benchmem
print memory allocations for benchmarks
-test.benchtime duration
approximate run time for each benchmark (default 1s)
-test.blockprofile string
write a goroutine blocking profile to the named file after execution
-test.blockprofilerate int
if >= 0, calls runtime.SetBlockProfileRate() (default 1)
-test.count n
run tests and benchmarks n times (default 1)
-test.coverprofile string
write a coverage profile to the named file after execution
-test.cpu string
comma-separated list of number of CPUs to use for each test
-test.cpuprofile string
write a cpu profile to the named file during execution
-test.memprofile string
write a memory profile to the named file after execution
-test.memprofilerate int
if >=0, sets runtime.MemProfileRate
-test.outputdir string
directory in which to write profiles
-test.parallel int
maximum test parallelism (default 4)
-test.run string
regular expression to select tests and examples to run
-test.short
run smaller test suite to save time
-test.timeout duration
if positive, sets an aggregate time limit for all tests
-test.trace string
write an execution trace to the named file after execution
-test.v
verbose: print additional output

Condor output file updating

I'm running several simulations using Condor and have coded the program so that it outputs a progress status in the console. This is done at the end of a loop where it simply prints the current time (this can also be percentage or elapsed time). The code looks something like this:
printf("START");
while (programNeedsToRum) {
// Run code repetitive code...
// Print program status update
printf("[%i:%i:%i]\r\n", hours, minutes, seconds);
}
printf("FINISH");
When executing normally (i.e. in the terminal/cmd/bash) this works fine, but the condor nodes don't seem to printf() the status. Only once the simulation has finished, all the status updates have been outputted to the file but then it's no longer of use. My *.sub file that I submit to condor looks like this:
universe = vanilla
executable = program
output = out/out-$(Process)
error = out/err-$(Process)
queue 100
When submitted the program executes (this is confirmed in condor_q) and the output files contain this:
START
Only once the program has finished running its corresponding output file shows (example):
START
[0:3:4]
[0:8:13]
[0:12:57]
[0:18:44]
FINISH
Whilst the program executes, the output file only contains the START text. So I came to the conclusion that the file is not updated if the node executing program is busy. So my question is, is there a way of updating the output files manually or gather any information on the program's progress in a better way?
Thanks already
Max
What you want to do is use the streaming output options. See the stream_error and stream_output options you can pass to condor_submit as outlined here: http://research.cs.wisc.edu/htcondor/manual/current/condor_submit.html
By default, HTCondor stores stdout and stderr locally on the execute node and transfers them back to the submit node on job completion. Setting stream_output to TRUE will ask HTCondor to instead stream the output as it occurs back to the submit node. You can then inspect it as it happens.
Here's something I used a few years ago to solve this problem. It uses condor_chirp which is used to transfer files from the execute host to the submitter. I have a python script that executes the program I really want to run, and redirects its output to a file. Then, periodically, I send the output file back to the submit host.
Here's the Python wrapper, stream.py:
#!/usr/bin/python
import os,sys,time
os.environ['PATH'] += ':/bin:/usr/bin:/cygdrive/c/condor/bin'
# make sure the file exists
open(sys.argv[1], 'w').close()
pid = os.fork()
if pid == 0:
os.system('%s >%s' % (' '.join (sys.argv[2:]), sys.argv[1]))
else:
while True:
time.sleep(10)
os.system('condor_chirp put %s %s' % (sys.argv[1], sys.argv[1]))
try:
os.wait4(pid, os.WNOHANG)
except OSError:
break
And my submit script. The problem ran sh hello.sh, and redirected the output to myout.txt:
universe = vanilla
executable = C:\cygwin\bin\python.exe
requirements = Arch=="INTEL" && OpSys=="WINNT60" && HAS_CYGWIN==TRUE
should_transfer_files = YES
transfer_input_files = stream.py,hello.sh
arguments = stream.py myout.txt sh hello.sh
transfer_executable = false
It does send the output in its entirety, so take that in to account if you have a lot of jobs running at once. Currently, its sending the output every 10 seconds .. you may want to adjust that.
with condor_tail you can view the output of a running process.
to see stdout just add the job-ID (and -f if you want to follow the output and see the updates immediately. Example:
condor_tail 314.0 -f