elki-cli versus elki gui, I don't get equal results - data-mining

Though the terminal on ubuntu:
db#morris:~/lisbet/elki-master/elki/target$ elki-cli -algorithm outlier.lof.LOF -dbc.parser ArffParser -dbc.in /home/db/lisbet/AllData/literature/WBC/WBC_withoutdupl_norm_v10_no_ids.arff -lof.k 8 -evaluator outlier.OutlierROCCurve -rocauc.positive yes
giving
# ROCAUC: 0.6230046948356808
and in ELKI's GUI:
Running: -verbose -dbc.in /home/db/lisbet/AllData/literature/WBC/WBC_withoutdupl_norm_v10_no_ids.arff -dbc.parser ArffParser -algorithm outlier.lof.LOF -lof.k 8 -evaluator outlier.OutlierROCCurve -rocauc.positive yes
de.lmu.ifi.dbs.elki.datasource.FileBasedDatabaseConnection.parse: 18 ms
de.lmu.ifi.dbs.elki.datasource.FileBasedDatabaseConnection.filter: 0 ms
LOF #1/3: Materializing LOF neighborhoods.
de.lmu.ifi.dbs.elki.index.preprocessed.knn.MaterializeKNNPreprocessor.k: 9
Materializing k nearest neighbors (k=9): 223 [100%]
de.lmu.ifi.dbs.elki.index.preprocessed.knn.MaterializeKNNPreprocessor.precomputation-time: 10 ms
LOF #2/3: Computing LRDs.
LOF #3/3: Computing LOFs.
LOF: complete.
de.lmu.ifi.dbs.elki.algorithm.outlier.lof.LOF.runtime: 39 ms
ROCAUC: **0.6220657276995305**
I don't understand why the 2 ROCAUCcurves aren't the same.
My goal in testing this is to be comfortable with my result, that what I do is right, but it is hard when I don't get matching results. When I see that my settings are right I will move on to making my own experiments, that I can trust.

Pass cli as first command line parameter to launche the CLI, or minigui to launch the MiniGUI. The following are equivalent:
java -jar elki/target/elki-0.6.5-SNAPSHOT.jar cli
java -jar elki/target/elki-0.6.5-SNAPSHOT.jar KDDCLIApplication
java -jar elki/target/elki-0.6.5-SNAPSHOT.jar de.lmu.ifi.dbs.elki.application.KDDCLIApplication
This will work for any class extending the class AbstractApplication.
Your can also do:
java -cp elki/target/elki-0.6.5-SNAPSHOT.jar de.lmu.ifi.dbs.elki.application.KDDCLIApplication
(Which will load 1 class less, but this is usually not worth the effort.)
This will work for any class that has a standard public void main(String[]) method, as this is the standard Java invocation.
But notice that -h currently will still print 0.6.0 (2014, January), that value was not updated for the 0.6.5 interim versions. It will be bumped for 0.7.0. That version number is therefore not reliable.
As for the differences you observed: try varing k by 1. If I recall correctly, we changed the meaning of the k parameter to be more consistent across different algorithms. (They are not consistent in literature anyway.)

Related

Memchk (valgrind) reporting inconsistent results in different docker hosts

I have a fairly robust CI test for a C++ library, these tests (around 50) run over the same docker image but in different machines.
In one machine ("A") all the memcheck (valgrind) tests pass (i.e. no memory leaks).
In the other ("B"), all tests produce the same valgrind error below.
51/56 MemCheck #51: combinations.cpp.x ....................***Exception: SegFault 0.14 sec
valgrind: m_libcfile.c:66 (vgPlain_safe_fd): Assertion 'newfd >= VG_(fd_hard_limit)' failed.
Cannot find memory tester output file: /builds/user/boost-multi/build/Testing/Temporary/MemoryChecker.51.log
The machines are very similar, both are intel i7.
The only difference I can think of is that one is:
A. Ubuntu 22.10, Linux 5.19.0-29, docker 20.10.16
and the other:
B. Fedora 37, Linux 6.1.7-200.fc37.x86_64, docker 20.10.23
and perhaps some configuration of docker I don't know about.
Is there some configuration of docker that might generate the difference? or of the kernel? or some option in valgrind to workaround this problem?
I know for a fact that in real machines (not docker) valgrind doesn't produce any memory error.
The options I use for valgrind are always -leak-check=yes --num-callers=51 --trace-children=yes --leak-check=full --track-origins=yes --gen-suppressions=all.
Valgrind version in the image is 3.19.0-1 from the debian:testing image.
Note that this isn't an error reported by valgrind, it is an error within valgrind.
Perhaps after all, the only difference is that Ubuntu version of valgrind is compiled in release mode and the error is just ignored. (<-- this doesn't make sense, valgrind is the same in both cases because the docker image is the same).
I tried removing --num-callers=51 or setting it at 12 (default value), to no avail.
I found a difference between the images and the real machine and a workaround.
It has to do with the number of file descriptors.
(This was pointed out briefly in one of the threads on valgind bug issues on Mac OS https://bugs.kde.org/show_bug.cgi?id=381815#c0)
Inside the docker image running in Ubuntu 22.10:
ulimit -n
1048576
Inside the docker image running in Fedora 37:
ulimit -n
1073741816
(which looks like a ridiculous number or an overflow)
In the Fedora 37 and the Ubuntu 22.10 real machines:
ulimit -n
1024
So, doing this in the CI recipe, "solved" the problem:
- ulimit -n # reports current value
- ulimit -n 1024 # workaround neededed by valgrind in docker running in Fedora 37
- ctest ... (with memcheck)
I have no idea why this workaround works.
For reference:
$ ulimit --help
...
-n the maximum number of open file descriptors
First off, "you are doing it wrong" with your Valgrind arguments. For CI I recommend a two stage approach. Use as many default arguments as possible for the CI run (--trace-children=yes may well be necessary but not the others). If your codebase is leaky then you may need to check for leaks, but if you can maintain a zero leak policy (or only suppressed leaks) then you can tell if there are new leaks from the summary. After your CI detects an issue you can run again with the kitchen sink options to get full information. Your runs will be significantly faster without all those options.
Back to the question.
Valgrind is trying to dup() some file (the guest exe, a tempfile or something like that). The fd that it fets is higher than what it thinks the nofile rlimit is, so it is asserting.
A billion files is ridiculous.
Valgrind will try to call prlimit RLIMIT_NOFILE, with a fallback call to rlimit, and a second fallback to setting the limit to 1024.
To realy see what is going on you need to modify the Valgrind source (m_main.c, setup_file_descriptors, set local show to True). With this change I see
fd limits: host, before: cur 65535 max 65535
fd limits: host, after: cur 65535 max 65535
fd limits: guest : cur 65523 max 65523
Otherwise with strace I see
2049 prlimit64(0, RLIMIT_NOFILE, NULL, {rlim_cur=65535, rlim_max=65535}) = 0
2049 prlimit64(0, RLIMIT_NOFILE, {rlim_cur=65535, rlim_max=65535}, NULL) = 0
(all the above on RHEL 7.6 amd64)
EDIT: Note that the above show Valgrind querying and setting the resource limit. If you use ulimit to lower the limit before running Valgrind, then Valgrind will try to honour that limit. Also note that Valgrind reserves a small number (8) of files for its own use.

HTCondor - Partitionable slot not working

I am following the tutorial on
Center for High Throughput Computing and Introduction to Configuration in the HTCondor website to set up a Partitionable slot. Before any configuration I run
condor_status
and get the following output.
I update the file 00-minicondor in /etc/condor/config.d by adding the following lines at the end of the file.
NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=4
SLOT_TYPE_1_PARTITIONABLE = TRUE
and reconfigure
sudo condor_reconfig
Now with
condor_status
I get this output as expected. Now, I run the following command to check everything is fine
condor_status -af Name Slotype Cpus
and find slot1#ip-172-31-54-214.ec2.internal undefined 1 instead of slot1#ip-172-31-54-214.ec2.internal Partitionable 4 61295 that is what I would expect. Moreover, when I try to summit a job that asks for more than 1 cpu it does not allocate space for it (It stays waiting forever) as it should.
I don't know if I made some mistake during the installation process or what could be happening. I would really appreciate any help!
EXTRA INFO: If it can be of any help have have installed HTCondor with the command
curl -fsSL https://get.htcondor.org | sudo /bin/bash -s – –no-dry-run
on Ubuntu 18.04 running on an old p2.xlarge instance (it has 4 cores).
UPDATE: After rebooting the whole thing it seems to be working. I can now send jobs with different CPUs requests and it will start them properly.
The only issue I would say persists is that Memory allocation is not showing properly, for example:
But in reality it is allocating enough memory for the job (in this case around 12 GB).
If I run again
condor_status -af Name Slotype Cpus
I still get something I am not supposed to
But at least it is showing the correct number of CPUs (even if it just says undefined).
What is the output of condor_q -better when the job is idle?

Score-P callpath depth limitation of 30 exceeded

I am profiling a code with Scalasca 2.0 that uses some recoursions.
When I run the analyzer with scalasca -analyze myexec , it does not rise any error to the end, where it says:
Score-P callpath depth limitation of 30 exceeded.
Reached callpath depth was 34
At this point, the scalasca results are corrupted and I cannot run cube over the produced output files.
I know for sure that the number of self-calls, of the recoursions won't be greater than 34.
I have read that there is a variable taking into account the number of "measured call-paths" (see. https://www.dkrz.de/Nutzerportal-en/doku/blizzard/program-analysis/profiling). So, I also tried to run scalasca with export ESD_FRAMES=40 but scalasca still says the limit is 30.
So, Is there a way to shift this scalasca limit to an higher value?
I write my answer 2 months after you posted the question so chances are you have already found a solution.
In score-p 1.4+ it can be fixed with:
export SCOREP_PROFILING_MAX_CALLPATH_DEPTH=128

How can I find the default MaxPermSize when -XX:+PrintFlagsFinal is not supported?

I'm working with a system where a number of jobs, implemented as Java applications, can be started simultaneously. Each job runs in a separate JVM.
Some of these jobs require bigger permgen size than others. However, it is not feasible to allow all jobs to use the maximum value, as the OS memory is limited.
Therefore, I want to specify -XX:MaxPermSize for every job. Currently, the jobs are running without any -XX:MaxPermSize argument, so they must be using the default value. But how can I find out what the default value is?
I have seen Default values for Xmx, Xms, MaxPermSize on non-server-class machines where the accepted answer is to run java -XX:+PrintFlagsFinal, which should output the default values. However, the JVM version I'm running does not support that argument (Unrecognized VM option '+PrintFlagsFinal'). Updating to a newer JVM is not currently an option.
So what are my options for finding the default value?
System information:
> java -version
Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, mixed mode)
> cat /etc/issue
Welcome to SUSE Linux Enterprise Server 11 SP2 (x86_64) - Kernel \r (\l).
> uname -r
3.0.101-0.7.17-default
Default values for various regions would depend on:
Collector being used (which would depend on Java version in case you're specifying it explicitly using CLI args).
Sizes of heap etc you have specified using CLI args. GC would distribute the space according to some ratios.
Installed (or may be it was available) RAM on the machine.
How to find out:
From GC Log files (-Xloggc:gc.log), I would expect that at least in the Full GC logs, GC would report the Perm Gen sizes. See examples at bottom. You can take a representative gc log file and find the max perm gen size from it, and decide based on that.
Additional params like PrintFlagsFinal etc. (specific to Java version)
I'll look through the 1.6 options to see if I can find something and update the post, otherwise it's time to upgrade. :-)
Here are 3 examples from different GCs (Metaspace, CMS Perm & PSPermGen is what you're looking for):
2014-11-14T08:43:53.197-0500: 782.973: [Full GC (Ergonomics) [PSYoungGen: 54477K->0K(917504K)] [ParOldGen: 1042738K->367416K(1048576K)] 1097216K->367416K(1966080K), [Metaspace: 46416K->46389K(1091584K)], 0.4689827 secs] [Times: user=3.52 sys=0.07, real=0.47 secs]
2014-10-29T06:14:56.491-0400: 6.754: [Full GC2014-10-29T06:14:56.491-0400: 6.754: [CMS: 96098K->113997K(5242880K), 0.7076870 secs] 735545K->113997K(6186624K), [CMS Perm : 13505K->13500K(51200K)], 0.7078280 secs] [Times: user=0.69 sys=0.01, real=0.71 secs]
2014-10-29T21:13:33.140-0500: 2644.411: [Full GC [PSYoungGen: 2379K->0K(695296K)] [ParOldGen: 1397977K->665667K(1398272K)] 1400357K->665667K(2093568K) [PSPermGen: 106995K->106326K(262144K)], 1.2151010 secs] [Times: user=6.83 sys=0.09, real=1.22 secs]

Gameboy emulator testing strategies?

I'm writing a gameboy emulator, and am struggling with making sure opcodes are emulated correctly. Certain operations set flag registers, and it can be hard to track whether the flag is set correctly, and where.
I want to write some sort of testing framework, but thought it'd be worth asking here for some help. Currently I see a few options:
Unit test each and every opcode with several test cases. Issues are there are 256 8 bit opcodes and 50+ (can't remember exact number) 16 bit opcodes. This would take a long time to do properly.
Write some sort of logging framework that logs a stacktrace at each operation and compares it to other established emulators. This would be pretty quick to do, and allows a fairly rapid overview of what exactly went wrong. The log file would look a bit like this:
...
PC = 212 Just executed opcode 7c - Register: AF: 5 30 BC: 0 13 HL: 5 ce DE: 1 cd SP: ffad
PC = 213 Just executed opcode 12 - Register: AF: 5 30 BC: 0 13 HL: 5 ce DE: 1 cd SP: ffad
...
Cons are I need to modify the source of another emulator to output the same form. And there's no guarantee the opcode is correct as it assumes the other emulator is.
What else should I consider?
Here is my code if it helps: https://github.com/dbousamra/scalagb
You could use already established test roms. I would recommend Blargg's test roms. You can get them from here: http://gbdev.gg8.se/files/roms/blargg-gb-tests/.
To me the best idea is the one you already mentioned:
take an existing emulator that is well known and you have the source code. let's call it master emulator
take some ROM that you can use to test
test these ROMs in the emulator that is known to work well.
modify the master emulator so it produces log while it is running for each opcode that it executes.
do the same in your own emulator
compare the output
I think this one has more advantage:
you will have the log file from a good emulator
the outcome of the test can be evaluated much faster
you can use more than one emulator
you can go deeper later like putting memory to the log and see the differences between the two implementations.