I followed this tutorial to learn clojure (https://www.youtube.com/watch?v=zFPiPBIkAcQ at around 2:26). In the last example, you programme the game "Snake".
(ns ...tests.snake
(:import
(java.awt Color Dimension)
(javax.swing JPanel JFrame Timer JOptionPane)
(java.awt.event ActionListener KeyListener KeyEvent)))
...
113 (defn game-panel [frame snake apple]
114 (proxy [JPanel ActionListener KeyListener] []
115 ;JPanel
116 (paintComponent [g]
117 (proxy-super paintComponent g)
118 (paint g #apple)
119 (paint g #snake))
120 (getPreferredSize []
121 (Dimension. (* (inc field-width) point-size)
122 (* (inc field-height) point-size)))
123 ;ActionListener
124 (actionPerformed [e]
125 (update-positions snake apple)
126 (if (lose? #snake)
127 (do
128 (reset-game snake apple)
129 (JOptionPane/showMessageDialog frame "You lose")))
130 (if (win? #snake)
131 (do
132 (reset-game snake apple)
133 (JOptionPane/showMessageDialog "You win")))
134 (.repaint this))
135 (keyPressed [e]
136 (let [direction (directions (.getKeyCode e))]
137 (if direction
138 (update-direction snake direction))))
139 (keyReleased [e])
140 (keyTyped [e])))
I get an IllegalArgumentException there when using "proxy".
; Syntax error (IllegalArgumentException) compiling new at (c:\[...]\Clojure_Project\tests\snake.clj:114:3).
; Unable to resolve classname: ...tests.snake.proxy$javax.swing.JPanel$ActionListener$KeyListener$1b88ffec
I thought at first it might be related to the fact that I am passing multiple arguments, but that doesn't seem to be the problem.
I use VisualStudioCode and the "Getting Starting REPL" from Calva (because I don't know how to connect another one).
I don't know, did I forget to install something or import something?
I tried to look at the code of "proxy", but due to the fact that I'm not really familiar with the programming language yet, it didn't help me much.
my code: https://github.com/shadowprincess/clojure-learning
When looking at your question initially, I thought that your ...tests.snake namespace was just something you elided and not the actual namespace name.
But given your repo, seems like it's the real namespace name that you're using.
That's invalid - you can't start a namespace with .. Rename it to tests.snake and the error will go away.
Unfortunately, your code in the repo still won't work because there are many other errors, but they should be easy for you to figure out on your own. And as a general advice - don't run your whole project by sending separate forms into REPL. Learn how to launch it with a single command - it'll induce good practices that will be useful even when using REPL.
Related
I've got gperftools installed and collecting data, which looks reasonable so far. I see one node (?) that gets sampled a lot - but I'm interested in the callers to that node - I don't see them? I've tried callgrind/kcachegrind also, I feel like I'm missing something? Here's a snippet of the output when using --text
Total: 1844 samples
573 31.1% 31.1% 573 31.1% US_strcpy
185 10.0% 41.1% 185 10.0% US_strstr
167 9.1% 50.2% 167 9.1% US_strlen
63 3.4% 53.6% 63 3.4% PS_CompressTable
58 3.1% 56.7% 58 3.1% LX_LexInternal
51 2.8% 59.5% 51 2.8% US_CStrEql
47 2.5% 62.0% 47 2.5% 0x40472984
40 2.2% 64.2% 40 2.2% PS_DoSets
38 2.1% 66.3% 38 2.1% LX_ProcessCatRange
So I'm interested in seeing the callers to US_strcpy, but I don't seem to have any? I do get a nice call graph from kcachegrind for 0x40472984 (still trying to match that to a symbol)
There are several ways:
a) pprof --web or kcachgrind will show you callers nicely if it is captured correctly. It is sometimes useful to do pprof --traces (only with github.com/google/pprof version). Which will be somewhat like low-tech method that Mike mentioned above.
b) if the data is really unavailable, you're having problem with stacktrace capturing and/or symbolization. For that, build gperftools with libunwind and build all of your program with debug info.
I am using jetty "7.6.8.v20121106" as a part of https://github.com/ring-clojure/ring/tree/master/ring-jetty-adapter with my server.
I am making calls using http://http-kit.org/ with following code. Essentially I am making server calls but ignoring the response. What I am finding is that all the server threads become blocked/deadlocked after that. This seems like a really easy way to bring to server down and wanted to understand what is going on here.
Code from client is:
(require '[org.httpkit.client :as hk-client])
(defn hget [id]
(hk-client/get (str "http://localhost:5000/v1/pubapis/auth/ping?ping=" id)))
(doall (map hget (take 100 (range))))) ; Gives problem
(doall (map deref (map hget (take 100 (range)))))) ; Doesn't give problem
Threads blocked at
sun.nio.cs.StreamEncoder.write(StreamEncoder.java:118)
and deadlocked at
java.io.PrintStream.write(PrintStream.java:479)
Would really appreciate if someone can help with what is going on over here.
Finally found what the problem was. Took lot of digging through and starting with a sample project to find this.
When I started learning clojure and copied the following from somewhere for logging:
(defn log [msg & vals]
(let [line (apply format msg vals)]
(locking System/out (println line))))
The line locking over there was causing dead lock in some situation. Do not know enough about concurrency to solve this. Will create a separate question for that.
Removing this line fixes the problem.
I copied a FORTRAN IV program from a thesis, so it presumably worked at the time it was written. I compiled it with gfortran. When running, it stalls in an integration subroutine. I have tried easing off the residuals but to no avail. I am asking for help because (presuming no mistakes in code) gfortran might not like the archaic 66/IV code, and updating it is outside my abilities.
The program gets stuck by line 9, so I wonder if the DO loops are responsible. Note, lines 1 and 6 are unusual to me because ',1' has been added to the ends: e.g. =1,N,1.
I don't think it's necessary to show the FUNC subroutine called on line 5 but am happy to provide it if necessary.
If you need more detailed information I am happy to provide it.
00000001 13 DO 22 TDP=QDP,7,1
00000002 TD=TDP-1
00000003 X=X0+H0
00000004 IF(TD.EQ.QD) GOTO 15
00000005 CALL FUNC(N,DY,X,Y,J)
00000006 15 DO 21 RD=1,N,1
00000007 GOTO (120,121,122,123,124,125,126),TDP
00000008 120 RK(5*N*RD)=Y(RD)
00000009 GOTO 21
00000010 121 RK(RD)=HD*DY(RD)
00000011 H0=0.5*HD
00000012 F0=0.5*RK(RD)
00000013 GOTO 20
00000014 122 RK(N+RD)=HD*DY(RD)
00000015 F0=0.25*(RK(RD)+RK(N+RD))
00000016 GOTO 20
00000017 123 RK(2*N+RD)=HD*DY(RD)
00000018 H0=HD
00000019 F0=-RK(N+RD)+2.*RK(2*N+RD)
00000020 GOTO 20
00000021 124 RK(3*N+RD)=HD*DY(RD)
00000022 H0=0.66666666667*HD
00000023 F0=(7.*RK(RD)+10.*RK(N+RD)+RK(3*N+RD))/27.
00000024 GOTO 20
00000025 125 RK(4*N+RD)=HD*DY(RD)
00000026 H0=0.2*HD
00000027 F0=(28.*RK(RD)-125.*RK(N+RD)+546.*RK(2*N+RD)+54.*RK(3*N+RD)-
00000028 1378.*RK(4*N+RD))/625.
00000029 GOTO 20
00000030 126 RK(6*N+RD)=HD*DY(RD)
00000031 F0=0.1666666667*(RK(RD)+4.*RK(2*N+RD)+RK(3*N+RD))
00000032 X=X0+HD
00000033 ER=(-42.*RK(RD)-224.*RK(2*N+RD)-21.*RK(3*N+RD)+162.*RK(4*N+RD)
00000034 1+125.*RK(6*N+RD))/67.2
00000035 YN=RK(5*N+RD)+F0
00000036 IF(ABS(YN).LT.1E-8) YN=1
00000037 ER=ABS(ER/YN)
00000038 IF(ER.GT.G0) GOTO 115
00000039 IF(ED.GT.ER) GOTO 20
00000040 QD=-1
00000041 20 Y(RD)=RK(5*N+RD)+F0
00000042 21 CONTINUE
00000043 22 CONTINUE
It's difficult to be certain (not entirely sure your snippet exactly matches your source file) but your problem might arise from an old FORTRAN gotcha -- a 0 in column 6 is (or rather was) treated as a blank. Any other (non-blank) character in column 6 is/was treated as a continuation indicator, but not the 0.
Not all f66 compilers adhered to the convention of executing a loop at least once, but it was a common (non-portable) assumption.
Similarly, the assumption of all static variables was not a portable one, but can be implemented by adding a SAVE statement, beginning with f77. A further assumption that SAVE variables will be zero-initialized is even more non-portable, but most compilers have an option to implement that.
If an attempt is being made to resurrect old code, it is probably worth while to get it working before modernizing it incrementally so as to make it more self-documenting. The computed goto looks like a relatively sane one which could be replaced by select case, at a possible expense of optimization. Here the recent uses of the term "modernization" become contradictory.
The ,1 bits are to get the compiler to spot the errors. It is quite common to do the following
DO 10 I = 1.7
That is perfectly legal since spaces are allowed in variable names. If you wish to avoid that, then put in the extra number. The following will generate errors
DO 10 I = 1.7,1
DO 10 I = 1,7.1
DO 10 I = 1.7.1
Re the program getting struck, try puting a continuation line between labels 21 and 22. The if-goto is the same as if-not-then in the later versions of Fortran and the computed goto is the same as a select statement. You don't need to recode it: there is nothing wrong with it other than youngsters getting confused whenever they see goto. All you need to do is indent it and it becomes obvious. So what you will have is
DO 22 TDP = QDP, 7, 1
...
DO 23 RD = 1, N, 1
GOTO (...) TDP
...
GOTO 21
...
GOTO 20
...
GOTO 20
...
20 CONTINUE
Y(RD) = ...
21 CONTINUE
23 CONTINUE
22 CONTINUE
You will probably end up with far more code if you try recoding it. It will look exactly the same except that gotos have been replaced by other words. It is possible that the compiler is generating the wrong code so just help it by putting a few dummy (CONTINUE) statements.
I have some Clojure code that is simulating and then processing numerical data. The data are basically vectors of double values; the processing mainly involves summing their values in various ways. I will include some code below, but my question is (I think) more general - I just don't have a clue how to interpret the hprof results.
Anyway, my test code is:
(defn spin [n]
(let [c 6000
signals (spin-signals c)]
(doseq [_ (range n)] (time (spin-voxels c signals)))))
(defn -main []
(spin 4))
where spin-voxels should be more expensive than spin-signals (especially when repeated multiple times). I can give the lower level routines, but I think this question is more about me not understanding basics of the traces (below).
When I compile this with lein and then do some simple profiling:
> java -cp classes:lib/clojure-1.3.0-beta1.jar -agentlib:hprof=cpu=samples,depth=10,file=hprof.vec com.isti.compset.stack
"Elapsed time: 14118.772924 msecs"
"Elapsed time: 10082.015672 msecs"
"Elapsed time: 9212.522973 msecs"
"Elapsed time: 12968.23877 msecs"
Dumping CPU usage by sampling running threads ... done.
and the profile trace looks like:
CPU SAMPLES BEGIN (total = 4300) Sun Aug 28 15:51:40 2011
rank self accum count trace method
1 5.33% 5.33% 229 300791 clojure.core$seq.invoke
2 5.21% 10.53% 224 300786 clojure.core$seq.invoke
3 5.05% 15.58% 217 300750 clojure.core$seq.invoke
4 4.93% 20.51% 212 300787 clojure.lang.Numbers.add
5 4.74% 25.26% 204 300799 clojure.core$seq.invoke
6 2.60% 27.86% 112 300783 clojure.lang.RT.more
7 2.51% 30.37% 108 300803 clojure.lang.Numbers.multiply
8 2.42% 32.79% 104 300788 clojure.lang.RT.first
9 2.37% 35.16% 102 300831 clojure.lang.RT.more
10 2.37% 37.53% 102 300840 clojure.lang.Numbers.add
which is pretty cool. Up to here, I am happy. I can see that I am wasting time with generic handling of numerical values.
So I look at my code and decide that, as a first step, I will replace vec with d-vec:
(defn d-vec [collection]
(apply conj (vector-of :double) collection))
I am not sure that will be sufficient - I suspect I will also need to add some type annotations in various places - but it seems like a good start. So I compile and profile again:
> java -cp classes:lib/clojure-1.3.0-beta1.jar -agentlib:hprof=cpu=samples,depth=10,file=hprof.d-vec com.isti.compset.stack
"Elapsed time: 15944.278043 msecs"
"Elapsed time: 15608.099677 msecs"
"Elapsed time: 16561.659408 msecs"
"Elapsed time: 15416.414548 msecs"
Dumping CPU usage by sampling running threads ... done.
Ewww. So it is significantly slower. And the profile?
CPU SAMPLES BEGIN (total = 6425) Sun Aug 28 15:55:12 2011
rank self accum count trace method
1 26.16% 26.16% 1681 300615 clojure.core.Vec.count
2 23.28% 49.45% 1496 300607 clojure.core.Vec.count
3 7.74% 57.18% 497 300608 clojure.lang.RT.seqFrom
4 5.59% 62.77% 359 300662 clojure.core.Vec.count
5 3.72% 66.49% 239 300604 clojure.lang.RT.first
6 3.25% 69.74% 209 300639 clojure.core.Vec.count
7 1.91% 71.66% 123 300635 clojure.core.Vec.count
8 1.03% 72.68% 66 300663 clojure.core.Vec.count
9 1.00% 73.68% 64 300644 clojure.lang.RT.more
10 0.79% 74.47% 51 300666 clojure.lang.RT.first
11 0.75% 75.22% 48 300352 clojure.lang.Numbers.double_array
12 0.75% 75.97% 48 300638 clojure.lang.RT.more
13 0.64% 76.61% 41 300621 clojure.core.Vec.count
14 0.62% 77.23% 40 300631 clojure.core.Vec.cons
15 0.61% 77.84% 39 300025 java.lang.ClassLoader.defineClass1
16 0.59% 78.43% 38 300670 clojure.core.Vec.cons
17 0.58% 79.00% 37 300681 clojure.core.Vec.cons
18 0.54% 79.55% 35 300633 clojure.lang.Numbers.multiply
19 0.48% 80.03% 31 300671 clojure.lang.RT.seqFrom
20 0.47% 80.50% 30 300609 clojure.lang.Numbers.add
I have included more rows here because this is the part I don't understand.
Why on earth is Vec.count appearing so often? It's a method that returns the size of the vector. A single line lookup of an attribute.
I assume I am slower because I am still jumping back and forth between Double and double, and that things may improve again when I add more type annotations. But I don't understand what I have now, so I am not so sure blundering fowards makes much sense.
Please, can anyone explain the dump above in general terms? I promise I don't repeatedly call count - instead I have lots of maps and reduces and a few explicit loops.
I wondered if I am perhaps being confused by the JIT? Maybe I am missing a pile of information because functions are being inlined? Oh, and I am using 1.3.0-beta1 because it appears to have more sensible number handling.
[UPDATE] I summarised my experiences at http://www.acooke.org/cute/Optimising1.html - got a 5x speedup (actually was 10x after cleaning some more and moving to 1.3) despite never understanding this.
calling seq for a Vec object (object created by vector-of) creates a VecSeq Object.
VecSeq object created on Veca calls Vec.count in it's method internal-reduce, which is used by clojure.core/reduce.
so it seems that a vector created by vector-of calls Vec.count while reducing. And as you mentioned that the code did a lot of reducing this seems to be the cause
What remains spooky is that Vec.count is that Vec.count seems to be very simple:
clojure.lang.Counted
(count [_] cnt)
a simple getter that doesn't do any counting.
Just talking out loud, it looks like your code is doing a lot of back and forth conversion to/from Seq.
Looking at RT.seqFrom, this calls ArraySeq.createFromObject
if(array == null || Array.getLength(array) == 0)
return null;
So would that be that using vec uses fast vector access, and that using d-vec is forcing the use of arrays and the call to a slow java.lang.Array.getLength method (that uses reflection .. )
I trying to get started with Google Perf Tools to profile some CPU intensive applications. It's a statistical calculation that dumps each step to a file using `ofstream'. I'm not a C++ expert so I'm having troubling finding the bottleneck. My first pass gives results:
Total: 857 samples
357 41.7% 41.7% 357 41.7% _write$UNIX2003
134 15.6% 57.3% 134 15.6% _exp$fenv_access_off
109 12.7% 70.0% 276 32.2% scythe::dnorm
103 12.0% 82.0% 103 12.0% _log$fenv_access_off
58 6.8% 88.8% 58 6.8% scythe::const_matrix_forward_iterator::operator*
37 4.3% 93.1% 37 4.3% scythe::matrix_forward_iterator::operator*
15 1.8% 94.9% 47 5.5% std::transform
13 1.5% 96.4% 486 56.7% SliceStep::DoStep
10 1.2% 97.5% 10 1.2% 0x0002726c
5 0.6% 98.1% 5 0.6% 0x000271c7
5 0.6% 98.7% 5 0.6% _write$NOCANCEL$UNIX2003
This is surprising, since all the real calculation occurs in SliceStep::DoStep. The "_write$UNIX2003" (where can I find out what this is?) appears to be coming from writing the output file. Now, what confuses me is that if I comment out all the outfile << "text" statements and run pprof, 95% is in SliceStep::DoStep and `_write$UNIX2003' goes away. However my application does not speed up, as measured by total time. The whole thing speeds up less than 1 percent.
What am I missing?
Added:
The pprof output without the outfile << statements is:
Total: 790 samples
205 25.9% 25.9% 205 25.9% _exp$fenv_access_off
170 21.5% 47.5% 170 21.5% _log$fenv_access_off
162 20.5% 68.0% 437 55.3% scythe::dnorm
83 10.5% 78.5% 83 10.5% scythe::const_matrix_forward_iterator::operator*
70 8.9% 87.3% 70 8.9% scythe::matrix_forward_iterator::operator*
28 3.5% 90.9% 78 9.9% std::transform
26 3.3% 94.2% 26 3.3% 0x00027262
12 1.5% 95.7% 12 1.5% _write$NOCANCEL$UNIX2003
11 1.4% 97.1% 764 96.7% SliceStep::DoStep
9 1.1% 98.2% 9 1.1% 0x00027253
6 0.8% 99.0% 6 0.8% 0x000274a6
This looks like what I'd expect, except I see no visible increase in performance (.1 second on a 10 second calculation). The code is essentially:
ofstream outfile("out.txt");
for loop:
SliceStep::DoStep()
outfile << 'result'
outfile.close()
Update: I timing using boost::timer, starting where the profiler starts and ending where it ends. I do not use threads or anything fancy.
From my comments:
The numbers you get from your profiler say, that the program should be around 40% faster without the print statements.
The runtime, however, stays nearly the same.
Obviously one of the measurements must be wrong. That means you have to do more and better measurements.
First I suggest starting with another easy tool: the time command. This should get you a rough idea where your time is spend.
If the results are still not conclusive you need a better testcase:
Use a larger problem
Do a warmup before measuring. Do some loops and start any measurement afterwards (in the same process).
Tiristan: It's all in user. What I'm doing is pretty simple, I think... Does the fact that the file is open the whole time mean anything?
That means the profiler is wrong.
Printing 100000 lines to the console using python results in something like:
for i in xrange(100000):
print i
To console:
time python print.py
[...]
real 0m2.370s
user 0m0.156s
sys 0m0.232s
Versus:
time python test.py > /dev/null
real 0m0.133s
user 0m0.116s
sys 0m0.008s
My point is:
Your internal measurements and time show you do not gain anything from disabling output. Google Perf Tools says you should. Who's wrong?
_write$UNIX2003 is probably referring to the write POSIX system call, which outputs to the terminal. I/O is very slow compared to almost anything else, so it makes sense that your program is spending a lot of time there if you are writing a fair bit of output.
I'm not sure why your program wouldn't speed up when you remove the output, but I can't really make a guess on only the information you've given. It would be nice to see some of the code, or even the perftools output when the cout statement is removed.
Google perftools collects samples of the call stack, so what you need is to get some visibility into those.
According to the doc, you can display the call graph at statement or address granularity. That should tell you what you need to know.