Overtone sin-osc ignoring mult and add? - clojure

Overtone seems to be ignoring the mult and add arguments to sin-osc. Wanted to make sure I'm not missing something silly before filing a bug report.
This snippet should vary the amplitude from 0.6 to 1.0:
(definst sin-test [freq 440]
(* (sin-osc:kr 0.5 0 0.2 0.8)
(sin-osc:ar freq)))
Instead its clearly multiplying by -1 to 1 (going totally silent in the middle). Setting the mult on the sin-osc:ar in this snippet also has no effect.
The equivalent in straight supercollider behaves as expected:
{ SinOsc.kr(0.5,0,0.2,0.8) * SinOsc.ar(440) }.play;
I notice that the the tremolo example on the overtone getting started page does the mult and add manually, but figured they were just being explanatory:
(definst trem [freq 440 depth 10 rate 6 length 3]
(* 0.3
(line:kr 0 1 length FREE)
(saw (+ freq (* depth (sin-osc:kr rate))))))

Related

Is there a way to count from last condition x?

I have a complex set of data that can return 3 different conditions per row. I need to be able to count the last x rows matching one of the specific conditions.
The following formula has been working well for me, but I have discovered a glitch in one instance of this formula (the formula is replicated at least a dozen times)
=ArrayFormula(LOOKUP(9.99999999999999E+307,IF(FREQUENCY(IF(AQ:AQ)=1,ROW(AQ:AQ)),IF(AQ:AQ<>1,ROW(AQ:AQ)))=0,FREQUENCY(IF(AQ:AQ=1,ROW(AQ:AQ)),IF(AQ:AQ=0,ROW(AQ:AQ))))))
Current criteria are as such:
0: Condition x met - Reset counter
1: Condition y met - Increment counter
2: Condition z met - Ignore this row
Therefore this:
1
2
2
2
1
1
0
1
1
1
Should output: 3
This:
1
2
0
2
2
1
2
1
Should output: 2
However the glitch I have encountered isn't resetting the counter when 0 is reached, for example:
1
2
1
2
1
1
2
2
2
2
0
Should output: 0
But in fact is outputting: 4
I have tested all possible conditions with that specific data set and I cannot rectify the issue. I believe there is an error in the formula (specifically the 9.99999999999999E+307) but I wrote it so long ago that I cannot successfully debug it. I have tried 1E+306 but the result is the same.
EDIT1: Upon request I have included as stripped down version of the sheet as I can while recreating the issue.
https://docs.google.com/spreadsheets/d/1SOXiFMEQelqptBvjcabMZGNgG60TRRbe_b65rzT1bi0/edit?usp=sharing
If you scroll to the bottom of the sheet you can see Col AQ has a 0, as a result the value in the cell AF2 should be 0.
You will notice in the sheet that I am using Named Ranges.
EDIT2: player0's answer was PERFECT!! <3
I modified the new formula to adapt to my spreadsheet so it could accommodate Named Ranges and drop-down lists. This question helped me a lot with that:
Convert column index into corresponding column letter
The final formula (just FYI) turned out to be:
=ARRAYFORMULA(COUNTIF(
INDIRECT(REGEXEXTRACT(ADDRESS(ROW(), column(INDIRECT($A$1 & Z$1 & "L"))), "[A-Z]+")&
MAX(IF((INDIRECT($A$1 & Z$1 & "L")=0)*(INDIRECT($A$1 & Z$1 & "L")<>""),
ROW(INDIRECT($A$1 & Z$1 & "L"))+1,5))&":"&
REGEXEXTRACT(ADDRESS(ROW(), column(INDIRECT($A$1 & Z$1 & "L"))), "[A-Z]+")), 1))
=ARRAYFORMULA(COUNTIF(INDIRECT("A"&
MAX(IF((A2:A=0)*(A2:A<>""), ROW(A2:A)+1, ROW(A2)))&":A"), 1))
spreadsheet demo

How Python calculates % function can some one please explain 3%5

How Python calculates % function? can some one please explain 3%5 outcome as 3 in Python? Answer for 5%3 is also showing 3. I use python 2.7
The Python % operator isn't percentage, it's modulo. That means the remainder part of a division. Remember when you were a kid and your math problems would be like 11 divided by 3 = 3 R 2 (remainder 2)? That's what % does. 5 % 3 = 2.
If you want to calculate percentage, do that yourself like A * 100.0 / B.

Calculating the distance between characters

Problem: I have a large number of scanned documents that are linked to the wrong records in a database. Each image has the correct ID on it somewhere that says where it belongs in the db.
I.E. A DB row could be:
| user_id | img_id | img_loc |
| 1 | 1 | /img.jpg|
img.jpg would have the user_id (1) on the image somewhere.
Method/Solution: Loop through the database. Pull the image text in to a variable with OCR and check if user_id is found anywhere in the variable. If not, flag the record/image in a log, if so do nothing and move on.
My example is simple, in the real world I have a guarantee that user_id wouldn't accidentally show up on the wrong form (it is of a specific format that has its own significance)
Right now it is working. However, it is incredibly strict. If you've worked with OCR you understand how fickle it can be. Sometimes a 7 = 1 or a 9 = 7, etc. The result is a large number of false positives. Especially among images with low quality scans.
I've addressed some of the image quality issues with some processing on my side - increase image size, adjust the black/white threshold and had satisfying results. I'd like to add the ability for the prog to recognize, for example, that "81*7*23103" is not very far from "81*9*23103"
The only way I know how to do that is to check for strings >= to the length of what I'm looking for. Calculate the distance between each character, calc an average and give it a limit on what is a good average.
Some examples:
Ex 1
81723103 - Looking for this
81923103 - Found this
--------
00200000 - distances between characters
0 + 0 + 2 + 0 + 0 + 0 + 0 + 0 = 2
2/8 = .25 (pretty good match. 0 = perfect)
Ex 2
81723103 - Looking
81158988 - Found
--------
00635885 - distances
0 + 0 + 6 + 3 + 5 + 8 + 8 + 5 = 35
35/8 = 4.375 (Not a very good match. 9 = worst)
This way I can tell it "Flag the bottom 30% only" and dump anything with an average distance > 6.
I figure I'm reinventing the wheel and wanted to share this for feedback. I see a huge increase in run time and a performance hit doing all these string operations over what I'm currently doing.

Computation of Kullback-Leibler (KL) distance between text-documents using numpy

My goal is to compute the KL distance between the following text documents:
1)The boy is having a lad relationship
2)The boy is having a boy relationship
3)It is a lovely day in NY
I first of all vectorised the documents in order to easily apply numpy
1)[1,1,1,1,1,1,1]
2)[1,2,1,1,1,2,1]
3)[1,1,1,1,1,1,1]
I then applied the following code for computing KL distance between the texts:
import numpy as np
import math
from math import log
v=[[1,1,1,1,1,1,1],[1,2,1,1,1,2,1],[1,1,1,1,1,1,1]]
c=v[0]
def kl(p, q):
p = np.asarray(p, dtype=np.float)
q = np.asarray(q, dtype=np.float)
return np.sum(np.where(p != 0,(p-q) * np.log10(p / q), 0))
for x in v:
KL=kl(x,c)
print KL
Here is the result of the above code: [0.0, 0.602059991328, 0.0].
Texts 1 and 3 are completely different, but the distance between them is 0, while texts 1 and 2, which are highly related has a distance of 0.602059991328. This isn't accurate.
Does anyone has an idea of what I'm not doing right with regards to KL? Many thanks for your suggestions.
Though I hate to add another answer, there are two points here. First, as Jaime pointed out in the comments, KL divergence (or distance - they are, according to the following documentation, the same) is designed to measure the difference between probability distributions. This means basically that what you pass to the function should be two array-likes, the elements of each of which sum to 1.
Second, scipy apparently does implement this, with a naming scheme more related to the field of information theory. The function is "entropy":
scipy.stats.entropy(pk, qk=None, base=None)
http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.stats.entropy.html
From the docs:
If qk is not None, then compute a relative entropy (also known as
Kullback-Leibler divergence or Kullback-Leibler distance) S = sum(pk *
log(pk / qk), axis=0).
The bonus of this function as well is that it will normalize the vectors you pass it if they do not sum to 1 (though this means you have to be careful with the arrays you pass - ie, how they are constructed from data).
Hope this helps, and at least a library provides it so don't have to code your own.
After a bit of googling to undersand the KL concept, I think that your problem is due to the vectorization : you're comparing the number of appearance of different words. You should either link your column indice to one word, or use a dictionnary:
# The boy is having a lad relationship It lovely day in NY
1)[1 1 1 1 1 1 1 0 0 0 0 0]
2)[1 2 1 1 1 0 1 0 0 0 0 0]
3)[0 0 1 0 1 0 0 1 1 1 1 1]
Then you can use your kl function.
To automatically vectorize to a dictionnary, see How to count the frequency of the elements in a list? (collections.Counter is exactly what you need). Then you can loop over the union of the keys of the dictionaries to compute the KL distance.
A potential issue might be in your NP definition of KL. Read the wikipedia page for formula: http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
Note that you multiply (p-q) by the log result. In accordance with the KL formula, this should only be p:
return np.sum(np.where(p != 0,(p) * np.log10(p / q), 0))
That may help...

Fortran95 -- Reading from a formatted text file

I need to read some values from a table. These are the first five rows, to give you some idea of what it should look like:
1 + 3 98 96 1
2 + 337 2799 2463 1
3 + 2801 3733 933 1
4 + 3734 5020 1287 1
5 + 5234 5530 297 1
My interest is in the first four columns of each row. I need to read these into arrays. I used the following code:
program ----
implicit none
integer, parameter :: totbases = 4639675, totgenes = 4395
integer :: codtot, ks
integer, dimension(totgenes) :: ngene, lend, rend
character :: genome*4639675, sign*4
open(1,file='e_coli_g_info')
open(2,file='e_coli_g_str')
do ks = 1, totgenes
read(1,100) ngene(ks),sign(ks:ks),lend(ks), rend(ks)
end do
100 format(1x,i4,8x,a1, 2(5x,i7), 22x)
do ks = 1, 100
write(*,*) ngene(ks), sign(ks:ks),lend(ks), rend(ks)
end do
end program
The loop at the end of the program is to print the first hundred entries to test that they are being read correctly. The problem is that I am getting this garbage (the fourth row is the problem):
1 + 3 757934891
2 + 337 724249387
3 + 2801 757803819
4 + 3734 757803819
5 + 5234 757935405
Clearly, the fourth column is way off. In fact, I cannot find these values anywhere in the file that I am reading from. I am using the gfortran compiler for Ubuntu 12.04. I would greatly appreciate if somebody would point me in the right direction. I'm sure it's likely that I'm missing something very obvious because I'm new at Fortran.
Fortran formats are (traditionally, there's some newer stuff that I won't go into here) fixed format, that is, they are best suited for file formats with fixed columns. I.e. column N always starts at character position M, no ifs or buts. If your file format is more "free format"-like, that is, columns are separated by whitespace, it's often easier and more robust to read data using list formatting. That is, try to do your read loop as
do ks = 1, totgenes
read(1, *) ngene(ks), sign(ks:ks), lend(ks), rend(ks)
end do
Also, as a general advice, when opening your own files, start from unit 10 and go upwards from there. Fortran implementations typically use some of the low-numbered units for standard input, output, and error (a common choice is units 1, 5, and 6). You probably don't want to redirect those.
PS 2: I haven't tried your code, but it seems that you have a bounds overflow in the sign variable. It's declared of length 4, but then you assign to index ks which goes all the way up to totgenes. As you're using gfortran on Ubuntu 12.04 (that is, gfortran 4.6), when developing compile with options "-O1 -Wall -g -fcheck=all"