We have button. User clicking the button and receive action1 or action2 depended by variable percent_to_action2(from 0 to 100). The simplest way to give him action1 or action2 is based on rand() % 100 and comparison with percent_to_action2.
But the problem is that if eg. perfect_to_action = 50 there is no guaranty that after first random action1 user will get action2 (by rand()). I am searching the ways to avoid many repeating actions. Please, suggest how to count more accurately considering the previous event/or all events. with examples and comments. Goal is to avoid excessive number of repeating actions that rund() can give. for example with percent = 50 rand() can give 10/10 action2!
ps. perfect_to_action can be changed at any time.
pps. sorry for me english.
my code:
int num_rand = (rand() % 100 ) + 1; // from 1 to 100
if ( num_rand <= current_percent_to_action2 )
{
// action 1
} else {} // action2
What I want in examples:
percent = 50:
action1 than action2 than action1 than action2 etc.
percent = 33:
(first by rand)
if first action1 than action1 than action2 than action1 than action1 than action2 etc.
static unsigned num_action_1 = 1;
static unsigned num_action_2 = 1;
double bias = double(num_action_2)/num_action_1;
double randomchance = 1.0-current_percent_to_action2/100.0;
double action_1_cutoff = RAND_MAX*randomchance*bias;
if ( rand() <= action_1_cutoff ) {
// action 1
++num_action_1;
} else {
// action2
++num_action_2;
}
This will bias the randomness toward the option that has happened less frequently. I also changed it so that action 2 will happen roughly current_percent_to_action2 percent of the time, instead of action 1 like in your code. As you can see by this chart, it adds a lot of complexity, but you are far less likely to get unbalanced numbers of results. In the long term, these will end up virtually identical though, both will give strings of 10 in a row eventually, this code just starts far more even.
times #1 Even distribution Biased distribution
1 50% 50%
2 25% 8.3%
3 12.5% 3.125%
4 6.25% 1.25%
5 3.13% 0.52%
6 1.56% 0.22%
7 0.78% 0.09%
8 0.39% 0.04%
9 0.20% 0.02%
10 0.10% 0.01%
If you don't want actions repeated, you can either a) choose from all BUT the last action, or b) choose as you do now, but keep re-choosing until you get something other than the previous action. The latter is easier to do, but is slower (possibly a LOT slower).
Related
I need to count the number of items in my output.
So for example i created this:
a =1000000
while a >=10:
print a
a=a/2
How would i count how many halving steps were carried out?
Thanks
You have 2 ways: the empiric way and the predictible way.
a =1000000
import math
print("theorical iterations {}".format(int(math.log2(a//10)+0.5)))
counter=0
while a >=10:
counter+=1
a//=2
print("real iterations {}".format(counter))
I get:
theorical iterations 17
real iterations 17
The experimental method just counts the iterations, whereas the predictive method relies on the rounded (to upper bound) result of log2 value of a (which matches the complexity of the algorithm).
(It's rounded to upper bound because if it's more than 16, then you need 17 iterations)
c = 0
a = 1000000
while a >= 10:
print a
a = a / 2
c = c + 1
Let say I've a system that distribute 8820 values into 96 values, rounding using Banker's Round (call them pulse). The formula is:
pulse = BankerRound(8820 * i/96), with i[0,96[
Thus, this is the list of pulses:
0
92
184
276
368
459
551
643
735
827
919
1011
1102
1194
1286
1378
1470
1562
1654
1746
1838
1929
2021
2113
2205
2297
2389
2481
2572
2664
2756
2848
2940
3032
3124
3216
3308
3399
3491
3583
3675
3767
3859
3951
4042
4134
4226
4318
4410
4502
4594
4686
4778
4869
4961
5053
5145
5237
5329
5421
5512
5604
5696
5788
5880
5972
6064
6156
6248
6339
6431
6523
6615
6707
6799
6891
6982
7074
7166
7258
7350
7442
7534
7626
7718
7809
7901
7993
8085
8177
8269
8361
8452
8544
8636
8728
Now, suppose the system doesn't send to me these pulses directly. Instead, it send these pulse in 8820th (call them tick):
tick = value * 1/8820
The list of the ticks I get become:
0
0.010430839
0.020861678
0.031292517
0.041723356
0.052040816
0.062471655
0.072902494
0.083333333
0.093764172
0.104195011
0.11462585
0.124943311
0.13537415
0.145804989
0.156235828
0.166666667
0.177097506
0.187528345
0.197959184
0.208390023
0.218707483
0.229138322
0.239569161
0.25
0.260430839
0.270861678
0.281292517
0.291609977
0.302040816
0.312471655
0.322902494
0.333333333
0.343764172
0.354195011
0.36462585
0.375056689
0.38537415
0.395804989
0.406235828
0.416666667
0.427097506
0.437528345
0.447959184
0.458276644
0.468707483
0.479138322
0.489569161
0.5
0.510430839
0.520861678
0.531292517
0.541723356
0.552040816
0.562471655
0.572902494
0.583333333
0.593764172
0.604195011
0.61462585
0.624943311
0.63537415
0.645804989
0.656235828
0.666666667
0.677097506
0.687528345
0.697959184
0.708390023
0.718707483
0.729138322
0.739569161
0.75
0.760430839
0.770861678
0.781292517
0.791609977
0.802040816
0.812471655
0.822902494
0.833333333
0.843764172
0.854195011
0.86462585
0.875056689
0.88537415
0.895804989
0.906235828
0.916666667
0.927097506
0.937528345
0.947959184
0.958276644
0.968707483
0.979138322
0.989569161
Unfortunately, between these ticks it sends to me also fake ticks, that aren't multiply of original pulses. Such as 0,029024943, which is multiply of 256, which isn't in the pulse lists.
How can I find from this list which ticks are valid and which are fake?
I don't have the pulse list to compare with during the process, since 8820 will change during the time, so I don't have a list to compare step by step. I need to deduce it from ticks at each iteration.
What's the best math approch to this? Maybe reasoning only in tick and not pulse.
I've thought to find the closer error between nearest integer pulse and prev/next tick. Here in C++:
double pulse = tick * 96.;
double prevpulse = (tick - 1/8820.) * 96.;
double nextpulse = (tick + 1/8820.) * 96.;
int pulseRounded=round(pulse);
int buffer=lrint(tick * 8820.);
double pulseABS = abs(pulse - pulseRounded);
double prevpulseABS = abs(prevpulse - pulseRounded);
double nextpulseABS = abs(nextpulse - pulseRounded);
if (nextpulseABS > pulseABS && prevpulseABS > pulseABS) {
// is pulse
}
but for example tick 0.0417234 (pulse 368) fails since the prev tick error seems to be closer than it: prevpulseABS error (0.00543795) is smaller than pulseABS error (0.0054464).
That's because this comparison doesn't care about rounding I guess.
NEW POST:
Alright. Based on what I now understand, here's my revised answer.
You have the information you need to build a list of good values. Each time you switch to a new track:
vector<double> good_list;
good_list.reserve(96);
for(int i = 0; i < 96; i++)
good_list.push_back(BankerRound(8820.0 * i / 96.0) / 8820.0);
Then, each time you want to validate the input:
auto iter = find(good_list.begin(), good_list.end(), input);
if(iter != good_list.end()) //It's a match!
cout << "Happy days! It's a match!" << endl;
else
cout << "Oh bother. It's not a match." << endl;
The problem with mathematically determining the correct pulses is the BankerRound() function which will introduce an ever-growing error the higher values you input. You would then need a formula for a formula, and that's getting out of my wheelhouse. Or, you could keep track of the differences between successive values. Most of them would be the same. You'd only have to check between two possible errors. But that falls apart if you can jump tracks or jump around in one track.
OLD POST:
If I understand the question right, the only information you're getting should be coming in the form of (p/v = y) where you know 'y' (that's each element in your list of ticks you get from the device) and you know that 'p' is the Pulse and 'v' is the Values per Beat, but you don't know what either of them are. So, pulling one point of data from your post, you might have an equation like this:
p/v = 0.010430839
'v', in all the examples you've used thus far, is 8820, but from what I understand, that value is not a guaranteed constant. The next question then is: Do you have a way of determining what 'v' is before you start getting all these decimal values? If you do, you can work out mathematically what the smallest error can be (1/v) then take your decimal information, multiply it by 'v', round it to the nearest whole number and check to see if the difference between its rounded form and its non-rounded form falls in the bounds of your calculated error like so:
double input; //let input be elements in your list of doubles, such as 0.010430839
double allowed_error = 1.0 / values_per_beat;
double proposed = input * values_per_beat;
double rounded = std::round(proposed);
if(abs(rounded - proposed) < allowed_error){cout << "It's good!" << endl;}
If, however, you are not able to ascertain the values_per_beat ahead of time, then this becomes a statistical question. You must accumulate enough data samples, remove the outliers (the few that vary from the norm) and use that data. But that approach will not be realtime, which, given the terms you've been using (values per beat, bpm, the value 44100) it sounds like realtime might be what you're after.
Playing around with Excel, I think you want to multiply up to (what should be) whole numbers rather than looking for closest pulses.
Tick Pulse i Error OK
Tick*8820 Pulse*96/8820 ABS( i - INT( i+0.05 ) ) Error < 0.01
------------ ------------ ------------- ------------------------ ------------
0.029024943 255.9999973 2.786394528 0.786394528 FALSE
0.0417234 368.000388 4.0054464 0.0054464 TRUE
0 0 0 0 TRUE
0.010430839 91.99999998 1.001360544 0.001360544 TRUE
0.020861678 184 2.002721088 0.002721088 TRUE
0.031292517 275.9999999 3.004081632 0.004081632 TRUE
0.041723356 367.9999999 4.005442176 0.005442176 TRUE
0.052040816 458.9999971 4.995918336 0.004081664 TRUE
0.062471655 550.9999971 5.99727888 0.00272112 TRUE
0.072902494 642.9999971 6.998639424 0.001360576 TRUE
0.083333333 734.9999971 7.999999968 3.2E-08 TRUE
The table shows your two "problem" cases (the real wrong value, 256, and the one your code gets wrong, 368) followed by the first few "good" values.
If both 8820s vary at the same time, then obviously they will cancel out, and i will just be Tick*96.
The Error term is the difference between the calculated i and the nearest integer; if this less than 0.01, then it is a "good" value.
NOTE: the 0.05 and 0.01 values were chosen somewhat arbitrarily (aka inspired first time guess based on the numbers): adjust if needed. Although I've only shown the first few rows, all the 96 "good" values you gave show as TRUE.
The code (completely untested) would be something like:
double pulse = tick * 8820.0 ;
double i = pulse * 96.0 / 8820.0 ;
double error = abs( i - floor( i + 0.05 ) ) ;
if( error < 0.05 ) {
// is pulse
}
I assume your initializing your pulses in a for-loop, using int i as loop variable; then the problem is this line:
BankerRound(8820 * i/96);
8820 * i / 96 is an all integer operation and the result is integer again, cutting off the remainder (so in effect, always rounding towards zero already), and BankerRound actually has nothing to round any more. Try this instead:
BankerRound(8820 * i / 96.0);
Same problem applies if you are trying to calculate prev and next pulse, as you actually subtract and add 0 (again, 1/8820 is all integer and results in 0).
Edit:
From what I read from the commments, the 'system' is not – as I assumed previously – modifiable. Actually, it calculates ticks in the form of n / 96.0, n ∊ [0, 96) in ℕ
however including some kind of internal rounding appearently independent from the sample frequency, so there is some difference to the true value of n/96.0 and the ticks multiplied by 96 do not deliver exactly the integral values in [0, 96) (thanks KarstenKoop). And some of the delivered samples are simply invalid...
So the task is to detect, if tick * 96 is close enough to an integral value to be accepted as valid.
So we need to check:
double value = tick * 96.0;
bool isValid
= value - floor(value) < threshold
|| ceil(value) - value < threshold;
with some appropriately defined threshold. Assuming the values really are calculated as
double tick = round(8820*i/96.0)/8820.0;
then the maximal deviation would be slightly greater than 0.00544 (see below for a more exact value), thresholds somewhere in the sizes of 0.006, 0.0055, 0.00545, ... might be a choice.
Rounding might be a matter of internally used number of bits for the sensor value (if we have 13 bits available, ticks might actually be calculated as floor(8192 * i / 96.0) / 8192.0 with 8192 being 1 << 13 &ndash and floor accounting to integer division; just a guess...).
The exact value of the maximal deviation, using 8820 as factor, as exact as representable by double, was:
0.00544217687075132516838493756949901580810546875
The multiplication by 96 is actually not necessary, you can compare directly with the threshold divided by 96, which would be:
0.0000566893424036596371706764330156147480010986328125
I am making a simple (terminal) slot machine project, in which 3 fruit names will be output in the terminal, and if they are all the same then the player wins.
I cannot figure out how to make a set probability that the player will win the round (roughly 40% chance for example). As of now I have:
this->slotOne = rand() % 6 + 1; // chooses rand number for designated slot
this->oneFruit = spinTOfruit(this->slotOne); //converts rand number to fruit name
this->slotTwo = rand() % 6 + 1;
this->twoFruit = spinTOfruit(this->slotTwo);
this->slotThree = rand() % 6 + 1;
this->threeFruit = spinTOfruit(this->slotThree);
which picks a "fruit" based on the number, but each of the three slots has a 1 in 6 chance (seeing that there are 6 fruits). Since each individual slot has a 1/6 chance, overall the probability of winning is incredibly low.
How would I fix this to create better odds (or even better, chosen odds, changing the odds when desired)?
I thought of changing the second two spins to less options (rand()%2 for instance), but that would make the last two slots choose the same couple fruits every time.
The link to my project: https://github.com/tristanzickovich/slotmachine
Cheat.
Decide first if the player wins or not
const bool winner = ( rand() % 100 ) < 40 // 40 % odds (roughly)
Then invent an outcome that supports your decision.
if ( winner )
{
// Pick the one winning fruit.
this->slotOne = this->slotTwo = this->slotThree = rand() % 6 + 1;
}
else
{
// Pick a failing combo.
do
{
this->slotOne = rand() % 6 + 1;
this->slotTwo = rand() % 6 + 1;
this->slotThree = rand() % 6 + 1;
} while ( slotOne == slotTwo && slotTwo == slotThree );
}
You can now toy with the player's emotions like the Vegas best.
I need to program a nearest neighbor algorithm in stata from scratch because my dataset does not allow me to use any of the available solutions (as far as I am concerned).
To be pecise. I have a dataset that is of similar structure to that of the following (original has around 14k observations)
input id value treatment match
1 0.14 0 .
2 0.32 0 .
3 0.465 1 2
4 0.878 1 2
5 0.912 1 2
6 0.001 1 1
end
I want to generate a variable called match (already included in the example above). For each observation with treatment == 1 the variable match should store the id of another observation from within treatment == 0 whose value is closest to value of the considered observation (treatment == 1).
I am new to stata programming, so I am not yet familiar with the syntax. My first shot is the following however it does not produce any changes to the match variable. I am sure this is a novice question but I am hoping for some advice on how to make the code running.
EDIT: I have changed the code slightly and now it seems to work. Do you see any problems that may arise if I run it on a bigger dataset?
set more off
clear all
input id pscore treatment
1 0.14 0
2 0.32 0
3 0.465 1
4 0.878 1
5 0.912 1
6 0.001 1
end
gen match = .
forval i = 1/`= _N' {
if treatment[`i'] == 1 {
local dist 1
forvalues j = 1/`= _N' {
if (treatment[`j'] == 0) {
local current_dist (pscore[`i'] - pscore[`j'])^2
if `dist' > `current_dist' {
local dist `current_dist' // update smallest distance
replace match = id[`j'] in `i' // write match
}
}
}
}
}
Consider some simulated data: 1,000 observations, 200 of them untreated (treat == 0) and the rest treated (treat == 1). Then the code included below will be much more efficient than the originally posted. (Ties, like in your code, are not explicitly handled.)
clear
set more off
*----- example data -----
set obs 1000
set seed 32956
gen id = _n
gen pscore = runiform()
gen treat = cond(_n <= 200, 0, 1)
*----- new method -----
timer clear
timer on 1
// get id of last non-treated and first treated
// (data is sorted by treat and ids are consecutive)
bysort treat (id): gen firsttreat = id[1]
local firstt = first[_N]
local lastnt = `firstt' - 1
// start loop
gen match = .
gen dif = .
quietly forvalues i = `firstt'/`=_N' {
// compute distances
replace dif = (pscore[`i'] - pscore)^2
summarize dif in 1/`lastnt', meanonly
// identify id of minimum-distance observation
replace match = . in 1/`lastnt'
replace match = id in 1/`lastnt' if dif == r(min)
summarize match in 1/`lastnt', meanonly
// save the minimum-distance id
replace match = r(max) in `i'
}
// clean variable and drop
replace match = . in 1/`lastnt'
drop dif firsttreat
timer off 1
tempfile first
save `first'
*----- your method -----
drop match
timer on 2
gen match = .
quietly forval i = 1/`= _N' {
if treat[`i'] == 1 {
local dist 1
forvalues j = 1/`= _N' {
if (treat[`j'] == 0) {
local current_dist (pscore[`i'] - pscore[`j'])^2
if `dist' > `current_dist' {
local dist `current_dist' // update smallest distance
replace match = id[`j'] in `i' // write match
}
}
}
}
}
timer off 2
tempfile second
save `second'
// check for equality of results
cf _all using `first'
// check times
timer list
The results in seconds to finish execution:
. timer list
1: 0.19 / 1 = 0.1930
2: 10.79 / 1 = 10.7900
The difference is huge, specially considering this data set has only 1,000 observations.
An interesting thing to notice is that as the number of non-treated cases increases relative to the number of treated, then the original method improves, but never reaches the levels of efficiency of the new method. As an example, invert the number of cases, so there is now 800 untreated and 200 treated (change data setup to gen treat = cond(_n <= 800, 0, 1)). The result is
. timer list
1: 0.07 / 1 = 0.0720
2: 4.45 / 1 = 4.4470
You can see that the new method also improves and is still much faster. In fact, the relative difference is still the same.
Another way to do this is using joinby or cross. The problem is they temporarily expand (a lot) the size of your data base. In many cases, they are not feasible due to the hard limit Stata has on the number of possible observations (see help limits). You can find an example of joinby here: https://stackoverflow.com/a/19784222/2077064.
Edit
If there's a large number of treated relative to untreated, your code suffers
because you go through the whole first loop many more times (due to the first if).
Furthermore, going through
that whole loop once, implies going through another loop that
has itself two if conditions, _N more times.
The opposite case in which there are few treated observations means that you go through the whole
first loop only in a small number of occasions, speeding up your code substantially.
The reason my code can maintain its efficiency is due to the use of in. This always
offers speed gains over if. Stata will go directly to those observations with no
logical checking needed. Your problem provides an opportunity for that replacement
and it's wise to seize it.
If my code used if where in is in place, the results would be different.
Your code would be faster for the
case in which there's a large number of untreated relative to treated, and again, that
is because in your code there would not be the need to go through the complete loop,
requiring very little work;
the first loop is short-circuited with the first if. For the opposite case,
my code would still dominate.
The key is to "separate" treated from untreated and work on each group using in.
I was just writing an if statement to "spawn" objects in a map and I was playing with percents, but I'm not sure if I'm doing it right. This is what I have:
int chance = rng_.nextInt(0, 100);
if(chance <= 20) // 20%
{
// Spawn a chest
}
else if((chance > 20) && (chance <= 50)) // 30%
{
// Spawn a monster
}
// Otherwise don't spawn nothing
Is this a correct approach or I'm just wrong?
Edit: Ok, I have fixed the code and now I think the question is solved.
No, because
30% + 70% + 30% is more than 100%
in your code the chance of "Spawn a monster" is 40% not 70%
Assuming rng_.nextInt is generating a number between 0 and 100, and it is a linear distributions (any number between 0 and 99 is just as likely as any other number), then 0-19 would be a chest (20 percent chance), 20-49 would be a monster (30 percent chance), and anything between 50 and 99 (50 percent chance) would spawn nothing. So the code would look like:
int chance = rng_.nextInt(0,100);
if ( chance < 20 )
{
// spawn a chest
}
else if ( chance < 50 )
{
// spawn a monster
}
else
{
// Do other items if required.
}
so 20+30+50 = 100 which equates to 0-99 (100 integer values) in your random number generation.