Tick - no negative

Tick - no negative - chart.js

If I have a linear chart.js graph that often has all 0 values, it will show a high of 1 and a low of -1. If I have values around 3000, I'll have a high and low around that value. I want to specify no negatives - so I want the bottom value to be a minimum of 0. There are options for min and suggestedMin, however, both these set the bottom value at 0. So in the example of 3000, the top would be 3000 and the bottom would be 0, which is not what I want. I want the 3000 to stay the same. I just don't want it to show a -1 when the graph has 0s. I want it to show a top of 1 and bottom of 0, similar to how the min works, but treat it like the minimum possible value, not actually setting the graph data low at that value. So in the case of 3000, 2997 might be the low. But in the case of 0s, 0 is the low, not -1. Hope that makes sense.

I ended up doing this as I only have a single dataset in each graph. Happy to know if there is a better way.
function addData(chart, label, data) {
chart.data.labels.push(label);
chart.data.datasets.forEach((dataset) => {
dataset.data.push(data);
if (dataset.data.every(item => item === 0)) {
chart.options.scales.yAxes[0].ticks.min = 0;
} else {
chart.options.scales.yAxes[0].ticks.min = undefined;
}
});
chart.update();
}

Related

Random selection of a number from a list of numbers that is biased toward the lowest number

At every moment t, I have a different set of positive integers. I need to randomly select one of them, satisfying the criteria that the probability of a particular number to be selected from the set must be proportionally higher, the lower is the value of the number. At moment t+1, we have another set of positive integers, and again we need to select one of them satisfying the same criteria. So on, so forth. How to do this in c++?

One way would be to assign ranges to each number, randomize a value, and pick the number which has the range in which the randomized value is included.
Example
Input:
1 2 3
Ranges:
[0,300) [300, 450) [450, 550)
Probability for each number:
~55% ~27% ~18%
With random value between 0 and 550, ex. 135 the selected number would be 1 since 0 <= 135 < 300.
As to how you do this in C++, try it for yourself first.

What you're talking about is basically the Geometric distribution, or at least a chopped and normalized variant of it.
C++11 includes a geometric distribution generator, and you can use that directly if you like, but to keep things simple, you can also do something like this:
int genRandom(int count)
{
while(true)
{
int x = 0;
while(x < count-1)
{
if(rand() < RAND_MAX/2) // with probability 0.5...
{
return x;
}
x++;
}
}
}
Note that here I'm using rejection sampling to ensure that a value gets picked, rather than return count-1 if I run past it (which would give the last two items equal probability).

efficiently mask-out exactly 30% of array with 1M entries

My question's header is similar to this link, however that one wasn't answered to my expectations.
I have an array of integers (1 000 000 entries), and need to mask exactly 30% of elements.
My approach is to loop over elements and roll a dice for each one. Doing it in a non-interrupted manner is good for cache coherency.
As soon as I notice that exactly 300 000 of elements were indeed masked, I need to stop. However, I might reach the end of an array and have only 200 000 elements masked, forcing me to loop a second time, maybe even a third, etc.
What's the most efficient way to ensure I won't have to loop a second time, and not being biased towards picking some elements?
Edit:
//I need to preserve the order of elements.
//For instance, I might have:
[12, 14, 1, 24, 5, 8]
//Masking away 30% might give me:
[0, 14, 1, 24, 0, 8]
The result of masking must be the original array, with some elements set to zero

Just do a fisher-yates shuffle but stop at only 300000 iterations. The last 300000 elements will be the randomly chosen ones.
std::size_t size = 1000000;
for(std::size_t i = 0; i < 300000; ++i)
{
std::size_t r = std::rand() % size;
std::swap(array[r], array[size-1]);
--size;
}
I'm using std::rand for brevity. Obviously you want to use something better.
The other way is this:
for(std::size_t i = 0; i < 300000;)
{
std::size_t r = rand() % 1000000;
if(array[r] != 0)
{
array[r] = 0;
++i;
}
}
Which has no bias and does not reorder elements, but is inferior to fisher yates, especially for high percentages.

When I see a massive list, my mind always goes first to divide-and-conquer.
I won't be writing out a fully-fleshed algorithm here, just a skeleton. You seem like you have enough of a clue to take decent idea and run with it. I think I only need to point you in the right direction. With that said...
We'd need an RNG that can return a suitably-distributed value for how many masked values could potentially be below a given cut point in the list. I'll use the halfway point of the list for said cut. Some statistician can probably set you up with the right RNG function. (Anyone?) I don't want to assume it's just uniformly random [0..mask_count), but it might be.
Given that, you might do something like this:
// the magic RNG your stats homework will provide
int random_split_sub_count_lo( int count, int sub_count, int split_point );
void mask_random_sublist( int *list, int list_count, int sub_count )
{
if (list_count > SOME_SMALL_THRESHOLD)
{
int list_count_lo = list_count / 2; // arbitrary
int list_count_hi = list_count - list_count_lo;
int sub_count_lo = random_split_sub_count_lo( list_count, mask_count, list_count_lo );
int sub_count_hi = list_count - sub_count_lo;
mask( list, list_count_lo, sub_count_lo );
mask( list + sub_count_lo, list_count_hi, sub_count_hi );
}
else
{
// insert here some simple/obvious/naive implementation that
// would be ludicrous to use on a massive list due to complexity,
// but which works great on very small lists. I'm assuming you
// can do this part yourself.
}
}
Assuming you can find someone more informed on statistical distributions than I to provide you with a lead on the randomizer you need to split the sublist count, this should give you O(n) performance, with 'n' being the number of masked entries. Also, since the recursion is set up to traverse the actual physical array in constantly-ascending-index order, cache usage should be as optimal as it's gonna get.
Caveat: There may be minor distribution issues due to the discrete nature of the list versus the 30% fraction as you recurse down and down to smaller list sizes. In practice, I suspect this may not matter much, but whatever person this solution is meant for may not be satisfied that the random distribution is truly uniform when viewed under the microscope. YMMV, I guess.

Here's one suggestion. One million bits is only 128K which is not an onerous amount.
So create a bit array with all items initialised to zero. Then randomly select 300,000 of them (accounting for duplicates, of course) and mark those bits as one.
Then you can run through the bit array and, any that are set to one (or zero, if your idea of masking means you want to process the other 700,000), do whatever action you wish to the corresponding entry in the original array.
If you want to ensure there's no possibility of duplicates when randomly selecting them, just trade off space for time by using a Fisher-Yates shuffle.
Construct an collection of all the indices and, for each of the 700,000 you want removed (or 300,000 if, as mentioned, masking means you want to process the other ones) you want selected:
pick one at random from the remaining set.
copy the final element over the one selected.
reduce the set size.
This will leave you with a random subset of indices that you can use to process the integers in the main array.

You want reservoir sampling. Sample code courtesy of Wikipedia:
(*
S has items to sample, R will contain the result
*)
ReservoirSample(S[1..n], R[1..k])
// fill the reservoir array
for i = 1 to k
R[i] := S[i]
// replace elements with gradually decreasing probability
for i = k+1 to n
j := random(1, i) // important: inclusive range
if j <= k
R[j] := S[i]

Return a random object from a list based on proprieties

This is quite a strange issue for me because I can't visualize my problem correctly. Just so that you know, I'm not really asking for code but just for an idea to write an approriate alogirthm that would generate some weather based on their probability of occuring.
Here's what I want to achieve :
Let's say I have a WeatherClass, with a parameter called "Probability". I want to have different weather instances with their own probability of "happening".
enum Probability {
Never = -1,
Low = 0,
Normal = 1,
Always = 2
};
std::vector<WeatherClass> WeatherContainer;
WeatherClass Sunny = WeatherClass();
Sunny.Probability = Probability.Normal;
WeatherClass Rainy = WeatherClass();
Rainy.Probability = Probability.Low;
WeatherClass Cloudy = WeatherClass();
Cloudy.Probability = Probability.Normal;
WeatherContainer.push_back(Sunny);
WeatherContainer.push_back(Rainy);
WeatherContainer.push_back(Cloudy);
Now, my question is : what is the most clever way to return some weather based on its own probability of happening?
I don't know why but I can't figure this out.. My first guess would be to have some kind of "luck" variable and compare it with the probability of each element or something similar.
Any hint or advice would be really helpful.
Greets,
required

Generally speaking, assuming you have an integer sequence of numbers representing a linear increase in probability (starting from 1, not 0!):
1,2,3,4,5,6...n
Marking pn for some specific integer (weather in your scheme, say "6"), and the sum of all the enum integers Sn, a linear probability could easily be defined as:
pn/Sn
This of course means the weather associated with "1" is least likely, and the one with "n" is most likely. Other schemes are possible, such as exponential - just need to normalize properly. Also, if you forgot your math:
Sn=(1+n)*n/2
Now you need to roll from this probability. One option, disregarding efficiency, to help you think about this:
Make a giant set, where each weather (or integer) appears as many times as the associated integer. 1 appears once, ..., n appears n times. This list is of size Sn by definition. Now use the random library:
int choice = rand() % Sn; #index between 0 and Sn-1 - chosen probability indicator.
You could of course randomize the list as well for extra randomness.
An example: in our array we have probmap={1,2,2,3,3,3}. If choice==4, then probmap[4]==3. Suppose 3 corresponds to Sunny, then we have our result!. There are of course ways to make this better, choose different probability functions etc. but I think this is a good start.

You can generate a random number between 0 and 3, subtract 1, cast it to Probability and search your vector for a matching entry.
auto result = rand();
result %= 4;
--result;
auto prob = (Probability)result;
auto index = -1;
for( auto I = 0 ; I < WeatherContainer.size() ; ++I )
if( WeatherContainer [ I ].Probability == prob )
{
index = I;
break;
}
if( index != -1 )
{
// Do your thing
}

Probability and random numbers

I am just starting with C++ and am creating a simple text-based adventure. I am trying to figure out how to have probability based events. For example a 50% chance that when you open box there will be a sword and a 50% chance it will be a knife. I know how to make a random number generator, but I don't know how to associate that number with something. I created a variation of what I want but it requires the user to input the random number. I am wondering how to base the if statement on whether or not the random number was greater or less than 50, not if the number the user put in was greater or less than 50.

Use the rest operator % with rand.
rand()%2 can give you either 0 or 1.
Lets 0 be a sword and 1 be a knife.
If you also need an axe,then use rand()%3.
It can give you 0,1 or 2.
2 represents an axe and 0 and 1 like above.
The ifs and elses are then obvious.
rand()%n where n is a big number has a higher probability to give you smaller numbers. The probability is not equally distributed. You can check out some random number generators from stl or boost.

If you use rand() it generates numbers in the range 0..RAND_MAX. So for a 50% probability you could do something like:
#include <stdlib.h>
if (rand() < RAND_MAX / 2)
{
// sword - 50% probability
}
else
{
// knife - 50% probability
}
You can obviously extend this to more than two different cases with any given probability for each case, simply by defining appropriate thresholds in the range 0..RAND_MAX for each case, e.g.
int r = rand();
if (r < RAND_MAX / 4)
{
// sword - 25% probability
}
else if (r < 3 * RAND_MAX / 4)
{
// knife - 50% probability
}
else
{
// axe - 25% probability
}

Binarization methods, middle-threshold binarisation

I'm trying to binarise a picture, firstly of course having it prepared(grayscaling)
My method is to find the maximum and minimum values of grayscale, then find the middle value(which is my threshold) and then, iterating over all the pixels I compare the current one with a threshold and if the grayscale is larger than the threshold, I put 0 in a matrix, or for the others I put 1.
But now I'm facing the problem. In common I'm binarising images with white background, so my algorithm is further based on this feature. But when I meet an image with black background everything collapses, but I still can see the number clearly(now 0's and 1's switch places)
How can i solve this problem, make my program more common?
Maybe I'd better look for another ways of binarization/
P.S. I looked for an understandable explanation of Otsu threshold method, but it seems either I'm not prepared for this way of difficulty or I find very complicated explanations every time, but I can't write it in C. If anyone could hrlp here, it'd be wonderful.
Sorry for not answering the questions, just didn't see them
Firstly - the code
for (int y=1;y<Source->Picture->Height;y++)
for (int x=1;x<Source->Picture->Width;x++)
{
unsigned green = GetGValue(Source->Canvas->Pixels[x][y]);
unsigned red = GetRValue(Source->Canvas->Pixels[x][y]);
unsigned blue = GetBValue(Source->Canvas->Pixels[x][y]);
threshold = (0.2125*red+0.7154*green+0.0721*blue);
if (min>threshold)
min=threshold;
if (max<threshold)
max = threshold;
}
middle = (max+min)/2;
Then iterating through the image
if (threshold<middle)
{
picture[x][y]=1;
fprintf( fo,"1");
} else {
picture[x][y]=0;
fprintf( fo,"0");
}
}
fprintf( fo,"\n");
}
fclose(fo);
So I get a file, something like this
000000000
000001000
000001000
000011000
000101000
000001000
000001000
000001000
000000000
Here you can see an example of one.
Then I can interpolate it, or do something else (recognize), depending on zero's and one's.
But if I switch the colors, the numbers won't be the same. So the recognition will not work. I wonder if there's an algoritm that can help me out.

I've never heard of Otsu's method, but I understand some of the wikipedia page so I'll try to simplify that.
1 Count how many pixels are at each level of darkness.
2 "Guess" a threshold.
3 Calculate the variance of the counts of darkness less than the threshold
4 Calculate the variance of the counts of darkness greater than the threshold
5 If the variance of the darker side is greater, guess a darker threshold,
else guess a higher threshold.
Do this like a binary search so that it ends.
6 Turn all pixels darker than threshold black, the rest white.
Otsu's method is actually "maximizing inter-class variance", but I don't understand that part of the math.
The concept of variance, is "how far apart are the values from each other." A low variance means everything is similar. A high variance means the values are far apart. The variance of a rainbow is very high, lots of colors. The variance of the background of stackoverflow is 0, since it's all perfectly white, with no other colors. Variance is calculated more or less like this
double variance(unsigned int* counts, int size, int threshold, bool above) {
//this is a quick trick to turn the "upper" into lower, save myself code
if (above) return variance(counts, size-threshold, size-threshold, false);
//first we calculate the average value
unsigned long long atotal=0;
unsigned long long acount=0;
for(int i=0; i<threshold; ++i) {
atotal += counts[i]*i //number of px times value
acount += counts[i];
}
//finish calculating average
double average = double(atotal)/count;
//next we calculate the variance
double vtotal=0;
for(int i=0; i<threshold; ++i) {
//to do so we get each values's difference from the average
double t = std::abs(i-average);
//and square it (I hate mathmaticians)
vtotal += counts[i]*t*t;
}
//and return the average of those squared values.
return vtotal/count;
}

I would tackle this problem with another approach:
Compute the cumulative histogram of greyscaled values of the image.
Use as threshold the pixel value in which this cumulative
reaches half of the total pixels of the image.
The algorithm would go as follows:
int bin [256];
foreach pixel in image
bin[pixelvalue]++;
endfor // this computes the histogram of the image
int thresholdCount = ImageWidth * ImageSize / 2;
int count = 0;
for int i = 0 to 255
count = count + bin[i];
if( count > thresholdCount)
threshold = i;
break; // we are done
endif
endfor
This algorithm does not compute the cumulative histogram itself but rather uses the image histogram to do what I said earlier.

If your algorithm works properly for white backgrounds but fails for black backgrounds, you simply need to detect when you have a black background and invert the values. If you assume the background value will be more common, you can simply count the number of 1s and 0s in the result; if the 0s are greater, invert the result.

Instead of using mean of min and max, you should use median of all points as threshold. In general kth percentile (k = what percentage of points you want as black) is more appropriate.
Another solution is to cluster the data into two clusters.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Tick - no negative - chart.js

Related

Random selection of a number from a list of numbers that is biased toward the lowest number

efficiently mask-out exactly 30% of array with 1M entries

Return a random object from a list based on proprieties

Probability and random numbers

Binarization methods, middle-threshold binarisation

Categories

Resources