In comparing bernoulli_distribution's default constructor (50/50 chance of true/false) and uniform_int_distribution{0, 1} (uniform likely chance of 0 or 1) I find that bernoulli_distributions are at least 2x and upwards of 6x slower than uniform_int_distribution despite the fact that they give equivalent results.
I would expect bernoulii_distribition to perform better due to it being specifically designed for the probability of only two outcomes, true or false; yet, it doesn't.
Given the above and the below performance metrics, are there practical uses of bernoulli distributions over uniform_int_distributions?
Results over 5 runs (Release mode, x64-bit):
(See edit below for release runs without the debugger attached)
bernoulli: 58 ms
false: 500690
true: 499310
uniform: 9 ms
1: 499710
0: 500290
----------
bernoulli: 57 ms
false: 500921
true: 499079
uniform: 9 ms
0: 499614
1: 500386
----------
bernoulli: 61 ms
false: 500440
true: 499560
uniform: 9 ms
0: 499575
1: 500425
----------
bernoulli: 59 ms
true: 498798
false: 501202
uniform: 9 ms
1: 499485
0: 500515
----------
bernoulli: 58 ms
true: 500777
false: 499223
uniform: 9 ms
0: 500450
1: 499550
----------
Profiling code:
#include <chrono>
#include <random>
#include <iostream>
#include <unordered_map>
int main() {
auto gb = std::mt19937{std::random_device{}()};
auto bd = std::bernoulli_distribution{};
auto bhist = std::unordered_map<bool, int>{};
auto start = std::chrono::steady_clock::now();
for(int i = 0; i < 1'000'000; ++i) {
bhist[bd(gb)]++;
}
auto end = std::chrono::steady_clock::now();
auto dif = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "bernoulli: " << dif.count() << " ms\n";
std::cout << std::boolalpha;
for(auto& b : bhist) {
std::cout << b.first << ": " << b.second << '\n';
}
std::cout << std::noboolalpha;
std::cout << '\n';
auto gu = std::mt19937{std::random_device{}()};
auto u = std::uniform_int_distribution<int>{0, 1};
auto uhist = std::unordered_map<int, int>{};
start = std::chrono::steady_clock::now();
for(int i = 0; i < 1'000'000; ++i) {
uhist[u(gu)]++;
}
end = std::chrono::steady_clock::now();
dif = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "uniform: " << dif.count() << " ms\n";
for(auto& b : uhist) {
std::cout << b.first << ": " << b.second << '\n';
}
std::cout << '\n';
}
EDIT
I re-ran the test without debugging symbols attached and bernoulli still ran a good 4x slower:
bernoulli: 37 ms
false: 500250
true: 499750
uniform: 9 ms
0: 500433
1: 499567
-----
bernoulli: 36 ms
false: 500595
true: 499405
uniform: 9 ms
0: 499061
1: 500939
-----
bernoulli: 36 ms
false: 500988
true: 499012
uniform: 8 ms
0: 499596
1: 500404
-----
bernoulli: 36 ms
true: 500425
false: 499575
uniform: 8 ms
0: 499974
1: 500026
-----
bernoulli: 36 ms
false: 500847
true: 499153
uniform: 8 ms
0: 500082
1: 499918
-----
A default constructed std::bernoulli_distribution gives equal weight to both outcomes, but you can give it a different distribution parameter to change the probabilities. That might cause extra complexity. A better comparison would be to use a std::uniform_real_distribution<double> and compare its result to 0.5 (by default it gives a random number in the range [0, 1)).
See here for an example:
gcc output:
bernoulli: 28 ms
false: 499818
true: 500182
uniform: 31 ms
1: 500686
0: 499314
real: 29 ms
1: 500191
0: 499809
clang output:
bernoulli: 106 ms
false: 500662
true: 499338
uniform: 23 ms
1: 501263
0: 498737
real: 101 ms
1: 499683
0: 500317
The results are about the same using gcc (multiple runs tend to give uniform int a higher time, contrary to what you saw). With clang I get bernoulli and real to be about the same, with uniform int being much less time. Both of these are using -O3.
The class bernoulli_distribution is used to generate boolean with possible uneven ratios. To achieve that it has to generate a floating point in [0,1] range and then compare it versus the given probability. Or anything equivalent.
It is rather obvious that this routine is likely to be slower than taking a random integer modulo 2 - which is pretty much all it takes to create a uniform number in {0,1} from a random number.
How is it surprising? Only if compiler somehow manages to figure out unnecessary operations while being aware that it is 50/50 during compilation can the performance match up to even.
Some comments and answers suggest using uniform_real_distribution instead.
I tested uniform_real_distribution(0.0f, nextafter(1.0f, 20.f)) (to account for urd being a half-closed range) vs bernoulli_distribution and the bernoulli_distribution is faster by about 20%-25% regardless of the probability (and gave more correct results. I tested 1.0 true probability and my implementation that used the above urd values actually gave false negatives (granted one or two out of 5 one-million runs) and bernoulli gave the correct none.
So, speed-wise: bernoulli_distribution is faster than uniform_real_distribution but slower than uniform_int_distribution.
Long-story short, use the right tool for the job, don't reinvent the wheel, the STL is well-built, etc. and depending on the use-case one is better than the other.
For yes-no probability (IsPercentChance(float probability)), bernoulli_distribution is faster and better.
For pure "give me a random random bool value", uniform_int_distribution is faster and better.
Related
In spite of the fact that there are online plenty of algorithms and functions for generating unique combinations of any size from a list of unique items, there is none available in case of a list of non-unique items (i.e. list containing repetitions of same value.)
The question is how to generate ON-THE-FLY in a generator function all
the unique combinations from a non-unique list without the
computational expensive need of filtering out duplicates?
I consider combination comboA to be unique if there is no other combination comboB for which sorted lists for both combinations are the same. Let's give an example of code checking for such uniqueness:
comboA = [1,2,2]
comboB = [2,1,2]
print("B is a duplicate of A" if sorted(comboA)==sorted(comboB) else "A is unique compared to B")
In the above given example B is a duplicate of A and the print() prints B is a duplicate of A.
The problem of getting a generator function capable of providing unique combinations on-the-fly in case of a non-unique list is solved here: Getting unique combinations from a non-unique list of items, FASTER?, but the provided generator function needs lookups and requires memory what causes problems in case of a huge amount of combinations.
The in the current version of the answer provided function does the job without any lookups and appears to be the right answer here, BUT ...
The goal behind getting rid of lookups is to speed up the generation of unique combinations in case of a list with duplicates.
I have initially (writing the first version of this question) wrongly assumed that code which doesn't require creation of a set used for lookups needed to assure uniqueness is expected to give an advantage over code needing lookups. It is not the case. At least not always. The code in up to now provided answer does not using lookups, but is taking much more time to generate all the combinations in case of no redundant list or if only a few redundant items are in the list.
Here some timings to illustrate the current situation:
-----------------
k: 6 len(ls): 48
Combos Used Code Time
---------------------------------------------------------
12271512 len(list(combinations(ls,k))) : 2.036 seconds
12271512 len(list(subbags(ls,k))) : 50.540 seconds
12271512 len(list(uniqueCombinations(ls,k))) : 8.174 seconds
12271512 len(set(combinations(sorted(ls),k))): 7.233 seconds
---------------------------------------------------------
12271512 len(list(combinations(ls,k))) : 2.030 seconds
1 len(list(subbags(ls,k))) : 0.001 seconds
1 len(list(uniqueCombinations(ls,k))) : 3.619 seconds
1 len(set(combinations(sorted(ls),k))): 2.592 seconds
Above timings illustrate the two extremes: no duplicates and only duplicates. All other timings are between this two.
My interpretation of the results above is that a pure Python function (not using any C-compiled modules) can be extremely faster, but it can be also much slower depending on how many duplicates are in a list. So there is probably no way around writing C/C++ code for a Python .so extension module providing the required functionality.
Instead of post-processing/filtering your output, you can pre-process your input list. This way, you can avoid generating duplicates in the first place. Pre-processing involves either sorting (or using a collections.Counter on) the input. One possible recursive realization is:
def subbags(bag, k):
a = sorted(bag)
n = len(a)
sub = []
def index_of_next_unique_item(i):
j = i + 1
while j < n and a[j] == a[i]:
j += 1
return j
def combinate(i):
if len(sub) == k:
yield tuple(sub)
elif n - i >= k - len(sub):
sub.append(a[i])
yield from combinate(i + 1)
sub.pop()
yield from combinate(index_of_next_unique_item(i))
yield from combinate(0)
bag = [1, 2, 3, 1, 2, 1]
k = 3
i = -1
print(sorted(bag), k)
print('---')
for i, subbag in enumerate(subbags(bag, k)):
print(subbag)
print('---')
print(i + 1)
Output:
[1, 1, 1, 2, 2, 3] 3
---
(1, 1, 1)
(1, 1, 2)
(1, 1, 3)
(1, 2, 2)
(1, 2, 3)
(2, 2, 3)
---
6
Requires some stack space for the recursion, but this + sorting the input should use substantially less time + memory than generating and discarding repeats.
The current state-of-the-art inspired initially by a 50 than by a 100 reps bounties is at the moment (instead of a Python extension module written entirely in C):
An efficient algorithm and implementation that is better than the obvious (set + combinations) approach in the best (and average) case, and is competitive with it in the worst case.
It seems to be possible to fulfill this requirement using a kind of "fake it before you make it" approach. The current state-of-the-art is that there are two generator function algorithms available for solving the problem of getting unique combinations in case of a non-unique list. The below provided algorithm combines both of them what becomes possible because it seems to exist a threshold value for percentage of unique items in the list which can be used for appropriate switching between the two algorithms. The calculation of the percentage of uniqueness is done with so tiny amount of computation time that it even doesn't clearly show up in the final results due to common variation of the taken timing.
def iterFastUniqueCombos(lstList, comboSize, percUniqueThresh=60):
lstListSorted = sorted(lstList)
lenListSorted = len(lstListSorted)
percUnique = 100.0 - 100.0*(lenListSorted-len(set(lstListSorted)))/lenListSorted
lstComboCandidate = []
setUniqueCombos = set()
def idxNextUnique(idxItemOfList):
idxNextUniqueCandidate = idxItemOfList + 1
while (
idxNextUniqueCandidate < lenListSorted
and
lstListSorted[idxNextUniqueCandidate] == lstListSorted[idxItemOfList]
): # while
idxNextUniqueCandidate += 1
idxNextUnique = idxNextUniqueCandidate
return idxNextUnique
def combinate(idxItemOfList):
if len(lstComboCandidate) == sizeOfCombo:
yield tuple(lstComboCandidate)
elif lenListSorted - idxItemOfList >= sizeOfCombo - len(lstComboCandidate):
lstComboCandidate.append(lstListSorted[idxItemOfList])
yield from combinate(idxItemOfList + 1)
lstComboCandidate.pop()
yield from combinate(idxNextUnique(idxItemOfList))
if percUnique > percUniqueThresh:
from itertools import combinations
allCombos = combinations(lstListSorted, comboSize)
for comboCandidate in allCombos:
if comboCandidate in setUniqueCombos:
continue
yield comboCandidate
setUniqueCombos.add(comboCandidate)
else:
yield from combinate(0)
#:if/else
#:def iterFastUniqueCombos()
The below provided timings show that the above iterFastUniqueCombos() generator function provides a clear advantage
over uniqueCombinations() variant in case the list has less than 60 percent of unique elements and is not worse as
the on (set + combinations) based uniqueCombinations() generator function in the opposite case where it gets much faster than the iterUniqueCombos() one (due to switching between
the (set + combinations) and the (no lookups) variant at 60% threshold for amount of unique elements in the list):
=========== sizeOfCombo: 6 sizeOfList: 48 noOfUniqueInList 1 percUnique 2
Combos: 12271512 print(len(list(combinations(lst,k)))) : 2.04968 seconds.
Combos: 1 print(len(list( iterUniqueCombos(lst,k)))) : 0.00011 seconds.
Combos: 1 print(len(list( iterFastUniqueCombos(lst,k)))) : 0.00008 seconds.
Combos: 1 print(len(list( uniqueCombinations(lst,k)))) : 3.61812 seconds.
========== sizeOfCombo: 6 sizeOfList: 48 noOfUniqueInList 48 percUnique 100
Combos: 12271512 print(len(list(combinations(lst,k)))) : 1.99383 seconds.
Combos: 12271512 print(len(list( iterUniqueCombos(lst,k)))) : 49.72461 seconds.
Combos: 12271512 print(len(list( iterFastUniqueCombos(lst,k)))) : 8.07997 seconds.
Combos: 12271512 print(len(list( uniqueCombinations(lst,k)))) : 8.11974 seconds.
========== sizeOfCombo: 6 sizeOfList: 48 noOfUniqueInList 27 percUnique 56
Combos: 12271512 print(len(list(combinations(lst,k)))) : 2.02774 seconds.
Combos: 534704 print(len(list( iterUniqueCombos(lst,k)))) : 1.60052 seconds.
Combos: 534704 print(len(list( iterFastUniqueCombos(lst,k)))) : 1.62002 seconds.
Combos: 534704 print(len(list( uniqueCombinations(lst,k)))) : 3.41156 seconds.
========== sizeOfCombo: 6 sizeOfList: 48 noOfUniqueInList 31 percUnique 64
Combos: 12271512 print(len(list(combinations(lst,k)))) : 2.03539 seconds.
Combos: 1114062 print(len(list( iterUniqueCombos(lst,k)))) : 3.49330 seconds.
Combos: 1114062 print(len(list( iterFastUniqueCombos(lst,k)))) : 3.64474 seconds.
Combos: 1114062 print(len(list( uniqueCombinations(lst,k)))) : 3.61857 seconds.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have an idea of how to prevent a single bit flip (due to cosmic radiation or similar externally induced event) from causing enumerations (enum) to change from one defined value to another defined value in a relatively easy way. To put it simple each value should have an even amount of ones, binary speaking. If one flips, the enum will be odd and is guaranteed not to match any other enum.
I'm not sure how to actually "generate" such a sequence so that it may be used as enum values as those values must be compile time constant. A macro function returning the n:th element in the set would do perfectly.
The first few numbers in the sequence would be 0 (000), 3 (011), 5 (101), 6 (110). I think you get the idea by now.
Non-enumeration (non-compile time) answers are appreciated as it may help me realize how to do it myself.
To make myself clear I want a macro generating the n:th number in an enum with even number of ones in the bit pattern, similar to macros generating the fibbonachi sequence. The lowest bit is essentially a parity bit.
Most of my memory is protected by hardware ECC, L1 cache being one exception. A single bit error in L1 has been measured to occur once every 10000h which is good enough seen from my requirements.
VRAM however is not protected. There I have mostly RGB(A) raster, a few general purpose raster (like stencil) and some geometry. RGB raster is rather unsensative to bit flips as it is only used for visualization. Erroneous geometry is in general very visible, very rare (few KB) and is by design to be resolved by user induced reboot.
For a 4096x4096x8bit stencil (~16MB) single bit error rate is in my environment about once every 8h for average cosmic radiation, more often during solar storms. It is actually not that bad in my opinion, but I'd hate filling the paper work proving to my officers why this is perfectly fine in my application and everyone elses using the stencil data without regard to how the data is used. If having a parity bit in the stencil value however I'd be able to detect most errors and if necessary re-generate the stencil hoping for better results. The stencil can be generated in less than a second so the risk of errors occuring twice in a row is considered low.
So basically, by generating a set of enumerations with parity I dodge the bullet of current and future paper work and research regarding how it may affect my app and others'.
If you simply want to know if the enum is either valid or if a bit is flipped, you can use any values and a parity bit that makes the total number of bits even. (which this first sequence is identical to your example)
0000000 0 = 0
0000001 1 = 3
0000010 1 = 5
0000011 0 = 6
0000100 1 = 9
0000101 0 = 10
0000110 0 = 12
0000111 1 = 15
which can be done by
int encode_enum(int e) {
return (e << 1) + (number_of_bits_set(e) % 2);
}
However, if you want to be able to restore the value, then a simple way is duplication; have multiple copies of the value that can be later compared to eachother. You'd need 3 copies to restore it. If your list of values is small, you can encode it into one integer.
int encode_enum(int e) {
return (e << 20) | (e << 10) | e;
}
Which if e is less than 2^10 is just copied 3 times into a single 32-bit integer.
c++14's constexpr solves this for you:
#include <iostream>
constexpr int bit_count(int val)
{
int count = 0;
for (int i = 0 ; i < 31 ; ++i) {
if (val & (1 << i))
++count;
}
return count;
}
constexpr int next_bitset(int last)
{
int candidate = last + 1;
if (bit_count(candidate) & 1)
return next_bitset(candidate);
return candidate;
}
enum values
{
a,
b = next_bitset(a),
c = next_bitset(b),
d = next_bitset(c)
};
int main()
{
std::cout << "a = " << a << std::endl;
std::cout << "b = " << b << std::endl;
std::cout << "c = " << c << std::endl;
std::cout << "d = " << d << std::endl;
}
expected output:
a = 0
b = 3
c = 5
d = 6
I am running a simple program where, I take a time_point with system_clock::now then this_thread::sleep_for(seconds(1)) and again a time_point with system_clock::now.
Now if I add some extra duration to the 1st time_point, it gives exactly the same result for 1 and 2 seconds!
Here is the demo code:
#include<iostream>
#include<chrono>
#include<thread>
using namespace std;
void CheckDuration (std::chrono::duration<int> seconds)
{
auto start = std::chrono::system_clock::now() + seconds;
std::this_thread::sleep_for(std::chrono::seconds(1));
auto stop = std::chrono::system_clock::now();
cout << "Difference = " << std::chrono::duration_cast<std::chrono::seconds>(stop-start).count() << endl;
}
int main ()
{
CheckDuration(std::chrono::duration<int>(0)); // Difference = 1
CheckDuration(std::chrono::duration<int>(1)); // Difference = 0
CheckDuration(std::chrono::duration<int>(2)); // Difference = 0 <=== ???
CheckDuration(std::chrono::duration<int>(3)); // Difference = -1
}
It is clarifying to add output with finer units, for example:
cout << "Difference = " << std::chrono::duration_cast<std::chrono::milliseconds>(stop-start).count() << endl;
For me, for the 3rd case (argument 2 seconds), the output is:
Difference = -998
(that is in milliseconds)
To analyze this, let T0 represent the time now() is first called in CheckDuration. So:
start == T0 + 2s
stop is called at T0, plus 1 second for sleeping, plus a tiny bit of processing time we can call epsilon. So:
stop == T0 + 1s + epsilon
Subtracting these two we get:
T0 + 1s + epsilon - (T0 + 2s)
simplifying:
epsilon - 1s
In my case, epsilon == 2ms
duration_cast has the behavior of truncate towards zero when the conversion can not be made exactly. So -998ms truncates to 0s. For other duration and time point rounding modes which may be helpful in your computations, see:
http://howardhinnant.github.io/duration_io/chrono_util.html
I have this multimap in my code:
multimap<long, Note> noteList;
// notes are added with this method. measureNumber is minimum `1` and doesn't go very high
void Track::addNote(Note ¬e) {
long key = note.measureNumber * 1000000 + note.startTime;
this->noteList.insert(make_pair(key, note));
}
I'm encountering problems when I try to read the notes from the last measure. In this case the song has only 8 measures and it's measure number 8 that causes problems. If I go up to 16 measures it's measure 16 that causes the problem and so on.
// (when adding notes I use as key the measureNumber * 1000000. This searches for notes within the same measure)
for(noteIT = trackIT->noteList.lower_bound(this->curMsr * 1000000); noteIT->first < (this->curMsr + 1) * 1000000; noteIT++){
if(this->curMsr == 8){
cout << "_______________________________________________________" << endl;
cout << "ID:" << noteIT->first << endl;
noteIT->second.toString();
int blah = 0;
}
// code left out here that processes the notes
}
I have only added one note to the 8th measure and yet this is the result I'm getting in console:
_______________________________________________________
ID:8000001
note toString()
Duration: 8
Start Time: 1
Frequency: 880
_______________________________________________________
ID:1
note toString()
Duration: 112103488
Start Time: 44
Frequency: 0
_______________________________________________________
ID:8000001
note toString()
Duration: 8
Start Time: 1
Frequency: 880
_______________________________________________________
ID:1
note toString()
Duration: 112103488
Start Time: 44
Frequency: 0
This keeps repeating. The first result is a correct note which I've added myself but I have no idea where the note with ID: 1 is coming from.
Any ideas how to avoid this? This loop gets stuck repeating the same two results and I can't get out of it. Even if there are several notes within measure 8 (so that means several values within the multimap that start with 8xxxxxx it only repeats the first note and the non-existand one.
You aren't checking for the end of your loop correctly. Specifically there is no guarantee that noteIT does not equal trackIT->noteList.end(). Try this instead
for (noteIT = trackIT->noteList.lower_bound(this->curMsr * 1000000);
noteIT != trackIT->noteList.end() &&
noteIT->first < (this->curMsr + 1) * 1000000;
++noteIT)
{
For the look of it, it might be better to use some call to upper_bound as the limit of your loop. That would handle the end case automatically.
Is it possible to print a random number in C++ from a set of numbers with ONE SINGLE statement?
Let's say the set is {2, 5, 22, 55, 332}
I looked up rand() but I doubt it's possible to do in a single statement.
int numbers[] = { 2, 5, 22, 55, 332 };
int length = sizeof(numbers) / sizeof(int);
int randomNumber = numbers[rand() % length];
Pointlessly turning things into a single expression is practically what the ternary operator was invented for (I'm having none of litb's compound-statement trickery):
std::cout << ((rand()%5==0) ? 2 :
(rand()%4==0) ? 5 :
(rand()%3==0) ? 22 :
(rand()%2==0) ? 55 :
332
) << std::endl;
Please don't rat on me to my code reviewer.
Ah, here we go, a proper uniform distribution (assuming rand() is uniform on its range) in what you could maybe call a "single statement", at a stretch.
It's an iteration-statement, but then so is a for loop with a great big block containing multiple statements. The syntax doesn't distinguish. This actually contains two statements: the whole thing is a statement, and the whole thing excluding the for(...) part is a statement. So probably "a single statement" means a single expression-statement, which this isn't. But anyway:
// weasel #1: #define for brevity. If that's against the rules,
// it can be copy and pasted 7 times below.
#define CHUNK ((((unsigned int)RAND_MAX) + 1) / 5)
// weasel #2: for loop lets me define and use a variable in C++ (not C89)
for (unsigned int n = 5*CHUNK; n >= 5*CHUNK;)
// weasel #3: sequence point in the ternary operator
((n = rand()) < CHUNK) ? std::cout << 2 << "\n" :
(n < 2*CHUNK) ? std::cout << 5 << "\n" :
(n < 3*CHUNK) ? std::cout << 22 << "\n" :
(n < 4*CHUNK) ? std::cout << 55 << "\n" :
(n < 5*CHUNK) ? std::cout << 332 << "\n" :
(void)0;
// weasel #4: retry if we get one of the few biggest values
// that stop us distributing values evenly between 5 options.
If this is going to be the only code in the entire program, and you don't want it to return the same value every time, then you need to call srand(). Fortunately this can be fitted in. Change the first line to:
for (unsigned int n = (srand((time(0) % UINT_MAX)), 5*CHUNK); n >= 5*CHUNK;)
Now, let us never speak of this day again.
Say these numbers are in a set of size 5, all you gotta do is find a random value multiplied by 5 (to make it equi probable). Assume the rand() method returns you a random value between range 0 to 1. Multiply the same by 5 and cast it to integer you will get equiprobable values between 0 and 4. Use that to fetch from the index.
I dont know the syntax in C++. But this is how it should look
my_rand_val = my_set[(int)(rand()*arr_size)]
Here I assume rand() is a method that returns a value between 0 and 1.
Yes, it is possible. Not very intuitive but you asked for it:
#include <time.h>
#include <stdlib.h>
#include <iostream>
int main()
{
srand(time(0));
int randomNumber = ((int[]) {2, 5, 22, 55, 332})[rand() % 5];
std::cout << randomNumber << std::endl;
return 0;
}
Your "single statement" criteria is very vague. Do you mean one machine instruction, one stdlib call?
If you mean one machine instruction, the answer is no, without special hardware.
If you mean one function call, then of course it is possible. You could write a simple function to do what you want:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main()
{
int setSize = 5;
int set[] = {2, 5, 22, 55, 332 };
srand( time(0) );
int number = rand() % setSize;
printf("%d %d", number, set[number]);
return 0;
}