So, I am having a problem with my code. The program needs to choose randomized one from 4 printfs and print it in the terminal. I am new at this so sorry about that.
#include <stdio.h>
#include <stdlib.h>
#include <locale.h>
int main () {
setlocale (LC_ALL, "Portuguese");
int opcao;
opcao = rand() % 3 + 1;
if (opcao == 0) {
printf ("\nA opção sorteada foi a de que o 1º classificado atual será o campeão (FC Porto)");
}
if (opcao == 1) {
printf ("\nA opção sorteada foi a de que o 1º classificado na 1ª volta será o campeão (SL Benfica)");
}
if (opcao == 2) {
printf ("\nA opção sorteada foi a de que Porto e Benfica farão um jogo em campo neutro para determinar o campeão!");
}
if (opcao == 4) {
printf ("\nFoi sorteada a opção de que não haverá campeão esta época");
}
return 0;
}
This is my code but is only choosing the same printf ever and ever.
Use the <random> library instead of the obsolete and error-prone std::rand (using the modulo operator to obtain a random integer in a range is a common mistake). See Why is the new random library better than std::rand()? for more information.
#include <iostream>
#include <random>
int main()
{
std::mt19937 engine{std::random_device{}()};
std::uniform_int_distribution<int> dist{0, 3};
switch (dist(eng)) {
case 0:
std::cout << "...\n";
break;
case 1:
std::cout << "...\n";
break;
case 2:
std::cout << "...\n";
break;
case 3:
std::cout << "...\n";
break;
}
}
Here, we first create a std::mt19937 engine, which produces uniformly distributed integers in the half-open range [0, 232), and seed it using a std::random_device, which is supposed to generate a non-deterministic number (it may be implemented using the system time, for example). Then, we create a std::uniform_int_distribution to map the random numbers generated by the engine to integers in the inclusive interval [0, 3] by calling it with the engine as argument.
This can be generalized by taking a range of strings to print:
template <typename RanIt, typename F>
decltype(auto) select(RanIt begin, RanIt end, F&& f)
{
if (begin == end) {
throw std::invalid_argument{"..."};
}
thread_local std::mt19937 engine{std::random_device{}()};
using index_t = long long; // for portability
std::uniforn_int_distribution<index_t> dist{0, index_t{end - begin - 1}};
return std::invoke(std::forward<F>(f), begin[dist(engine)]);
}
int main()
{
const std::array<std::string, 4> messages {
// ...
};
select(messages.begin(), messages.end(),
[](const auto& string) {
std::cout << string << '\n';
});
}
Here, we take a pair of random access iterators and a Callable object to support selecting an element from an arbitrary random-accessible range and performing an arbitrary operation on it.
First, we check if the range is empty, in which case selection is not possible and an error is reported by throwing an exception.
Then, we create a std::mt19937 engine that is thread_local (that is, each thread has its own engine) to prevent data races. The state of the engine is maintained between calls, so we only seed it once for every thread.
After that, we create a std::uniform_int_distribution to generate a random index. Note that we used long long instead of typename std::iterator_traits<RanIt>::difference_type: std::uniform_int_distribution is only guaranteed to work with short, int, long, long long, unsigned short, unsigned int, unsigned long, and unsigned long long, so if difference_type is signed char or an extended signed integer type, it results in undefined behavior. long long is the largest supported signed integer type, and we use braced initialization to prevent narrowing conversions.
Finally, we std::forward the Callable object and std::invoke it with the selected element. The decltype(auto) specifier makes sure that the type and value category of the invocation is preserved.
We call the function with an std::array and a lambda expression that prints the selected element.
Since C++20, we can constrain the function template using concepts:
template <std::random_access_iterator RanIt,
std::indirectly_unary_invocable<RanIt> F>
decltype(auto) select(RanIt begin, RanIt end, F&& f)
{
// ...
}
Before C++20, we can also use SFINAE:
template <typename RanIt, typename F>
std::enable_if_t<
std::is_base_of_v<
std::random_access_iterator_tag,
typename std::iterator_traits<RanIt>::iterator_category
>,
std::invoke_result_t<F, typename std::iterator_traits<RanIt>::value_type>
> select(RanIt begin, RanIt end, F&& f)
{
// ...
}
I wanted to add some things here, adding to the previous answers.
First and foremost, you are not writing the code for yourself. As a non-native English speaker, I understand why it may seem easier to write down a code in your native language, but don't do it!
Secondly, I have made changes to the code in order to make it easier and more readable:
#include <time.h>
#include <cstdint>
#include <iostream>
#include <string>
constexpr uint32_t NUM_OF_POSSIBLE_PROMPTS = 4;
int main () {
srand(time(NULL)); // seed according to current time
for (uint32_t i = 0; i < 10; ++i)
{
int option = rand() % (NUM_OF_POSSIBLE_PROMPTS);
std::string prompt = "";
switch (option)
{
case 0:
prompt = "option 0";
break;
case 1:
prompt = "option 1";
break;
case 2:
prompt = "option 2";
break;
case 3:
prompt = "option 3";
break;
default:
// some error handling!
break;
}
std::cout << prompt << std::endl;
}
return 0;
}
I'm using switch-case instead of if-else-if-else which is much more readable and efficient!
I'm using constexpr in order to store my hard-coded number - it is a bad habit having hardcoded numbers in the code (in a real program, I would make a constexpr value of 10 as well for the loop boundaries).
In c++ (unlike in c), we use std::cout and its operator << in order to print, and not the printf function. It makes a unified behavior with other types of streams, such as stringstream (which is helpful when trying to construct strings in real time, however, it is somewhat heavy on resources!).
Since this code is more organized, it easier for you to understand where a possible error may occur, and also makes it less likely for that to happen in the first place.
For example, using gcc's -Wswitch-enum flag will make sure that if you use an enum, all values must be handled in switch-case sections (which makes your program less error prone, of course).
P.S, I have added the loop only to show you that this code is getting different results every round, and you can test this by running the code multiple times.
You did not provide a seed for the random number generator. From the man page,
If no seed value is provided, the functions are automatically seeded
with a value of 1.
If you have the same seed on every run, you'll always get the same random sequence.
A couple of problems with your program:
You don't have a seed, that's the reason why the numbers are repeated. Use
srand (time(NULL)); // #include <time.h>
before you use rand()
Your random numbers are not sequenced,you have 0-2 and then 4, when you get 3 there is no option available. If it's on purpose, ignore this remark.
With rand() % 3 + 1; your random numbers will range from 1 to 3, so opcao == 0 and opcao == 4 will never occur. For a 0-4 interval you will need something like:
opcao = rand() % 5;
Related
I wrote a very simple implementation of what could be a similarity to Assembly/machine code.
It is even capable of recursion as in this example:
9 6
IFEQ R0,0
RET 1
ENDIF
MOV R1,R0
SUB R1,1
CALL R1
MOV R2,R9
MUL R2,R0
RET R2
Output: 720 (factorial of 6)
Description:
9 = Program Lines
6 = Program Input. Will be set to registry R0 value at class construction
CALL = calls the program again with the passed value (recursion)
RET = returns the program with the specified value. Sets registry R9 value to output value.
R0 to R9 -> general purpose registry.
R0 - program input value
R9 - program output value
-edit: Program commands:
MOV, ADD, SUB, MUL, DIV, MOD, IFEQ, IFNEQ, IFG, IFGE, IFL, IFLE, ENDIF, CALL, RET
However the program can enter into infinite loop/recursion. e.g:
2 0
CALL 10
RET 1 //will never be reached
How do I verify whether MY program will enter into an infinite loop/recursion?
Here's my implementation, don't know whether it's necessary, but just in case you need. (It's the whole code... hope you don't mind).
#include <iostream>
#include <map>
#include <string> //std::getline
#include <sstream>
#include <vector>
namespace util
{
template<typename I>I readcin(I& input) {
std::cin >> input;
std::cin.clear(); std::cin.ignore();
return input;
}
template<typename I, typename...O> I readcin(I& input, O&... others) {
readcin(input);
return readcin(others...);
}
}
//operations
enum OP
{
MOV, ADD, SUB, MUL, DIV, MOD,
IFG, IFL,
IFEQ, IFGE, IFLE,
IFNEQ,
CALL,
RET,
ENDIF,
};
std::map<std::string, OP> OPTABLE
{
{"MOV", MOV}, {"ADD", ADD}, {"SUB", SUB}, {"MUL", MUL}, {"DIV", DIV}, {"MOD", MOD},
{"RET", RET},
{"IFG", IFG}, {"IFL", IFL},
{"IFNEQ", IFNEQ}, {"IFEQ", IFEQ}, {"IFGE", IFGE}, {"IFLE", IFLE},
{"CALL", CALL},
{"ENDIF", ENDIF}
};
//registry index
enum RI
{
R0, R1, R2, R3, R4, R5, R6, R7, R8, R9, RI_MAX
};
std::map<std::string, RI> RITABLE =
{
{"R0", R0}, {"R1", R1}, {"R2", R2}, {"R3", R3}, {"R4", R4}, {"R5", R5},
{"R6", R6}, {"R7", R7}, {"R8", R8}, {"R9", R9}
};
struct Instruction
{
OP op;
RI r1;
int r2value;
Instruction() = delete;
Instruction(OP operation, RI firstRegister, int _2ndRegValue = -1)
{
op = operation;
r1 = firstRegister;
r2value = _2ndRegValue;
}
};
class Assembly
{
private:
int REG[RI::RI_MAX] {0};
int GetRegistryValue(RI ri) const { return REG[ri]; }
void SetRegistryValue(RI ri, int val) { REG[ri] = val; }
enum CMP_FLAG{ CMP_FAIL, CMP_OK };
CMP_FLAG flag { CMP_OK };
CMP_FLAG GetFlag() const { return flag; }
void SetFlag(bool setFlag) { flag = static_cast<CMP_FLAG>(setFlag); }
std::vector<std::string> programLines;
OP ExtractOP(const std::string& line);
RI ExtractRI(const std::string& line, OP op);
int Extract2ndRIval(const std::string& line, OP op);
public:
void addCommand(const std::string& line) { programLines.push_back(line); }
void Execute();
Assembly() = delete;
Assembly(int inputValue) { REG[R0] = inputValue; }
int ReturnValue() const { return REG[R9]; }
private:
//recursion only
Assembly(int inputValue, const std::vector<std::string>& progLines) {
REG[R0] = inputValue;
programLines = progLines;
this->Execute();
}
};
OP Assembly::ExtractOP(const std::string& line)
{
std::istringstream issline{ line };
std::string operation;
issline >> operation;
return OPTABLE[operation];
}
RI Assembly::ExtractRI(const std::string& line, OP op)
{
auto space = line.find(' ');
if(op <= IFNEQ){
auto comma = line.find(',');
return RITABLE[std::string(line.begin() + space + 1, line.begin() + comma)];
}
return RI_MAX;
}
int Assembly::Extract2ndRIval(const std::string& line, OP op)
{
if(op == ENDIF) {
return -1;
}
std::size_t spaceOrComma;
if(op == CALL || op == RET) {
spaceOrComma = line.find(' ');
} else {
spaceOrComma = line.find(',');
}
std::string opval = std::string(line.begin() + spaceOrComma + 1, line.end());
auto it = RITABLE.find(opval);
if(it != RITABLE.end()){
return this->REG[it->second];
}
auto num = std::atoi(opval.c_str());
return num;
}
void Assembly::Execute()
{
for(const std::string& line : programLines)
{
OP op = ExtractOP(line);
RI r1 = ExtractRI(line, op);
int r2value = Extract2ndRIval(line, op);
Instruction command ( op, r1, r2value );
if(GetFlag() == CMP_FAIL)
{
if(command.op == ENDIF){
SetFlag(CMP_OK);
}
continue;
}
switch(command.op)
{
case MOV: { SetRegistryValue(command.r1, command.r2value); } break;
case ADD: { SetRegistryValue(command.r1, REG[command.r1] + command.r2value); } break;
case SUB: { SetRegistryValue(command.r1, REG[command.r1] - command.r2value); } break;
case MUL: { SetRegistryValue(command.r1, REG[command.r1] * command.r2value); } break;
case DIV: { SetRegistryValue(command.r1, REG[command.r1] / command.r2value); } break;
case MOD: { SetRegistryValue(command.r1, REG[command.r1] % command.r2value); } break;
case IFEQ: { SetFlag(GetRegistryValue(command.r1) == command.r2value); } break;
case IFNEQ: { SetFlag(GetRegistryValue(command.r1) != command.r2value); } break;
case IFG: { SetFlag(GetRegistryValue(command.r1) > command.r2value); } break;
case IFL: { SetFlag(GetRegistryValue(command.r1) < command.r2value); } break;
case IFGE: { SetFlag(GetRegistryValue(command.r1) >= command.r2value); } break;
case IFLE: { SetFlag(GetRegistryValue(command.r1) <= command.r2value); } break;
case RET:
{
SetRegistryValue(R9, command.r2value);
return;
}break;
//oh boy!
case CALL:
{
// std::cout << "value to call:" << command.r2value << std::endl;
Assembly recursion(command.r2value, this->programLines);
SetRegistryValue(R9, recursion.ReturnValue());
}break;
}
}
}
int main()
{
while(true)
{
int pl, input;
util::readcin(pl, input);
if(pl == 0){
break;
}
Assembly Asm(input);
for(auto i=0; i<pl; ++i)
{
std::string line;
std::getline(std::cin, line);
Asm.addCommand(line);
}
Asm.Execute();
std::cout << Asm.ReturnValue() << std::endl;
}
return 0;
}
The only way to check to see if a program is stuck in an infinite loop in the general case is to check to see the program has entered the same state as previous state. If it has entered exactly the same state previously, then it must continue on executing in a loop returning to the same state over and over following the same sequence of steps. In real programs this essentially impossible because of the huge number of possible states the program can be in, but your assembly language only allows much more limited number of possible states.
Since your CALL instruction works just like invoking the program at the start and this is the only form of looping, this means that checking if the code enters the same state twice is simple. A CALL instruction with a certain argument has the exact same effect as invoking the program with that argument as an input. If the CALL instruction has the same argument as any previously executed CALL instruction or the program's input value, then it must continue executing in a loop endlessly returning to same state in the same sequence of steps.
In other words, the only state that needs to be checked is the R0 value at the start of the program. Since this value is stored in a int, it can only have 2^32 possible values on any common C++ implementation, so it's reasonable and easy to brute force check see if a given program with a given input gets stuck in infinite loop.
In fact, it's possible to use memoization of the return values to brute force check all possible inputs in O(N) time and O(N) space, where N is number of possible inputs. There are various ways you could do this, but the way I would do it is to create three vectors, all with a number of elements equal to the number of possible inputs. The first vector is a bool (bit) vector that records whether or not a given input has been memoized yet, the second vector is also bool vector and it records whether a given input has been used already on the call stack, the second vector is an int vector that records the result and is used a linked list of input values to create a call stack to save space. (In the code below these vectors are called, is_memoized, input_pending and memoized_value respectively.)
I'd take your interpreter loop and rewrite it to be non-recursive, something like this pseudo-code:
input = reg[R0]
if is_memoized[input]:
reg[R9] = memoized_value[input]
return
input_pending[input] = true
memoized_value[input] = input // mark the top of the stack
while true:
for command in program:
...
if command.op == CALL:
argument = command.r2value
if input_pending[argument]:
// Since this input value is ready been used as input value
// somewhere on the call stack this the program is about to enter
// an identical state as a previous state and so is stuck in
// a infinite loop.
return false // program doesn't halt
if is_memoized[argument]:
REG[R9] = memoized_value[argument]
else:
memoized_value[argument] = input // stack the input value
input = argument
REG = [input, 0, 0, 0, 0, 0, 0, 0, 0, 0]
input_pending[input] = true
break // reevaluate the program from the beginning.
if command.op == RET:
argument = command.r2value
stacked_input = memoized_value[input]
input_pending[input] = false
if stacked_input == input: // at the top of the stack
REG[R9] = argument
return true // program halts for this input
is_memoized[input] = true
memoized_value[input] = argument
input = stacked_input
REG = [input, 0, 0, 0, 0, 0, 0, 0, 0, 0]
break // reevaluate the program from the beginning
You'd then call this interpreter loop for each possible input, something like this:
for i in all_possible_inputs:
if not program.execute(input = i): // the function written above
print "program doesn't halt for all inputs"
return
print "program halts for all inputs"
A recursive version should be faster since it doesn't have to reevaluate the program on each unmemoized CALL instruction in the program, but it would require hundreds of gigabytes of stack space in the worst case. This non-recursive version only requires 17 GB of memory. Either way it's still O(N) space and time, you're just making one constant factor smaller and another bigger.
To get this execute in reasonable amount of time you'd also probably want to parse the code once, and execute some byte code representation instead.
I take it you're looking for outside-the-box thinking.
Think of the halting problem this way. Turing proved programs are free from control. But why? Because the language has instructions to control execution. This means feasibly regulating and predicting execution in programs requires removing all control semantics from the language.
Even my collaborative process architecture doesn't accomplish that. It just forbids them because of the mess they make. It is still composed from a language which contains them. For example, you can use IF to break, return or continue but not for other operations. Function calls are illegal. I created such restrictions to achieve controllability. However not even a collaborative organization removes control structures from the language to prevent their misuse.
My architecture is online via my profile with a factorial example in my W article.
If the program steps into the same configuration twice then it will loop forever.
This is also true for Turing Machines, it is just that the (infinite) input is part of the machine's configuration.
This is also the intuition behind the pumping lemmas.
What constitutes a configuration in your model?
Since you have no memory and no IO, a configuration is given by the content of the registers and the line number of the current instruction (i.e. the instruction pointer).
When do a configuration change?
After every instruction, sure. But in the case of a non-branching instruction, the configurations before and after it are surely different because even if the instruction is a NOP then line number changed.
In the case of a branch, the line number might be one that we've seen before (for a backwards branch), so that's where the machine could step into the same configuration.
The only jumping instruction of interest, to me, seems to be call. The IF like ones will always produce different configurations because they are not expressive enough to produce iteration (jump back and repeat).
How does call change a configuration? It sets the line number to 1 and all the registers (except r0) to zero.
So the condition for a call to produce the same configuration reduces to having the same input.
If you check, in the call implementation, if the operand value has already been used before (in the simulation) then you can tell that the program will loop forever.
If a register has size n, then the possible states are O(2n), which is generally a lot.
You must be prepared to give up after a (possible customizable) threshold. Or in your case where your registers are int, most C++ implementations have 32-bit int, and modern machines can handle a 512MiB bitmap of 2^32 bits. (std::vector<bool> implements a packed bitmap for example; index it with unsigned to avoid negative ints). A hash table is another alternative (std::unordered_set<int>). But if you used a wider type for your registers, the state would be too big to practically record every possible one and you would need some limit. A limit is kind of built-in to your implementation as you will overflow the C++ callstack (C++ recursion depth) before seeing anywhere near 2^32 repeats of the machine being simulated.
If the registers are unbounded in their value, this reduces to the Halting Problem and thus undecideable in the general case. (But as #Brendan says, you can still look for early repeats of the state; many programs will terminate or infinitely repeat in a simple way.)
If you change the call implementation to not zero out the other registers, you must fallback to check the whole configuration at the call site (and not just the operand).
To check the termination of the program on every input you must proceed non-deterministically and symbolically.
There are two problems: the branches and the input value.
It is a famous theorem that an NDTM can be simulated by a TM in an exponential number of steps w.r.t. the steps of the NDTM.
The only problematic instructions are the IF ones because they create non-determinism.
You can take several approaches:
Split the simulation in two branches. One that executes the IF one that does not.
Rewrite the code to be simulated to produce an exponential (in the number of branches) number of branch-free variants of the code. You can generate them lazily.
Keep a tree of configurations, each branch in the program generating two children in the current node in the tree.
They are all equivalent.
The input value is not known, so it's hard to tell if a call ends up in the same configuration.
A possible approach is to record all the changes to the input register, for example you could end up with a description like "sub(add(add(R0, 1), 4), 5);".
This description should be easy to manipulate, as it's easy to see that in the example above R0 didn't change because you get "sub(add(R0, 5), 5);" and then "R0;".
This works by relying on the laws of arithmetics, you must take care of inverse operations, identity elements (1 * a = a) and overflow.
Basically, you need to simplify the expression.
You can then check if the input has changed at a given point in the simulated time.
How do I verify whether a program will enter into an infinite loop/recursion?
In practice; the halting problem is trivial to solve. It's only impossible in theory.
The reason people think that halting problem is impossible to solve is that the question is constructed as a false dilemma ( https://en.wikipedia.org/wiki/False_dilemma ). Specifically; the question asks to determine if a program will always halt or will never halt; but there's a third possibility - sometimes halting (and sometimes not halting). For an example if this, imagine a program that asks the user if they want to halt forever or exit immediately (and correctly does what the user requested). Note that all sane applications are like this - they're not supposed to exit until/unless the user tells them to.
More correctly; in practice, there are 4 possibilities:
runs until something external causes it to halt (power turned off, hardware failure, kill -9, ...)
sometimes halts itself
always halts itself
indeterminate (unable to determine which of the 3 other cases it is)
Of course with these 4 possibilities, you can say you've created a "halting problem solver" just by classifying every program as "indeterminate", but it won't be a good solution. This gives us a kind of rating system - extremely good "halting problem solvers" rarely classify programs as "indeterminate", and extremely bad "halting problem solvers" frequently classify programs as "indeterminate".
So; how do you create a good "halting problem solver"? This involves 2 parts - generating control flow graphs ( https://en.wikipedia.org/wiki/Control-flow_graph ) for each function and a call graph ( https://en.wikipedia.org/wiki/Call_graph ) for the whole program; and value tracking.
Control Flow Graphs and Call Graph
It's not that hard to generate a control flow graph for each function/procedure/routine, just by examining the control flow instructions (call, return, jump, condition branch); and not that hard to generate a call graph while you're doing this (just by checking if a node is already in the call graph when you see a call instruction and adding it if it's not there yet).
While doing this, you want to mark control flow changes (in the control flow graph for each function) as "conditional" or "not conditional", and you want to mark functions (in the call graph for the whole program) as "conditional" or "not conditional".
By analyzing the resulting graphs you can classify trivial programs as "runs until something external causes it to halt" or "always halts itself" (e.g. this is enough to classify OP's original code as "runs until something external causes it to halt"); but the majority of programs will still be "indeterminate".
Value Tracking
Value tracking is (trying) to keep track of the possible values that could be in any register/variable/memory location at any point in time. For example, if a program reads 2 unsigned bytes from disk into 2 separate variable you know both variables will have a value from 0 to 255. If those variables are multiplied you know the result will be a value from 0*0 to 255*255; if those variables are added you know the result will be a value from 0+0 to 255+255; etc. Of course the variable's type gives absolute maximum possible ranges, even for assembly (where there's no types) - e.g. (without knowing if it's signed or unsigned) you know that a 32-bit read from memory will return a value from -2147483648 to 4294967296.
The point of value tracking is to annotate conditional branches in the control flow graph for each function; so that you can use those annotations to help classify a program as something other than "indeterminate".
This is where things get tricky - improving the quality of a "practical halting problem solver" requires increasing the sophistication/complexity of the value tracking. I don't know if it's possible to write a perfect "halting problem solver" (that never returns "indeterminate") but I do know that it's possible to write a "halting problem solver" that is sufficient for almost all practical purposes (that returns "indeterminate" so rarely that nobody cares).
Using C++ standard random generator I can more or less efficiently create sequences with pre-defined distributions using language-provided tools. What about Shannon entropy? Is it possible some way to define output Shannon entropy for the provided sequence?
I tried a small experiment, generated a long enough sequence with linear distribution, and implemented a Shannon entropy calculator. Resulting value is from 0.0 (absolute order) to 8.0 (absolute chaos)
template <typename T>
double shannon_entropy(T first, T last)
{
size_t frequencies_count{};
double entropy = 0.0;
std::for_each(first, last, [&entropy, &frequencies_count] (auto item) mutable {
if (0. == item) return;
double fp_item = static_cast<double>(item);
entropy += fp_item * log2(fp_item);
++frequencies_count;
});
if (frequencies_count > 256) {
return -1.0;
}
return -entropy;
}
std::vector<uint8_t> generate_random_sequence(size_t sequence_size)
{
std::vector<uint8_t> random_sequence;
std::random_device rnd_device;
std::cout << "Random device entropy: " << rnd_device.entropy() << '\n';
std::mt19937 mersenne_engine(rnd_device());
std::uniform_int_distribution<unsigned> dist(0, 255);
auto gen = std::bind(dist, mersenne_engine);
random_sequence.resize(sequence_size);
std::generate(random_sequence.begin(), random_sequence.end(), gen);
return std::move(random_sequence);
}
std::vector<double> read_random_probabilities(size_t sequence_size)
{
std::vector<size_t> bytes_distribution(256);
std::vector<double> bytes_frequencies(256);
std::vector<uint8_t> random_sequence = generate_random_sequence(sequence_size);
size_t rnd_seq_size = random_sequence.size();
std::for_each(random_sequence.begin(), random_sequence.end(), [&](uint8_t b) mutable {
++bytes_distribution[b];
});
std::transform(bytes_distribution.begin(), bytes_distribution.end(), bytes_frequencies.begin(),
[&rnd_seq_size](size_t item) {
return static_cast<double>(item) / rnd_seq_size;
});
return std::move(bytes_frequencies);
}
int main(int argc, char* argv[]) {
size_t sequence_size = 1024 * 1024;
std::vector<double> bytes_frequencies = read_random_probabilities(sequence_size);
double entropy = shannon_entropy(bytes_frequencies.begin(), bytes_frequencies.end());
std::cout << "Sequence entropy: " << std::setprecision(16) << entropy << std::endl;
std::cout << "Min possible file size assuming max theoretical compression efficiency:\n";
std::cout << (entropy * sequence_size) << " in bits\n";
std::cout << ((entropy * sequence_size) / 8) << " in bytes\n";
return EXIT_SUCCESS;
}
First, it appears that std::random_device::entropy() hardcoded to return 32; in MSVC 2015 (which is probably 8.0 according to Shannon definition). As you can try it's not far from the truth, this example it's always close to 7.9998..., i.e. absolute chaos.
The working example is on IDEONE (by the way, their compiler hardcode entropy to 0)
One more, the main question - is it possible to create such a generator that generate linearly-distributed sequence with defined entropy, let's say 6.0 to 7.0? Could it be implemented at all, and if yes, if there are some implementations?
First, you're viewing Shannon's theory entirely wrong. His argument (as you're using it) is simply, "given the probably of x (Pr(x)), the bits required to store x is -log2 Pr(x). It has nothing to do with the probability of x. In this regard, you're viewing Pr(x) wrong. -log2 Pr(x) given a Pr(x) that should be uniformly 1/256 results in a required bitwidth of 8 bits to store. However, that's not how statistics work. Go back to thinking about Pr(x) because the bits required means nothing.
Your question is about statistics. Given an infinite sample, if-and-only-if the distribution matches the ideal histogram, as the sample size approaches infinite the probability of each sample will approach the expected frequency. I want to make it clear that you're not looking for "-log2 Pr(x) is absolute chaos when it's 8 given Pr(x) = 1/256." A uniform distribution is not chaos. In fact, it's... well, uniform. It's properties are well known, simple, and easy to predict. You're looking for, "Is the finite sample set of S meeting the criteria of a independently-distributed uniform distribution (commonly known as "Independently and Identically Distributed Data" or "i.i.d") of Pr(x) = 1/256?" This has nothing to do with Shannon's theory and goes much further back in time to the basic probability theories involving flips of a coin (in this case binomial given assumed uniformity).
Assuming for a moment that any C++11 <random> generator meets the criteria for "statistically indistinguishable from i.i.d." (which, by the way, those generators don't), you can use them to emulate i.i.d. results. If you would like a range of data that is storable within 6..7 bits (it wasn't clear, did you mean 6 or 7 bits, because hypothetically, everything in between is doable as well), simply scale the range. For example...
#include <iostream>
#include <random>
int main() {
unsigned long low = 1 << 6; // 2^6 == 64
unsigned long limit = 1 << 7; // 2^7 == 128
// Therefore, the range is 6-bits to 7-bits (or 64 + [128 - 64])
unsigned long range = limit - low;
std::random_device rd;
std::mt19937 rng(rd()); //<< Doesn't actually meet criteria for i.d.d.
std::uniform_int_distribution<unsigned long> dist(low, limit - 1); //<< Given an engine that actually produces i.i.d. data, this would produce exactly what you're looking for
for (int i = 0; i != 10; ++i) {
unsigned long y = dist(rng);
//y is known to be in set {2^6..2^7-1} and assumed to be uniform (coin flip) over {low..low + (range-1)}.
std::cout << y << std::endl;
}
return 0;
}
The problem with this is that, while the <random> distribution classes are accurate, the random number generators (presumably aside from std::random_device, but that's system-specific) are not designed to stand up to statistical tests of fitness as i.i.d. generators.
If you would like one that does, implement a CSPRNG (my go-to is Bob Jenkins' ISAAC) that has an interface meeting the requirements of the <random> class of generators (probably just covering the basic interface of std::random_device is good enough).
To test for statistically sound "no" or "we can't say no" for whether a set follows a specific model (and therefore Pr(x) is accurate and therefore Shannon's entropy function is an accurate prediction), that's a whole thing else entirely. Like I said, no generator in <random> meets these criteria (except maybe std::random_device). My advice is to do research into things like Central limit theorem, Goodness-of-fit, Birthday-spacing, et cetera.
To drive my point a bit more, under the assumptions of your question...
struct uniform_rng {
unsigned long x;
constexpr uniform_rng(unsigned long seed = 0) noexcept:
x{ seed }
{ };
unsigned long operator ()() noexcept {
unsigned long y = this->x++;
return y;
}
};
... would absolutely meet your criteria of being uniform (or as you say "absolute chaos"). Pr(x) is most certainly 1/N and the bits required to store any number of the set is -log2 Pr(1/N) which is whatever 2 to the power of the bitwidth of unsigned long is. However, it's not independently distributed. Because we know it's properties, you can "store" it's entire sequence by simply storing seed. Surprise, all PRNGs work this way. Therefore the bits required to store the entire sequence of an PRNG is -log2(1/2^bitsForSeed). As your sample grows, the bits required to store vs the bits your able to generate that sample (aka, the compression ratio) approaches a limit of 0.
I cannot comment yet, but I would like to start the discussion:
From communication/information theory, it would seem that you would require probabilistic shaping methods to achieve what you want. You should be able to feed the output of any distribution function through a shaping coder, which then should re-distribute the input to a specific target shannon entropy.
Probabilistic constellation shaping has been succesfully applied in fiber-optic communication: Wikipedia with some other links
You are not clear what you want to achieve, and there are several ways of lowering the Shannon entropy for your sequence:
Correlation between the bits, e.g. putting random_sequence through a
simple filter.
Individual bits are not fully random.
As an example below you could make the bytes less random:
std::vector<uint8_t> generate_random_sequence(size_t sequence_size,
int unit8_t cutoff=10)
{
std::vector<uint8_t> random_sequence;
std::vector<uint8_t> other_sequence;
std::random_device rnd_device;
std::cout << "Random device entropy: " << rnd_device.entropy() << '\n';
std::mt19937 mersenne_engine(rnd_device());
std::uniform_int_distribution<unsigned> dist(0, 255);
auto gen = std::bind(dist, mersenne_engine);
random_sequence.resize(sequence_size);
std::generate(random_sequence.begin(), random_sequence.end(), gen);
other_sequence.resize(sequence_size);
std::generate(other_sequence.begin(), other_sequence.end(), gen);
for(size_t j=0;j<size;++j) {
if (other_sequence[j]<=cutoff) random_sequence[j]=0; // Or j or ...
}
return std::move(random_sequence);
}
I don't think this was the answer you were looking for - so you likely need to clarify the question more.
I am extremely new to c++, and I was wondering how I might output text from a random number generator.
I am creating a text game. You occasionally fight things and I wish for whether you win or lose be random. For instance, if the random number is 2 (the only choices it would have would be one or two) then it would say: " You lost!". Please keep answers simple as I am very new and explaining your solution would be perfect.
Thanks in advance.
#include <cstdlib>
#include <iostream>
#include <ctime>
int main()
{
std::srand(std::time(0)); // use current time as seed for random generator
int random_variable = std::rand();
std::cout << "Random value on [0 " << RAND_MAX << "]: "
<< random_variable << '\n';
}
Source: http://en.cppreference.com/w/cpp/numeric/random/rand
Than, you can just compare it with your constant variable and do any action, ex.:
if (random_variable > 2)
doSomething();
else
doSomethingElse();
Since so many usages of rand have been proposed here, let's do it a bit more robust:
We will seed with std::random_device do ease into how <random> works. (You could use time(0) here, it does not really matter.)
Our actual PRNG (the thing that makes numbers) will be [std::mt19937_64](http://en.cppreference.com/w/cpp/numeric/random/mersenne_twister_engine], which is accepted as one of the better random number generators.
We will not simply inspect one bit, but tell C++ that we want a number in the range [0,1].
We will combine this into a single object that you just need to call.
A simple comparision will let us decide whether the player won or lost.
So, starting with number 1:
#include <random>
#include <functional>
#include <iostream>
int main() {
using namespace std; // because I am lazy today
random_device seeder; // call this to get a number
// more to do here
}
Now, while seeder() gives a random number, it is usually expected that you will just use this to seed your own PRNG (unless you do crypto, in which case it becomes much more complicated). So, let's do it:
mt19937_64 prng(seeder());
Well, that was easy. Now, let's make a distribution:
uniform_int_distribution<int> distribution(0, 1);
Now, to get an int that is either 0 or 1, we could just toss the prng to the distribution, as in:
int one_or_zero = distribution(prng);
But, that is cumbersome. So instead of the previous steps, we just combine everything:
auto dist = bind(uniform_int_distribution<int>(0, 1), mt19937_64(seeder()));
You can read this as "Make me a function-like variable named dist which holds a uniform distribution (every value is as likely as any other) of the range [0, 1] that is powered by an Mersenne Twister 64 PRNG.
All we now need to do is:
int one_or_zero = dist();
Ok, we just need to wrap a little if around a call to dist - sounds easy:
if(dist() == 0) {
cout << "You won!\n";
} else {
cout << "Sorry, you lost.\n";
}
You can see the result in action here, but be aware that the result is cached, so you'll need to fork it and run it yourself to see it change.
P.S.: Please note that it results in exactly two lines with the semantics similar to (swap it around a bit and you get exactly the same semantics) srand/rand -- except that it avoids a whole bunch of problems associated with those functions.
#include<iostream>
using namespace std;
int main()
{int ran_num=0;
srand((unsigned)time(0));
while(ran_num !=2) //You can add options here.
{ran_num=rand() % 100;//You can change the max number.
cout<<ran_num<<" "<<endl;
}
cout<<"You lost!";}
Since your random out has only two states, you can think about it as flipping a coin, so you can take a random function and perform a modular division by 2, like this example (just look for 'coin toss' and you will get tons of samples):
http://www.c-program-example.com/2012/05/c-program-to-toss-coin-using-random.html
int toss = rand() % 2;
you can use toss to manage your chooses.
If there are only two options, the fastest way is to be interested only in value of the least significant bit.
if(randomNumber & 1) // equals 1 if the LSB is set.
cout << "You won!" << endl;
else
cout << "You lost!" << endl;
I am trying to count the number of floating point operations in one of my programs and I think perf could be the tool I am looking for (are there any alternatives?), but I have trouble limiting it to a certain function/block of code. Lets take the following example:
#include <complex>
#include <cstdlib>
#include <iostream>
#include <type_traits>
template <typename T>
typename std::enable_if<std::is_floating_point<T>::value, T>::type myrand()
{
return static_cast <T> (std::rand()) / static_cast <T> (RAND_MAX);
}
template <typename T>
typename std::enable_if<!std::is_floating_point<T>::value, std::complex<typename T::value_type>>::type myrand()
{
typedef typename T::value_type S;
return std::complex<S>(
static_cast <S> (std::rand()) / static_cast <S> (RAND_MAX),
static_cast <S> (std::rand()) / static_cast <S> (RAND_MAX)
);
}
int main()
{
auto const a = myrand<Type>();
auto const b = myrand<Type>();
// count here
auto const c = a * b;
// stop counting here
// prevent compiler from optimizing away c
std::cout << c << "\n";
return 0;
}
The myrand() function simply returns a random number, if the type T is complex then a random complex number. I did not hardcode doubles into the program because they would be optimized away by the compiler.
You can compile the file (lets call it bench.cpp) with c++ -std=c++0x -DType=double bench.cpp.
Now I would like to count the number of floating point operations, which can be done on my processor (Nehalem architecture, x86_64 where floating point is done with scalar SSE) with the event r8010 (see Intel Manual 3B, Section 19.5). This can be done with
perf stat -e r8010 ./a.out
and works as expected; however it counts the overall number of uops (is there a table telling how many uops a movsd e.g. is?) and I am only interested in the number for the multiplication (see in the example above).
How can this be done?
I finally found a way to do this, although not using perf but instead the corresponding perf API. One first has to define a perf_event_open function which is actually a syscall:
#include <cstdlib> // stdlib.h for C
#include <cstdio> // stdio.h for C
#include <cstring> // string.h for C
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/perf_event.h>
#include <asm/unistd.h>
long perf_event_open(
perf_event_attr* hw_event,
pid_t pid,
int cpu,
int group_fd,
unsigned long flags
) {
int ret = syscall(__NR_perf_event_open, hw_event, pid, cpu, group_fd, flags);
return ret;
}
Next, one selects the events one wishes to count:
perf_event_attr attr;
// select what we want to count
std::memset(&attr, 0, sizeof(perf_event_attr));
attr.size = sizeof(perf_event_attr);
attr.type = PERF_TYPE_HARDWARE;
attr.config = PERF_COUNT_HW_INSTRUCTIONS;
attr.disabled = 1;
attr.exclude_kernel = 1; // do not count the instruction the kernel executes
attr.exclude_hv = 1;
// open a file descriptor
int fd = perf_event_open(&attr, 0, -1, -1, 0);
if (fd == -1)
{
// handle error
}
In this case I want to count simply the number of instructions. Floating point instructions can be counted on my processor (Nehalem) by replacing the corresponding lines with
attr.type = PERF_TYPE_RAW;
attr.config = 0x8010; // Event Number = 10H, Umask Value = 80H
By setting the type to RAW one can basically count every event the processor is offering; the number 0x8010 specifies which one. Note that this number is highly processor-dependent! One can find the right numbers in the Intel Manual 3B, Part2, Chapter 19, by picking the right subsection.
One can then measure the code by enclosing it in
// reset and enable the counter
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
// perform computation that should be measured here
// disable and read out the counter
ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
long long count;
read(fd, &count, sizeof(long long));
// count now has the (approximated) result
// close the file descriptor
close(fd);
I have the following function:
typedef unsigned long long int UINT64;
UINT64 getRandom(const UINT64 &begin = 0, const UINT64 &end = 100) {
return begin >= end ? 0 : begin + (UINT64) ((end - begin)*rand()/(double)RAND_MAX);
};
Whenever I call
getRandom(0, ULLONG_MAX);
or
getRandom(0, LLONG_MAX);
I always get the same value 562967133814800. How can I fix this problem?
What is rand()?
According to this the rand() function returns a value in the range [0,RAND_MAX].
What is RAND_MAX?
According to this, RAND_MAX is "an integral constant expression whose value is the maximum value returned by the rand function. This value is library-dependent, but is guaranteed to be at least 32767 on any standard library implementation."
Precision Is An Issue
You take rand()/(double)RAND_MAX, but you have perhaps only 32767 discrete values to work with. Thus, although you have big numbers, you don't really have more numbers. That could be an issue.
Seeding May Be An Issue
Also, you don't talk about how you are calling the function. Do you run the program once with LLONG_MAX and another time with ULLONG_MAX? In that case, the behaviour you are seeing is because you are implicitly using the same random seed each time. Put another way, each time you run the program it will generate the exact same sequence of random numbers.
How can I seed?
You can use the srand() function like so:
#include <stdlib.h> /* srand, rand */
#include <time.h> /* time */
int main (){
srand (time(NULL));
//The rest of your program goes here
}
Now you will get a new sequence of random numbers each time you run your program.
Overflow Is An Issue
Consider this part ((end - begin)*rand()/(double)RAND_MAX).
What is (end-begin)? It is LLONG_MAX or ULLONG_MAX these are, by definition, the largest possible values those data types can hold. Therefore, it would be bad to multiply them by anything. Yet you do! You multiply them by rand(), which is non-zero. This will cause an overflow. But we can fix that...
Order of Operations Is An Issue
You then divide them by RAND_MAX. I think you've got your order of operations wrong here. You really meant to say:
((end - begin) * (rand()/(double)RAND_MAX) )
Note the new parantheses! (rand()/(double)RAND_MAX)
Now you are multiplying an integer by a fraction, so you are guaranteed not to overflow. But that introduces a new problem...
Promotion Is An Issue
But there's an even deeper problem. You divide an int by a double. When you do that the int is promoted to a double. A double is a floating-point number which basically means that it sacrifices precision in order to have a big range. That's probably what's biting you. As you get to bigger and bigger numbers both your ullong and your llong end up getting cast to the same value. This could be especially true if you overflowed your data type first (see above).
Uh oh
So, basically, everything about the PRNG you have presented is wrong.
Perhaps this is why John von Neumann said
Anyone who attempts to generate random numbers by deterministic means
is, of course, living in a state of sin.
And, sometimes, we pay for those sins.
How can I absolve myself?
C++11 provides some nice functionality. You can use it as follows
#include <iostream>
#include <random>
#include <limits>
int main(){
std::random_device rd; //Get a random seed from the OS entropy device, or whatever
std::mt19937_64 eng(rd()); //Use the 64-bit Mersenne Twister 19937 generator
//and seed it with entropy.
//Define the distribution, by default it goes from 0 to MAX(unsigned long long)
//or what have you.
std::uniform_int_distribution<unsigned long long> distr;
//Generate random numbers
for(int n=0; n<40; n++)
std::cout << distr(eng) << ' ';
std::cout << std::endl;
}
(Note that appropriately seeding the generator is difficult. This question addresses that.)
typedef unsigned long long int UINT64;
UINT64 getRandom(UINT64 const& min = 0, UINT64 const& max = 0)
{
return (((UINT64)(unsigned int)rand() << 32) + (UINT64)(unsigned int)rand()) % (max - min) + min;
}
Using shift operation is unsafe since unsigned long long might be less than 64 bits on some machines. You can use unsigned __int64 instead, but keep in mind it's compiler dependant, therefore is available only in certain compilers.
unsigned __int64 getRandom(unsigned __int64 const& min, unsigned __int64 const& max)
{
return (((unsigned __int64)(unsigned int)rand() << 32) + (unsigned __int64)(unsigned int)rand()) % (max - min) + min;
}
Use your own PRNG that meets your requirements rather than the one provided with rand that seems not to and was never guaranteed to.
Given that ULLONG_MAX and LLONG_MAX are both way bigger than the RAND_MAX value, you will certainly get "less precision than you want".
Other than that, there's 50% chance that your value is below the LLONG_MAX, as it is halfway throuogh the range of 64-bit values.
I would suggest using the Mersenne-Twister from the C++11, which has a 64-bit variant
http://www.cplusplus.com/reference/random/mt19937_64/
That should give you a value that fits in a 64-bit number.
If you "always get the same value", then it's because you haven't seeded the random number generator, using for example srand(time(0)) - you should normally only seed once, because this sets the "sequence". If the seed is very similar, e.g. you run the same program twice in short succession, you will still get the same sequence, because "time" only ticks once a second, and even then, doesn't change that much. There are various other ways to seed a random number, but for most purposes, time(0) is reasonably good.
You are overflowing the computation, in the expression
((end - begin)*rand()/(double)RAND_MAX)
you are telling the compiler to multiply (ULLONG_MAX - 0) * rand() and then divide by RAND_MAX, you should divide by RAND_MAX first, then multiply by rand().
// http://stackoverflow.com/questions/22883840/c-get-random-number-from-0-to-max-long-long-integer
#include <iostream>
#include <stdlib.h> /* srand, rand */
#include <limits.h>
using std::cout;
using std::endl;
typedef unsigned long long int UINT64;
UINT64 getRandom(const UINT64 &begin = 0, const UINT64 &end = 100) {
//return begin >= end ? 0 : begin + (UINT64) ((end - begin)*rand()/(double)RAND_MAX);
return begin >= end ? 0 : begin + (UINT64) rand()*((end - begin)/RAND_MAX);
};
int main( int argc, char *argv[] )
{
cout << getRandom(0, ULLONG_MAX) << endl;
cout << getRandom(0, ULLONG_MAX) << endl;
cout << getRandom(0, ULLONG_MAX) << endl;
return 0;
}
See it live in Coliru
union bigRand {
uint64_t ll;
uint32_t ii[2];
};
uint64_t rand64() {
bigRand b;
b.ii[0] = rand();
b.ii[1] = rand();
return b.ll;
}
I am not sure how portable it is. But you could easily modify it depending on how wide RAND_MAX is on the particular platform. As an upside, it is brutally simple. I mean the compiler will likely optimize it to be quite efficient, without extra arithmetic whatsoever. Just the cost of calling rand twice.
The most reasonable solution would be to use C++11's <random>, mt19937_64 would do.
Alternativelly you might try:
return ((double)rand() / ((double)RAND_MAX + 1.0)) * (end - begin + 1) + begin;
to produce numbers in more reasonable way. However note that just like your first attempt, this will still not be producing uniformly distributed numbers (although it might be good enough).
The term (end - begin)*rand() seems produce an overflow. You can alleviate that problem by using (end - begin) * (rand()/(double)RAND_MAX). Using the second way, I get the following results:
15498727792227194880
7275080918072332288
14445630964995612672
14728618955737210880
with the following calls:
std::cout << getRandom(0, ULLONG_MAX) << std::endl;
std::cout << getRandom(0, ULLONG_MAX) << std::endl;
std::cout << getRandom(0, ULLONG_MAX) << std::endl;
std::cout << getRandom(0, ULLONG_MAX) << std::endl;