Is jBCrypt's default log_rounds still appropriate for 2013 - password-encryption

I've been using jBCrypt version 0.3 out-of-the-box now since it came out in 2010. I use the default getsalt() method which sets the number of "log_rounds" to 10. Given the progression of password cracking hardware and methods, is this value still appropriate as a default, or should i be looking at some higher value.
Info from the javadoc...
String pw_hash = BCrypt_v03.hashpw(plain_password, BCrypt_v03.gensalt());
String strong_salt = BCrypt_v03.gensalt(10)
String stronger_salt = BCrypt_v03.gensalt(12)
The amount of work increases exponentially (2**log_rounds), so each increment is twice as much work. The default log_rounds is 10, and the valid range is 4 to 31.

I made a little test class to check the performance of checkPw() under differing salt log_rounds.
public void testCheckPerformance() {
int MULT = 1;
for( int i = 4; i < 31; i++) {
String salt = BCrypt_v03.gensalt(i);
String hashpw = BCrypt_v03.hashpw("my pwd", salt);
long startTs = System.currentTimeMillis();
for( int mult = 0; mult < MULT; mult++) {
assertTrue(BCrypt_v03.checkpw("my pwd", hashpw));
}
long endTs = System.currentTimeMillis();
System.out.println(""+i+": " + ((endTs-startTs)/MULT));
}
}
My PC is an 8 core i7 2.8GHz. The results are:
log-rounds: time in millis.
4: 3
5: 3
6: 6
7: 11
8: 22
9: 46
10: 92
11: 188
12: 349
13: 780
14: 1449
15: 2785
16: 5676
17: 11247
18: 22264
19: 45170
Using the default log_rounds=10 means that a single thread can check a login in 0.1s. This potentially limits the number of login checks per second that a single server can achieve.
So i guess the question becomes how much time are you prepared to spend per password check vs how many password checks per second you want to size the system to cope with.

Related

bernoulli_distribution vs uniform_int_distribution

In comparing bernoulli_distribution's default constructor (50/50 chance of true/false) and uniform_int_distribution{0, 1} (uniform likely chance of 0 or 1) I find that bernoulli_distributions are at least 2x and upwards of 6x slower than uniform_int_distribution despite the fact that they give equivalent results.
I would expect bernoulii_distribition to perform better due to it being specifically designed for the probability of only two outcomes, true or false; yet, it doesn't.
Given the above and the below performance metrics, are there practical uses of bernoulli distributions over uniform_int_distributions?
Results over 5 runs (Release mode, x64-bit):
(See edit below for release runs without the debugger attached)
bernoulli: 58 ms
false: 500690
true: 499310
uniform: 9 ms
1: 499710
0: 500290
----------
bernoulli: 57 ms
false: 500921
true: 499079
uniform: 9 ms
0: 499614
1: 500386
----------
bernoulli: 61 ms
false: 500440
true: 499560
uniform: 9 ms
0: 499575
1: 500425
----------
bernoulli: 59 ms
true: 498798
false: 501202
uniform: 9 ms
1: 499485
0: 500515
----------
bernoulli: 58 ms
true: 500777
false: 499223
uniform: 9 ms
0: 500450
1: 499550
----------
Profiling code:
#include <chrono>
#include <random>
#include <iostream>
#include <unordered_map>
int main() {
auto gb = std::mt19937{std::random_device{}()};
auto bd = std::bernoulli_distribution{};
auto bhist = std::unordered_map<bool, int>{};
auto start = std::chrono::steady_clock::now();
for(int i = 0; i < 1'000'000; ++i) {
bhist[bd(gb)]++;
}
auto end = std::chrono::steady_clock::now();
auto dif = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "bernoulli: " << dif.count() << " ms\n";
std::cout << std::boolalpha;
for(auto& b : bhist) {
std::cout << b.first << ": " << b.second << '\n';
}
std::cout << std::noboolalpha;
std::cout << '\n';
auto gu = std::mt19937{std::random_device{}()};
auto u = std::uniform_int_distribution<int>{0, 1};
auto uhist = std::unordered_map<int, int>{};
start = std::chrono::steady_clock::now();
for(int i = 0; i < 1'000'000; ++i) {
uhist[u(gu)]++;
}
end = std::chrono::steady_clock::now();
dif = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "uniform: " << dif.count() << " ms\n";
for(auto& b : uhist) {
std::cout << b.first << ": " << b.second << '\n';
}
std::cout << '\n';
}
EDIT
I re-ran the test without debugging symbols attached and bernoulli still ran a good 4x slower:
bernoulli: 37 ms
false: 500250
true: 499750
uniform: 9 ms
0: 500433
1: 499567
-----
bernoulli: 36 ms
false: 500595
true: 499405
uniform: 9 ms
0: 499061
1: 500939
-----
bernoulli: 36 ms
false: 500988
true: 499012
uniform: 8 ms
0: 499596
1: 500404
-----
bernoulli: 36 ms
true: 500425
false: 499575
uniform: 8 ms
0: 499974
1: 500026
-----
bernoulli: 36 ms
false: 500847
true: 499153
uniform: 8 ms
0: 500082
1: 499918
-----
A default constructed std::bernoulli_distribution gives equal weight to both outcomes, but you can give it a different distribution parameter to change the probabilities. That might cause extra complexity. A better comparison would be to use a std::uniform_real_distribution<double> and compare its result to 0.5 (by default it gives a random number in the range [0, 1)).
See here for an example:
gcc output:
bernoulli: 28 ms
false: 499818
true: 500182
uniform: 31 ms
1: 500686
0: 499314
real: 29 ms
1: 500191
0: 499809
clang output:
bernoulli: 106 ms
false: 500662
true: 499338
uniform: 23 ms
1: 501263
0: 498737
real: 101 ms
1: 499683
0: 500317
The results are about the same using gcc (multiple runs tend to give uniform int a higher time, contrary to what you saw). With clang I get bernoulli and real to be about the same, with uniform int being much less time. Both of these are using -O3.
The class bernoulli_distribution is used to generate boolean with possible uneven ratios. To achieve that it has to generate a floating point in [0,1] range and then compare it versus the given probability. Or anything equivalent.
It is rather obvious that this routine is likely to be slower than taking a random integer modulo 2 - which is pretty much all it takes to create a uniform number in {0,1} from a random number.
How is it surprising? Only if compiler somehow manages to figure out unnecessary operations while being aware that it is 50/50 during compilation can the performance match up to even.
Some comments and answers suggest using uniform_real_distribution instead.
I tested uniform_real_distribution(0.0f, nextafter(1.0f, 20.f)) (to account for urd being a half-closed range) vs bernoulli_distribution and the bernoulli_distribution is faster by about 20%-25% regardless of the probability (and gave more correct results. I tested 1.0 true probability and my implementation that used the above urd values actually gave false negatives (granted one or two out of 5 one-million runs) and bernoulli gave the correct none.
So, speed-wise: bernoulli_distribution is faster than uniform_real_distribution but slower than uniform_int_distribution.
Long-story short, use the right tool for the job, don't reinvent the wheel, the STL is well-built, etc. and depending on the use-case one is better than the other.
For yes-no probability (IsPercentChance(float probability)), bernoulli_distribution is faster and better.
For pure "give me a random random bool value", uniform_int_distribution is faster and better.

Input a date in 8 digit numerical form and display into English form

Below is the c++ problem i have to solve and i'm having some trouble from number 2)
1) Prompt the user to input a date in 8 digit numerical form(MMDDYYYY)
ex. 04221970
2) Display the date in English form
ex. 22nd April 1970
3) If the day the user entered is 01,21,31, add "st" after the day
4) Else if the day the user entered is 02,22, add "nd" after the day
5) Elae if the day user entered is 03,23, add "rd" after the day
6) Else add "th" after the day
All of the step are rather straightforward. But dates are always tricky, since there are many rules. But only a couple apply for parsing.
Define a struct to hold the parsed value and parse the input.
[EDIT] The value of using a struct, is that it can be useful to have an intermediate function that would return this reusable bit of binary data.
struct date_s
{
unsigned int day;
unsigned int month;
unsigned int year;
};
// parsing
date_s date = {};
if (strlen(input) != 8 || sscanf(input, "%2u%2u%4u", &date.month, &date.day, &date.year) != 3)
{
// handle error
}
Validating the year is rather easy, there is nothing to do, unless you want to restrict to a specific range. For example, since we're using the Gregorian calendar, you may want to restrict to years after 1582, inclusive.
Validating the month is also very straightforward, we'll validate that along with the number of days in a month, which is the most tricky part, because of Febuary.
unsigned int day_max = 0;
switch (date.month)
{
case 1: case 3: case 5; case 7: case 8: case 10: case 12:
day_max = 31;
break;
case 4: case 6: case 9: case 11:
day_max = 30;
break;
case 2:
if (date.year % 4 != 0)
day_max = 28;
else if (date.year % 100 != 0)
day_max = 29;
else if (date.year % 400 != 0)
day_max = 28;
else
day_max = 29;
break;
// else day_max stays 0, of course
}
if (date.day < 1 || day_max < date.day)
{
// error
}
After all the validating is done, all you have to do is print.
For the months you will need to define a table of strings for display.
const char* MONTH[12] = { "January", /* etc... */ };
Date suffix.
const char* SUFFIX[4] = { "st", "nd", "rd", "th" };
We now have all the data we need to print, and all within range, too.
const char* suffix = SUFFIX[std::min(date.day, 4) - 1];
printf("%d%s %s %d", date.day, suffix, MONTH[date.month - 1], date.year);
// or, for US format
printf("%s %d%s, %d", MONTH[date.month - 1], date.day, suffix, date.year);

How to get the right value?

What must I do to get the right Level?
example:
int gXP = globalDoHandle::PlayerStats_XP(p);
ostringstream sXP; sXP << "XP(RP): " << gXP;
ostringstream sLEVEL; sLEVEL << "Level: " << gLEVEL;
I want use the XP value to get the right Level then.
If I get the the value 24450 should give me it then "10" back
I know I can use something like this, but that are 8000 Level in the Game!
if (gXP < 800) { Rank = "1"; }
else if (gXP < 2100) { Rank = "2"; }
else if (gXP < 3800) { Rank = "3"; }
...
LEVEL: XP
Level 1: 0
Level 2: 800
Level 3: 2100
Level 4: 3800
Level 5: 6100
Level 6: 9500
Level 7: 12500
Level 8: 16000
Level 9: 19800
Level 10: 24000
Level 11: 28500
Level 12: 33400
Level 13: 38700
Level 14: 44200
Level 15: 50200
Level 16: 56400
Level 17: 63000
Level 18: 69900
Level 19: 77100
Level 20: 84700
...Level 8000: 1787576850
To do such a job, you need std::lower_bound. std::lower_bound(l, h, v) returns an iterator it inside the range [l, h) for which the value on the range is the smallest verifying *it > v.
constexpr std::array<unsigned, 10> levels = { /* ... */ }; // xp needed for each level
unsigned level(unsigned xp)
{
auto it = std::lower_bound(cbegin(levels), cend(levels), xp);
return std::distance(begin(levels), it);
}
level(xp) returns the level reached with xp experience points with respect to the values of levels.
Se a full demo online
Use a std::array with your exp values in.
std::array<int, 8000> exp_table = { 0, 800, 2100... };
int level=0;
for(; level < exp_table.size() && exp_table[level] < gXP; ++level);
Rank = std::to_string(level); // If this is needed?
After this level will be set correctly. The only thing left is how you generate your exp_table. Read it from a file. Use some math formula.

How is this DP solution approached?

I have been trying to solve this TopCoder problem since a while now, and I'm not able to come up with a DP-based solution to this problem (given below). I also found a solution(also given below) by someone else to the problem, but I can't even understand it.
Could anyone help me with the solution here? I don't understand what line of thought would lead to this solution? How do I go about establishing the recurrence relation in my mind? I have absolutely no clue how to approach this problem, or how the person who wrote that solution wrote it.
PS: I'm aware TopCoder has editorials for problems, but this one's hasn't been released. I don't know why.
Here's the problem
Fox Ciel has lots of homework to do. The homework consists of some
mutually independent tasks. Different tasks may take different amounts
of time to complete. You are given a int[] workCost. For each i, the
i-th task takes workCost[i] seconds to complete. She would like to
attend a party and meet her friends, thus she wants to finish all
tasks as quickly as possible.
The main problem is that all foxes, including Ciel, really hate doing
homework. Each fox is only willing to do one of the tasks. Luckily,
Doraemon, a robotic cat from the 22nd century, gave Fox Ciel a split
hammer: a magic gadget which can split any fox into two foxes.
You are given an int splitCost. Using the split hammer on a fox is
instantaneous. Once a hammer is used on a fox, the fox starts to
split. After splitCost seconds she will turn into two foxes -- the
original fox and another completely new fox. While a fox is splitting,
it is not allowed to use the hammer on her again.
The work on a task cannot be interrupted: once a fox starts working on
a task, she must finish it. It is not allowed for multiple foxes to
cooperate on the same task. A fox cannot work on a task while she is
being split using the hammer. It is possible to split the same fox
multiple times. It is possible to split a fox both before and after
she solves one of the tasks.
Compute and return the smallest amount of time in which the foxes can
solve all the tasks.
Here's the solution:
1:
2: const int maxN = 55;
3: int dp[maxN][maxN*2];
4: int N;
5: int splitC;
6: vector<int> workC;
7:
8: int rec(int,int);
9: int FoxAndDoraemon::minTime(vector <int> workCost, int splitCost) {
10:
11: sort(workCost.begin(), workCost.end());
12: N = workCost.size();
13: splitC = splitCost;
14: workC = workCost;
15: memset(dp, -1, sizeof(dp));
16:
17: return rec(N-1, 1);
18: }
19:
20: int rec(int jobs, int fox)
21: {
22: if(jobs == -1) return 0;
23:
24: int& val = dp[jobs][fox];
25: if(val != -1) return val;
26: val = 0;
27:
28: if( (jobs+1) <= fox) val = max(workC[jobs] , rec(jobs-1, fox-1));
29: else
30: {
31: //split fox
32: val = splitC + rec(jobs, fox + 1);
33:
34: if( !(fox == 1 && jobs > 0) )
35: val = min(val, max(workC[jobs], rec(jobs-1, fox-1)));
36: }
37: return val;
38: }
39:
DP problems usually requires you to work out couple of examples and the only way to get good at it is to practice. Try solving some standard problem types in DP like longest increasing subsequence, knapsack, coins change, matrix multiplication, TSP etc. Try varients of these type.
As for the above problem, few things to note:
You need N foxes to finish N tasks (1 fox will work on only 1 task). So, if you've already cloned N foxes you don't need to create anymore. And, if you have more than 1 task, you have to split the first fox.
With each fox you can do two things
Split it and then calculate the minimum time taken
Don't split, but perform the current task and calculate the time it takes to perform remaining tasks with one less fox.
Note that you can only opt for not spliting if you have more than 1 Fox (or 1 Fox with 1 task).
This should give you an idea for the problem. I haven't thoroughly analyzed the problem, but the recursion doesn't seem to create overlapping calls i.e. if I have 3 tasks and 2 foxes, I'm only calling that state once and no more. So, the solution is a regular recursive solution and not a DP.
The editorial is now up at the TopCoder site.You can have a look there!

I asked this yesterday, after the input given I'm still having trouble implementing

I'm not sure how to fix this or what I did wrong, but whenever I enter in a value it just closes out the run prompt.
So, seems I do have a problem somewhere in my coding. Whenever I run the program and input a variable, it always returns the same answer.."The content at location 76 is 0."
On that note, someone told me that "I don't know, but I suspect that Program A incorrectly has a fixed address being branched to on instructions 10 and 11." - mctylr but I'm not sure how to fix that..
I'm trying to figure out how to incorporate this idea from R Samuel Klatchko.. I'm still not sure what I'm missing but I can't get it to work..
const int OP_LOAD = 3;
const int OP_STORE = 4;
const int OP_ADD = 5;
...
const int OP_LOCATION_MULTIPLIER = 100;
mem[0] = OP_LOAD * OP_LOCATION_MULTIPLIER + ...;
mem[1] = OP_ADD * OP_LOCATION_MULTIPLIER + ...;
operand = memory[ j ] % OP_LOCATION_MULTIPLIER;
operation = memory[ j ] / OP_LOCATION_MULTIPLIER;
I'm new to programming, I'm not the best, so I'm going for simplicity. Also this is an SML program. Anyway, this IS a homework assignment and I'm wanting a good grade on this. So I was looking for input and making sure this program will do what I'm hoping they are looking for. Anyway, here are the instructions: Write SML (Simpletron Machine language) programs to accomplish each of the following task:
A) Use a sentinel-controlled loop to read positive number s and compute and print their sum. Terminate input when a neg number is entered.
B) Use a counter-controlled loop to read seven numbers, some positive and some negative, and compute + print the avg.
C) Read a series of numbers, and determine and print the largest number. The first number read indicates how many numbers should be processed.
Without further a due, here is my program. All together.
int main()
{
const int READ = 10;
const int WRITE = 11;
const int LOAD = 20;
const int STORE = 21;
const int ADD = 30;
const int SUBTRACT = 31;
const int DIVIDE = 32;
const int MULTIPLY = 33;
const int BRANCH = 40;
const int BRANCHNEG = 41;
const int BRANCHZERO = 41;
const int HALT = 43;
int mem[100] = {0}; //Making it 100, since simpletron contains a 100 word mem.
int operation; //taking the rest of these variables straight out of the book seeing as how they were italisized.
int operand;
int accum = 0; // the special register is starting at 0
int j;
// This is for part a, it will take in positive variables in a sent-controlled loop and compute + print their sum. Variables from example in text.
memory [0] = 1010;
memory [01] = 2009;
memory [02] = 3008;
memory [03] = 2109;
memory [04] = 1109;
memory [05] = 4300;
memory [06] = 1009;
j = 0; //Makes the variable j start at 0.
while ( true )
{
operand = memory[ j ]%100; // Finds the op codes from the limit on the memory (100)
operation = memory[ j ]/100;
//using a switch loop to set up the loops for the cases
switch ( operation ){
case 10: //reads a variable into a word from loc. Enter in -1 to exit
cout <<"\n Input a positive variable: ";
cin >> memory[ operand ]; break;
case 11: // takes a word from location
cout << "\n\nThe content at location " << operand << "is " << memory[operand]; break;
case 20:// loads
accum = memory[ operand ]; break;
case 21: //stores
memory[ operand ] = accum; break;
case 30: //adds
accum += mem[operand]; break;
case 31: // subtracts
accum-= memory[ operand ]; break;
case 32: //divides
accum /=(memory[ operand ]); break;
case 33: // multiplies
accum*= memory [ operand ]; break;
case 40: // Branches to location
j = -1; break;
case 41: //branches if acc. is < 0
if (accum < 0)
j = 5;
break;
case 42: //branches if acc = 0
if (accum == 0)
j = 5;
break;
case 43: // Program ends
exit(0); break;
}
j++;
}
return 0;
}
So, seems I do have a problem
somewhere in my coding. Whenever I run
the program and input a variable, it
always returns the same answer.."The
content at location 76 is 0."
Look at what your program is doing:
memory [0] = 1010; /* Read from user and store at address 10 */
memory [01] = 2009; /* Read garbage into acc from address 9 */
memory [02] = 3008; /* Add whatever garbage is in address 8 into accumulator */
memory [03] = 2109; /* Store garbage from accumulator into address 9 */
memory [04] = 1109; /* Print the contents of address 9, which is garbage */
memory [05] = 4300; /* Stop */
memory [06] = 1009; /* Read from user and store in address 9 */
So... yeah. To debug something like this, you just need to print out all the pertinent variables of your program to see whether they're what you think they are. For example, in case 10 you could have done cout << "10: operand is " << operand << endl; and then in case 11 you could have done cout << "11: operand is " << operand << endl; and you would have seen right away that the operand was 10 in the first instruction, 9 in the second, and 8 in the 3rd.
If you want to guarantee that memory always starts with the value 0, write a for loop (you can change your declaration to do it but there are subtleties you'd need to learn):
for( int i = 0; i < sizeof(memory); ++i ) { memory[i] = 0; }
Here's working program that adds 2 values the user enters and prints the result:
memory [0] = 1009;
memory [1] = 1008;
memory [2] = 2009;
memory [3] = 3008;
memory [4] = 2109;
memory [5] = 1109;
memory [6] = 4300;
As I suggested yesterday in your original question. I believe you may have an error in case 10 and 11 hard coded to modify the "stack pointer" (j) to 5 if I'm reading your code correctly.
The case statements for switch (operation) don't match your opcode constants (e.g. READ = 10, BRANCH = 40). (This has been fixed in your example)
For debugging at least having a default statement in the switch to catch unknown operations is recommended to catch mistakes.
Added:
I'd also suggest printing the operation and operand as they are being executed, to help you follow the Simpletron's program execution.
You still have not fixed the usage of leading zeros of memory addresses. The C/C++ compiler interprets the leading zero as meaning octal (base 8) number system.
Your example code as posted does not even compile. Please edit and fix the variable name usage (hint: mixing mem and memory).
Code fixes removed.
The values in the switch statement are incorrect. The operation values will be 10, 11, 20, 21 etc. not 1, 2, 3 etc as you check for in your cases. Which means none of your code gets executed as you don't have cases for these numbers.
Good luck!