Best string hash function for this example - c++

i have a key with type AcccAA where A-[A...Z] (capital letters), and c is [1..9]. i have 1500 segments.
Now my temp hash function
int HashFunc(string key){
int Adress = ((key[0] + key[1] + key[2] + key[3] + key[4] + key[5]) - 339) * 14;
return Adress;
}
and Excel show a lot of collision in center (from 400 to 900)
Please tell me the hash function to be more evenly.

A common way to build a hash function in this case is to evaluate some polynomial with prime coefficients, like this one:
int address = key[0] +
31 * key[1] +
137 * key[2] +
1571 * key[3] +
11047 * key[4] +
77813 * key[5];
return address % kNumBuckets;
This gives a much larger dispersion over the key space. Right now, you get a lot of collisions because anagrams like AB000A and BA000A will collide, but with the above hash function the hash is much more sensitive to small changes in the input.
For a more complex but (probably) way better hash function, consider using a string hash function like the shift-add-XOR hash, which also gets good dispersion but is less intuitive.
Hope this helps!

One way is to construct a guaranteed collision-free number (which will not make your hash table collision free of course), as long as the possible keys fit in an integral type (e.g. int):
int number = (key[0] - 'A') + 26 * (
(key[1] - '0') + 10 * (
(key[2] - '0') + 10 * (
(key[3] - '0') + 10 * (
(key[4] - 'A') + 26 * (
(key[5] - 'A')
)))));
This works since 26 * 10 * 10 * 10 * 26 * 26 = 17576000 which fits into an int fine.
Finally simply hash this integer.

Related

Adding number to pointer value

I am trying to add a number to a pointer value with the following expression:
&AddressHelper::getInstance().GetBaseAddress() + 0x39EA0;
The value for the &AddressHelper::getInstance().GetBaseAddress() is always 0x00007ff851cd3c68 {140700810412032}
should I not get 0x00007ff851cd3c68 + 0x39EA0 = 7FF81350DB08 as a result?
while I am getting: 0x00007ff851ea3168 or sometimes 0x00007ff852933168 or some other numbers.
Did I took the pointer value incorrectly?
With pointer arithmetic, type is taken into account,
so with:
int buffer[42];
char* start_c = reinterpret_cast<char*>(buffer);
int *start_i = buffer;
we have
start_i + 1 == &buffer[1]
reinterpret_cast<char*>(start_i + 1) == start_c + sizeof(int).
and (when sizeof(int) != 1) reinterpret_cast<char*>(start_i + 1) != start_c + 1
In your case:
0x00007ff851ea3168 - 0x00007ff851cd3c68) / 0x39EA0 = 0x08
and sizeof(DWORD) == 8.

How to iterate over objective function

I want to potentiate certain terms in my objective function.
model.addConstr(KW == quicksum(I[t] *(1.05**(-i)) for t in Tst + Z[t]
* (1.05**(-j)) for t in T)
model.setObjective(KW,GRB.MAXIMIZE)
model.optimize()
The variable i should run from 1 to the number of elements in Tst and T
respectively.
So if t in Tst is [2020,2021,2022], I[2020] gets multiplied by 1.05**
(-1)
I[2021] by 1.05**(-2) and I[2022] by 1.05**(-3).
Same with Z[t], only that the list of T is larger than Tst.
for i in range(1,len(Tst)+1):
model.addConstr(KW == quicksum(I[t] * (1.05**(-i)))
However KW is always 0 then, which it shouldnt be. What am i missing?
I just created a 2nd dictionary
Expo = {}
i=1
for t in T:
Expo[t] = i
i = i + 1
If I do:
model.addConstr(KW == quicksum(I[t] *(1.05**(Expo[t])) for t in Tst + Z[t]
* (1.05**(Expo[t])) for t in T)
model.setObjective(KW,GRB.MAXIMIZE)
model.optimize()
it does what i want. But I dont think its a very good solution :P

How do I go about getting the real result for 50%60 in C++

I please check this problem I'm creating a Time Base app but I'm having problem getting to work around the modulus oper (%) I want the remainder of 50%60 which I'm expecting to output 10 but it just give me the Lhvalues instead i.e 50. How do I go about it.
Here is a part review of the code.
void setM(int m){
if ((m+min)>59){
hour+=((min+m)/60);
min=0;
min=(min+m)%60;
}
else min+=m;
}
In the code m is passed in as 50 and min is passed in as 10
How do I get the output to be 10 for min in this equation min=(min+m)%60; without reversing the equation i.e
60%(min+m)
in C++ expression a % b returns remainder of division of a by b (if they are positive. For negative numbers sign of result is implementation defined).
you should do : 60 % 50 if you want to divide by 50
Or, if you want to get mins, i think you don't need to make min=0.
When you do 50 % 60, you get a remiainder of 50 since 50 cannot be divided by 60.
To get around this error, you can try doing do something like 70 % 60 to get the correct value as a result, since you do not want to use 60 % 50
This would follow the following logic:
Find the difference between 60 and min + m after min is set to zero if min + mis less than 60. Store it in a variable var initially set to zero.
check if the result is negative; if it is, then set it to positive by multiplying it by -1
When you do the operation, do min = ((min + m) + var) % 60; instead.
***Note: As I am unfamiliar with a Time Base App and what its purpose is, this solution may or may not be required, hence please inform me in the comments before downvoting if I there is anything wrong with my answer. Thanks!
It looks like you are trying to convert an integral number of minutes to an hour/minute pair. That would look more like this instead:
void setM(int m)
{
hour = m / 60;
min = m % 60;
}
If you are trying to add an integral number of minutes to an existing hour/minute pair, it would look more like this:
void addM(int m)
{
int value = (hour * 60) + min;
value += m;
hour = value / 60;
min = value % 60;
}
Or
void addM(int m)
{
setM(((hour * 60) + min) + m);
}

Time optimize C++ function to find number of decoding possibilities

This is an interview practice problem from CodeFights. I have a solution that's working except for the fact that it takes too long to run for very large inputs.
Problem Description (from the link above)
A top secret message containing uppercase letters from 'A' to 'Z' has been encoded as numbers using the following mapping:
'A' -> 1
'B' -> 2
...
'Z' -> 26
You are an FBI agent and you need to determine the total number of ways that the message can be decoded.
Since the answer could be very large, take it modulo 10^9 + 7.
Example
For message = "123", the output should be
mapDecoding(message) = 3.
"123" can be decoded as "ABC" (1 2 3), "LC" (12 3) or "AW" (1 23), so the total number of ways is 3.
Input/Output
[time limit] 500ms (cpp)
[input] string message
A string containing only digits.
Guaranteed constraints:
0 ≤ message.length ≤ 105.
[output] integer
The total number of ways to decode the given message.
My Solution so far
We have to implement the solution in a function int mapDecoding(std::string message), so my entire solution is as follows:
/*0*/ void countValidPaths(int stIx, int endIx, std::string message, long *numPaths)
/*1*/ {
/*2*/ //check out-of-bounds error
/*3*/ if (endIx >= message.length())
/*4*/ return;
/*5*/
/*6*/ int subNum = 0, curCharNum;
/*7*/ //convert substr to int
/*8*/ for (int i = stIx; i <= endIx; ++i)
/*9*/ {
/*10*/ curCharNum = message[i] - '0';
/*11*/ subNum = subNum * 10 + curCharNum;
/*12*/ }
/*13*/
/*14*/ //check for leading 0 in two-digit number, which would not be valid
/*15*/ if (endIx > stIx && subNum < 10)
/*16*/ return;
/*17*/
/*18*/ //if number is valid
/*19*/ if (subNum <= 26 && subNum >= 1)
/*20*/ {
/*21*/ //we've reached the end of the string with success, therefore return a 1
/*22*/ if (endIx == (message.length() - 1) )
/*23*/ ++(*numPaths);
/*24*/ //branch out into the next 1- and 2-digit combos
/*25*/ else if (endIx == stIx)
/*26*/ {
/*27*/ countValidPaths(stIx, endIx + 1, message, numPaths);
/*28*/ countValidPaths(stIx + 1, endIx + 1, message, numPaths);
/*29*/ }
/*30*/ //proceed to the next digit
/*31*/ else
/*32*/ countValidPaths(endIx + 1, endIx + 1, message, numPaths);
/*33*/ }
/*34*/ }
/*35*/
/*36*/ int mapDecoding(std::string message)
/*37*/ {
/*38*/ if (message == "")
/*39*/ return 1;
/*40*/ long numPaths = 0;
/*41*/ int modByThis = static_cast<int>(std::pow(10.0, 9.0) + 7);
/*42*/ countValidPaths(0, 0, message, &numPaths);
/*43*/ return static_cast<int> (numPaths % modByThis);
/*44*/ }
The Issue
I have passed 11/12 of CodeFight's initial test cases, e.g. mapDecoding("123") = 3 and mapDecoding("11115112112") = 104. However, the last test case has message = "1221112111122221211221221212212212111221222212122221222112122212121212221212122221211112212212211211", and my program takes too long to execute:
Expected_output: 782204094
My_program_output: <empty due to timeout>
I wrote countValidPaths() as a recursive function, and its recursive calls are on lines 27, 28 and 32. I can see how such a large input would cause the code to take so long, but I'm racking my brain trying to figure out what more efficient solutions would cover all possible combinations.
Thus the million dollar question: what suggestions do you have to optimize my current program so that it runs in far less time?
A couple of suggestions.
First this problem can probably be formulated as a Dynamic Programming problem. It has that smell to me. You are computing the same thing over and over again.
The second is the insight that long contiguous sequences of "1"s and "2"s are a Fibonacci sequence in terms of the number of possibilities. Any other value terminates the sequence. So you can split the strings into runs of of ones and twos terminated by any other number. You will need special logic for a termination of zero since it does not also correspond to a character. So split the strings count, the length of each segment, look up the fibonacci number (which can be pre-computed) and multiply the values. So your example "11115112112" yields "11115" and "112112" and f(5) = 8 and f(6) = 13, 8*13 = 104.
Your long string is a sequence of 1's and 2's that is 100 digits long. The following Java (Sorry, my C++ is rusty) program correctly computes its value by this method
public class FibPaths {
private static final int MAX_LEN = 105;
private static final BigInteger MOD_CONST = new BigInteger("1000000007");
private static BigInteger[] fibNum = new BigInteger[MAX_LEN];
private static void computeFibNums() {
fibNum[0] = new BigInteger("1");
fibNum[1] = new BigInteger("1");
for (int i = 2; i < MAX_LEN; i++) {
fibNum[i] = fibNum[i-2].add(fibNum[i-1]);
}
}
public static void main(String[] argv) {
String x = "1221112111122221211221221212212212111221222212122221222112122212121212221212122221211112212212211211";
computeFibNums();
BigInteger val = fibNum[x.length()].mod(MOD_CONST);
System.out.println("N=" + x.length() + " , val = " + val);
}
}

How to store output of very large Fibonacci number?

I am making a program for nth Fibonacci number. I made the following program using recursion and memoization.
The main problem is that the value of n can go up to 10000 which means that the Fibonacci number of 10000 would be more than 2000 digit long.
With a little bit of googling, I found that i could use arrays and store every digit of the solution in an element of the array but I am still not able to figure out how to implement this approach with my program.
#include<iostream>
using namespace std;
long long int memo[101000];
long long int n;
long long int fib(long long int n)
{
if(n==1 || n==2)
return 1;
if(memo[n]!=0)
return memo[n];
return memo[n] = fib(n-1) + fib(n-2);
}
int main()
{
cin>>n;
long long int ans = fib(n);
cout<<ans;
}
How do I implement that approach or if there is another method that can be used to achieve such large values?
One thing that I think should be pointed out is there's other ways to implement fib that are much easier for something like C++ to compute
consider the following pseudo code
function fib (n) {
let a = 0, b = 1, _;
while (n > 0) {
_ = a;
a = b;
b = b + _;
n = n - 1;
}
return a;
}
This doesn't require memoisation and you don't have to be concerned about blowing up your stack with too many recursive calls. Recursion is a really powerful looping construct but it's one of those fubu things that's best left to langs like Lisp, Scheme, Kotlin, Lua (and a few others) that support it so elegantly.
That's not to say tail call elimination is impossible in C++, but unless you're doing something to optimise/compile for it explicitly, I'm doubtful that whatever compiler you're using would support it by default.
As for computing the exceptionally large numbers, you'll have to either get creative doing adding The Hard Way or rely upon an arbitrary precision arithmetic library like GMP. I'm sure there's other libs for this too.
Adding The Hard Way™
Remember how you used to add big numbers when you were a little tater tot, fresh off the aluminum foil?
5-year-old math
1259601512351095520986368
+ 50695640938240596831104
---------------------------
?
Well you gotta add each column, right to left. And when a column overflows into the double digits, remember to carry that 1 over to the next column.
... <-001
1259601512351095520986368
+ 50695640938240596831104
---------------------------
... <-472
The 10,000th fibonacci number is thousands of digits long, so there's no way that's going to fit in any integer C++ provides out of the box. So without relying upon a library, you could use a string or an array of single-digit numbers. To output the final number, you'll have to convert it to a string tho.
(woflram alpha: fibonacci 10000)
Doing it this way, you'll perform a couple million single-digit additions; it might take a while, but it should be a breeze for any modern computer to handle. Time to get to work !
Here's an example in of a Bignum module in JavaScript
const Bignum =
{ fromInt: (n = 0) =>
n < 10
? [ n ]
: [ n % 10, ...Bignum.fromInt (n / 10 >> 0) ]
, fromString: (s = "0") =>
Array.from (s, Number) .reverse ()
, toString: (b) =>
b .reverse () .join ("")
, add: (b1, b2) =>
{
const len = Math.max (b1.length, b2.length)
let answer = []
let carry = 0
for (let i = 0; i < len; i = i + 1) {
const x = b1[i] || 0
const y = b2[i] || 0
const sum = x + y + carry
answer.push (sum % 10)
carry = sum / 10 >> 0
}
if (carry > 0) answer.push (carry)
return answer
}
}
We can verify that the Wolfram Alpha answer above is correct
const { fromInt, toString, add } =
Bignum
const bigfib = (n = 0) =>
{
let a = fromInt (0)
let b = fromInt (1)
let _
while (n > 0) {
_ = a
a = b
b = add (b, _)
n = n - 1
}
return toString (a)
}
bigfib (10000)
// "336447 ... 366875"
Expand the program below to run it in your browser
const Bignum =
{ fromInt: (n = 0) =>
n < 10
? [ n ]
: [ n % 10, ...Bignum.fromInt (n / 10 >> 0) ]
, fromString: (s = "0") =>
Array.from (s) .reverse ()
, toString: (b) =>
b .reverse () .join ("")
, add: (b1, b2) =>
{
const len = Math.max (b1.length, b2.length)
let answer = []
let carry = 0
for (let i = 0; i < len; i = i + 1) {
const x = b1[i] || 0
const y = b2[i] || 0
const sum = x + y + carry
answer.push (sum % 10)
carry = sum / 10 >> 0
}
if (carry > 0) answer.push (carry)
return answer
}
}
const { fromInt, toString, add } =
Bignum
const bigfib = (n = 0) =>
{
let a = fromInt (0)
let b = fromInt (1)
let _
while (n > 0) {
_ = a
a = b
b = add (b, _)
n = n - 1
}
return toString (a)
}
console.log (bigfib (10000))
Try not to use recursion for a simple problem like fibonacci. And if you'll only use it once, don't use an array to store all results. An array of 2 elements containing the 2 previous fibonacci numbers will be enough. In each step, you then only have to sum up those 2 numbers. How can you save 2 consecutive fibonacci numbers? Well, you know that when you have 2 consecutive integers one is even and one is odd. So you can use that property to know where to get/place a fibonacci number: for fib(i), if i is even (i%2 is 0) place it in the first element of the array (index 0), else (i%2 is then 1) place it in the second element(index 1). Why can you just place it there? Well when you're calculating fib(i), the value that is on the place fib(i) should go is fib(i-2) (because (i-2)%2 is the same as i%2). But you won't need fib(i-2) any more: fib(i+1) only needs fib(i-1)(that's still in the array) and fib(i)(that just got inserted in the array).
So you could replace the recursion calls with a for loop like this:
int fibonacci(int n){
if( n <= 0){
return 0;
}
int previous[] = {0, 1}; // start with fib(0) and fib(1)
for(int i = 2; i <= n; ++i){
// modulo can be implemented with bit operations(much faster): i % 2 = i & 1
previous[i&1] += previous[(i-1)&1]; //shorter way to say: previous[i&1] = previous[i&1] + previous[(i-1)&1]
}
//Result is in previous[n&1]
return previous[n&1];
}
Recursion is actually discommanded while programming because of the time(function calls) and ressources(stack) it consumes. So each time you use recursion, try to replace it with a loop and a stack with simple pop/push operations if needed to save the "current position" (in c++ one can use a vector). In the case of the fibonacci, the stack isn't even needed but if you are iterating over a tree datastructure for example you'll need a stack (depends on the implementation though). As I was looking for my solution, I saw #naomik provided a solution with the while loop. That one is fine too, but I prefer the array with the modulo operation (a bit shorter).
Now concerning the problem of the size long long int has, it can be solved by using external libraries that implement operations for big numbers (like the GMP library or Boost.multiprecision). But you could also create your own version of a BigInteger-like class from Java and implement the basic operations like the one I have. I've only implemented the addition in my example (try to implement the others they are quite similar).
The main idea is simple, a BigInt represents a big decimal number by cutting its little endian representation into pieces (I'll explain why little endian at the end). The length of those pieces depends on the base you choose. If you want to work with decimal representations, it will only work if your base is a power of 10: if you choose 10 as base each piece will represent one digit, if you choose 100 (= 10^2) as base each piece will represent two consecutive digits starting from the end(see little endian), if you choose 1000 as base (10^3) each piece will represent three consecutive digits, ... and so on. Let's say that you have base 100, 12765 will then be [65, 27, 1], 1789 will be [89, 17], 505 will be [5, 5] (= [05,5]), ... with base 1000: 12765 would be [765, 12], 1789 would be [789, 1], 505 would be [505]. It's not the most efficient, but it is the most intuitive (I think ...)
The addition is then a bit like the addition on paper we learned at school:
begin with the lowest piece of the BigInt
add it with the corresponding piece of the other one
the lowest piece of that sum(= the sum modulus the base) becomes the corresponding piece of the final result
the "bigger" pieces of that sum will be added ("carried") to the sum of the following pieces
go to step 2 with next piece
if no piece left, add the carry and the remaining bigger pieces of the other BigInt (if it has pieces left)
For example:
9542 + 1097855 = [42, 95] + [55, 78, 09, 1]
lowest piece = 42 and 55 --> 42 + 55 = 97 = [97]
---> lowest piece of result = 97 (no carry, carry = 0)
2nd piece = 95 and 78 --> (95+78) + 0 = 173 = [73, 1]
---> 2nd piece of final result = 73
---> remaining: [1] = 1 = carry (will be added to sum of following pieces)
no piece left in first `BigInt`!
--> add carry ( [1] ) and remaining pieces from second `BigInt`( [9, 1] ) to final result
--> first additional piece: 9 + 1 = 10 = [10] (no carry)
--> second additional piece: 1 + 0 = 1 = [1] (no carry)
==> 9542 + 1 097 855 = [42, 95] + [55, 78, 09, 1] = [97, 73, 10, 1] = 1 107 397
Here is a demo where I used the class above to calculate the fibonacci of 10000 (result is too big to copy here)
Good luck!
PS: Why little endian? For the ease of the implementation: it allows to use push_back when adding digits and iteration while implementing the operations will start from the first piece instead of the last piece in the array.