I am spending my evening doing some programming problems from Kattis. There is one part of the problem 4 thought that I am stuck on.
Given a number, the program is supposed to return the operations (+, -, * or /) required between 4 fours to achieve that number.
For example, the input
9
would result in the output
4 + 4 + 4 / 4 = 9
My solution (not efficient, but simple) is to evaluate all possible ways to combine the operators above and see if any of the combinations achieve the wanted result.
To do this I have written the function seen below. It takes in an array of chars which are the operators to be evaluated (uo[3], could look like {+, /, *}), and the wanted result as an integer (expRes).
bool check(char uo[3], int expRes) {
int res = 4;
for(int oPos = 2; oPos >= 0; oPos--) {
switch (uo[oPos]) {
case '+' : res += 4; break;
case '-' : res -= 4; break;
case '*' : res *= 4; break;
case '/' : res /= 4; break;
}
}
return res == expRes;
}
I realized that this "sequential" approach comes with a problem: it doesn't follow the order of operations. If I was to call the function with
uo = {+, -, /}
and
expRes = 7 it would return false since 4 + 4 = 8, 8 - 4 = 4, 4 / 4 = 1.
The real answer is obviously true, since 4 + 4 - 4 / 4 = 7.
Can any of you think of a way to rewrite the function so that the evaluation follows the order of operations?
Thanks in advance!
Its an easy problem if you look at it.
You are restricted with four 4's and three operators in between, that is you already know your search space. So one solution is to generate the complete search space which is O(n^3) = 4^3 = 64 total equations, where n is the number of operators. Keep the answer to these solutions as a <key, value> pair so that look up to the input of test case is O(1).
Step wise you'd do.
Generate Complete Sequence and store them as key, value pairs
Take Input from test cases
Check if key exists, if yes print the sequence, else print that the sequence doesn't exist
Solution would take 64*1000 operations, which can easily be computed with in a second and would avoid Time Limit Exceeded Error that usually these competitions have
in Code form (most of it is incomplete):
// C++ Syntax
map<int, string> mp;
void generateAll() {
// generate all equations
}
void main () {
generateAll();
int n, t; scanf("%d", &t);
while (t--) {
scanf("%d", &n);
if ( mp.find(n) != mp.end() )
// equation exists to the input
else
// equation doesn't exist for the input
}
}
So I have an int that counts upwards.
Let's say we're up to 65,000 already.
I need to get the number in the thousandth place (the 5), to be assigned to another int.
I found the following snippet which is pretty easy but not quite what I need.
You can use % operator for any number of integers you want to
separate. For example 888881%10 will give you 1 and 888881%100 will
give you 81...
Thanks!
You probably need this.
int AtPos(int number, int pos)
{
return ((number > 0 ? number : -number) / (int)pow(10, pos)) % 10;
}
If this is that you are looking for, arguments validity check should be added.
EDIT.
I just noticed, you need to assign different number in specified position. So you need this improvement:
int& SetValueAtPos(int& number, int pos, int newValue)
{
int power = (int)pow(10, pos);
number -= AtPos(number, pos) * power;
number += power * newValue;
return number;
}
Additionally you can merge those functions and cache value of pow(10, pos) so it would be more optimal.
Try something like this
int a =65432;
int b = ((a%10000)-(a%1000))/1000;
here (a%10000) = 5432
and (a%1000) = 432
so (a%10000)-(a%1000) will be 5000
and finally 5000/1000 = 5
or directly you can use
int b = (a%10000)/1000;
What #Matt suggested in comment is:
int num1 = 65000;
int num2 = num%10000; //num2 is 5000
num2 = num2/1000;
Output: 5
I'm trying to write a code that would convert numbers into alphabets. For example 1 will be 'A', 2 will be 'B', 3 will be 'C' and so on. Im thinking of writing 26 if statements. I'm wondering if there's a better way to do this...
Thank you!
Use an array of letters.
char nth_letter(int n)
{
assert(n >= 1 && n <= 26)
return "abcdefghijklmnopqrstuvwxyz"[n-1];
}
If you can rely on the using an ASCII character set where they are consecutive then you can convert
char intToAlphabet( int i )
{
return static_cast<char>('A' - 1 + i);
}
If you can sometimes rely on this fact, e.g. you can set a compiler flag or similar for the particular target, you can also use this code for that specific build.
Otherwise use a static lookup table (as others have suggested).
Note that it is preferable to "assert" your range check if your numbered input comes from program variables that you know should never be out of range.
If the input comes from user-provided data where the users could potentially provide rogue data, you need a way to handle it that is not "undefined behaviour". Therefore you would have to check each value and either throw an exception (informing the user to correct their data) or use some character to indicate a bad input.
something like that
my_print(int x)
{
char symbol = (char)('A' + x - 1);
cout << symbol;
}
The simplest way would be using a table:
char
remap( int original )
{
static char const remap[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
return original >= 0 && original < sizeof(remap) - 1
? remap[ original ]
: '?'; // or throw, or ...
}
If you want to convert 0 to a, 1 to b, 2 to c ... 25 to z:
char covertedChar = (char)('a' + 2);
// Will print 'c'
cout << covertedChar;
Similary, to convert 0 to A, 1 to B, 2 to C ... 25 to Z:
char covertedChar = (char)('A' + 25);
// Will print 'Z'
cout << covertedChar;
So if you want to convert 1 to A, 2 to B, 3 to C ... 26 to Z, simply subtract 1 offset.
int toCovert = 26 // 'Z'
char covertedChar = (char)('A' + toCovert - 1);
// Will print 'Z'
cout << covertedChar;
Given a vector X of size L, where every scalar element of X is from a binary set {0,1}, it is to find a dot product z=dot(X,Y) if vector Y of size L consists of the integer-valued elements. I suggest, there must exist a very fast way to do it.
Let's say we have L=4; X[L]={1, 0, 0, 1}; Y[L]={-4, 2, 1, 0} and we have to find z=X[0]*Y[0] + X[1]*Y[1] + X[2]*Y[2] + X[3]*Y[3] (which in this case will give us -4).
It is obvious that X can be represented using binary digits, e.g. an integer type int32 for L=32. Then, all what we have to do is to find a dot product of this integer with an array of 32 integers. Do you have any idea or suggestions how to do it very fast?
This really would require profiling but an alternative you might want to consider:
int result=0;
int mask=1;
for ( int i = 0; i < L; i++ ){
if ( X & mask ){
result+=Y[i];
}
mask <<= 1;
}
Typically bit shifting and bitwise operations are faster than multiplication, however, the if statement might be slower than a multiplication, although with branch prediction and large L my guess is it might be faster. You would really have to profile it, though, to determine if it resulted in any speedup.
As has been pointed out in the comments below, unrolling the loop either manually or via a compiler flag (such as "-funroll-loops" on GCC) could also speed this up (eliding the loop condition).
Edit
In the comments below, the following good tweak has been proposed:
int result=0;
for ( int i = 0; i < L; i++ ){
if ( X & 1 ){
result+=Y[i];
}
X >>= 1;
}
Is a suggestion to look into SSE2 helpful? It has dot-product type operations already, plus you can trivially do 4 (or perhaps 8, I forget the register size) simple iterations of your naive loop in parallel.
SSE also has some simple logic-type operations so it may be able to do additions rather than multiplications without using any conditional operations... again you'd have to look at what ops are available.
Try this:
int result=0;
for ( int i = 0; i < L; i++ ){
result+=Y[i] & (~(((X>>i)&1)-1));
}
This avoids a conditional statement and uses bitwise operators to mask the scalar value with either zeros or ones.
Since size explicitly doesn’t matter, I think the following is probably the most efficient general-purpose code:
int result = 0;
for (size_t i = 0; i < 32; ++i)
result += Y[i] & -X[i];
Bit-encoding X just doesn’t bring anything to the table (even if the loop may potentially terminate earlier as #Mathieu correctly noted). But omitting the if inside the loop does.
Of course, loop unrolling can speed this up drastically, as others have noted.
This solution is identical to, but slightly faster (by my test), than Micheal Aaron's:
long Lev=1;
long Result=0
for (int i=0;i<L;i++) {
if (X & Lev)
Result+=Y[i];
Lev*=2;
}
I thought there was a numerical way to rapidly establish the next set bit in a word which should improve performance if your X data is very sparse but currently cannot find said numerical formulation currently.
I've seen a number of responses with bit trickery (to avoid branching) but none got the loop right imho :/
Optimizing #Goz answer:
int result=0;
for (int i = 0, x = X; x > 0; ++i, x>>= 1 )
{
result += Y[i] & -(int)(x & 1);
}
Advantages:
no need to do i bit-shifting operations each time (X>>i)
the loop stops sooner if X contains 0 in higher bits
Now, I do wonder if it runs faster, especially since the premature stop of the for loop might not be as easy for loop unrolling (compared to a compile-time constant).
How about combining a shifting loop with a small lookup table?
int result=0;
for ( int x=X; x!=0; x>>=4 ){
switch (x&15) {
case 0: break;
case 1: result+=Y[0]; break;
case 2: result+=Y[1]; break;
case 3: result+=Y[0]+Y[1]; break;
case 4: result+=Y[2]; break;
case 5: result+=Y[0]+Y[2]; break;
case 6: result+=Y[1]+Y[2]; break;
case 7: result+=Y[0]+Y[1]+Y[2]; break;
case 8: result+=Y[3]; break;
case 9: result+=Y[0]+Y[3]; break;
case 10: result+=Y[1]+Y[3]; break;
case 11: result+=Y[0]+Y[1]+Y[3]; break;
case 12: result+=Y[2]+Y[3]; break;
case 13: result+=Y[0]+Y[2]+Y[3]; break;
case 14: result+=Y[1]+Y[2]+Y[3]; break;
case 15: result+=Y[0]+Y[1]+Y[2]+Y[3]; break;
}
Y+=4;
}
The performance of this will depend on how good the compiler is at optimising the switch statement, but in my experience they are pretty good at that nowadays....
There is probably no general answer to this question. You need to profile your code under all the different targets. Performance will depend on compiler optimizations such as loop unwinding and SIMD instructions that are available on most modern CPUs (x86, PPC, ARM all have their own implementations).
For small L, you can use a switch statement instead of a loop. For example, if L = 8, you could have:
int dot8(unsigned int X, const int Y[])
{
switch (X)
{
case 0: return 0;
case 1: return Y[0];
case 2: return Y[1];
case 3: return Y[0]+Y[1];
// ...
case 255: return Y[0]+Y[1]+Y[2]+Y[3]+Y[4]+Y[5]+Y[6]+Y[7];
}
assert(0 && "X too big");
}
And if L = 32, you can write a dot32() function which calls dot8() four times, inlined if possible. (If your compiler refuses to inline dot8(), you could rewrite dot8() as a macro to force inlining.) Added:
int dot32(unsigned int X, const int Y[])
{
return dot8(X >> 0 & 255, Y + 0) +
dot8(X >> 8 & 255, Y + 8) +
dot8(X >> 16 & 255, Y + 16) +
dot8(X >> 24 & 255, Y + 24);
}
This solution, as mikera points out, may have an instruction cache cost; if so, using a dot4() function might help.
Further update: This can be combined with mikera's solution:
static int dot4(unsigned int X, const int Y[])
{
switch (X)
{
case 0: return 0;
case 1: return Y[0];
case 2: return Y[1];
case 3: return Y[0]+Y[1];
//...
case 15: return Y[0]+Y[1]+Y[2]+Y[3];
}
}
Looking at the resulting assembler code with the -S -O3 options with gcc 4.3.4 on CYGWIN, I'm slightly surprised to see that this is automatically inlined within dot32(), with eight 16-entry jump-tables.
But adding __attribute__((__noinline__)) seems to produce nicer-looking assembler.
Another variation is to use fall-throughs in the switch statement, but gcc adds jmp instructions, and it doesn't look any faster.
Edit--Completely new answer: After thinking about the 100 cycle penalty mentioned by Ants Aasma, and the other answers, the above is likely not optimal. Instead, you could manually unroll the loop as in:
int dot(unsigned int X, const int Y[])
{
return (Y[0] & -!!(X & 1<<0)) +
(Y[1] & -!!(X & 1<<1)) +
(Y[2] & -!!(X & 1<<2)) +
(Y[3] & -!!(X & 1<<3)) +
//...
(Y[31] & -!!(X & 1<<31));
}
This, on my machine, generates 32 x 5 = 160 fast instructions. A smart compiler could conceivably unroll the other suggested answers to give the same result.
But I'm still double-checking.
result = 0;
for(int i = 0; i < L ; i++)
if(X[i]!=0)
result += Y[i];
It's quite likely that the time spent to load X and Y from main memory will dominate. If this is the case for your CPU architecture, the algorithm is faster when loading less. This means that storing X as a bitmask and expanding it into L1 cache will speed up the algorithm as a whole.
Another relevant question is whether your compiler will generate optimal loads for Y. This is higly CPU and compiler dependent. But in general, it helps if the compiler can see precsiely which values are needed when. You could manually unroll the loop. However, if L is a contant, leave it to the compiler:
template<int I> inline void calcZ(int (&X)[L], int(&Y)[L], int &Z) {
Z += X[I] * Y[I]; // Essentially free, as it operates in parallel with loads.
calcZ<I-1>(X,Y,Z);
}
template< > inline void calcZ<0>(int (&X)[L], int(&Y)[L], int &Z) {
Z += X[0] * Y[0];
}
inline int calcZ(int (&X)[L], int(&Y)[L]) {
int Z = 0;
calcZ<L-1>(X,Y,Z);
return Z;
}
(Konrad Rudolph questioned this in a comment, wondering about memory use. That's not the real bottleneck in modern computer architectures, bandwidth between memory and CPU is. This answer is almost irrelevant if Y is somehow already in cache. )
You can store your bit vector as a sequence of ints where each int packs a couple of coefficients as bits. Then, the component-wise multiplication is equivalent to bit-and. With this you simply need to count the number of set bits which could be done like this:
inline int count(uint32_t x) {
// see link
}
int dot(uint32_t a, uint32_t b) {
return count(a & b);
}
For a bit hack to count the set bits see http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
Edit: Sorry I just realized only one of the vectors contains elements of {0,1} and the other one doesn't. This answer only applies to the case where both vectors are limited to coefficients from the set of {0,1}.
Represente X using linked list of the places where x[i] = 1.
To find required sum you need O(N) operations where N is size of your list.
Well you want all bits to get past if its a 1 and none if its a 0. So you want to somehow turn 1 into -1 (ie 0xffffffff) and 0 stays the same. Thats just -X .... so you do ...
Y & (-X)
for each element ... job done?
Edit2: To give a code example you can do something like this and avoid the branch:
int result=0;
for ( int i = 0; i < L; i++ )
{
result+=Y[i] & -(int)((X >> i) & 1);
}
Of course you'd be best off keeping the 1s and 0s in an array of ints and therefore avoiding the shifts.
Edit: Its also worth noting that if the values in Y are 16-bits in size then you can do 2 of these and operations per operation (4 if you have 64-bit registers). It does mean negating the X values 1 by 1 into a larger integer, though.
ie YVals = -4, 3 in 16-bit = 0xFFFC, 0x3 ... put into 1 32-bit and you get 0xFFFC0003. If you have 1, 0 as the X vals then you form a bit mask of 0xFFFF0000 and the 2 together and you've got 2 results in 1 bitwise-and op.
Another edit:
IF you want the code on how to do the 2nd method something like this should work (Though it takes advantage of unspecified behaviour so it may not work on every compiler .. works on every compiler I've come across though).
union int1632
{
int32_t i32;
int16_t i16[2];
};
int result=0;
for ( int i = 0; i < (L & ~0x1); i += 2 )
{
int3264 y3264;
y3264.i16[0] = Y[i + 0];
y3264.i16[1] = Y[i + 1];
int3264 x3264;
x3264.i16[0] = -(int16_t)((X >> (i + 0)) & 1);
x3264.i16[1] = -(int16_t)((X >> (i + 1)) & 1);
int3264 res3264;
res3264.i32 = y3264.i32 & x3264.i32;
result += res3264.i16[0] + res3264.i16[1];
}
if ( i < L )
result+=Y[i] & -(int)((X >> i) & 1);
Hopefully the compiler will optimise out the assigns (Off the top of my head i'm not sure but the idea could be re-worked so that they definitely are) and give you a small speed up in that you now only need to do 1 bitwise-and instead of 2. The speed up would be minor though ...
How can i do a Bitwise OR on strings?
A:
10001
01010
------
11011
Why on strings?
The Bits can have length of 40-50.Maybe this could be problematic on int ?
Any Ideas ?
I would say std::bitset is more than enough for your situation, but for more flexibility you can use boost::dynamic_bitset. Here is an example on std::bitset:
const size_t N = 64;
string a_str = "10001", b_str = "01010";
bitset<N> a(a_str), b(b_str);
bitset<N> c = a | b;
cout << c;
You should take a look at the C++ std::bitset class, which does exactly what you want.
For each char:
char result = (a - '0') | (b - '0') + '0';
Where a and b are two chars with ascii character 0 or 1 in them.
Why not just use a vector of int values? Doesn't the bitset still use a byte per bit?
You can also use a vector of bool values, but this is also implementation specific.
Depending on whether you need storage efficiency or speed (or the utility of container methods that a couple of these approaches lack) you might profile to decide which approach to use.
This is similar to Andreas Brinck's answer, only it returns a full output string and can compare strings of different (arbitrary) lengths.
Example in C# (not near c++ compiler right now), but it should be simple to convert it to a language of your choice.
public static string BitwiseOr(string input1, string input2)
{
char[] inarr1 = (char[])input1.ToCharArray().Reverse().ToArray();
char[] inarr2 = (char[])input2.ToCharArray().Reverse().ToArray();
char[] outarr = new char[input1.Length > input2.Length ? input1.Length : input2.Length];
for (int i = 0; i < outarr.Length ; i++)
{
char c1 = i < input1.Length ? inarr1[i] : '0';
char c2 = i < input2.Length ? inarr2[i] : '0';
outarr[i] = (char)((c1 - '0') | (c2 - '0') + '0');
}
return new string((char[])outarr.Reverse().ToArray());
}
Of course this is only valid if you really need it to be in a string, if not you should (as suggested in other answers) use a vector or similar data type.