Efficient code to determine which of the two numbers are larger

Efficient code to determine which of the two numbers are larger - c++

I have two character arrays (each of many bytes in length; for example each can be 10-12 bytes) and they represent a number in binary format. I want to check if one number is larger than the other. What is the most efficient way to check which of the two is the largest? Are there bitwise operations that one can perform to efficiently determine this?

A simple solution is to compare them from most to least significant byte:
// assuming MSB is at index 0
for(int i = 0; i < len; ++i) {
if(a[i] > b[i]) return a;
if(b[i] > a[i]) return b;
}
// what to return if they're equal?
return a;
This requires them both to have the same size. You can work around this limitation by padding the arrays, or by adding extra checks. I don't know which one will run faster.
You can improve this by treating the arrays of char as arrays of unsigned (or better, a type with size equal to a machine word) if their size is a multiple of sizeof(unsigned), as that will do comparisons wordwise.

I'm assuming this is a bignum data type. You have 10-12 chars to store 80-96 bit integer values. To make it simple, I'm going to assume unsigned values.
Iterate through both arrays simultaneously comparing elements from each array. Start at the most significant element. As soon as you find one element bigger than the other, you have your answer. For extra speed, do machine word size compares by loop unrolling.
But since you are getting these values over the wire, it seems odd that the bignum class is a bottleneck. Surely the network will be your bottleneck. What's more, a good bignum class will be well optimised. Why would your own code beat it?

I think what you can do is firstly use two pointers to point at the first nonzero byte in both (from the left). If now the two effective lengths are different, output the longer one. I mean, let's assume that first is 10 bytes and second is 12 bytes. Byte 4 (first[3]) is the first nonzero byte in first, and byte 2 (second[1]) is the first nonzero byte in second. Now the effective length of first is 7 bytes, while that of second is 11. Clearly, second is larger.
Now for equal effective lengths. Compare the bytes. If equal, go to the next. Otherwise, the larger byte exists in the larger number and we finish.
You can speed this operation up by comparing register-sized chunks (I mean chunks that fill the whole register), because if both chunks are equal you'll skip a number of comparisons equal to the size of the register in bytes... if they are unequal, you can compare byte-by-byte, or you can even compare half-size by half-size firstly... if equal, skip to the other half. If different, then compare quarter-size by quarter-size, and so on before comparing byte-by-byte (this is analogous to binary search).

I posted this as a comment but thought I might add it as an answer. Starting from the most significant byte, loop and compare until the two values are different. The number with the largest value at any iteration is therefore the largest number.
If the values are signed, depending on the encoding (such as 2's complement), the first iteration might have to be a special case.
EDIT: You just commented that the numbers are unsigned, therefore it should be fairly simple and you only need to worry about the first part.

Related

Interpreting a std::vector<unsigned int> as bitvector - efficient algorithm?

I would like to interpret
std::vector<unsigned int> numbers
as a bitvector, i.e. the MSB of numbers[0] is the 1st bit, the MSB of numbers[1] is the 33rd bit and so on. I want to find all sequences of Ones in this vector and store the corresponding positions in a data structure. (Also a single One is defined as sequence here)
For example: I have the values 15 and 112 stored in numbers. Thus bit 29 to 32 and bit 58 to 60 are equal to one.
The challange is to optimize the runtime of this function.
Here is my idea of how to handle this: I thought of working with two for-loops. The first loop is iterating through the elements of "numbers" (let's call it element_loop), while the second loop is used to figure out the the positions of all Ones within a single element (let's call it bit_loop). I thought of detecting the "rising" and "falling edges" of a sequence for that purpose.
At the beginning of every bit_loop cycle, a mask is initialized to the hex. value 0x80000000. With this mask I check whether the 1st bit is equal to one. If yes, the current position (0) is stored. Following, the mask in binary representation "1000..." is used to detect a "falling edge" in the next cycle. If no, the mask is shifted by one bit to the right "0100..." in order to detect a "rising edge" in the next cycle. (I only care about the the couple of bold numbers)
Once an edge is detected, I store the current position and shift the mask in the appropriate way by one bit. Therefore, after a pos. edge (01) I switch to neg. edge detection (10) and the other way round. While iterating through the 32 bit usigned number, I store all edge positions in some kind of vector. This vector could be a 2-dim. array, with the first column being the start of a one-sequence and the second column the end of the sequence. Furthermore I will need some special treatment for the turnover from one element to the next.
Here's my general question: What do you think of this approach? Is there a way to handle this more efficiently? Thanks a lot for your help in advance.
Ben

There are various bitwise tricks to do bit scans efficiently, but if you're using C++ you can take advantage of either std::bitset or boost::dynamic_bitset to iterate over bit positions. The algorithm you described iterates backwards for each block though, so you would want to invert your positions using something like 32 - (32 - i).
Depending on the architecture, each bit should take roughly a cycle.

There are efficient (constant time) methods for finding the first bit set in a word, using either special processor instructions or various clever tricks (see e.g. Position of least significant bit that is set).
With a bit of care you could work backwards and use those to scan for the first one, then do some masking and bit flipping and search for the next zero, and so on.
This might give you a faster algorithm, especially if the sequences are long on average so the gain on the fast scans outweighs the cost of the bit twiddling.

Find common elements from two very large Arrays

There are two integer arrays ,each in very large files (size of each is larger than RAM). How would you find the common elements in the arrays in linear time.
I cant find a decent solution to this problem. Any ideas?

One pass on one file build a bitmap (or a Bloom filter if the integer range is too large for a bitmap in memory).
One pass on the other file find the duplicates (or candidates if using a Bloom filter).
If you use a Bloom filter, the result is probabilistic. New passes can reduce the false positive (Bloom filters don't have false negative).

Assuming integer size is 4 bytes.
Now we can have maximum of 2^32 integers i.e I can have a bitvector of 2^32 bits (512 MB) to represent all integers where each bit reperesents 1 integer.
1. Initialize this vector with all zeroes
2. Now go through one file and set bits in this vector to 1 if you find an integer.
3. Now go through other file and look for any set bit in bit Vector.
Time complexity O(n+m)
space complexity 512 MB

You can obviously use an hash table to find common elements with O(n) time complexity.
First, you need to create an hash table using the first array, then compare the second array using this hash table.

Let's say enough RAM is available to hold 5% of hash of either given file-array (FA).
So, I can split the file arrays (FA1 and FA2) into 20 chunks each - say do a MOD 20 of the contents. We get FA1(0)....FA1(19) and FA2(0)......FA2(19). This can be done in linear time.
Hash FA1(0) in memory and compare contents of FA2(0) with this hash. Hashing and checking for existence are constant time operations.
Destroy this hash and repeat for FA1(1)...FA1(19). This is also linear. So, the whole operation is linear.

Assuming you are talking of integers with the same size, and written in the files in binary mode, you first sort the 2 files (use a quicksort, but reading and writing to the file "offsets" ).
Then you just need to move from the start of the 2 files, and check for matches, if you have a match write the output to another file (assuming you can't also store the result in memory) and keep moving on the files until EOF.

Sort files. With fixed length integers it can be done in O(n) time:
Get some part of file, sort it with radix sort, write to temporary file. Repeat until all data finished. This part is O(n)
Merge sorted parts. This is O(n) too. You can even skip repeated numbers.
On sorted files find a common subset of integers: compare numbers, write it down if they are equal, then step one number ahead on file with smaller number. This is O(n).
All operations are O(n) and final algorithm is O(n) too.
EDIT: bitmap method is much faster if you have enough memory for bitmaps. This method works for any fixed size integers, 64-bit for example. Bitmap of size 2^31 Mb will not be practical for at least a few years :)

Fast hamming distance between 2 bitset

I'm writing a software that heavily relies on (1) accessing single bit and (2) Hamming distance computations between 2 bitset A and B (ie. the numbers of bits that differ between A and B). The bitsets are quite big, between 10K and 1M bits and i have a bunch of them. Since it is impossible to know the bitset sizes at compilation time, i'm using vector < bool > , but i plan to migrate to boost::dynamic_bitset soon.
Hereafter are my questions:
(1) Any ideas about which implementations have the fastest single bit access time?
(2) To compute Hamming distance, the naive approach is to loop over the single bits and to count differences between the 2 bitsets. But, my feeling is that it might be much faster to loop over bytes instead of bits, perform R = byteA XOR byteB, and look in a table with 255 entries what "local" distance is associated with R. Another solutions would be store a 255 x 255 matrix and access directly without operation to the distance between byteA and byteB. So my question: Any idea how to implement that from std::vector < bool > or boost::dynamic_bitset? In other words, do you know if there is a way to get access to the bytes array or i have to recode everything from scratch?

(1) Probably vector<char> (or even vector<int>), but that wastes at least 7/8 space on typical hardware. You don't need to unpack the bits if you use a byte or more to store them. Which of vector<bool> or dynamic_bitset is faster, I don't know. That might depend on the C++ implementation.
(2) boost::dynamic_bitset has operator^ and a count member, which together can be used to compute the Hamming distance in a probably fast, though memory-wasting way. You can also get to the underlying buffer with to_block_range; to use that, you need to implement a Hamming distance calculator as an OutputIterator.

If you do code from scratch, you can probably do even better than a byte at a time: take a word at a time from each bitset. The cost of XOR should be very low, then use either an implementation-specific builtin popcount, or else the fastest bit-twiddling popcount you can find (which may or may not involve a 256-entry lookup).
[Edit: looks as if this could apply to boost::dynamic_bitset::to_block_range, with the Block chosen as either int or long. It's a shame that it writes to an OutputIterator rather than giving you an InputIterator -- I can't immediately see how to use it to iterate over two bitsets together, except by using an extra thread or else copying one of the bitsets out to an int array first. Either way you'll take some copy overhead that could have been avoided if it had left the program control to you. The thread is pretty complicated for this task, and of course has its own overheads, and copying out the data probably isn't any better than using operator^ and count().]

I know this will get downvoted for heresy, but here it is: you can get a pointer to the actual data from a vector using &vector[0]; (for vector ymmv). Then, you can iterate over it using c-style functions; meaning, cast your pointer to an int pointer or something big like that, perform your hamming arithmetic as above, and move the pointer one word-length at a time. This would only work because you know that the bits are packed together continuously, and would be vulnerable (for example, if the vector is modified, it could move memory locations).

How to ensure that randomly generated numbers are not being repeated? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Unique (non-repeating) random numbers in O(1)?
How do you efficiently generate a list of K non-repeating integers between 0 and an upper bound N
I want to generate random number in a certain diapason, and I must be sure, that each new number is not a duplicate of formers. One solution is to store formerly generated numbers in a container and each new number checks aginst the container. If there is such number in the container, then we generate agin, else we use and add it to the container. But with each new number this operation is becoming slower and slower. Is there any better approach, or any rand function that can work faster and ensure uniqueness of the generation?
EDIT: Yes, there is a limit (for example from 0 to 1.000.000.000). But I want to generate 100.000 unique numbers! (Would be great if the solution will be by using Qt features.)

Is there a range for the random numbers? If you have a limit for random numbers and you keep generating unique random numbers, then you'll end up with a list of all numbers from x..y in random order, where x-y is the valid range of your random numbers. If this is the case, you might improve speed greatly by simply generating the list of all numbers x..y and shuffling it, instead of generating the numbers.

I think there are 3 possible approaches, depending on range-size, and performance pattern needed you can use another algorithm.
Create a random number, see if it is in (a sorted) list. If not add and return, else try another.
Your list will grow and consume memory with every number you need. If every number is 32 bit, it will grow with at least 32 bits every time.
Every new random number increases the hit-ratio and this will make it slower.
O(n^2) - I think
Create an bit-array for every number in the range. Mark with 1/True if already returned.
Every number now only takes 1 bit, this can still be a problem if the range is big, but every number now only allocates 1 bit.
Every new random number increases the hit-ratio and this will make it slower.
O(n*2)
Pre-populate a list with all the numbers, shuffle it, and return the Nth number.
The list will not grow, returning numbers will not get slower,
but generating the list might take a long time, and a lot of memory.
O(1)
Depending on needed speed, you could store all lists in a database. There's no need for them to be in memory except speed.

Fill out a list with the numbers you need, then shuffle the list and pick your numbers from one end.

If you use a simple 32-bit linear congruential RNG (such as the so-called "Minimal Standard"), all you have to do is store the seed value you use and compare each generated number to it. If you ever reach that value again, your sequence is starting to repeat itself and you're out of values. This is O(1), but of course limited to 2^32-1 values (though I suppose you could use a 64-bit version as well).

There is a class of pseudo-random number generators that, I believe, has the properties you want: the Linear congruential generator. If defined properly, it will produce a list of integers from 0 to N-1, with no two numbers repeating until you've used all of the numbers in the list once.
#include <stdint.h>
/*
* Choose these values as follows:
*
* The MODULUS and INCREMENT must be relatively prime.
* The MULTIPLIER-1 must be divisible by all prime factors of the MODULUS.
* The MULTIPLIER-1 must be divisible by 4, if the MODULUS is divisible by 4.
*
* In addition, modulus must be <= 2**32 (0x0000000100000000ULL).
*
* A small example would be 8, 5, 3.
* A larger example would be 256, 129, 251.
* A useful example would be 0x0000000100000000ULL, 1664525, 1013904223.
*/
#define MODULUS (0x0000000100000000ULL)
#define MULTIPLIER (1664525)
#define INCREMENT (1013904223)
static uint64_t seed;
uint32_t lcg( void ) {
uint64_t temp;
temp = seed * MULTIPLIER + INCREMENT; // 64-bit intermediate product
seed = temp % MODULUS; // 32-bit end-result
return (uint32_t) seed;
}
All you have to do is choose a MODULUS such that it is larger than the number of numbers you'll need in a given run.

It wouldn't be random if there is such a pattern?
As far as I know you would have to store and filter all unwanted numbers...

unsigned int N = 1000;
vector <unsigned int> vals(N);
for(unsigned int i = 0; i < vals.size(); ++i)
vals[i] = i;
std::random_shuffle(vals.begin(), vals.end());
unsigned int random_number_1 = vals[0];
unsigned int random_number_2 = vals[1];
unsigned int random_number_3 = vals[2];
//etc

You could store the numbers in a vector, and get them by index (1..n-1). After each random generation, remove the indexed number from the vector, then generate the next number in the interval 1..n-2. etc.

If they can't be repeated, they aren't random.
EDIT:
Furthermore..
if they can't be repeated, they don't fit in a finite computer

How many random numbers do you need? Maybe you can apply a shuffle algorithm to a precalculated array of random numbers?

There is no way a random generator will output values depending on previously outputted values, because they wouldn't be random. However, you can improve performance by using different pools of random values each with values combined by a different salt value, which will divide the quantity of numbers to check by the quantity of pools you have.

If the range of the random number doesn't matter you could use a really large range of random numbers and hope you don't get any collisions. If your range is billions of times larger than the number of elements you expect to create your chances of a collision are small but still there. If the numbers don't to have an actual random distribution you could have a two part number {counter}{random x digits} that would ensure a unique number but it wouldn't be randomly distributed.

There's not going to be a pure functional approach that isn't O(n^2) on the number of results returned so far - every time a number is generated you will need to check against every result so far. Additionally, think about what happens when you're returning e.g. the 1000th number out of 1000 - you will require on average 1000 tries until the random algorithm comes up with the last unused number, with each attempt requiring an average of 499.5 comparisons with the already-generated numbers.
It should be clear from this that your description as posted is not quite exactly what you want. The better approach, as others have said, is to take a list of e.g. 1000 numbers upfront, shuffle it, and then return numbers from that list incrementally. This will guarantee you're not returning any duplicates, and return the numbers in O(1) time after the initial setup.

You can allocate enough memory for array of bits with 1 bit for each possible number. and check/set bits for every generated number. for example for numbers from 0 to 65535 you will need only 8192 (8kb) of memory.

Here's an interesting solution I came up with:
Assume you have numbers 1 to 1000 - and you don't have enough memory.
You could put all 1000 numbers into an array, and remove them one by one, but you'll get memory overflow error.
You could split the array in two, so you have an array of 1-500 and one empty array
You could then check if the number exists in array 1, or doesn't exist in the second array.
So assuming you have 1000 numbers, you can get a random number from 1-1000. If its less than 500, check array 1 and remove it if present. If it's NOT in array 2, you can add it.
This halves your memory usage.
If you propogate this using recursion, you can split your 500 array into a 250 and empty array.
Assuming empty arrays use no space, you can decrease your memory usage quite a bit.
Searching will be massively faster too, because if you break it down a lot, you generate a number such as 29. It's less than 500, less than 250, less than 125, less than 62, less than 31, greater than 15, so you do those 6 calculations, then check the array containing an average of 16/2 items - 8 in total.
I should patent this search, although I bet it already exists!

Especially given the desired number of values, you want a Linear Feedback Shift Register.
Why?
No shuffle step, nor a need to keep track of values you've already hit. As long as you go less than the full period, you should be fine.
It turns out that the Wikipedia article has some C++ code examples which are more tested than anything I would give you off the top of my head. Note that you'll want to be pulling values from inside the loops -- the loops just iterate the shift register through. You can see this in the snippet here.
(Yes, I know this was mentioned, briefly in the dupe -- saw it as I was revising. Given it hasn't been brought up here and is the best way to solve the poster's question, I think it should be brought up again.)

Let's say size=100.000 then create an array with this size. Create random numbers then put them into array.Problem is which index that number will be ? randomNumber%size will give you index.
When u put next number, use that function for index and check this value is exist or not. If not exist put it if exist then create new number and try that. U can create in fastest way with this way. Disadvange of this way is you will never find numbers which last section is same.
For example for last sections is
1231232444556
3458923444556
you will never have such numbers in your list even if they are totally different but last sections are same.

First off, there's a huge difference between random and pseudorandom. There's no way to generate perfectly random numbers from a deterministic process (such as a computer) without bringing in some physical process like latency between keystrokes or another entropy source.
The approach of saving all the numbers generated will slow down the computation rather quickly; the more numbers you have, the larger your storage needs, until you've filled up all available memory. A better method would be (as someone's already suggested) using a well known pseudorandom number generator such as the Linear Congruential Generator; it's super fast, requiring only modular multiplication and addition, and the theory behind it gets a lot of mention in Vol. 2 of Knuth's TAOCP. That way, the theory involved guarantees a rather large period before repetition, and the only storage needed are the parameters and seed used.

If you have no problem when a value can be calculated by the previous one, LFSR and LCG are fine. When you don't want that one output value can be calculated by another, you can use a block cipher in counter mode to generate the output sequence, given that the cipher block length is equal to the output length.

Use Hashset generic class . This class does not contain same values. You can put in all of your generated numbers then u can use them in Hashset.You can also check it if it is exist or not .Hashset can determine existence of items in fastest way.Hashset does not slow when list become bigger and this is biggest feature of it.
For example :
HashSet<int> array = new HashSet<int>();
array.Add(1);
array.Add(2);
array.Add(1);
foreach (var item in array)
{
Console.WriteLine(item);
}
Console.ReadKey();

How to partition bits in a bit array with less than linear time

This is an interview question I faced recently.
Given an array of 1 and 0, find a way to partition the bits in place so that 0's are grouped together, and 1's are grouped together. It does not matter whether 1's are ahead of 0's or 0's are ahead of 1's.
An example input is 101010101, and output is either 111110000 or 000011111.
Solve the problem in less than linear time.
Make the problem simpler. The input is an integer array, with each element either 1 or 0. Output is the same integer array with integers partitioned well.
To me, this is an easy question if it can be solved in O(N). My approach is to use two pointers, starting from both ends of the array. Increases and decreases each pointer; if it does not point to the correct integer, swap the two.
int * start = array;
int * end = array + length - 1;
while (start &lt end) {
// Assume 0 always at the end
if (*end == 0) {
--end;
continue;
}
// Assume 1 always at the beginning
if (*start == 1) {
++start;
continue;
}
swap(*start, *end);
}
However, the interview insists there is a sub-linear solution. This makes me thinking hard but still not get an answer.
Can anyone help on this interview question?
UPDATE: Seeing replies in SO stating that the problem cannot be solved in sub-linear time, I can confirm my original idea that there cannot be a solution of sub-linear.
Is it possible the interviewer plays a trick?

I don't see how there can be a solution faster than linear time.
Imagine a bit array that is all 1's. Any solution will require examining every bit in this array before declaring that it is already partitioned. Examining every bit takes linear time.

It's not possible. Doing it in less than linear time implies that you don't look at every array element (like a binary search). However since there is no way to know what any element of the array is without looking at it, you must look at each array element at least once.
You can use lookup tables to make it faster, but O(n/8) is still O(n), so either the interviewer was wrong or you misunderstood the question.

It is possible faster then in linear time given you have enough memory, it can be done in O(1)
Use the bitmask as index in a vector which maps to the partitioned bitmask.
using your example, at index 341 (101010101) the value 496 (111110000) is stored.

Perhaps the confusion comes from "less than linear time". For example, this solution counts the number of bits, that makes a masks containing that many bits. It only counts bits while there are uncounted on-bits:
// from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan
unsigned count_bits(unsigned pX)
{
unsigned result;
for (result = 0; v; ++result)
{
pX &= pX - 1;
}
return result;
}
unsigned n = /* the number */;
// r contains 000...111, with number of 1's equal to number of 1's in v
unsigned r = 1 << count_bits(n);
Even though this minimizes the number of bits to count, it's still linear. So if this is what is meant by "sub-linear", there you go.
But if they really meant sub-linear as in logarithmic or constant, I don't see a way. You could conceivably make a look-up table for every value, but :/

Technically you could send each element of the array to a separate processor and then do it in less than linear time. If you have N processors, you could even do it in O(1) time!

As others said, I don't believe this can be done in less than linear time. For linear time solution, you can STL algorithms instead your own loop like this:
int a1[8] = {1,0,1,0,1,0,1,0};
std::fill(std::remove(a1, a1+8, 0), a1+8, 0);

Well.. It can be be done 'less than linear' time (cheeky method).
if(n % 2)
{
// Arrange all 1's to the right and DON'T check the right-most bit, because it's 1
}else{
// Arrange all 0's to the right and DON'T check the right-most bit, because it's 0.
}
So, technically you 'group' the bits in less than linear time :P

To me, the most likely interpretations are:
The bits are supposed to be in an int instead of an array, in which case you can use something like http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetKernighan or an 8-bit (or more) lookup table.
they used "sublinear" to mean "less than n operations" rather than less-than-O(n). But even that seems impossible for the same reasons listed below.
There is another miscommunication in the question
Otherwise the question is wrong, since all elements of the array must be examined to determine the answer, and that is at least 'n' operations.
Listing either 0s or 1s first, and the references to bits rather than bools make me think something like the first option was intended, even though, when dealing with only one word, it doesn't make very much difference. I'm curious to know what the interviewer actually had in mind.

Splitting this work among parallel processors costs N/M ( or O(N) ) only if you assume that parallelism increases more slowly than problem size does. For the last ten years or so, paralellism (via the GPU) has been increasing more rapidly than typical problem sizes, and this trend looks to continue for years to come. For a broad class of problems, it is instructive to assume "infinite parallelism" or more precisely, "parallelism greater than any expected problem size" because the march of progress in GPUs and cloud computing provides such a thing over time.
Assuming infinite parallelism, this problem can be solved in O(logN) time because the addition operator required to add up all the 0 and 1 bits is associative, and so it requires at least logN time steps to complete.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js