Julia: check if a vector is a vector of numbers - if-statement

I'd like to check if my vector / array is made of numbers.
I've tried:
if isa(x, Array{Number})
println("yes")
end
But it doesn't seem to work...

You have two scenarios here.
Scenario 1. You want to check if type of a vector allows only numbers. Then write:
eltype(x) <: Number
Scenario 2. You want to check if actually all elements of a vector are numbers. Then write:
all(isa.(x, Number))
The second is less efficient because it has to check the whole array. The reason why it might be sometimes needed is that you can have e.g.:
x = Any[1, 2, 3]
which contains only numbers, but type of the vector in general allows it to contain other things than numbers (so it will fail scenario 1 but pass scenario 2).

Related

How to pick valid pairs of integers from a list optimally?

Summary
Given a discrete list of integers, assign as many of the integers as possible to unique pairs that satisfy a condition. How do I choose these pairs optimally so I have the fewest integers left over?
Details
I have a list of 100 integers from 1 to ~1000000000.
Each integer can only be used once.
Integers are assigned to pairs (a, b) such that, let's say, 2 * a != greatestCommonDenominator(a, a + b).
The program returns the set of pairs that could be created with the fewest number of integers left over.
Problems:
Each integer can have more than one valid pair, but only one can be made with it.
List is not sorted, though it can be.
Integers in the list are not unique, meaning I have to work with indices.
I already have the functions for checking the conditions in place, I just can't figure out a functional strategy for scanning the list.
Considering how unlikely it is for a pairing to be invalid given this condition, I've tried finding the pairs I can't make first, then taking the members of each of those pairs and placing them with a different integer so they can't be invalid. From there I figure I could pair the rest arbitrarily. Of course, that ended up being far too convoluted and had a lot of redundancy in my implementation.
While I work on figuring out my system, does anyone have any better strategies I could try?

Given a set of positive integers <=k and its n subsets , find union of which pairs of subsets give the original set

I have a set A which consists of first p positive integers (1 to p), and I am given n subsets of this set. How can I find how many pair of subsets on union would give the original set A?
Of course this can be done naively by checking the size of the union of each pair and if it is equal to p , the union must make up the set A, but is there a more elegant way of doing this, which reduces the time complexity?
The set_union in c++ has a time complexity of 2*(size(set 1) + size(set 2)) - 1 which is not good for nC2 pairs.
If we need to cope with a worst-case scenario then some ideas about this problem:
I suppose that using of std::bitset without any optimizations would be sufficient for this task because of the much faster union operation. But if not, don't use variable size vectors, use simple p-length 0-1 arrays/vectors or unordered_sets. I don't think variable size vectors without O(1) find operation would be better in worst-case scenarios.
Use heuristics to minimize subsets unions. The simplest heuristic is checking the sizes of subsets. We need only those pairs (A, B) of subsets where size(A) + size(B) >= p.
In addition to heuristics, we can count (in O(n^2)) the frequencies of appearing of every number in subsets. After that, we can check the presence of the numbers in some subset(s) in frequence-increasing order. Also, we can exclude those numbers that appear in every subset.
If you'll fix some subset A (in the outer loop for example) and will find unions with other subsets, you can check only those numbers that do not appear in set A. If the subset A is large enough this can dramatically reduce the number of operations needed.
Just a possible improvement to your approach, instead of binary searching you can keep a boolean array to find out if some x appears in array i in O(1).
For example,
Let's say, when taking input you save all the appearances for an array i. Meaning, if x appears in array i, then isThere[i][x] should be true else false.
This can save some time.

Random pairs from two lists

My question is similar to this one.
I have two lists: X with n elements and Y with m elements - let's say they hold row and column indices for a n x m matrix A. Now, I'd like to write something to k random places in matrix A.
I thought of two solutions:
Get a random element x from X and a random element y from Y. Check if something is already written to A[x][y] and if not - write. But if k is close to m*n I can shoot like this for ever.
Create an m*n array with all possible combinations of indices, shuffle it, draw first k elements and write there. But the problem I see here is that if both n and m are very big, the newly created n*m array may be huge (and shuffling may take some time too).
Karoly Horvath suggested to combine the two. I guess I'd have to pick threshold t and:
.
if( k/(m*n) > t ){
use option 2.
}else{
use option 1.
}
Any advice on how to pick t then?
Are there any other (better) approaches I missed?
There's an elegant algorithm due to Floyd for sampling without replacement from a range of integers. You can map the resulting integers in [0, n*m) to coordinates by the C++ function [m](int i) { return std::make_pair(i / m, i % m); }.
The best approach depends on how full your resulting matrix will be.. If you are going to fill more than half of it your rate of collision (aka getting random spot that is already "written" to) is going to be high and will cause you to loop a lot more than you would want.
I would not generate all possibilities, but instead I would build it as you go using a lists of lists. One for all possible values of X and from that a list of possible values of Y. I would initialize the X list but not the Y ones.
every time you pick a value of x for the first time you create a dictionary or list of m elements, then remove the one you use. then next time you pick x you will have m-1 elements, once a X value runs out of elements then remove it from the list so it does not get picked again.. this way you can guarantee never to pick a occupied space again, and you do not need to generate n*m possible options.
You have n x m elements, e.g. 200 elements for a 10 x 20 matrix. Picking one out of 200 should be easy. Point is, whatever you do, you can flatten the two dimensions into one, reducing that part of the problem.
Notes:
Use floor divide and modulo operations to get row and column out of the index.
Blacklist: Store the picked index in a set to quickly skip those that were already written.
Whitelist: Store the indices that are not yet picked in a set. If this is better than blacklisting depends on how full your set is.
Using the right container type for the set might come important, it doesn't have to be std::set. For the blacklist, you only need fast lookup and fast insertion, a vector<bool> might actually work pretty well. For the whitelist, you need fast random access and fast deletion, a vector<unsigned> with the remaining indices would be a good choice.
Prepare to switch between either method depending on the circumstances.
for a nxm matrix, you can consider [0..n*m-1] the indexes for the matrix elements.
Filling in a random index is rather trivial, just generate a random number between 0 and n*m-1, and that is the position to be filled.
Subsequently doing this operation can be a little more tricky:
you can test weather you have already written something to a position and regenerate the random number; but as you fill the matrix you will have a larger number of index regeneration.
a better solution is to put all the indexes in a vector of n*m elements. As you generate an index, you remove it from the list and next time generate a random index between 0 and N-1
example:
vector<int> indexVec;
for (i=0;i<n*m;i++)
indexVec.push_back(i);
nrOfIndexes = n*m-1;
while (nrOfIndexes>1)
{
index = rand()% nrOfIndexes;
processMatrixLocation(index);
indexVec.erase(indexVec.begin()+index);
nrOfIndexes--;
}
processMatrixLocation(indexVec[0]);

Finding the index position of the nearest value in a Fortran array

I have two sorted arrays, one containing factors (array a) that when multiplied with values from another array (array b), yields the desired value:
a(idx1) * b(idx2) = value
With idx2 known, I would like find the idx1 of a that provides the factor necessary to get as close to value as possible.
I have looked at some different algorithms (like this one, for example), but I feel like they would all be subject to potential problems with floating point arithmetic in my particular case.
Could anyone suggest a method that would avoid this?
If I understand correctly, this expression
minloc(abs(a-value/b(idx2)))
will return the the index into a of the first occurrence of the value in a which minimises the difference. I expect that the compiler will write code to scan all the elements in a so this may not be faster in execution than a search which takes advantage of the knowledge that a and b are both sorted. In compensation, this is much quicker to write and, I expect, to debug.

Deriving an ordered sequence from 5 scrambled ones

I've been assigned to do a problem that goes something like this:
My program should derive a list of integers A[1...N], where A[j] represents the jth integer in the list.
To derive it, my program will be inputted 5 lists, each of N integers (the same exact ones as in A[1...N], although scrambled). Each of these lists will be generated this way:
The list is put into order, just like A[1...N]. The list is then scrambled, which is done by removing at least 0 integers from this list, and placing them BACK into any position in the said list. In each of the 5 lists, each number is moved at most one time (although a number could end up at a different index as a result of other numbers shifting around).
FOR EXAMPLE
Assume N is 5, and the correct sequence A is {1, 2, 3, 4, 5}
The program would be entered these 5 sequences:
1,2,3,4,5
2,1,3,4,5
3,1,2,4,5
4,1,2,3,5
5,1,2,3,4
How would it be able to determine that the target/original sequence was {1,2,3,4,5}?
Could anyone point me in the right direction? (This is a homework problem)
Tell me if you need me to clarify the problem more.
Thanks!
I would create an array of size N and use it as an index for the other arrays. For instance, if you created an integer array index[N], you could manipulate it and use its values as indices for the other arrays, i.e. array1[index[N]]. Depending on how you manipulated this index array, you could use it for either scrambling or sorting.