Find largest index of nonzero in python - python-2.7

I have a module to print ~12000 lists of 60 y values against a single set of 60 x values. Would like to find the largest x value that has a non-zero y value.
Using numpy np.nonzero(y) returns every list. Also tried
b = []
for i in range(len(y)):
if y[i] != 0: b.append(i)
print b
and it returned all 12000 indices in y.
Any help is greatly appreciated!

The where function returns a tuple, so you need to pull the first element to get at the data you want:
import numpy as np
y = [0, 0, 2, 3, 1, 0, 0, 3, 0]
print np.where(y)[0].max()
This prints 7.
[Edit...]
I just re-read Adlai's question: He has a large list, each with 60 x values. If everything is in lists, and one of the lists is very large, it's probably fastest to convert the 12000 item list of 60 values each to a 12000 by 60 array, and then just straight numpy. If y is the "outside" list, then np.array(y) should come back with shape 12000, 60. If that's the case, this is a better solution to finding which x values have somewhere a non-zero y value:
yy = np.array(y) # results in a shape (12000, 60)
np.where((yy != 0).any(axis=0))[0]
The logic is: Convert your data to a truth table by comparing to zero, then collapse the truth table with any(axis=0), then find the largest index in the collapsed truth table.
To pull it together with the x data, and wrap it up in a one-liner:
np.array(x)[np.where((np.array(y) != 0).any(axis=0))[0]].max()
This gives the largest x value that has some non-zero y value. If you want an array of largest x-values corresponding to non-zero y-value, that would be a 12,000 item list of x-values (one for every set of 60 y-values), you need something slightly different.

import numpy as np
np.max(np.where(y))

You are probably looking for numpy.where
Return elements, either from x or y, depending on condition.
If only condition is given, return condition.nonzero().
Something like this:
largestindex = numpy.max(numpy.where(item))

Related

Finding indices of elements that equal zero in the given numpy matrix

I'm trying to find the indices of non-zero elements in a 3*3 integer matrix using numpy as a part of the tictactoe game problem. I realize that np.where is a good option for this case and tried it out, the output I get doesn't look right still. Can you please help me code this part ? I have given my partial code below.
input: s, an integer matrix of dimension 3*3
example:
output: m,a list of possible next moves, where each next move is a (r,c) tuple where r denotes the row number, c denotes the column number.
example:
[code]
m = np.where(s==0)
Here's a quick solution:
import numpy as np
s = np.matrix('0, 0, 0; 0, 1, 0; 0, 0, 0')
m = np.where(s==0)
m = list(zip(m[0], m[1]))
print(m)
s is the input matrix, where you can see that the middle square is taken, and then we use np.where() just like you did, which produces two arrays, then use zip() to combine them into tuples and list() to convert the output to a list of tuples of valid moves.

Generate random floats in a set of discontinuous ranges

I am looking to generate random floats to populate a virtual reality world with objects, using the floats to define the locations of the objects. Right now it is easy to uniformly fill a "box" with objects using random.uniform call.
import random
#generate random floating coordinates
for z in range(1000):
x = random.uniform(-1,1)
y = random.uniform(-1,1)
z = random.uniform(-1,1)
#shape = vizshape.addSphere() commented out because this is an uncommon module for VR but I wanted to show how floats are being used
#shape.setPosition([x,y,z])
What I would like to do is pass in arguments to random.uniform() that specify more than 1 range to generate floats in, something like:
x = random.uniform([-1,1],[2,3])
Of course this will cause errors but I am looking for a way to get floats in multiple, size-varying, discontinuous ranges. I'd like to do it without writing another check to see if the generated floats are within my desired ranges and then deciding to keep or throw away. Any ideas?
The idea would be to first peek a range at random and then a float inside the selected range. The problem, however, is that for this to be uniform across all ranges, their length should be taken into account. Otherwise, shorter ranges would get, in the long run, the same number of samples than longer ones, which is not a uniform behavior.
In order to address this problem we can map all ranges to, say, the interval [0,1] in such a way that relative lengths are preserved. Here is how:
Sort the ranges in ascending order and get [a1, b1], ..., [an, bn]
Let di = bi - ai be the length of the i-th range
Let L be the sum of all di
Map a1 to 0 and b1 to d1/L.
For all i >= 2 map ai to the mapping of b_{i-1} and bi to that value plus di/L.
For all i >= 1 let xi be the mapping of ai.
Now take a random number s uniformly sampled in [0,1] and proceed as follows
Let i0 be the last index i such that xi <= s.
Use the inverse of the i0-th mapping to get the answer. In other words, answer with
f = a_{i0} + (s-xi) * L / di
This number f is in the i0-th interval and has been chosen uniformly at random.
This is a bit of a hack, but it can be easily generalized to multiple discontinuous ranges:
import random
def getRandoms(n, ranges):
answer = []
for _ in xrange(n):
low,high = random.choice(ranges)
answer.append(random.uniform(low, high))
return answer

Python list - find tuple with minimum value but randomize the list before

I have a list of tuples. These tuples include an integer value and a category.
mylist = ((1,catA),(1,catB),(2,catA),..)
My objective is to select one tuple from the list that has a value = minimum value. There could be one or more of the tuples that have a value = minimum value. In the example above the minimum value is 1 and both CatA and CatB have value = minimum value.
To get min value I used:
min_value = min(mylist , key=lambda x: x[0])[0]
To select a tuple with a value = min_value I used:
min_tuple = min([x for x in mylist if x[0] == min_value])
However I would like to randomize the sort order so that the selected tuple doesn't always have the same category.
I tried using shuffle before selecting min_tuple but that didn't change the selection order.
random.shuffle(mylist)
min_tuple = min([x for x in mylist if x[0] == min_value])
So I am guessing that the min_tuple expression does it's own ordering. Is this true? If true, can the min_tuple expression ordering be randomized as it selects a tuple with value = min_value?
Edit to add:
I was aware of random.choice from other SO question/answers and elsewhere but my question was focused on the min tuple expression specifically how/if it ordered tuples as it found min value in list.
Also my specific formulation incorrectly or needlessly did a 'double filter' for min value (eg == min_value and min() ). The answer I received here corrected this usage and also applied random.choice as a modification of my specific method.
The min call in your computation of min_tuple means that you're always going to get the tuple with the category that compares smallest. If that was what you really wanted, you should just do min(mylist) and be done with it.
If you want to randomly select from the tuples that have the minimum value, replace min with something like random.choice:
min_value = min(x[0] for x in mylist)
min_tuple = random.choice([x for x in mylist if x[0] == min_value])
Note that I've changed the calculation of min_value to work a little more directly (rather than finding the first tuple with the minimum value and then extracting just the value from it). The original way would have worked fine too.

Subset sum variant with a non-zero target sum

I have an array of integers and need to apply a variant of the subset sum algorithm on it, except that instead of finding a set of integers whose sum is 0 I am trying to find a set of integers whose sum is n. I am unclear as to how to adapt one of the standard subset sum algorithms to this variant and was hoping for any insight into the problem.
This is subset sum problem, which is NP-Complete (there is no known efficient solution to NP-Complete problems), but if your numbers are relatively small integers - there is an efficient pseudo polynomial solution to it that follows the recurrence:
D(x,i) = false x<0
D(0,i) = true
D(x,0) = false x != 0
D(x,i) = D(x,i-1) OR D(x-arr[i],i-1)
Later, you need to step back on your choices, see where you decided to "reduce" (take the element), and where you decided not to "reduce" (not take the element), on the generated matrix.
This thread and this thread discuss how to get the elements for similar problems.
Here is a python code (taken from the thread I linked to) that does the trick.
If you are not familiar with python - read it as pseudo code, it's pretty easy to understand python!.
arr = [1,2,4,5]
n = len(arr)
SUM = 6
#pre processing:
D = [[True] * (n+1)]
for x in range(1,SUM+1):
D.append([False]*(n+1))
#DP solution to populate D:
for x in range(1,SUM+1):
for i in range(1,n+1):
D[x][i] = D[x][i-1]
if x >= arr[i-1]:
D[x][i] = D[x][i] or D[x-arr[i-1]][i-1]
print D
#get a random solution:
if D[SUM][n] == False:
print 'no solution'
else:
sol = []
x = SUM
i = n
while x != 0:
possibleVals = []
if D[x][i-1] == True:
possibleVals.append(x)
if x >= arr[i-1] and D[x-arr[i-1]][i-1] == True:
possibleVals.append(x-arr[i-1])
#by here possibleVals contains 1/2 solutions, depending on how many choices we have.
#chose randomly one of them
from random import randint
r = possibleVals[randint(0,len(possibleVals)-1)]
#if decided to add element:
if r != x:
sol.append(x-r)
#modify i and x accordingly
x = r
i = i-1
print sol
You can solve this by using dynamic programming.
Lets assume that:
N - is the sum that required (your first input).
M - is the number of summands available (your second input).
a1...aM - are the summands available.
f[x] is true when you can reach the sum of x, and false otherwise
Now the solution:
Initially f[0] = true and f[1..N] = false - we can reach only the sum of zero without taking any summand.
Now you can iterate over all ai, where i in [1..M], and with each of them perform next operation:
f[x + ai] = f[x + ai] || f[x], for each x in [M..ai] - the order of processing is relevant!
Finally you output f[N].
This solution has the complexity of O(N*M), so it is not very useful when you either have large input numbers or large number of summands.

How can I remove similar but not duplicate items from a list?

I have a list:
values = [[6.23234121,6.23246575],[1.352672,1.352689],[6.3245,123.35323,2.3]]
What is a way I can go through this list and remove all items that are within say 0.01 to other elements in the same list.
I know how to do it for a specific set of lists using del, but I want it to be general for if values has n lists in it and each list has n elements.
What I want to happen is perform some operation on this list
values = [[6.23234121,6.23246575],[1.352672,1.352689],[6.3245,123.35323,2.3]]
and get this output
new_values = [[6.23234121],[1.352672],[6.3245,123.35323,2.3]]
I'm going to write a function to do this for a single list, eg
>>> compact([6.23234121,6.23246575], tol=.01)
[6.23234121]
You can then get it to work on your nested structure through just [compact(l) for l in lst].
Each of these methods will keep the first element that doesn't have anything closer to it in the list; for #DSM's example of [0, 0.005, 0.01, 0.015, 0.02] they'd all return [0, 0.0.15] (or, if you switch > to >=, [0, 0.01, 0.02]). If you want something different, you'll have to define exactly what it is more carefully.
First, the easy approach, similar to David's answer. This is O(n^2):
def compact(lst, tol):
new = []
for el in lst:
if all(abs(el - x) > tol for x in new):
new.append(el)
return compact
On three-element lists, that's perfectly nice. If you want to do it on three million-element lists, though, that's not going to cut it. Let's try something different:
import collections
import math
def compact(lst, tol):
round_digits = -math.log10(tol) - 1
seen = collections.defaultdict(set)
new = []
for el in lst:
rounded = round(seen, round_digits)
if all(abs(el - x) > tol for x in seen[rounded]):
seen[rounded].add(el)
new.append(el)
return new
If your tol is 0.01, then round_digits is 1. So 6.23234121 is indexed in seen as just 6.2. When we then see 6.23246575, we round it to 6.2 and look that up in the index, which should contain all numbers that could possibly be within tol of the number we're looking up. Then we still have to check distances to those numbers, but only on the very few numbers that are in that index bin, instead of the entire list.
This approach is O(n k), where k is the average number of elements that'll fall within one such bin. It'll only be helpful if k << n (as it typically would be, but that depends on the distribution of the numbers you're using relative to tol). Note that it also uses probably more than twice as much memory as the other approach, which could be an issue for very large lists.
Another option would be to sort the list first; then you only have to look at the previous and following elements to check for a conflict.