Replacing list values in a while loop - list

So I am trying to add two lists, a nominal list and a random number list, together to create a new list made up of positive values. My issue is that my list of random numbers is being generated from a normal distribution (random.normalvariate(0, SD)) which means occasionally I get negative values when summing the two lists, something I do not want.
I have tried to resolve this issue using a while loop to check whether the sum of the two lists at each item creates a negative value and if it does replace the random number with a new random number. However my code does not seem to be replacing the values no matter how I adjust it. Here is my current attempt.
nominalList = [1,2,3,4,5]
randomList = []
for n in xrange(0, len(nominalList)):
randomList.append(random.normalvariate(0, SD))
while nominalList[n] + randomList[n] < 0:
randomList[n] = random.normalvariate(0, SD)

You need to iterate over the indices again and reassign any that don't meet your criteria:
import random
SD = 7
nominalList = [1,2,3,4,5]
randomList = [random.normalvariate(0, SD) for n in nominalList]
for i,n in enumerate(nominalList):
while n + randomList[i] < 0:
randomList[i] = random.normalvariate(0, SD)
Or another thought, just indent the while in your original code:
nominalList = [1,2,3,4,5]
randomList = []
for n in xrange(0, len(nominalList)):
randomList.append(random.normalvariate(0, SD))
while nominalList[n] + randomList[n] < 0:
randomList[n] = random.normalvariate(0, SD)

Related

Perfect sum problem with fixed subset size

I am looking for a least time-complex algorithm that would solve a variant of the perfect sum problem (initially: finding all variable size subset combinations from an array [*] of integers of size n that sum to a specific number x) where the subset combination size is of a fixed size k and return the possible combinations without direct and also indirect (when there's a combination containing the exact same elements from another in another order) duplicates.
I'm aware this problem is NP-hard, so I am not expecting a perfect general solution but something that could at least run in a reasonable time in my case, with n close to 1000 and k around 10
Things I have tried so far:
Finding a combination, then doing successive modifications on it and its modifications
Let's assume I have an array such as:
s = [1,2,3,3,4,5,6,9]
So I have n = 8, and I'd like x = 10 for k = 3
I found thanks to some obscure method (bruteforce?) a subset [3,3,4]
From this subset I'm finding other possible combinations by taking two elements out of it and replacing them with other elements that sum the same, i.e. (3, 3) can be replaced by (1, 5) since both got the same sum and the replacing numbers are not already in use. So I obtain another subset [1,5,4], then I repeat the process for all the obtained subsets... indefinitely?
The main issue as suggested here is that it's hard to determine when it's done and this method is rather chaotic. I imagined some variants of this method but they really are work in progress
Iterating through the set to list all k long combinations that sum to x
Pretty self explanatory. This is a naive method that do not work well in my case since I have a pretty large n and a k that is not small enough to avoid a catastrophically big number of combinations (the magnitude of the number of combinations is 10^27!)
I experimented several mechanism related to setting an area of research instead of stupidly iterating through all possibilities, but it's rather complicated and still work in progress
What would you suggest? (Snippets can be in any language, but I prefer C++)
[*] To clear the doubt about whether or not the base collection can contain duplicates, I used the term "array" instead of "set" to be more precise. The collection can contain duplicate integers in my case and quite much, with 70 different integers for 1000 elements (counts rounded), for example
With reasonable sum limit this problem might be solved using extension of dynamic programming approach for subset sum problem or coin change problem with predetermined number of coins. Note that we can count all variants in pseudopolynomial time O(x*n), but output size might grow exponentially, so generation of all variants might be a problem.
Make 3d array, list or vector with outer dimension x-1 for example: A[][][]. Every element A[p] of this list contains list of possible subsets with sum p.
We can walk through all elements (call current element item) of initial "set" (I noticed repeating elements in your example, so it is not true set).
Now scan A[] list from the last entry to the beginning. (This trick helps to avoid repeating usage of the same item).
If A[i - item] contains subsets with size < k, we can add all these subsets to A[i] appending item.
After full scan A[x] will contain subsets of size k and less, having sum x, and we can filter only those of size k
Example of output of my quick-made Delphi program for the next data:
Lst := [1,2,3,3,4,5,6,7];
k := 3;
sum := 10;
3 3 4
2 3 5 //distinct 3's
2 3 5
1 4 5
1 3 6
1 3 6 //distinct 3's
1 2 7
To exclude variants with distinct repeated elements (if needed), we can use non-first occurence only for subsets already containing the first occurence of item (so 3 3 4 will be valid while the second 2 3 5 won't be generated)
I literally translate my Delphi code into C++ (weird, I think :)
int main()
{
vector<vector<vector<int>>> A;
vector<int> Lst = { 1, 2, 3, 3, 4, 5, 6, 7 };
int k = 3;
int sum = 10;
A.push_back({ {0} }); //fictive array to make non-empty variant
for (int i = 0; i < sum; i++)
A.push_back({{}});
for (int item : Lst) {
for (int i = sum; i >= item; i--) {
for (int j = 0; j < A[i - item].size(); j++)
if (A[i - item][j].size() < k + 1 &&
A[i - item][j].size() > 0) {
vector<int> t = A[i - item][j];
t.push_back(item);
A[i].push_back(t); //add new variant including current item
}
}
}
//output needed variants
for (int i = 0; i < A[sum].size(); i++)
if (A[sum][i].size() == k + 1) {
for (int j = 1; j < A[sum][i].size(); j++) //excluding fictive 0
cout << A[sum][i][j] << " ";
cout << endl;
}
}
Here is a complete solution in Python. Translation to C++ is left to the reader.
Like the usual subset sum, generation of the doubly linked summary of the solutions is pseudo-polynomial. It is O(count_values * distinct_sums * depths_of_sums). However actually iterating through them can be exponential. But using generators the way I did avoids using a lot of memory to generate that list, even if it can take a long time to run.
from collections import namedtuple
# This is a doubly linked list.
# (value, tail) will be one group of solutions. (next_answer) is another.
SumPath = namedtuple('SumPath', 'value tail next_answer')
def fixed_sum_paths (array, target, count):
# First find counts of values to handle duplications.
value_repeats = {}
for value in array:
if value in value_repeats:
value_repeats[value] += 1
else:
value_repeats[value] = 1
# paths[depth][x] will be all subsets of size depth that sum to x.
paths = [{} for i in range(count+1)]
# First we add the empty set.
paths[0][0] = SumPath(value=None, tail=None, next_answer=None)
# Now we start adding values to it.
for value, repeats in value_repeats.items():
# Reversed depth avoids seeing paths we will find using this value.
for depth in reversed(range(len(paths))):
for result, path in paths[depth].items():
for i in range(1, repeats+1):
if count < i + depth:
# Do not fill in too deep.
break
result += value
if result in paths[depth+i]:
path = SumPath(
value=value,
tail=path,
next_answer=paths[depth+i][result]
)
else:
path = SumPath(
value=value,
tail=path,
next_answer=None
)
paths[depth+i][result] = path
# Subtle bug fix, a path for value, value
# should not lead to value, other_value because
# we already inserted that first.
path = SumPath(
value=value,
tail=path.tail,
next_answer=None
)
return paths[count][target]
def path_iter(paths):
if paths.value is None:
# We are the tail
yield []
else:
while paths is not None:
value = paths.value
for answer in path_iter(paths.tail):
answer.append(value)
yield answer
paths = paths.next_answer
def fixed_sums (array, target, count):
paths = fixed_sum_paths(array, target, count)
return path_iter(paths)
for path in fixed_sums([1,2,3,3,4,5,6,9], 10, 3):
print(path)
Incidentally for your example, here are the solutions:
[1, 3, 6]
[1, 4, 5]
[2, 3, 5]
[3, 3, 4]
You should first sort the so called array. Secondly, you should determine if the problem is actually solvable, to save time... So what you do is you take the last k elements and see if the sum of those is larger or equal to the x value, if it is smaller, you are done it is not possible to do something like that.... If it is actually equal yes you are also done there is no other permutations.... O(n) feels nice doesn't it?? If it is larger, than you got a lot of work to do..... You need to store all the permutations in an seperate array.... Then you go ahead and replace the smallest of the k numbers with the smallest element in the array.... If this is still larger than x then you do it for the second and third and so on until you get something smaller than x. Once you reach a point where you have the sum smaller than x, you can go ahead and start to increase the value of the last position you stopped at until you hit x.... Once you hit x that is your combination.... Then you can go ahead and get the previous element so if you had 1,1,5, 6 in your thingy, you can go ahead and grab the 1 as well, add it to your smallest element, 5 to get 6, next you check, can you write this number 6 as a combination of two values, you stop once you hit the value.... Then you can repeat for the others as well.... You problem can be solved in O(n!) time in the worst case.... I would not suggest that you 10^27 combinations, meaning you have more than 10^27 elements, mhmmm bad idea do you even have that much space??? That's like 3bits for the header and 8 bits for each integer you would need 9.8765*10^25 terabytes just to store that clossal array, more memory than a supercomputer, you should worry about whether your computer can even store this monster rather than if you can solve the problem, that many combinations even if you find a quadratic solution it would crash your computer, and you know what quadratic is a long way off from O(n!)...
A brute force method using recursion might look like this...
For example, given variables set, x, k, the following pseudo code might work:
setSumStructure find(int[] set, int x, int k, int setIdx)
{
int sz = set.length - setIdx;
if (sz < x) return null;
if (sz == x) check sum of set[setIdx] -> set[set.size] == k. if it does, return the set together with the sum, else return null;
for (int i = setIdx; i < set.size - (k - 1); i++)
filter(find (set, x - set[i], k - 1, i + 1));
return filteredSets;
}

use elements from a large list till the list becomes empty (python)

I m new to python , I have looping issue to get chuck of data for a list.
I have large list where I need to use chunk of it until it becomes entirely nil.
Lets say I have list as :
a = range(4000) # range 100 -9k
n = 99
while a:
x = a[:n] # want to use first 100 elements
some insertion work of (x) in dB
a = a[n+1 :] reducing first 100 elements from main list
but this method is not working .
Can anybody suggest me a proper approach for this.
Thanks
a[:n] when n is 99 gets the first 99 elements - so change n to 100.
a = a[n+1:] will miss an element - so change n+1 to n
The full code:
a = range(4000)
n = 100
while a:
x = a[:n]
# some insertion work of (x) in dB
a = a[n:] # reducing first 100 elements from main list

Fastest way to remove N random objects

My question is as follows, I am currently working with a generated list of length m. However the list is supposed to be the result of an algorithm taking n as an argument for the final length. m is always much large than n. Currently I am running a while loop where m is the result of len(list).
ie:
from numpy import random as rnd
m = 400000
n = 3000
list = range(0, m)
while len(list) > n:
rmi = rnd.randint(0, len(list))
del list[rmi]
print('%s/%s' %(len(list), n))
This approach certainly works but takes an incredibly long time to run. Is there a more efficient and less time consuming way of removing m-n random entries from my list? The entries removed must be random or the resulting list will no longer represent what it should be.
edit:
Later in my code I then have two arrays of size n, which need to be shortened to size b, the caveat here being that both lists need to have the elements removed randomly but the elements removed must also share the same index. ie:
from numpy import random as rnd
n = 3000
b = 500
list1 = range(0, n)
list2 = rnd.sample(xrange(10000), n)
while len(list1) > b:
rmi = rnd.randint(0, len(list1))
del list1[rmi]
del list2[rmi]
print('%s/%s' %(len(list1), b)
alvis' answer below answers the first part of my question however it does not work for the second part.
Try numpy.random.choice, it creates random sample of your list:
https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.random.choice.html
import numpy as np
...
np.random.choice(range(0,m), size=n)

Subset sum variant with a non-zero target sum

I have an array of integers and need to apply a variant of the subset sum algorithm on it, except that instead of finding a set of integers whose sum is 0 I am trying to find a set of integers whose sum is n. I am unclear as to how to adapt one of the standard subset sum algorithms to this variant and was hoping for any insight into the problem.
This is subset sum problem, which is NP-Complete (there is no known efficient solution to NP-Complete problems), but if your numbers are relatively small integers - there is an efficient pseudo polynomial solution to it that follows the recurrence:
D(x,i) = false x<0
D(0,i) = true
D(x,0) = false x != 0
D(x,i) = D(x,i-1) OR D(x-arr[i],i-1)
Later, you need to step back on your choices, see where you decided to "reduce" (take the element), and where you decided not to "reduce" (not take the element), on the generated matrix.
This thread and this thread discuss how to get the elements for similar problems.
Here is a python code (taken from the thread I linked to) that does the trick.
If you are not familiar with python - read it as pseudo code, it's pretty easy to understand python!.
arr = [1,2,4,5]
n = len(arr)
SUM = 6
#pre processing:
D = [[True] * (n+1)]
for x in range(1,SUM+1):
D.append([False]*(n+1))
#DP solution to populate D:
for x in range(1,SUM+1):
for i in range(1,n+1):
D[x][i] = D[x][i-1]
if x >= arr[i-1]:
D[x][i] = D[x][i] or D[x-arr[i-1]][i-1]
print D
#get a random solution:
if D[SUM][n] == False:
print 'no solution'
else:
sol = []
x = SUM
i = n
while x != 0:
possibleVals = []
if D[x][i-1] == True:
possibleVals.append(x)
if x >= arr[i-1] and D[x-arr[i-1]][i-1] == True:
possibleVals.append(x-arr[i-1])
#by here possibleVals contains 1/2 solutions, depending on how many choices we have.
#chose randomly one of them
from random import randint
r = possibleVals[randint(0,len(possibleVals)-1)]
#if decided to add element:
if r != x:
sol.append(x-r)
#modify i and x accordingly
x = r
i = i-1
print sol
You can solve this by using dynamic programming.
Lets assume that:
N - is the sum that required (your first input).
M - is the number of summands available (your second input).
a1...aM - are the summands available.
f[x] is true when you can reach the sum of x, and false otherwise
Now the solution:
Initially f[0] = true and f[1..N] = false - we can reach only the sum of zero without taking any summand.
Now you can iterate over all ai, where i in [1..M], and with each of them perform next operation:
f[x + ai] = f[x + ai] || f[x], for each x in [M..ai] - the order of processing is relevant!
Finally you output f[N].
This solution has the complexity of O(N*M), so it is not very useful when you either have large input numbers or large number of summands.

More elegant way for updating a slice of a list

Given a list, I would like to apply some set of operations to a subset(slice) of the list, and store the result of each transformation in the original list.
My background is in Ada, which led me to make the following mistake:
Number_List = [0,1,2,3,4,5,6,7,8,9]
for Index, Number in enumerate(Number_List[1:]):
Number_List[Index] = Number + 1
Giving a new Number_List of: 2,3,4,5,6,7,8,9,10,9 and teaching me that a slice of an array is re-indexed to 0.
I've moved to the following, which is cumbersome but functional.
Number_List = [0,1,2,3,4,5,6,7,8,9]
for Index in range(1,len(Number_List))
Number_List[Index] = Number_List[Index]+1
I am looking for a more elegant way to do this.
enumerate takes a start parameter:
Number_List = [0,1,2,3,4,5,6,7,8,9]
for Index, Number in enumerate(Number_List[1:], start=1):
Number_List[Index] = Number + 1
You can also write
Number_List[1:] = [x+1 for x in Number_List[1:]]
from itertools import islice
number_list[start:stop] = (x + 1 for x in islice(number_list, start, stop))
Alternatively, use number_list[start:stop] instead of islice, but that creates another slice needlessly. This updates the list in-places either way thanks to slice assignment and avoids an explicit loop.
You can use list comprehensions and slices to great effect:
vals = range(10) #gives your example numbers
vals[1:] = [v + 1 for v in vals[1:]]