I was wondering how i would go about counting combinations in a list. To be more precise i have a list that is comprised of smaller lists that are made up of 6 randomly chosen numbers and i want to count how many times each combinations occurs within the bigger list and then finally display the least occurring combination. So far i tried using Counter() but it seems it can't count lists.
here's an example of what i want to do:
list = [[1,2,3,4,5,6],[1,5,16,35,55,22],[1,2,3,4,5,6],[5,25,35,45,55,10],[1,5,16,35,55,22],[1,2,3,4,5,6],[9,16,21,22,23,6],[9,16,21,22,23,6]]
so after counting the combinations it should print the combination [5,25,35,45,55,10]
since it only occurred once in the list
FYI the list is going to randomly generated with around 1 billion combinations stored but given the range of numbers, there's only 175 million possible combinations
FYI 2 i'm extremely new to python
When you construct the Counter instance you can convert your lists to tuples; the latter are hashable, which is the property an object needs to be able to serve as a key of a dict.
>>> from collections import Counter
>>> l = [[1,2,3,4,5,6],[1,5,16,35,55,22],[1,2,3,4,5,6],[5,25,35,45,55,10],[1,5,16,35,55,22],[1,2,3,4,5,6],[9,16,21,22,23,6],[9,16,21,22,23,6]]
>>> c = Counter(tuple(e) for e in l)
>>> c
Counter({(1, 2, 3, 4, 5, 6): 3, (1, 5, 16, 35, 55, 22): 2, (9, 16, 21, 22, 23, 6): 2, (5, 25, 35, 45, 55, 10): 1})
>>> list(c.most_common()[-1][0])
[5, 25, 35, 45, 55, 10]
Related
I am working on an optimization problem. I have X number of ambulance locations, where X ranges from 1-39.
There are 43 numbers [Ambulance Locations] to choose from (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39) , we choose 3 of them since I have 3 ambulances.
I can only put my ambulance in three locations among 1-39 locations (Restriction). Assume that I want to put my Ambulance on the 5th, 19th, and 31 positions. -- Chromosome 1= [000010000000000000100000000000100000000]. In the above presentation, I am turning on 5-bit, 19-bit, and 31-bit.
Is it possible to flip a bit close to the original solution? For example, keeping 2 bits on in the original position and randomly changing the 3rd bit close to 2bits. It is important for me to keep 3bits on among 39bits. I want to make a control mutation with the aim to produce a small change.
My goal is to make small changes since each bit represents a location. The purpose of mutation is to make small changes and see evaluate results. Therefore, a code should do something like this. As for CS1: (111000000000000000000000000000000000000), I want something like (011010000000000000000000000000000000000), or (011001000000000000000000000000000000000), or (010110000000000000000000000000000000000) or (101010000000000000000000000000000000000), or (00101100000000000000000000000000000000), etc
To achieve mutation, what can be a good way to randomly change present positions to other positions keeping the range only between 1-39 locations (Restriction)?
you could use numpy and do something like
import numpy
s = "1110000000000000000000000000"
def mutate(s):
arr = numpy.array(list(s))
mask = arr == "1"
indices_of_ones = numpy.argwhere(mask).flatten()
pick_one_1_index = numpy.random.choice(indices_of_ones)
potential_swaps = numpy.argwhere(~mask).flatten()
distances = numpy.abs(pick_one_1_index - potential_swaps)
probabilities = (1/distances) # higher probabilities the less distance from its original position
# probabilities = (1/(distances*2)) # even higher probabilities the less distance from its original position
pick_one_0_index = numpy.random.choice(potential_swaps,p=probabilities/probabilities.sum())
arr[pick_one_1_index] = '0'
arr[pick_one_0_index] = '1'
return "".join(arr)
there is likely a more optimal solution
alternatively you can add a scalar or power to the distances to penalize more for distance...
if you wanted to test different multipliers or powers for the probabilities
you could use something like
def score_solution(s1,s2):
ix1 = set([i for i,v in enumerate(s1) if v == "1"])
ix2 = set([i for i,v in enumerate(s2) if v == "1"])
a,b = ix1 ^ ix2
return numpy.abs(a-b)
def get_solution_score_quantiles(sample_size=100,quantiles = [0.25,0.5,0.75]):
scores = []
for i in range(10):
s1 = mutate(s)
scores.append(score_solution(s,s1))
return numpy.quantile(scores,quantiles)
print(get_solution_score_quantiles(50))
Some query in a database allows me to count the number of documents it contains, grouped by the different values of a key. Here is a sample of the result:
{('value1',): 3, ('value2',): 11, (u'value3',): 5, (u'value4',): 35, ('value5',): 3, etc.}
I would like to compute the average and the median of 3, 11, 5, 35, 3, etc. with Python. How can I extract these values and compute them?
I'm not sure how you're getting the results, but something like this would work.
arr = {('value1',): 3, ('value2',): 11, (u'value3',): 5, (u'value4',): 35, ('value5',): 3}
vals = arr.values() # Get list of values, [3, 11, 5, 34, 3]
average = reduce(lambda x,y: (x + y) / 2.0, vals)
# Perform whatever operations you want
If you're using numpy, you can get the median with numpy.median(numpy.array(vals))
I've been wanting to learn some Haskell for a while now, and I know it and similar languages have really good support for various kinds of infinite lists. So, how could I represent the sequence of tetrahedral numbers in Haskell, preferably with an explanation of what's going on?
0 0 0
1 1 1
2 3 4
3 6 10
4 10 20
5 15 35
6 21 56
7 28 84
8 36 120
In case it's not clear what's going on there, the second column is a running total of the first column, and the third column is a running total of the second column. I'd prefer that the Haskell code retain something of the "running total" approach, since that's the concept I was wondering how to express.
You're correct, Haskell is really nice for doing things like this:
first_col = [0..]
second_col = scanl1 (+) first_col
third_col = scanl1 (+) second_col
first_col is an infinite list of integers, starting at 0
scanl (+) calculates a lazy running sum: Prelude docs
We can verify that the above code is doing the right thing:
Prelude> take 10 first_col
[0,1,2,3,4,5,6,7,8,9]
Prelude> take 10 second_col
[0,1,3,6,10,15,21,28,36,45]
Prelude> take 10 third_col
[0,1,4,10,20,35,56,84,120,165]
Adding to perimosocordiae's great answer, languages like Haskell are so slick they allow you to make an infinite list of infinite lists.
First lets define the operator that produces each successive row:
op :: [Integer] -> [Integer]
op = scanl1 (+)
As explained by perimosocordiae, this is just a lazy running sum.
We also need a base case:
tnBase :: [Integer]
tnBase = [0..]
So how do we get an infinite list of infinite lists of tetrahedral numbers? We iterate this operation on the base case, then the output produced by the base case, then that output...
tn = iterate op tnBase
iterate is in the Prelude, such functions can be found using hoogle and searching by name (if you have a good guess) or type signature (you generally know the signature of what you need). Source code is usually linked from the haddock documentation.
Presentation
(in case you're not comfortable with map, take, drop, and head)
This is all well and good, but rather useless if you don't know how to get passed the first infinite list to see the second, third, etc. There are plenty of options, for just getting a particular list you can drop the first few:
getNthTN n = head (drop n tn)
Getting the first few results of each list is probably more what you're looking for though:
printFirstFew n m = print $ take m (map (take n) tn)
Here map (take n) tn will take the first n values from each list of tetrahedral numbers while take m will limit our results to the first m lists.
And lastly, I like the awesome groom package for quick interactive playing with data:
> groom $ take 10 (map (take 10) tn)
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 3, 6, 10, 15, 21, 28, 36, 45],
[0, 1, 4, 10, 20, 35, 56, 84, 120, 165],
[0, 1, 5, 15, 35, 70, 126, 210, 330, 495],
[0, 1, 6, 21, 56, 126, 252, 462, 792, 1287],
[0, 1, 7, 28, 84, 210, 462, 924, 1716, 3003],
[0, 1, 8, 36, 120, 330, 792, 1716, 3432, 6435],
[0, 1, 9, 45, 165, 495, 1287, 3003, 6435, 12870],
[0, 1, 10, 55, 220, 715, 2002, 5005, 11440, 24310],
[0, 1, 11, 66, 286, 1001, 3003, 8008, 19448, 43758]]
So I've been playing around with python and noticed something that seems a bit odd. The semantics of -1 in selecting from a list don't seem to be consistent.
So I have a list of numbers
ls = range(1000)
The last element of the list if of course ls[-1] but if I take a sublist of that so that I get everything from say the midpoint to the end I would do
ls[500:-1]
but this does not give me a list containing the last element in the list, but instead a list containing everything UP TO the last element. However if I do
ls[0:10]
I get a list containing also the tenth element (so the selector ought to be inclusive), why then does it not work for -1.
I can of course do ls[500:] or ls[500:len(ls)] (which would be silly). I was just wondering what the deal with -1 was, I realise that I don't need it there.
In list[first:last], last is not included.
The 10th element is ls[9], in ls[0:10] there isn't ls[10].
If you want to get a sub list including the last element, you leave blank after colon:
>>> ll=range(10)
>>> ll
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> ll[5:]
[5, 6, 7, 8, 9]
>>> ll[:]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
I get consistent behaviour for both instances:
>>> ls[0:10]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> ls[10:-1]
[10, 11, 12, 13, 14, 15, 16, 17, 18]
Note, though, that tenth element of the list is at index 9, since the list is 0-indexed. That might be where your hang-up is.
In other words, [0:10] doesn't go from index 0-10, it effectively goes from 0 to the tenth element (which gets you indexes 0-9, since the 10 is not inclusive at the end of the slice).
It seems pretty consistent to me; positive indices are also non-inclusive. I think you're doing it wrong. Remembering that range() is also non-inclusive, and that Python arrays are 0-indexed, here's a sample python session to illustrate:
>>> d = range(10)
>>> d
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> d[9]
9
>>> d[-1]
9
>>> d[0:9]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
>>> d[0:-1]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
>>> len(d)
10
when slicing an array;
ls[y:x]
takes the slice from element y upto and but not including x. when you use the negative indexing it is equivalent to using
ls[y:-1] == ls[y:len(ls)-1]
so it so the slice would be upto the last element, but it wouldn't include it (as per the slice)
-1 isn't special in the sense that the sequence is read backwards, it rather wraps around the ends. Such that minus one means zero minus one, exclusive (and, for a positive step value, the sequence is read "from left to right".
so for i = [1, 2, 3, 4], i[2:-1] means from item two to the beginning minus one (or, 'around to the end'), which results in [3].
The -1th element, or element 0 backwards 1 is the last 4, but since it's exclusive, we get 3.
I hope this is somewhat understandable.
Having a sorted list and some random value, I would like to find in which range the value is.
List goes like this: [0, 5, 10, 15, 20]
And value is, say 8.
The standard way would be to either go from start until we hit value that is bigger than ours (like in the example below), or to perform binary search.
grid = [0, 5, 10, 15, 20]
value = 8
result_index = 0
while result_index < len(grid) and grid[result_index] < value:
result_index += 1
print result_index
I am wondering if there is a more pythonic approach, as this although short, looks bit of an eye sore.
Thank you for your time!
>>> import bisect
>>> grid = [0, 5, 10, 15, 20]
>>> value = 8
>>> bisect.bisect(grid, value)
2
Edit:
bisect — Array bisection algorithm
for min, max in zip(grid, grid[1:]): # [(0, 5), (5, 10), (10, 15), (15, 20), (20, 25)]
if max <= value < min: #previously: if value in xrange(min, max):
return min, max
raise ValueError("value out of range")