Related
I am attempting to insert items into a list based on the index found in list a.
a = [2,1,0]
b = ['c','b','a']
c =[]
for (number,letter) in zip(a,b):
c.insert(number,[number,letter])
print(c)
This outputs: [[0, 'a'], [2, 'c'], [1, 'b']]
But I expected: [[0, 'a'], [1, 'b'], [2, 'c']]
Why does this happen?
Here's what actually happens inside your loop.
>>> c = []
>>> c.insert(2,[2,'c'])
>>> c
[[2, 'c']]
>>> c.insert(1,[1,'b'])
>>> c
[[2, 'c'], [1, 'b']]
>>> c.insert(0,[0,'a'])
>>> c
[[0, 'a'], [2, 'c'], [1, 'b']]
As you can see, the position your values are inserted at is relative to the contents of the list at the time of insertion. I'd recommend just sorting the zipped list.
I have a long deque of lists of 4 elements.
How do I efficiently extract columns from it?
I am using a comprehension list now as follows:
S=[s[0] for s in sample_D]
R=[s[2] for s in sample_D]
I am not sure if this is the most efficient way to do it.
Let's take an example:
>>> sample_D = [(i, i+1, i+2, i+3) for i in range(0, 1000, 4)]
>>> sample_D
[(0, 1, 2, 3), (4, 5, 6, 7), ..., (996, 997, 998, 999)]
The zip function is useful to transpose a matrix:
Returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables.
>>> list(zip(*sample_D))
[(0, 4, 8, ..., 988, 992, 996), (1, 5, ..., 993, 997), (2, 6, ..., 994, 998), (3, 7, ..., 995, 999)]
The list comprehension returns lists, while the zip method returns tuple, but the content is the same:
>>> def using_list_comp(sample, indices):
... return tuple([t[i] for t in sample] for i in indices)
>>> def using_zip(sample, indices):
... z = list(zip(*sample))
... return tuple(z[i] for i in indices)
>>> assert using_list_comp(sample_D, [0, 1, 2, 3]) == tuple(list(t) for t in using_zip(sample_D, [0, 1, 2, 3]))
If you need only one column, then the list comprehension is faster:
>>> import timeit
>>> timeit.timeit(lambda: using_list_comp(sample_D,[0]))
6.561095703000319
>>> timeit.timeit(lambda: using_zip(sample_D,[0]))
10.13769362000312
But if you need multiple columns, the zip method is faster:
>>> timeit.timeit(lambda: using_list_comp(sample_D,[0, 1, 2, 3]))
25.433326307000243
>>> timeit.timeit(lambda: using_zip(sample_D,[0, 1, 2, 3]))
10.10265000200161
I have large number of dictionaries with about 20 keys in each but using two dictionaries with only 2 keys as example here:
dict1 = {'A':np.array([[1,2,3],[4,5,6]]), 'B':np.array([[1,2],[4,5]])}
dict2 = {'A':np.array([[11,12,13],[14,15,16]]), 'B':np.array([[11,21],[41,51]])}
I am trying to obtain new dictionary with concatenated arrays such that:
combinedDict['A'] =
array([[ 1, 2, 3],
[ 4, 5, 6],
[11, 12, 13],
[14, 15, 16]])
combinedDict['B'] =
array([[ 1, 2],
[ 4, 5],
[11, 21],
[41, 51]])
How do I write a dictionary comprehension or other approach for the above?
using numpy.concatenate
dictkeys = ('A', 'B')
dicts = dict1, dict2
{key: np.concatenate([d[key] for d in dicts]) for key in dictkeys}
In Python's document, it says the following things for the zip function:
"The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using zip(*[iter(s)]*n)."
I have a difficulty in understanding the zip(*[iter(s)]*n) idiom. Can any body give me an example on when we should use that idiom?
Thank you very much!
I don't know what documentation you're using, but this version of zip() documentation, has this example:
>>> x = [1, 2, 3]
>>> y = [4, 5, 6]
>>> zipped = zip(x, y)
>>> zipped
[(1, 4), (2, 5), (3, 6)]
>>> x2, y2 = zip(*zipped)
>>> x == list(x2) and y == list(y2)
True
It interpolates two lists together, in respective order, and it also has an "unzip" feature
And since you asked, here's a slightly more understandable example:
>>> friends = ["Amy", "Bob", "Cathy"]
>>> orders = ["Burger", "Pizza", "Hot dog"]
>>> friend_order_pairs = zip(x, y)
>>> friend_order_pairs
[("Amy", "Burger"), ("Bob", "Pizza"), ("Cathy", "Hot dog")]
It's 2020, but let me leave this here for reference.
The zip(*[iter(s)]*n) idiom is used to split a flat list into chunks.
For example:
>>> mylist = [1, 2, 3, 'a', 'b', 'c', 'first', 'second', 'third']
>>> list(zip(*[iter(mylist)]*3))
[(1, 2, 3), ('a', 'b', 'c'), ('first', 'second', 'third')]
The idiom is analyzed here.
zip() is for sticking two or more lists together.
names=['bob','tim','larry']
ages=[15,36,50]
zip(names,ages)
Out: [('bob', 15), ('tim', 36), ('larry', 50)]
I use it to create dictionaries when I have a separate lists of keys and values:
>>> keys = ('pi', 'c', 'e')
>>> values = (3.14, 3*10**8, 1.6*10**-19)
>>> dict(zip(keys, values))
{'c': 300000000, 'pi': 3.14, 'e': 1.6000000000000002e-19}
Here is how to iterate over two lists and their indices using enumerate() together with zip():
alist = ['a1', 'a2', 'a3']
blist = ['b1', 'b2', 'b3']
for i, (a, b) in enumerate(zip(alist, blist)):
print i, a, b
zip() basically combines two or more items to form another list of equal length:
>>> alist = ['a1', 'a2', 'a3']
>>> blist = ['b1', 'b2', 'b3']
>>>
>>> zip(alist, blist)
[('a1', 'b1'), ('a2', 'b2'), ('a3', 'b3')]
>>>
Use izip instead.
When working with very large data sets, you can use izip which uses a generator and only evaluates results when requested - therefore great for memory management and much better performance. I usually use generator based variants of python modules when possible.
imagine an example like this:
from itertools import islice,izip
w = xrange(9000000000000000000)
x = xrange(2000000000000000000)
y = xrange(9000000000000000000)
z = xrange(9000000000000000000)
# The following only returns a generator that holds an iterator for the first 100 items
# without loading that large mess of numbers into memory
first_100_items_generator = islice(izip(w,x,y,z), 100)
# Iterate through the generator and return only what you need - first 100 items
first_100_items = list(first_100_items_generator)
print(first_100_items)
Output:
[ (0, 0, 0, 0),
(1, 1, 1, 1),
(2, 2, 2, 2),
(3, 3, 3, 3),
(4, 4, 4, 4),
(5, 5, 5, 5),
(6, 6, 6, 6),
(7, 7, 7, 7),
(8, 8, 8, 8),
(9, 9, 9, 9),
(10, 10, 10, 10),
(11, 11, 11, 11)
...
...
]
So here I have four large arrays of numbers, I used izip to zip the values then used islice to pick out the first 100 items.
The nice thing about using xrange, izip and islice is that are use generators, therefore they are not executed until the final "list()" method is called on it.
It's a bit of a digression into generators but good to know when you start doing large data processing in python.
Info on generators:
youtube
Generator intro
here is my list:
projects = ["A", "B", "C"]
hours = [1,2,3]
I want my final answer to be like: {A:1,B:2,C:3}
Is there any suggestion?
Did you try to call dict constructor?
dict(zip(projects,hours))
The code fragment zip(projects,hours) will generate a list of tuples (key,value) which will be used to feed the map (usually called dictionary in python) constructor: dict
In Python 2.7 is also "dictionary comprehension"
>>> projects = ["A", "B", "C"]
>>> hours = [1,2,3]
>>> res = {project: hours for project, hours in zip(projects, hours)}
>>> res
... {'A': 1, 'B': 2, 'C': 3}
My answer is {'A': 1, 'C': 3, 'B': 2}, but I want it to be exactly {'A': 1, 'B': 2, 'C': 3}. I used "sorted", but it only printed out "A, B, C", which missed the value of dictionary