I'm trying to replicate the format of an existing data file which has the following class structure when loaded with np.load:
<class 'numpy.ndarray'>
<class 'list'>
<class 'list'>
<class 'numpy.str_'>
It is a ndarray with lists of lists of strings.
I'm using the following code to create the same structure, a list of lists of lists of strings and trying to convert the outermost list into a ndarray without also converting the inner lists into ndarrays.
captions = []
for row in attrs.iterrows():
sorted_row = row[1].sort_values(ascending=False)
attributes, variations = [], []
for col, val in sorted_row[:20].iteritems():
attributes.append([x[1] for x in word2Id if x[0] == col][0])
variations.append(attributes)
for i in range(9):
variations.append(random.sample(attributes, len(attributes)))
captions.append(variations)
np.save('train_captions.npy', captions)
When I open the resulting npy file, the class hierarchy is like this:
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'numpy.str_'>
How can I store captions in the code above so that it has the same structure as the file at the very top.
import numpy as np
list = ["a", "b", "c", "d"]
np.save('list.npy', list)
read_list = np.load('list.npy').tolist()
print(read_list, type(read_list))
>>>['a', 'b', 'c', 'd'] <class 'list'>
If we don't use .tolist() the result is:
['a' 'b' 'c' 'd'] <class 'numpy.ndarray'>
When I try to replicate your code (more or less):
In [273]: captions = []
In [274]: for r in range(2):
...: attributes, variations = [], []
...: for c in range(2):
...: attributes.append([i for i in ['a','b','c']])
...: variations.append(attributes)
...: for i in range(2):
...: variations.append(random.sample(attributes, len(attributes)))
...: captions.append(variations)
...:
In [275]: captions
Out[275]:
[[[['a', 'b', 'c'], ['a', 'b', 'c']],
[['a', 'b', 'c'], ['a', 'b', 'c']],
[['a', 'b', 'c'], ['a', 'b', 'c']]],
[[['a', 'b', 'c'], ['a', 'b', 'c']],
[['a', 'b', 'c'], ['a', 'b', 'c']],
[['a', 'b', 'c'], ['a', 'b', 'c']]]]
The list has several levels of nesting. When passed to np.array, the result is a 4d array of strings:
In [276]: arr = np.array(captions)
In [277]: arr.shape
Out[277]: (2, 3, 2, 3)
In [278]: arr.dtype
Out[278]: dtype('<U1')
Where possible np.array tries to make as high dimensional array as it can.
To make an array of lists, we have to do something like:
In [279]: arr = np.empty(2, dtype=object)
In [280]: arr[0] = captions[0]
In [281]: arr[1] = captions[1]
In [282]: arr
Out[282]:
array([list([[['a', 'b', 'c'], ['a', 'b', 'c']], [['a', 'b', 'c'], ['a', 'b', 'c']], [['a', 'b', 'c'], ['a', 'b', 'c']]]),
list([[['a', 'b', 'c'], ['a', 'b', 'c']], [['a', 'b', 'c'], ['a', 'b', 'c']], [['a', 'b', 'c'], ['a', 'b', 'c']]])],
dtype=object)
Related
When I practising Python, I have two lists:
list_a = [1, 'a', 'c', 'e', 'f']
list_b = [2, 'b', 'c', 'd', 'e']
and I want the output is:
list_c = [3, 'a','b','c','d','e','f']
I tried:
list_c = [x + y for (x, y) in zip(list_a, list_b)]
the output is:
[3, 'ab', 'cc', 'ed', 'fe']
I also tried:
list_c = set(list_a + list_b)
the output is:
{1, 2, 'a', 'b', 'c', 'd', 'e', 'f'}
Can someone know how to do it? And the real output is like this:
list_c = [3, 'a','b','c','d','e','f']
Thanks.
This is an option for your example but I'm not really sure what you want.
list_a = [1, 'a', 'c', 'e', 'f']
list_b = [2, 'b', 'c', 'd', 'e']
def merge(a,b):
result=[]
for (r,p) in zip(a,b):
if(type(r) == type(p)):
if type(r)==int:
result.append(str(r+p))
else:
result.append(r)
result.append(p)
else:
result.append(r)
result.append(p)
result = list(set(result))
result.sort()
for n,k in enumerate(result):
try:
result[n] = int(k)
except:
pass
return(result)
print(merge(list_a,list_b))
Prints:[3, 'a', 'b', 'c', 'd', 'e', 'f']
I have a list and a dictionary:
list = ['a', 'b', 'c'] .
dict = {'1': ['a', 'd', 'e'], '2': ['b', 'c', 'f'], '3': ['b', 'a', 'e']} .
I want to get the key of the one that matches the lists items the most. If there are two with the same amount i want both.
Assuming your writing in python, this is a simplistic approach which either appends or adds new keys whose values' similarity to the list are at a maximum.
l= ['a', 'b', 'c']
d = {'1': ['a', 'd', 'e'], '2': ['b', 'c', 'f'], '3': ['b', 'a', 'e']}
high = -1
key = []
for k,v in d.items():
occ = (len(l) + len(v)) - len(set(l + v))
print((set(l+v)))
if(occ >= high):
if(occ == high):
key.append(k)
else:
key = [k]
high = occ
print(key)
If I had a dict containing {'a':'b', 'b':'c', 'c':'d'} and I want to use these keys to replace the contents of list l = ['z', 'q', 'f'] with their corresponding value, how would I do it?
When I first tried to solve this problem, I figured I could enter something like list[i] = get.(i) for i in dict. That doesn't seem to work, though.
my_dict = {'a':'b', 'b':'c', 'c':'d'}
l = ['b', 'c', 'a']
new_list = [my_dict[x] for x in l]
Of course, that's assuming you have a key for every element in the l list. Afterwards you can then do l = list(new_list). If you want to still use the l variable.
Below should take care of corner case scenarios ...
cyclic keys occurring in dictionary (e.g. {'a':'b', 'b':'c', 'c':'d'})
key is repeating multiple times in list (e.g. ['b', 'c', 'a', 'z', 'b', 'c'])
key in list doesn't exists in dictionary's keys (e.g. 'z')
Here are 2 solutions, one by updating same list and second by creating new list.
Updating same list
dictionary = {'a':'b', 'b':'c', 'c':'d'}
l = ['b', 'c', 'a', 'z', 'b', 'c']
print(l)
position = 0
for item in l:
if item in dictionary.keys():
l[position] = dictionary[item]
position = position + 1
print(l)
Creating new list
dictionary = {'a':'b', 'b':'c', 'c':'d'}
l = ['b', 'c', 'a', 'z', 'b', 'c']
nl = []
for item in l:
if item in dictionary.keys():
nl.append(dictionary[item])
else:
nl.append(item)
print(l)
print(nl)
Sample Run
======= RESTART: C:/listByMap.py =======
['b', 'c', 'a', 'z', 'b', 'c']
['c', 'd', 'b', 'z', 'c', 'd']
I have a huge list and want to convert it into a dictionary like this.
Sample list: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Output dictionary: {'a':'b', 'c':'d', 'e':'f', 'g':'h'}
I want the sequence to be intact. I read another post similar to it which uses izip from itertools. I tried using it as:
from itertools import izip
i = iter(list_name)
dic = dict(izip(i, i))
But it gives me a dictionary with all sequence jumbled.
Also, the list has even number of elements.
dicts are unordered you can use an OrderedDict to maintain insertion order:
from collections import OrderedDict
from itertools import izip
i = iter(list_name)
dic = OrderedDict(izip(i, i))
Output:
In [3]: list_name = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
In [4]: i = iter(list_name)
In [5]: dic = OrderedDict(izip(i, i))
In [6]: dic
Out[6]: OrderedDict([('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'h')]
I am doing a minor structure manipulation using python, and have a few issues.
Currently my output is the data below.
[['a', ['b', 'c'], ['d', 'e']], ['h', ['i'], ['j']]]
I want to get into this structure below, but my data structure comes out a bit wrong. There could be multiple lists with different entry per list.
(a, b, a, d), (a, c, a, e), (h, i, h, j)
What would be the best approach?
Here's a quick one:
from itertools import product, izip
data = [['a', ['b', 'c'], ['d', 'e']], ['h', ['i'], ['j']]]
result = []
for d in data:
first = d[0]
for v in izip(*d[1:]):
tmp = []
for p in product(*[first, v]):
tmp.extend(p)
result.append(tuple(tmp))
print result
Output:
[('a', 'b', 'a', 'd'), ('a', 'c', 'a', 'e'), ('h', 'i', 'h', 'j')]