please can someone help me with the codes for this nested list of numbers to look like the nested list of tuples below ie from pot to val.
pot = [[1,2,3,4],[5,6,7,8]]
val = [[(1,2),(2,3),(3,4)],[(5,6),(6,7),(7,8)]]
I used a grouper function but it didn't quite give me the desired result. Is there another way ? Thanks
for line in pot:
temp = []
for i in range(len(line)-1):
temp.append( (line[i],line[i+1]) )
val.append(temp)
May contain typos.
There's probably a better way to do it but I noticed there was no answer to your question and worked something out that does the job:
pot = [[1,2,3,4],[5,6,7,8]]
val = []
for sublist in pot:
temp = []
for n in range (1, len(sublist)):
temp.append((sublist[n-1], sublist[n]))
val.append(temp)
print val
prints
[[(1, 2), (2, 3), (3, 4)], [(5, 6), (6, 7), (7, 8)]]
I'm new to Python. Just started learning it because I'm working on a project that requires is, but I think this solves your question.
pot = [[1,2,3,4],[5,6,7,8]]
inner = []
val = []
a = 0
b = 0
for L in pot:
for x in range(len(L)):
if x>0:
a = L[x-1]
b = L[x]
inner.append((a,b))
val.append(inner)
inner = []
print val
My output running python 2.7 is:
[[(1, 2), (2, 3), (3, 4)], [(5, 6), (6, 7), (7, 8)]]
Related
My ultimate goal is to produce a *.csv file containing labeled binary term vectors for each document. In essence, a term document matrix.
Using gensim, I can produce a file with an unlabeled term matrix.
I do this by essentially copying and pasting code from here: http://radimrehurek.com/gensim/tut1.html
Given a list of documents called "texts".
corpus = [dictionary.doc2bow(text) for text in texts]
print(corpus)
[(0, 1), (1, 1), (2, 1)]
[(0, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1)]
[(2, 1), (5, 1), (7, 1), (8, 1)]
[(1, 1), (5, 2), (8, 1)]
[(3, 1), (6, 1), (7, 1)]
[(9, 1)]
[(9, 1), (10, 1)]
[(9, 1), (10, 1), (11, 1)]
[(4, 1), (10, 1), (11, 1)]
To convert the above vectors into a numpy matrix, I use:
scipy_csc_matrix = gensim.matutils.corpus2csc(corpus)
I then convert the sparse numpy matrix to a full array:
full_matrix = csc_matrix(scipy_csc_matrix).toarray()
Finally, I output this to a file:
with open('file.csv','wb') as f:
writer = csv.writer(f)
writer.writerows(full_matrix)
This produces a matrix of binomial vectors, but I do not know which vector represents which word. Is there an accurate way of matching words to vectors?
I've tried parsing the dictionary to creative a list of words which I would glue to the above full_matrix.
#Retrive dictionary
tokenIDs = dictionary.token2id
#Retrieve keys from dictionary and concotanate those to full_matrix
for key, value in tokenIDs.iteritems():
temp1 = unicodedata.normalize('NFKD', key).encode('ascii','ignore')
temp = [temp1]
dictlist.append(temp)
Keys = np.asarray(dictlist)
#Combine Keys and Matrix
labeled_full_matrix = np.concatenate((Keys, full_matrix), axis=1)
However, this does not work. The word ids (Keys) are not matched to the appropriate vectors.
I am under the assumption a much simpler and more elegant approach is possible. But after some time, I haven't been able to find it. Maybe someone here can help, or point me to something fundamental I've missed.
Is this what you want?
%time lda1 = models.LdaModel(corpus1, num_topics=20, id2word=dictionary1, update_every=5, chunksize=10000, passes=100)
import pandas
mixture = [dict(lda1[x]) for x in corpus1]
pandas.DataFrame(mixture).to_csv("output.csv")
This code below
dots = [[1,2,73,4],[5,36,7,18]]
pos = {1:(0,6), 2:(4,3),3:(7,5),4:(9,0), 5:(0,28), 6:(4,3),7:(7,5),8:(9,0)}
dot_pos = []
for k in dots:
for item in k:
if item in pos:
dot_pos.append(pos[item])
Gave:
[(0, 6), (4, 3), (9, 0), (0, 28), (7, 5)]
How then can I resolve this to get output like this:
[[(0, 6), (4, 3), (9, 0)],[(0, 28), (7, 5)]]
Thanks
For your desired OP, temporarily make a list and append the item in the list in your if block and then append the content after the inner for loop is executed:
dots = [[1,2,73,4],[5,36,7,18]]
pos = {1:(0,6), 2:(4,3),3:(7,5),4:(9,0), 5:(0,28), 6:(4,3),7:(7,5),8:(9,0)}
dot_pos = list()
for k in dots:
list_temp = list()
for item in k:
if item in pos:
list_temp.append(pos[item])
dot_pos.append(list_temp)
print dot_pos
A simple encapsulation work will be enough. After the loop executes, while you return,may be you return this:
return dot_pos
For the output you require,
return [dot_pos]
The code you provided gives [(0, 6), (4, 3), (9, 0), (0, 28), (7, 5)] as an output.
Assuming you want to have what you explained the code can look something like this:
dots = [[1,2,73,4],[5,36,7,18]]
pos = {1:(0,6), 2:(4,3),3:(7,5),4:(9,0), 5:(0,28), 6:(4,3),7:(7,5),8:(9,0)}
dot_pos = list()
for k in dots:
temp = list()
for item in k:
if item in pos:
temp.append(pos[item])
dot_pos.append(temp)
print dot_pos
I have multiple lists to work with. What I'm trying to do is to take a certain index for every list(in this case index 1,2,and 3), in a vertical column. And add those vertical numbers to an empty list.
line1=[1,2,3,4,5,5,6]
line2=[3,5,7,8,9,6,4]
line3=[5,6,3,7,8,3,7]
vlist1=[]
vlist2=[]
vlist3=[]
expected output
Vlist1=[1,3,5]
Vlist2=[2,5,6]
Vlist3=[3,7,3]
Having variables with numbers in them is often a design mistake. Instead, you should probably have a nested data structure. If you do that with your line1, line2 and line3 lists, you'd get a nested list:
lines = [[1,2,3,4,5,5,6],
[3,5,7,8,9,6,4],
[5,6,3,7,8,3,7]]
You can then "transpose" this list of lists with zip:
vlist = list(zip(*lines)) # note the list call is not needed in Python 2
Now you can access the inner lists (which in are actually tuples this now) by indexing or slicing into the transposed list.
first_three_vlists = vlist[:3]
in python 3 zip returns a generator object, you need to treat it like one:
from itertools import islice
vlist1,vlist2,vlist3 = islice(zip(line1,line2,line3),3)
But really you should keep your data out of your variable names. Use a list-of-lists data structure, and if you need to transpose it just do:
list(zip(*nested_list))
Out[13]: [(1, 3, 5), (2, 5, 6), (3, 7, 3), (4, 8, 7), (5, 9, 8), (5, 6, 3), (6, 4, 7)]
Use pythons zip() function, index accordingly.
>>> line1=[1,2,3,4,5,5,6]
>>> line2=[3,5,7,8,9,6,4]
>>> line3=[5,6,3,7,8,3,7]
>>> zip(line1,line2,line3)
[(1, 3, 5), (2, 5, 6), (3, 7, 3), (4, 8, 7), (5, 9, 8), (5, 6, 3), (6, 4, 7)]
Put your input lists into a list. Then to create the ith vlist, do something like this:
vlist[i] = [];
for l in list_of_lists:
vlist[i].append(l[i])
I'm brand new to Scala, having had very limited experience with functional programming through Haskell.
I'd like to try composing a list of all possible pairs constructed from a single input list. Example:
val nums = List[Int](1, 2, 3, 4, 5) // Create an input list
val pairs = composePairs(nums) // Function I'd like to create
// pairs == List[Int, Int]((1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (2, 1) ... etc)
I tried using zip on each element with the whole list, hoping that it would duplicate the one item across the whole. It didn't work (only matched the first possible pair). I'm not sure how to repeat an element (Haskell does it with cycle and take I believe), and I've had trouble following the documentation on Scala.
This leaves me thinking that there's probably a more concise, functional way to get the results I want. Does anybody have a good solution?
How about this:
val pairs = for(x <- nums; y <- nums) yield (x, y)
For those of you who don't want duplicates:
val uniquePairs = for {
(x, idxX) <- nums.zipWithIndex
(y, idxY) <- nums.zipWithIndex
if idxX < idxY
} yield (x, y)
val nums = List(1,2,3,4,5)
uniquePairs: List[(Int, Int)] = List((1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5))
Here's another version using map and flatten
val pairs = nums.flatMap(x => nums.map(y => (x,y)))
List[(Int, Int)] = List((1,1), (1,2), (1,3), (1,4), (1,5), (2,1), (2,2), (2,3), (2,4), (2,5), (3,1), (3,2), (3,3), (3,4), (3,5), (4,1), (4,2), (4,3), (4,4), (4,5), (5,1), (5,2) (5,3), (5,4), (5,5))
This can then be easily wrapped into a composePairs function if you like:
def composePairs(nums: Seq[Int]) =
nums.flatMap(x => nums.map(y => (x,y)))
I've got a list of integers and I want to be able to identify contiguous blocks of duplicates: that is, I want to produce an order-preserving list of duples where each duples contains (int_in_question, number of occurrences).
For example, if I have a list like:
[0, 0, 0, 3, 3, 2, 5, 2, 6, 6]
I want the result to be:
[(0, 3), (3, 2), (2, 1), (5, 1), (2, 1), (6, 2)]
I have a fairly simple way of doing this with a for-loop, a temp, and a counter:
result_list = []
current = source_list[0]
count = 0
for value in source_list:
if value == current:
count += 1
else:
result_list.append((current, count))
current = value
count = 1
result_list.append((current, count))
But I really like python's functional programming idioms, and I'd like to be able to do this with a simple generator expression. However I find it difficult to keep sub-counts when working with generators. I have a feeling a two-step process might get me there, but for now I'm stumped.
Is there a particularly elegant/pythonic way to do this, especially with generators?
>>> from itertools import groupby
>>> L = [0, 0, 0, 3, 3, 2, 5, 2, 6, 6]
>>> grouped_L = [(k, sum(1 for i in g)) for k,g in groupby(L)]
>>> # Or (k, len(list(g))), but that creates an intermediate list
>>> grouped_L
[(0, 3), (3, 2), (2, 1), (5, 1), (2, 1), (6, 2)]
Batteries included, as they say.
Suggestion for using sum and generator expression from JBernardo; see comment.