Construct an Adjacency List from a List of edges? - python-2.7

In context of graph algorithms, we are usually given a convenient representation of a graph (usually as an adjacency list or an adjacency matrix) to operate on.
My question is, what is an efficient way to construct an Adjacency list from a given list of all edges?
For the purpose of the question, assume that edges are a list of tuples (as in python) and (a,b) denotes a directed edge from a to b.

A combination of itertools.groupby (docs), sorting and dict comprehension could get you started:
from itertools import groupby
edges = [(1, 2), (2, 3), (1, 3)]
adj = {k: [v[1] for v in g] for k, g in groupby(sorted(edges), lambda e: e[0])}
# adj: {1: [2, 3], 2: [3]}
This sorts and groups the edges by their source node, and stores a list of target nodes for each source node. Now you can access all adjacent nodes of 1 via adj[1]

Related

Python comparing a list of tuples and a list of clustered words

I need to compare two kind of lists as follow.
List of words with their frequency
list_1=[('psicomotricita',6), ('psicomotorio',5) , ('psicomotorie',6),('psicomotore', 7),
('bella',1), ('biella',7), ('bello',3),('zorro',4)]
List of lists, where every sublist is a cluster of word by their similarity.
list_2=[['psicomotricità', 'psicomotorio','psicomotorie','psicomotore']
['bella', 'biella', 'bello']
['zorro']]
So, I need to loop every sublist of the list_2 in order to pick-up the word that compare in the list_1 with the maximum frequency.
The result should be:
final_list['psicomotore','biella','zorro']
Is there anybody who can help me? Thanks!
After a long struggle (I'm new newbie in python) I solved the problem above.
So, first I converted the list of tuples to a dictionary:
d = {t[0]:int(t[1]) for t in list_1}
Second, I have created the following function:
def SortTuples(list, dict):
final_ls= []
ddd= []
for el in list_2:
for key, value in d.iteritems():
if key in el:
ddd.append((key,value))
z= (max(ddd,key=itemgetter(1))[0])
final_ls.append(z)
ddd= []
return final_ls
The result is a list with the words whom has the max frequency:
Out: ['psicomotore', 'biella', 'zorro']

Intersection of a set and list of dictionaries

I have a set my_set = ("a","b","c","d","z") and a list my_list=[{"a",0.5},{"c",0.6},{"b",0.9},{"z",0.5},{"m",0.0}]. I would like to have a list with items containing keys in my_set only. In this case the result I would like to have is new_list=[{"a",0.5},{"c",0.6},{"b",0.9},{"z",0.5}]
The list and set is large. Is there an efficient way to accomplish this?
Assuming that that's actually a set and a list of dicts, as stated in the question, you can try this:
In [1]: my_set = set(["a","b","c","d","z"])
In [2]: my_list=[{"a":0.5},{"c":0.6},{"b":0.9},{"z":0.5},{"m":0.0}]
In [3]: [d for d in my_list if all(k in my_set for k in d)]
Out[3]: [{'a': 0.5}, {'c': 0.6}, {'b': 0.9}, {'z': 0.5}]
This simply uses a list comprehension to check that all the keys in the dicts are contained in the set. This will have complexity of O(nm), for n dicts in the list, with m keys each (m being 1 in your case) and assuming that set-lookup is always O(1).
Note, however, that you do not really need a list of dictionaries, since all the keys seem to be different (in this example, at least), so a single dictionary would be enough.

scala: generating tuples from a list

I have a list val l=List(4,3,2,1), I am trying to generate a list of tuples of the format (4,3), (4,2) and so on.
Here's what I have so far:
for (i1<-0 to l.length-1;i2<-i1+1 to l.length-1) yield (l(i1),l(i2))
The output is : Vector((4,3), (4,2), (4,1), (3,2), (3,1), (2,1))
Two questions:
It generates a Vector, not a List. How are these two different?
Is this the idiomatic scala way of doing this? I am very new to Scala, so it's important to me that I learn right.
On the first part of the question, the for comprehension implementation defines ranges 0 to l.length-1 and i1+1 to l.length-1 as IndexedSeq[Int] hence the yielded type is trait IndexedSeq[(Int, Int)] implemented by final class Vector.
On the second part, your approach is valid, yet consider the following where we do not use indexed references to the lists,
for (List(a,b,_*) <- xs.combinations(2).toList) yield (a,b)
Note that
xs.combinations(2).toList
List(List(4, 3), List(4, 2), List(4, 1), List(3, 2), List(3, 1), List(2, 1))
and so with List(a,b,_*) we pattern-match and extract the first two elements of each nested list (the _* indicates to ignore possible additional elements). Since the iteration is over a list, the for comprehension yields a list of duples.

Simple path queries on large graphs

I have a question about large graph data. Suppose that we have a large graph with nearly 100 million edges and around 5 million nodes, in this case what is the best graph mining platform that you know of that can give all simple paths of lengths <=k (for k=3,4,5) between any two given nodes. The main concern is the speed of getting those paths. Another thing is that the graph is directed, but we would like the program to ignore the directions when computing the paths but still return the actually directed edges once it spots those paths.
For instance:
a -> c <- d -> b is a valid path between nodes 'a' and 'b' of length 3.
Thanks in advance.
So this is a way to do it in networkx. It's roughly based on the solution I gave here. I'm assuming that a->b and a<-b are two distinct paths you want. I'm going to return this as a list of lists. Each sublist is the (ordered) edges of a path.
import networkx as nx
import itertools
def getPaths(G,source,target, maxLength, excludeSet=None):
#print source, target, maxLength, excludeSet
if excludeSet== None:
excludeSet = set([source])
else:
excludeSet.add(source)# won't allow a path starting at source to go through source again.
if maxLength == 0:
excludeSet.remove(source)
return []
else:
if G.has_edge(source,target):
paths=[[(source,target)]]
else:
paths = []
if G.has_edge(target,source):
paths.append([(target,source)])
#neighbors_iter is a big iterator that will give (neighbor,edge) for each successor of source and then for each predecessor of source.
neighbors_iter = itertools.chain(((neighbor,(source,neighbor)) for neighbor in G.successors_iter(source) if neighbor != target),((neighbor,(neighbor,source)) for neighbor in G.predecessors_iter(source) if neighbor != target))
#note that if a neighbor is both a predecessor and a successor, it shows up twice in this iteration.
paths.extend( [[edge] + path for (neighbor,edge) in neighbors_iter if neighbor not in excludeSet for path in getPaths(G,neighbor,target,maxLength-1,excludeSet)] )
excludeSet.remove(source) #when we move back up the recursion, don't want to exclude this source any more
return paths
G=nx.DiGraph()
G.add_edges_from([(1,2),(2,3),(1,3),(1,4),(3,4),(4,3)])
print getPaths(G,1,3,2)
>[[(1, 3)], [(1, 2), (2, 3)], [(1, 4), (4, 3)], [(1, 4), (3, 4)]]
I would expect that by modifying the dijkstra algorithm in networkx you'll arrive at a more efficient algorithm (note that the dijkstra algorithm has a cutoff, but by default it's only going to return the shortest path, and it's going to follow edge direction).
Here's an alternative version of the whole paths.extend thing:
paths.extend( [[edge] + path for (neighbor,edge) in neighbors_iter if neighbor not in excludeSet for path in getPaths(G,neighbor,target,maxLength-1,excludeSet) if len(path)>0 ] )
I would recommend using Gephi easy to handle and learn.
If you found it though Neo4j will do your requirement with a little bit of coding.

In Python, Given a list of tuple, generate a list whose elements are sum of elements of contained tuples

Given a list of tuple, generate a list whose elements are sum of elements of contained tuples.
E.g. Input: [(1, 7), (1, 3), (3, 4, 5), (2, 2)]
Output: [8, 4, 12, 4]
This is a simple question and one with basic knowledge in python can do it.
a=input('Enter the list of tuples')
b=[]
for i in range(len(a)):
b.append(sum(a[i]))
I have not checked for simple answers. You can please check for them. And please do use python shell as you can easily find solutions of python codes in it.