Groovy groupby List generated from Map values - list

This is the List that I have
List a = ["Sep->Day02->FY21;Inter01","Sep->Day02->FY21;Inter02","Sep->Day02->FY21;Inter03","Sep->Day01->FY21;Inter18","Sep->Day01->FY21;Inter19"]
I am trying to group this and generate a new string
Expected result
"Sep->Day02->FY21",Inter01:Inter03
"Sep->Day01->FY21",Inter18:Inter19
Tried to do this
List a = ["Sep->Day02->FY21;Inter01","Sep->Day02->FY21;Inter02","Sep->Day02->FY21;Inter03","Sep->Day01->FY21;Inter18","Sep->Day01->FY21;Inter19"]
List b = []
a.each{
b.add(it.split(";"))
}
def c = b.groupBy{it[0]}
println c
c.each{
k, v -> println "${v}"
}
I cant find a way to get the range of Inter01:Inter03 in the string. Please advice.
EDIT
The solution provided by Marmite works in the groovy console as expected. The list I am generating is from values in a map.
a.add(map[z]) Where z is the key.
When I am trying to use it, it gives me max and min method not found errors.
Tried using map[z].toString(). Still the same. Is the fact that the values are from a map affecting the same?
Code Snippet
Below is how I generate the map
def map = [:]
itr.each{
def Per = it.getMemberName("Period") //getMemberName is a product specific function . Sample output May
def Day = it.getMemberName("Day") //Day01 sample output
def Hour = it.getMemberName("Hour") //Interval01 sample output
def HourInt = it.getMemberName("Hour").reverse().take(2).reverse()
def Year = it.getMemberName("Years")
map.put(it.DataAsDate.format("dd/MM/yyyy")+"-"+Hour,Year+"->"+Per+"->"+Day+";"+HourInt)
}
Below is where I generate the List
def OSTinterval = OST.reverse().take(2).reverse() as Integer //This creates 01 out of Interval01
def OETinterval = OET.reverse().take(2).reverse() as Integer //This creates 03 out of Interval03
D1 = new Date(OSDDay)
D2 = new Date(OEDDay)
if (D1 == D2)
{
(OSTinterval..OETinterval).each
{inter ->
z = OSDDay+"-"+"Interval"+inter.toString().padLeft(2,'0')
coll.add(map[z].toString())
}
}
else
{
(D1..D2).each {
if (it == D1){
(OSTinterval..48).each
{inter ->
z = OSDDay+"-"+"Interval"+inter.toString().padLeft(2,'0')
coll.add(map[z].toString())
}

Basically you are looking for min and max after grouping.
I'm using a bit simplified data to focus on the processing:
def a = ['D02;I01','D02;I02','D02;I03','D01;I18','D01;I18']
println a.collect{it.split(";")}
.groupBy{it[0]}
.collect{k,v -> [k,v.collect {it[1]}]}
.collect{[it[0],"${it[1].min()}:${it[1].max()}"]}
.collect{it.join(',')}
returns a list of the keys with the min and max of the values
[D02,I01:I03, D01,I18:I18]
result after groupBy
[D02:[[D02, I01], [D02, I02], [D02, I03]], D01:[[D01, I18], [D01, I18]]]
The next collect removes the duplicated keys
[[D02, [I01, I02, I03]], [D01, [I18, I18]]]
Finally you find the min and max of the lists
[[D02, I01:I03], [D01, I18:I18]]

Related

Create multiple lists and store them into a dictionary Python

Here's my situation. I got two lists:
A list which comes from DF column (OS_names)
A list with the unique values of that column (OS_values)
OS_name = df['OS_name'].tolist()
OS_values = df.OS_name.unique().tolist()
I want to create several lists (one per value in OS_values) like this :
t = []
for i in range(0,len(OS_name)-1):
if OS_values[0] == OS_name[i]:
t.append(1)
else:
t.append(0)
I want to create a list per each value in OS_value, then store them into a dictionary, and at the end creating a df from my dictionary.
If you're able to insert the value as the key it would be great, but not necessary.
I read the defaultdict may be helpful but I cannot find a way to use it.
thanks for the help and have a great day!
I did it at the very end.
dict_stable_feature = dict()
for col in df_stable:
t1 = df[col].tolist()
t2 = df[col].unique().tolist()
for value in t2:
t = []
for i in range (0, len(t1)-1):
if value == t1[i]:
t.append(1)
else:
t.append(0)
cc = str(col)
vv = "_" + str(value)
cv = cc + vv
dict_stable_feature[cv] = t

Multiple values to same key in a Groovy Map

I am new to both coding and Groovy. I have a requirement to populate a data map values based on the values in a list but with a matching criterion. For e.g. say the 2nd character of the list value is equal to 2, then map it to "Number2" key in the data map. Likewise, I may end up having multiple list values matching this criterion. I am struggling with the below code - it works but it is always picking up the last occurrence of matching value in the list. What I understand is you can only have one unique key-value pair in the map. But is there any other way of achieving this? Sorry, I'm a total rookie here. All the help is appreciated. Thank you!
def map = [:]
def ent = ['123','133','124','143','125']
ent.each{
println it.charAt(1)
}
ent.each{
if(it.charAt(1) == '2'){
println it.charAt(1)
println "is in entity $it"
map['Number2'] = it
map.each{ k, v -> println "${k}:${v}" }
}
}
Expected Result:
['Number2':[123,124,125],'Number3':[133],'Number4':[143]]
I'm probably getting the wrong end of the stick, but do you mean:
def ent = ['123','133','124','143','125']
def map = ent.groupBy { "Number${it.charAt(1)}" }
Edit, with a pre-filter step
def ent = ['123','133','124','143','125']
def map = ent.findAll { it.charAt(1) in ['2', '3'] }
.groupBy { "Number${it.charAt(1)}" }

How do I extract part of a tuple that's duplicate as key to a dictionary, and have the second part of the tuple as value?

I'm pretty new to Python and Qgis, right now I'm just running scripts but I my end-goal is to create a plugin.
Here's the part of the code I'm having problems with:
import math
layer = qgis.utils.iface.activeLayer()
iter = layer.getFeatures()
dict = {}
#iterate over features
for feature in iter:
#print feature.id()
geom = feature.geometry()
coord = geom.asPolyline()
points=geom.asPolyline()
#get Endpoints
first = points[0]
last = points[-1]
#Assemble Features
dict[feature.id() ]= [first, last]
print dict
This is my result :
{0L: [(355277,6.68901e+06), (355385,6.68906e+06)], 1L: [(355238,6.68909e+06), (355340,6.68915e+06)], 2L: [(355340,6.68915e+06), (355452,6.68921e+06)], 3L: [(355340,6.68915e+06), (355364,6.6891e+06)], 4L: [(355364,6.6891e+06), (355385,6.68906e+06)], 5L: [(355261,6.68905e+06), (355364,6.6891e+06)], 6L: [(355364,6.6891e+06), (355481,6.68916e+06)], 7L: [(355385,6.68906e+06), (355501,6.68912e+06)]}
As you can see, many of the lines have a common endpoint:(355385,6.68906e+06) is shared by 7L, 4L and 0L for example.
I would like to create a new dictionary, fetching the shared points as a key, and having the second points as value.
eg : {(355385,6.68906e+06):[(355277,6.68901e+06), (355364,6.6891e+06), (355501,6.68912e+06)]}
I have been looking though list comprehension tutorials, but without much success: most people are looking to delete the duplicates, whereas I would like use them as keys (with unique IDs). Am I correct in thinking set() would still be useful?
I would be very grateful for any help, thanks in advance.
Maybe this is what you need?
dictionary = {}
for i in dict:
for j in dict:
c = set(dict[i]).intersection(set(dict[j]))
if len(c) == 1:
# ok, so now we know, that exactly one tuple exists in both
# sets at the same time, but this one will be the key to new dictionary
# we need the second tuple from the set to become value for this new key
# so we can subtract the key-tuple from set to get the other tuple
d = set(dict[i]).difference(c)
# Now we need to get tuple back from the set
# by doing list(c) we get list
# and our tuple is the first element in the list, thus list(c)[0]
c = list(c)[0]
dictionary[c] = list(d)[0]
else: pass
This code attaches only one tuple to the key in dictionary. If you want multiple values for each key, you can modify it so that each key would have a list of values, this can be done by simply modifying:
# some_value cannot be a set, it can be obtained with c = list(c)[0]
key = some_value
dictionary.setdefault(key, [])
dictionary[key].append(value)
So, the correct answer would be:
dictionary = {}
for i in a:
for j in a:
c = set(a[i]).intersection(set(a[j]))
if len(c) == 1:
d = set(a[i]).difference(c)
c = list(c)[0]
value = list(d)[0]
if c in dictionary and value not in dictionary[c]:
dictionary[c].append(value)
elif c not in dictionary:
dictionary.setdefault(c, [])
dictionary[c].append(value)
else: pass
See this code :
dict={0L: [(355277,6.68901e+06), (355385,6.68906e+06)], 1L: [(355238,6.68909e+06), (355340,6.68915e+06)], 2L: [(355340,6.68915e+06), (355452,6.68921e+06)], 3L: [(355340,6.68915e+06), (355364,6.6891e+06)], 4L: [(355364,6.6891e+06), (355385,6.68906e+06)], 5L: [(355261,6.68905e+06), (355364,6.6891e+06)], 6L: [(355364,6.6891e+06), (355481,6.68916e+06)], 7L: [(355385,6.68906e+06), (355501,6.68912e+06)]}
dictionary = {}
list=[]
for item in dict :
list.append(dict[0])
list.append(dict[1])
b = []
[b.append(x) for c in list for x in c if x not in b]
print b # or set(b)
res={}
for elm in b :
lst=[]
for item in dict :
if dict[item][0] == elm :
lst.append(dict[item][1])
elif dict[item][1] == elm :
lst.append(dict[item][0])
res[elm]=lst
print res

How to sort python lists due to certain criteria

I would like to sort a list or an array using python to achive the following:
Say my initial list is:
example_list = ["retg_1_gertg","fsvs_1_vs","vrtv_2_srtv","srtv_2_bzt","wft_3_btb","tvsrt_3_rtbbrz"]
I would like to get all the elements that have 1 behind the first underscore together in one list and the ones that have 2 together in one list and so on. So the result should be:
sorted_list = [["retg_1_gertg","fsvs_1_vs"],["vrtv_2_srtv","srtv_2_bzt"],["wft_3_btb","tvsrt_3_rtbbrz"]]
My code:
import numpy as np
import string
example_list = ["retg_1_gertg","fsvs_1_vs","vrtv_2_srtv","srtv_2_bzt","wft_3_btb","tvsrt_3_rtbbrz"]
def sort_list(imagelist):
# get number of wafers
waferlist = []
for image in imagelist:
wafer_id = string.split(image,"_")[1]
waferlist.append(wafer_id)
waferlist = set(waferlist)
waferlist = list(waferlist)
number_of_wafers = len(waferlist)
# create list
sorted_list = []
for i in range(number_of_wafers):
sorted_list.append([])
for i in range(number_of_wafers):
wafer_id = waferlist[i]
for image in imagelist:
if string.split(image,"_")[1] == wafer_id:
sorted_list[i].append(image)
return sorted_list
sorted_list = sort_list(example_list)
works but it is really awkward and it involves many for loops that slow down everything if the lists are large.
Is there any more elegant way using numpy or anything?
Help is appreciated. Thanks.
I'm not sure how much more elegant this solution is; it is a bit more efficient. You could first sort the list and then go through and filter into final set of sorted lists:
example_list = ["retg_1_gertg","fsvs_1_vs","vrtv_2_srtv","srtv_2_bzt","wft_3_btb","tvsrt_3_rtbbrz"]
sorted_list = sorted(example_list, key=lambda x: x[x.index('_')+1])
result = [[]]
current_num = sorted_list[0][sorted_list[0].index('_')+1]
index = 0
for i in example_list:
if current_num != i[i.index('_')+1]:
current_num = i[i.index('_')+1]
index += 1
result.append([])
result[index].append(i)
print result
If you can make assumptions about the values after the first underscore character, you could clean it up a bit (for example, if you knew that they would always be sequential numbers starting at 1).

find all ocurrences inside a list

I'm trying to implement a function to find occurrences in a list, here's my code:
def all_numbers():
num_list = []
c.execute("SELECT * FROM myTable")
for row in c:
num_list.append(row[1])
return num_list
def compare_results():
look_up_num = raw_input("Lucky number: ")
occurrences = [i for i, x in enumerate(all_numbers()) if x == look_up_num]
return occurrences
I keep getting an empty list instead of the ocurrences even when I enter a number that is on the mentioned list.
Your code does the following:
It fetches everything from the database. Each row is a sequence.
Then, it takes all these results and adds them to a list.
It returns this list.
Next, your code goes through each item list (remember, its a sequence, like a tuple) and fetches the item and its index (this is what enumerate does).
Next, you attempt to compare the sequence with a string, and if it matches, return it as part of a list.
At #5, the script fails because you are comparing a tuple to a string. Here is a simplified example of what you are doing:
>>> def all_numbers():
... return [(1,5), (2,6)]
...
>>> lucky_number = 5
>>> for i, x in enumerate(all_numbers()):
... print('{} {}'.format(i, x))
... if x == lucky_number:
... print 'Found it!'
...
0 (1, 5)
1 (2, 6)
As you can see, at each loop, your x is the tuple, and it will never equal 5; even though actually the row exists.
You can have the database do your dirty work for you, by returning only the number of rows that match your lucky number:
def get_number_count(lucky_number):
""" Returns the number of times the lucky_number
appears in the database """
c.execute('SELECT COUNT(*) FROM myTable WHERE number_column = %s', (lucky_number,))
result = c.fetchone()
return result[0]
def get_input_number():
""" Get the number to be searched in the database """
lookup_num = raw_input('Lucky number: ')
return get_number_count(lookup_num)
raw_input is returning a string. Try converting it to a number.
occurrences = [i for i, x in enumerate(all_numbers()) if x == int(look_up_num)]