How do I order the contents of a txt file - python-2.7

Hi this is the code I have at the moment which orders the following string by the age of the students 10,12,15 but how do I order it alphabetically starting with d ECT
>>> student_tuples = [
('john', 'A', 15),
('jane', 'B', 12),
('dave', 'B', 10),
]
>>> sorted(student_tuples, key=lambda student: student[2])
[('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)]

Just sort without any key:
sorted(student_tuples)
If the names can be lower case and capitalized use:
sorted(student_tuples, key=lambda student: student[0].lower())

Related

TypeError: slice indices must be integers or None or have an __index__ method in python 2.7

for file in zip(frames_list[-round(0.2 * len(frames_list)):], masks_list[-round(0.2 * len(masks_list)):]):
# Convert tensors to numpy arrays
frame = frame_batches.next().numpy().astype(np.uint8)
mask = mask_batches.next().numpy().astype(np.uint8)
# Convert numpy arrays to images
frame = Image.fromarray(frame)
mask = Image.fromarray(mask)
# Save frames and masks to correct directories
frame.save(DATA_PATH + '{}_frames/{}'.format(dir_name, dir_name) + '/' + file[0])
mask.save(DATA_PATH + '{}_masks/{}'.format(dir_name, dir_name) + '/' + file[1])
print("Saved {} frames to directory {}".format(len(frames_list), DATA_PATH))
print("Saved {} masks to directory {}".format(len(masks_list), DATA_PATH))
Traceback
Traceback (most recent call last):
File "/home/khawar/Desktop/Khawar_Seg/main.py", line 190, in <module>
generate_image_folder_structure(frame_tensors, masks_tensors, frames_list, masks_list)
File "/home/khawar/Desktop/Khawar_Seg/main.py", line 173, in generate_image_folder_structure
for file in zip(frames_list[-round(0.2 * len(frames_list)):], masks_list[-round(0.2 * len(masks_list)):]):
TypeError: slice indices must be integers or None or have an __index__ method
The round function in python 2.7 returns a float type, but the sequence slice is expecting an int as the argument.
>>> type(round(2.0))
<type 'float'>
>>> items = [0, 1, 2, 3, 4]
>>> items[round(2.0):]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: slice indices must be integers or None or have an __index__ method
# If we cast the rounded index to an int, it will work
>>> index = int(round(2.0))
>>> type(index)
<type 'int'>
>>> items[int(round(2.0)):]
[2, 3, 4]
So, for your example code, you need to cast the indices to integers before using them in the slice -- the [<start>:<end>] part of your for loop.
frames_index = -int(round(0.2 * len(frames_list)))
masks_index = -int(round(0.2 * len(masks_list)))
for file in zip(frames_list[frames_index:], masks_list[masks_index:]):
...
To make things easier to read, I suggest that you use a function to make your index numbers:
def get_list_index(list_size): # list_size is the len(<list>)
float_value = round(0.2 * list_size)
return -int(float_value)
frames_index = get_list_index(len(frames_list))
masks_index = get_list_index(len(masks_list))
for file in zip(frames_list[frames_index:], masks_list[masks_index:]):
...
Edit:
To answer the question in your comment:
what is the meaning of : in for file in zip(frames_list[-round(0.2 * len(frames_list)):]?
The : separates the start index from the end index in python slice notation.
For instance, if you have the list ['a', 'b', 'c', 'd', 'e'] and you wanted to get only the portion from 'b' through 'd', you would use a slice starting from 1 and ending on 4 -- _1 more than 'd''s index.
>>> ['a', 'b', 'c', 'd', 'e'][1:4]
['b', 'c', 'd']
Python lets you use negative indexing, so you can count back from the right side. We could write the same slice using -1 instead of 3:
>>> ['a', 'b', 'c', 'd', 'e'][1:-1]
['b', 'c', 'd']
If we wanted to have all of the items in the list, starting at 'b' and going through the end, we can change our right index to None or just leave it out:
>>> ['a', 'b', 'c', 'd', 'e'][1:None]
['b', 'c', 'd', 'e']
>>> ['a', 'b', 'c', 'd', 'e'][1:]
['b', 'c', 'd', 'e']

CSV reader putting /n after each row

I have generated a CSV file from excel.
I am trying to read this CSV file using python CSV. However after each row I get /n. How to remove this /n.
Here is my code:
with open('/users/ps/downloads/test.csv','rU') as csvfile
spamreader = csv.reader(csvfile,dialect=csv.excel_tab)
a = []
for row in csvfile:
a.append(row)
print a
I get result like this:
['HEADER\n', 'a\n', 'b\n', 'c\n', 'd\n', 'e']
I want to have results like this:
['HEADER', 'a', 'b', 'c', 'd', 'e']
you could try a replace
a.replace('\n','')
edit:
working verison- a.append(row.replace('\n',''))
You can use strip
x = ['HEADER\n', 'a\n', 'b\n', 'c\n', 'd\n', 'e']
In [6]: def f(word):
...: return word.strip()
...:
In [7]: map(f, x)
Out[7]: ['HEADER', 'a', 'b', 'c', 'd', 'e']
In [8]:

Filter and limit on a python dictionary

Given:
obj = {}
obj['a'] = ['x', 'y', 'z']
obj['b'] = ['x', 'y', 'z', 'u', 't']
obj['c'] = ['x']
obj['d'] = ['y', 'u']
How do you select (e.g. print) the top 2 entries in this dictionary, sorted by the length of each list?
the top 2 entries in this dictionary, sorted by the length of each
list
print(sorted(obj.values(), key=len)[:2])
The output:
[['x'], ['y', 'u']]

Counter in Python

Is there a way to collect values in Counter with respect to occurring number?
Example:
Let's say I have a list:
list = ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd']
When I do the Counter:
Counterlist = Counter(list)
I'll get:
Counter({'b': 3, 'a': 3, 'c': 3, 'd': 2})
Then I can select let's say a and get 3:
Counterlist['a'] = 3
But how I can select the occurring number '3'?
Something like:
Counterlist[3] = ['b', 'a', 'c']
Is that possible?
You can write the following
import collections
my_data = ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd']
result = collections.defaultdict(list)
for k, v in collections.Counter(my_data).items():
result[v].append(k)
and then access result[3] to obtain the characters with that count.

Algorithm for sorting an array in a custom way

I was looking for an algorithm for sorting an array in a custom way but I didn't succeed in finding the proper solution to my problem. I'll describe the code in Django-like syntax but it's not necessary to limit a solution only for Django.
Let's suppose I have the following models (classes):
class Website(models.Model):
...
class Offer(models.Model):
website = models.ForeignKey(Website, on_delete=models.CASCADE)
...
And let's suppose I have the following instances:
Offer 1 -> Website A
Offer 2 -> Website B
Offer 3 -> Website B
Offer 4 -> Website B
Offer 5 -> Website C
Offer 6 -> Website A
Offer 7 -> Website A
Offer 8 -> Website C
This instances form a sequence (array):
sequence = [Offer 1, Offer 2, Offer 3, Offer 4, Offer 5, Offer 6, Offer 7, Offer 8]
I need to sort the sequence in the way where Offers with the same Website cannot stand one after another nevertheless the original order should stay as same as possible.
So the sorted sequence should look this way:
sequence = [Offer 1, Offer 2, Offer 5, Offer 3, Offer 6, Offer 4, Offer 7, Offer 8]
Positive Examples:
Website A, Website B, Website A, Website C, Website A
Website A, Website B, Website C, Website B, Website C
Website A, Website B, Website A, Website B, Website A
Negative Examples:
Website A, Website B, Website B, Website A, Website B, ...
Website B, Website C, Website A, Website A, Website B, ...
Website B, Website C, Website A, Website C, Website C, ...
Thanks for any suggestion.
Try this:
def sort_custom(offers):
sorted_offers, sorted_count, index = [], len(offers), 0
while sorted_count > 0:
item = offers[index]
if not sorted_offers or sorted_offers[-1] != item:
sorted_offers.append(item)
sorted_count -= 1
del offers[index]
if index > 0: index = 0
else:
if index < len(offers) - 1:
index += 1
else:
sorted_offers += offers
break
return sorted_offers
Usage:
>> lst = ['A', 'B', 'B', 'B', 'C', 'A', 'A', 'C']
>> sort_custom(lst)
['A', 'B', 'C', 'B', 'A', 'B', 'A', 'C']
>> lst2 = ['C', 'A', 'C', 'A', 'C', 'A', 'A', 'A']
>> sort_custom(lst2)
['C', 'A', 'C', 'A', 'C', 'A', 'A', 'A']
timing:
>> # for lst = ['A', 'B', 'B', 'B', 'C', 'A', 'A', 'C']
>> timer.repeat(3, 2000000)
[0.4880218505859375, 0.4770481586456299, 0.4776880741119385]
This should work:
def gen_best_order(orig):
last = None
while len(orig) > 0:
deli = None
for i, m in enumerate(orig):
if m.website != last.website:
last = m
deli = i
yield m
break
if deli is None:
last = orig[0]
yield orig[0]
deli = 0
del orig[deli]
ordered = list(gen_best_order(sequence))
This is a generator that will try and yield elements in order, but if the next element equals the last element yielded, it will skip it. If it gets to the end of the list and there is no way to yield something that doesn't equal the previous, it just yields it anyway.
Here's an example of it working on a list of numbers:
def gen_best_order(orig):
last = None
while len(orig) > 0:
deli = None
for i, m in enumerate(orig):
if m != last:
last = m
deli = i
yield m
break
if deli is None:
last = orig[0]
yield orig[0]
deli = 0
del orig[deli]
nums = [1,2,3,3,4,5,5]
print 'orig:', nums
print 'reordered:', list(gen_best_order(nums))
This prints:
orig: [1, 2, 3, 3, 4, 5, 5]
reordered: [1, 2, 3, 4, 3, 5, 5]