Applying regexp and finding the highest number in a list - python-2.7

I have got a list of different names. I have a script that prints out the names from the list.
req=urllib2.Request('http://some.api.com/')
req.add_header('AUTHORIZATION', 'Token token=hash')
response = urllib2.urlopen(req).read()
json_content = json.loads(response)
for name in json_content:
print name['name']
Output:
Thomas001
Thomas002
Alice001
Ben001
Thomas120
I need to find the max number that comes with the name Thomas. Is there a simple way to to apply regexp for all the elements that contain "Thomas" and then apply max(list) to them? The only way that I have came up with is to go through each element in the list, match regexp for Thomas, then strip the letters and put the remaining numbers to a new list, but this seems pretty bulky.

You don't need regular expressions, and you don't need sorting. As you said, max() is fine. To be safe in case the list contains names like "Thomasson123", you can use:
names = ((x['name'][:6], x['name'][6:]) for x in json_content)
max(int(b) for a, b in names if a == 'Thomas' and b.isdigit())
The first assignment creates a generator expression, so there will be only one pass over the sequence to find the maximum.

You don't need to go for regex. Just store the results in a list and then apply sorted function on that.
>>> l = ['Thomas001',
'homas002',
'Alice001',
'Ben001',
'Thomas120']
>>> [i for i in sorted(l) if i.startswith('Thomas')][-1]
'Thomas120'

Related

How to create new column that parses correct values from a row to a list

I am struggling on creating a formula with Power Bi that would split a single rows value into a list of values that i want.
So I have a column that is called ID and it has values such as:
"ID001122, ID223344" or "IRRELEVANT TEXT ID112233, MORE IRRELEVANT;ID223344 TEXT"
What is important is to save the ID and 6 numbers after it. The first example would turn into a list like this: {"ID001122","ID223344"}. The second example would look exactly the same but it would just parse all the irrelevant text from between.
I was looking for some type of an loop formula where you could use the text find function to find ID starting point and use middle function to extract 8 characters from the start but I had no progress in finding such. I tried making lists from comma separator but I noticed that not all rows had commas to separate IDs.
The end results would be that the original value is on one column next to the list of parsed values which then could be expanded to new rows.
ID Parsed ID
"Random ID123456, Text;ID23456" List {"ID123456","ID23456"}
Any of you have former experience?
Hey I found the answer by myself using a good article similar to my problem.
Here is my solution without any further text parsing which i can do later on.
each let
PosList = Text.PositionOf([ID],"ID",Occurrence.All),
List = List.Transform(PosList, (x) => Text.Middle([ID],x,8))
in List
For example this would result "(ID343137,ID352973) ID358388" into {ID343137,ID352973,ID358388}
Ended up being easier than I thought. Suppose the solution relied again on the lists!

Why does random.sample() add square brackets and single quotes to the item sampled?

I'm trying to sample an item (which is one of the keys in a dictionary) from a list and later use the index of that item to find its corresponding value (in the same dictionary).
questions= list(capitals.keys())
answers= list(capitals.values())
for q in range(10):
queswrite = random.sample(questions,1)
number = questions.index(queswrite)
crtans = answers[number]
Here,capitals is the original dectionary from which the states(keys) and capitals(values) are being sampled.
But,apparently random.sample() method adds square brackets and single quotes to the sampled item and thus prevents it from being used to reference the list containing the corresponding values.
Traceback (most recent call last):
File "F:\test.py", line 30, in
number = questions.index(queswrite)
ValueError: ['Delaware'] is not in list
How can I prevent this?
random.sample() returns a list, containing the number of elements you requested. See the documentation:
Return a k length list of unique elements chosen from the population sequence or set. Used for random sampling without replacement.
If you wanted to pick just one element, you don't want a sample however, you wanted to choose just one. For that you'd use the random.choice() function instead:
question = random.choice(questions)
However, given that you are using a loop, you probably really wanted to get 10 unique questions. Don't use a loop over range(10), instead pick a sample of 10 random questions. That's exactly what random.sample() would do for you:
for question in random.sample(questions, 10):
# pick the answer for this question.
Next, putting both keys and values into two separate lists, then using the index of one to find the other is... inefficient and unnecessary; the keys you pick can be used directly to find the answers:
questions = list(capitals)
for question in random.sample(questions, 10):
crtans = capitals[question]

How to subtract unkown strings from list in python

I am trying to write a program that you say to it, from now on call me Jason, then will convert it into a list and subtract everything but Jason from the list. I managed to make this but, i want it to subtract words that aren't in there but would be able to if they were there.
You haven't posted any code, so here is how I would do it.
names = set(['John','Jason','Jim'])
callme = 'Jason'
names.intersection(set([callme]))
Alternatively, with iterators
names = ['John','Jason','Jim']
callme = ['Jason']
[N for N in names if N in callme]

Compare a portion of String value present in 2 Lists

Below code extract a particular value from List srchlist and check for a particular value in List rplzlist. The contents of list srchlist and rplzlist looks like below.
srchlist = ["DD='A'\n", "SOUT='*'\n", 'PGM=FTP\n', 'PGM=EMAIL']
rplzlist = ['A=ZZ.VVMSSB\n', 'SOUT=*\n', 'SALEDB=TEST12']
I am extracting the characters after the '='(equal) sign and within the single quotes using a combination of strip and translate function.
Of the elements in the srchlist only the 'SOUT' matches with the rplzlist.
Do let me know why the below code does not work, also suggest me a better approach to compare a part of string present in the list.
for ele in srchlist:
sYmls = ele.split('=')
vAlue = sYmls[1].translate(None,'\'')
for elem in rplzlist:
rPls = elem.split('=')
if vAlue in rPls:
print("vAlue")
Here is the more pythonic approach for what you wanted to do:
>>> list(set([(i.split('='))[1].translate(None,'\'') for i in srchlist]) & set([j.split('=')[1] for j in rplzlist]))
['*\n']
I used set() and then get the whole output as list, you may use .join().
Inside set(), list comprehension is given which is faster than the normal for loops.
Another Solution Using join(), and replace() in place of translate():
>>> "".join(set([(i.split('='))[1].replace('\'','') for i in srchlist]) & set([j.split('=')[1] for j in rplzlist]))
'*\n'

How to read each element within a tuple from a list

I want to write a program which will read in a list of tuples, and in the tuple it will contain two elements. The first element can be an Object, and the second element will be the quantity of that Object. Just like: Mylist([{Object1,Numbers},{Object2, Numbers}]).
Then I want to read in the Numbers and print the related Object Numbers times and then store them in a list.
So if Mylist([{lol, 3},{lmao, 2}]), then I should get [lol, lol, lol, lmao, lmao] as the final result.
My thought is to first unzip those tuples (imagine if there are more than 2) into two tuples which the first one contains the Objects while the second one contains the quantity numbers.
After that read the numbers in second tuples and then print the related Object in first tuple with the exact times. But I don't know how to do this. THanks for any help!
A list comprehension can do that:
lists:flatten([lists:duplicate(N,A) || {A, N} <- L]).
If you really want printing too, use recursion:
p([]) -> [];
p([{A,N}|T]) ->
FmtString = string:join(lists:duplicate(N,"~p"), " ")++"\n",
D = lists:duplicate(N,A),
io:format(FmtString, D),
D++p(T).
This code creates a format string for io:format/2 using lists:duplicate/2 to replicate the "~p" format specifier N times, joins them with a space with string:join/2, and adds a newline. It then uses lists:duplicate/2 again to get a list of N copies of A, prints those N items using the format string, and then combines the list with the result of a recursive call to create the function result.