redis: matching partial keys of hash - python-2.7

In a hash, I have a bunch of keys-values pairs
my keys are in the following format: name:city
john:newyork
kate:chicago
lisa:atlanta
Im using python to access redis and in https://redis-py.readthedocs.org/en/latest/, i dont see any hash operations that does the partial matching
i would like to be able to get all keys in the hash with a city name
is that possible?

It is possible, but not with HASH objects, but with sorted sets. As long as all elements in a sorted set have the same score, you can do lexicographical prefix matching.
let's say you do the following (raw redis commands, but the same applies with the python client):
ZADD foo 0 john:newyork:<somevalue>
ZADD foo 0 john:chicago:<somevalue>
ZADD foo 0 kate:chicago:<somevalue>
....
You can then query by using ZRANGEBYLEX:
ZRANGEBYLEX foo [john: (john:\xff
will give you all entries that start with john, and you can extract the value with regular expressions or splitting.
Note that this is a prefix search and not suffix search. if you want "all entries in new york" you need to reverse the order in the sorted set.

I was able to achieve matching hash keys partially by:
pool = redis.ConnectionPool(host='localhost', port=6379, db=0)
r = redis.StrictRedis(connection_pool=pool)
cmd = "hscan <hashname> 0 match *:atlanta"
print r.execute_command(cmd)

Related

Sort nested dictionary in ascending order and grab outer key?

I have a dictionary that looks like:
dictionary = {'article1.txt': {'harry': 3, 'hermione': 2, 'ron': 1},
'article2.txt': {'dumbledore': 1, 'hermione': 3},
'article3.txt': {'harry': 5}}
And I'm interested in picking the article with the most number of occurences of Hermione. I already have code that selects the outer keys (article1.txt, article2.txt) and inner key hermione.
Now I want to be able to have code that sorts the dictionary into a list of ascending order for the highest number occurrences of the word hermione. In this case, I want a list such that ['article1.txt', 'article2.txt']. I tried it with the following code:
#these keys are generated from another part of the program
keys1 = ['article1.txt', 'article2.txt']
keys2 = ['hermione', 'hermione']
place = 0
for i in range(len(keys1)-1):
for j in range(len(keys2)-1):
if articles[keys1[i]][keys2[j]] > articles[keys1[i+1]][keys2[j+1]]:
ordered_articles.append(keys1[i])
place += 1
else:
ordered_articles.append(place, keys1[i])
But obviously (I'm realizing now) it doesn't make sense to iterate through the keys to check if dictionary[key] > dictionary[next_key]. This is because we would never be able to compare things not in sequence, like dictionary[key[1]] > dictionary[key[3]].
Help would be much appreciated!
It seems that what you're trying to do is sort the articles by the amount of 'hermiones' in them. And, python has a built-in function that does exactly that (you can check it here). You can use it to sort the dictionary keys by the amount of hermiones each of them points to.
Here's a code you can use as example:
# filters out articles without hermione from the dictionary
# value here is the inner dict (for example: {'harry': 5})
dictionary = {key: value for key, value in dictionary.items() if 'hermione' in value}
# this function just returns the amount of hermiones in an article
# it will be used for sorting
def hermione_count(key):
return dictionary[key]['hermione']
# dictionary.keys() is a list of the keys of the dictionary (the articles)
# key=... here means we use hermione_count as the function to sort the list
article_list = sorted(dictionary.keys(), key=hermione_count)

Applying regexp and finding the highest number in a list

I have got a list of different names. I have a script that prints out the names from the list.
req=urllib2.Request('http://some.api.com/')
req.add_header('AUTHORIZATION', 'Token token=hash')
response = urllib2.urlopen(req).read()
json_content = json.loads(response)
for name in json_content:
print name['name']
Output:
Thomas001
Thomas002
Alice001
Ben001
Thomas120
I need to find the max number that comes with the name Thomas. Is there a simple way to to apply regexp for all the elements that contain "Thomas" and then apply max(list) to them? The only way that I have came up with is to go through each element in the list, match regexp for Thomas, then strip the letters and put the remaining numbers to a new list, but this seems pretty bulky.
You don't need regular expressions, and you don't need sorting. As you said, max() is fine. To be safe in case the list contains names like "Thomasson123", you can use:
names = ((x['name'][:6], x['name'][6:]) for x in json_content)
max(int(b) for a, b in names if a == 'Thomas' and b.isdigit())
The first assignment creates a generator expression, so there will be only one pass over the sequence to find the maximum.
You don't need to go for regex. Just store the results in a list and then apply sorted function on that.
>>> l = ['Thomas001',
'homas002',
'Alice001',
'Ben001',
'Thomas120']
>>> [i for i in sorted(l) if i.startswith('Thomas')][-1]
'Thomas120'

how to neatly print a dictionary

I have a dict which contains string keys of different lengths.
I want to obtain the following result when printing the dictionary:
'short-key' value1
'short-key2 value2
...
'little-longer-key' valueXX
...
'very-very-long-keeey' valueXXXX
Until now I've been doing something like this:
for key,value in dict.iteritems():
print key," "*(80-len(key)),value
PROBLEMS:
I don't like it. Doesn't really seem pythonic
80 is a usually-big-enough number randomly chosen. But sometimes it may happen the key is longer than that, therefore the " "*(80-len(key)) is useless
You will have to iterate twice to get the length of the longest key. List comprehensions can make that nicer. My personal preference is to only iterate on the keys of the dictionary and then do a lookup:
padded_width = max(len(x) for x in my_dict.iterkeys()) + 1
for key in my_dict:
print(key.ljust(padded_width) + my_dict[key])
Here's a fancier version that allows more control over the padding and uses string formatting:
SPACE_BETWEEN_KEYS_AND_VALUES = 1
MINIMUM_PADDING = 10
padded_width = max(MINIMUM_PADDING, max(len(x) for x in my_dict.iterkeys()) + SPACE_BETWEEN_KEYS_AND_VALUES)
for key in my_dict:
print("{key: <{width}}{value}".format(key=key, width=padded_width, value=my_dict[key]))
I think I prefer the string concatenation of the first example, personally.

How not to order a list of pk's in a query?

I have a list of pk's and I would like to get the result in the same order that my list is defined... But the order of the elements is begging changed. How any one help me?
print list_ids
[31189, 31191, 31327, 31406, 31352, 31395, 31309, 30071, 31434, 31435]
obj_opor=Opor.objects.in_bulk(list_ids).values()
for o in obj_oportunidades:
print o
31395 31435 31434 30071 31309 31406 31189 31191 31352 31327
This object should be used in template to show some results to the user... But how you can see, the order is different from the original list_ids
Would have been nice to have this feature in SQL - sorting by a known list of values.
Instead, what you could do is:
obj_oportunidades=Opor.objects.in_bulk(list_ids).values()
all_opor = []
for o in obj_oportunidades:
print o
all_opor.append(o)
for i in list_ids:
if i in all_opor:
print all_opor.index(i)
Downside is that you have to get all the result rows first and store them before getting them in the order you want. (all_opor could be a dictionary above, with the table records stored in the values and the PKeys as dict keys.)
Other way, create a temp table with (Sort_Order, Pkey) and add that to the query:
Sort_Order PKey
1 31189
2 31191
...
So when you sort on Sort_Order and Opor.objects, you'll get Pkeys it in the order you specify.
I found a solution in: http://davedash.com/2010/02/11/retrieving-elements-in-a-specific-order-in-django-and-mysql/ it's suited me perfectly.
ids = [a_list, of, ordered, ids]
addons = Addon.objects.filter(id__in=ids).extra(
select={'manual': 'FIELD(id,%s)' % ','.join(map(str,ids))},
order_by=['manual'])
This code do something similiar to MySQL "ORDER BY FIELD".
This guy: http://blog.mathieu-leplatre.info/django-create-a-queryset-from-a-list-preserving-order.html
Solved the problem for both MySQL and PostgreSQL!
If you are using PostgreSQL go to that page.

Comapring dictionary with list type values

I have the following 2 dictionaries,
d1={"aa":[1,2,3],"bb":[4,5,6],"cc":[7,8,9]}
d2={"aa":[1,2,3],"bb":[1,1,1,1,1,1],"cc":[7,8]}
How could I compare these two dictionaries and get the
positions(indexes) of UNMATCHED key value pairs? since I am dealing
with files of size around 2 GB, the dictionaries contain very large
data. How can this be implemented in optimized way?
def getUniqueEntry(dictionary1, dictionary2, listOfKeys):
assert sorted(dictionary1.keys()) == sorted(dictionary2.keys()), "Keys don't match" #check that they have the same keys
for key in dictionary1:
if dictionary1[key] != dictionary2[key]:
listOfKeys.append(key)
When calling the function, the third param listOfKeys is an empty list where you want the keys to be stored. Note that reading 2 gb worth of data into a dict requires alot of ram and will most likely fail.
and this is a more pythonic way: The list expansion will consider just the values that are not equal in both dictionaries:
diffrent_keys = [key for key in d1 if d1[key] != d2[key] ]