Making query from a list with varying amount of values - django

I have for example the following lists:
list1 = ['blue', 'red']
list2 = ['green', 'yellow', 'black']
How could i create a query searching for all values (using postgresql).
This will work fine but it's hardcoded for only two values and therefore not handling the varying amount of values of the lists.
entry.objects.annotate(search=SearchVector('colors')).filter(search=SearchQuery('blue') | SearchQuery('red'))
I tried the following to create a 'search_query' string and place it in my query:
for c in color:
search_string += "SearchQuery('" + c +"') | "
search_string = search_string[:-3]
This will result in the following search_strings
SearchQuery('blue') | SearchQuery('red')
SearchQuery('green') | SearchQuery('yellow') | SearchQuery('black')
If i now put this string in my query, it will not work.
entry.objects.annotate(search=SearchVector('colors')).filter(search=search_string)
I appreciate any help to solve this problem.
Link to django postgres search documentation: https://docs.djangoproject.com/en/1.11/ref/contrib/postgres/search/

I don't know about postgreSQL but:
# Can run this query with SQL (i think django query auto convert for each type of DataBase???)
res = Models.Objects.filter(color__in = list1)
more_res = Models.Objects.filter( Q(color__in = list1) | Q(color__in = list2))
Django query basic (i don't know if it work for you???)

Your search_string is not working because it is, well, a string.
You can use eval to make it work. Try this:
entry.objects.annotate(search=SearchVector('colors')).filter(search=eval(search_string))
It is not the best way though. Take a look at this

Related

Django Exclude Model Objects from query using a list

Having trouble with my query excluding results from a different query.
I have a table - Segment that I have already gotten entries from. It is related
to another table - Program, and I want to also run the same query on it but I want to exclude
any of the programs that were already found during the segment query.
When I try to do it, the list isn't allowed to be used in the comparison... See below:
query = "My Query String"
segment_results = Segment.objects.filter(
Q(title__icontains=query)|
Q(author__icontains=query)|
Q(voice__icontains=query)|
Q(library__icontains=query)|
Q(summary__icontains=query) ).distinct()
# There can be multiple segments in the same program
unique_programs = []
for segs in segment_results:
if segs.program.pk not in unique_programs:
unique_programs.append(segs.program.pk)
program_results = ( (Program.objects.filter(
Q(title__icontains=query) |
Q(library__icontains=query) |
Q(mc__icontains=query) |
Q(producer__icontains=query) |
Q(editor__icontains=query) |
Q(remarks__icontains=query) ).distinct()) &
(Program.objects.exclude(id__in=[unique_programs])))
I can run:
for x in unique_programs:
p = Program.objects.filter(id=x)
print("p = %s" % p)
And I get a list of Programs...which works
Just not sure how to incorporate this type of logic into the results
query...and have it exclude at the same time. I tried exclude keyword,
but the main problem is it doesn't like the list being in the query - I get an
error:
TypeError: int() argument must be a string, a bytes-like object or a number, not 'list'.
Feel like I am close...
The answer is simple, I was not comparing objects correctly in the filters, so
the correct statement would be:
program_results = (Program.objects.filter(
Q(title__icontains=query) |
Q(library__icontains=query) |
Q(mc__icontains=query) |
Q(producer__icontains=query) |
Q(editor__icontains=query) |
Q(remarks__icontains=query) )&
(Program.objects.exclude(id__in=Program.objects.filter(id__in=unique_programs))))

Django filter by query result

In my fixturesquery below you can see I am filtering by the results of the teamsquery, and it works, but only for the first result of the teamsquery. So it only outputs the fixtures for the first userteam__userID=request.user
teamsquery = Team.objects.filter(userteams__userID=request.user)
fixturesquery = Fixtures.objects.filter(Q(hometeamID=teamsquery) |
Q(awayteamID=teamsquery))
How do i fix it so it outputs the fixtures for all the results of teamsquery?
If I understand correctly, your user can have multiplte teams, right?
If so, you can use:
teamsquery = Team.objects.filter(userteams__userID=request.user)
fixturesquery = Fixtures.objects.filter(Q(hometeamID__in=teamsquery)|Q(awayteamID__in=teamsquery))

Django - Full text search - Wildcard

Is it possible to use wildcards in Django Full text search ?
https://docs.djangoproject.com/en/1.11/ref/contrib/postgres/search/
post = request.POST.get('search')
query = SearchQuery(post)
vector = SearchVector('headline', weight='A') + SearchVector('content', weight='B')
rank = SearchRank(vector, query, weights=[0.1,0.2])
data = wiki_entry.objects.annotate(rank=SearchRank(vector,query)).filter(rank__gte=0.1).order_by('-rank')
At the moment it only matches on full words.
Characters like * % | & have no effect.
Or do i have to go back to icontains ?
https://docs.djangoproject.com/en/1.11/ref/models/querysets/#icontains
Any help is appreciated
I extend the django SearchQuery class and override plainto_tsquery with to_tsquery. Did some simple tests, it works. I will get back here if I find cases where this causes problems.
from django.contrib.postgres.search import SearchQuery
class MySearchQuery(SearchQuery):
def as_sql(self, compiler, connection):
params = [self.value]
if self.config:
config_sql, config_params = compiler.compile(self.config)
template = 'to_tsquery({}::regconfig, %s)'.format(config_sql)
params = config_params + [self.value]
else:
template = 'to_tsquery(%s)'
if self.invert:
template = '!!({})'.format(template)
return template, params
Now I can do something like query = MySearchQuery('whatever:*')
[Postgres' part] The Postgres manual mentions this only briefly ( https://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES), but yes, it is possible, if you just need prefix matching:
test=# select to_tsvector('abcd') ## to_tsquery('ab:*');
?column?
----------
t
(1 row)
test=# select to_tsvector('abcd') ## to_tsquery('ac:*');
?column?
----------
f
(1 row)
And such query will utilize GIN index (I assume you have one).
[Django's part] I'm not Django user, so I made quick research and found that, unfortunately, Django uses plainto_tsquery() function, not to_tsquery(): https://docs.djangoproject.com/en/1.11/_modules/django/contrib/postgres/search/#SearchQuery
plainto_tsquery() made for simplicity, when you use just plain text as an input – so it doesn't support advanced queries:
test=# select to_tsvector('abcd') ## plainto_tsquery('ab:*');
?column?
----------
f
(1 row)
test=# select to_tsvector('abcd') ## plainto_tsquery('ac:*');
?column?
----------
f
(1 row)
So in this case, I'd recommend you using plain SQL with to_tsquery(). But you need to be sure you filtered out all special chars (like & or |) from your text input, otherwise to_tsquery() will produce wrong results or even error. Or if you can, extend django.contrib.postgres.search with the ability to work with to_tsquery() (this would be great contribution, btw).
Alternatives are:
if your data is ACSII-only, you can use LIKE with prefix search and B-tree index created with text_pattern_ops / varchar_pattern_ops operator classes (if you need case-insensitivity, use functional index over lower(column_name) and lower(column_name) like '...%'; see https://www.postgresql.org/docs/9.6/static/indexes-opclass.html);
use pg_trgm index, which supports regular expressions and GiST/GIN indexes (https://www.postgresql.org/docs/9.6/static/pgtrgm.html)

How to get 3 unique values using random.randint() in python?

I am trying to populate a list in Python3 with 3 random items being read from a file using REGEX, however i keep getting duplicate items in the list.
Here is an example.
import re
import random as rn
data = '/root/Desktop/Selenium[FILTERED].log'
with open(data, 'r') as inFile:
index = inFile.read()
URLS = re.findall(r'https://www\.\w{1,10}\.com/view\?i=\w{1,20}', index)
list_0 = []
for i in range(3):
list_0.append(URLS[rn.randint(1, 30)])
inFile.close()
for i in range(len(list_0)):
print(list_0[i])
What would be the cleanest way to prevent duplicate items being appended to the list?
(EDIT)
This is the code that i think has done the job quite well.
def random_sample(data):
r_e = ['https://www\.\w{1,10}\.com/view\?i=\w{1,20}', '..']
with open(data, 'r') as inFile:
urls = re.findall(r'%s' % r_e[0], inFile.read())
x = list(set(urls))
inFile.close()
return x
data = '/root/Desktop/[TEMP].log'
sample = random_sample(data)
for i in range(3):
print(sample[i])
Unordered collection with no duplicate entries.
Use the builtin random.sample.
random.sample(population, k)
Return a k length list of unique elements chosen from the population sequence or set.
Used for random sampling without replacement.
Addendum
After seeing your edit, it looks like you've made things much harder than they have to be. I've wired a list of URLS in the following, but the source doesn't matter. Selecting the (guaranteed unique) subset is essentially a one-liner with random.sample:
import random
# the following two lines are easily replaced
URLS = ['url1', 'url2', 'url3', 'url4', 'url5', 'url6', 'url7', 'url8']
SUBSET_SIZE = 3
# the following one-liner yields the randomized subset as a list
urlList = [URLS[i] for i in random.sample(range(len(URLS)), SUBSET_SIZE)]
print(urlList) # produces, e.g., => ['url7', 'url3', 'url4']
Note that by using len(URLS) and SUBSET_SIZE, the one-liner that does the work is not hardwired to the size of the set nor the desired subset size.
Addendum 2
If the original list of inputs contains duplicate values, the following slight modification will fix things for you:
URLS = list(set(URLS)) # this converts to a set for uniqueness, then back for indexing
urlList = [URLS[i] for i in random.sample(range(len(URLS)), SUBSET_SIZE)]
Or even better, because it doesn't need two conversions:
URLS = set(URLS)
urlList = [u for u in random.sample(URLS, SUBSET_SIZE)]
seen = set(list_0)
randValue = URLS[rn.randint(1, 30)]
# [...]
if randValue not in seen:
seen.add(randValue)
list_0.append(randValue)
Now you just need to check list_0 size is equal to 3 to stop the loop.

How not to order a list of pk's in a query?

I have a list of pk's and I would like to get the result in the same order that my list is defined... But the order of the elements is begging changed. How any one help me?
print list_ids
[31189, 31191, 31327, 31406, 31352, 31395, 31309, 30071, 31434, 31435]
obj_opor=Opor.objects.in_bulk(list_ids).values()
for o in obj_oportunidades:
print o
31395 31435 31434 30071 31309 31406 31189 31191 31352 31327
This object should be used in template to show some results to the user... But how you can see, the order is different from the original list_ids
Would have been nice to have this feature in SQL - sorting by a known list of values.
Instead, what you could do is:
obj_oportunidades=Opor.objects.in_bulk(list_ids).values()
all_opor = []
for o in obj_oportunidades:
print o
all_opor.append(o)
for i in list_ids:
if i in all_opor:
print all_opor.index(i)
Downside is that you have to get all the result rows first and store them before getting them in the order you want. (all_opor could be a dictionary above, with the table records stored in the values and the PKeys as dict keys.)
Other way, create a temp table with (Sort_Order, Pkey) and add that to the query:
Sort_Order PKey
1 31189
2 31191
...
So when you sort on Sort_Order and Opor.objects, you'll get Pkeys it in the order you specify.
I found a solution in: http://davedash.com/2010/02/11/retrieving-elements-in-a-specific-order-in-django-and-mysql/ it's suited me perfectly.
ids = [a_list, of, ordered, ids]
addons = Addon.objects.filter(id__in=ids).extra(
select={'manual': 'FIELD(id,%s)' % ','.join(map(str,ids))},
order_by=['manual'])
This code do something similiar to MySQL "ORDER BY FIELD".
This guy: http://blog.mathieu-leplatre.info/django-create-a-queryset-from-a-list-preserving-order.html
Solved the problem for both MySQL and PostgreSQL!
If you are using PostgreSQL go to that page.