What I am trying to do is to find the earliest article that has no related article_history.
Here is what I tried, but isn't working:
the_article = Article.objects.filter(cowcode=country).filter(pubdate__range=(start_date,end_date)).exclude(article_history_set__id > 0).order_by('pubdate')[0]
My thinking behind this was that the query is working until the exclude: I get all the articles that match the condition. Since I want to find the earliest article that has no article history attached yet, excludeing all articles that have articles with an article_history id > 0 should work. Why isn't it?
Would be awesome if someone could help me out here.
try
...end_date)).filter(article_history_set__isnull=True).order_by...
or
...end_date)).exclude(article_history_set__isnull=False).order_by…
and if you have self-relational foreign key as parent-children you can do that:
....filter(children__isnull=True).order_by...
or
....exclude(children__isnull=False).order_by...
Related
I am trying to write a regexp to use with Crazyegg that will allow me to only gather data from my product pages.
My site structure is:
category page: www.sitename.com/categoryname/sub-categoryname
product page: www.sitename.com/productname/
My regex so far is:
^https?://([A-Za-z0-9.-]*\.)?sitename\.com/[A-Za-z0-9.-]*(/|/\?|)$
This allows everything that isnt at the sub category level (2nd level folder?)
the issue is that this allow top level categories so I need to exclude these by their name for example:
^https?://([A-Za-z0-9.-]*\.)?sitename\.com/(?!\babout\b|\bcheckout\b)(/|/\?|)$
Could you please help me get the exclusion correct? ive also tried doing using [^\babout\b|\bcheckout\b]
if you only want product pages, the regex for capture product pages is:
".*productname"
For capture category pages: ".*categoryname/sub-categoryname"
I hope help you. If you have more questions, ask me!
I'm building web app and using django and Sphinx for free text search. I need to apply additional restrictions before making request to searchd, consider 2 tables:
Entity
id
title
description
created_by_id
updated_by_id
created_date
updated_date
and
EntityUser
id
entity_id [FK to the table above]
joining_user_id
is_approved
created_by_id
updated_by_id
created_date
updated_date
I've built RT index for main table Entity, all works fine, but then I want to make a query only on those entities to which user has joined, i.e. where for specific user_id & entity_id exists record in EntityUser with is_approved=1. Problem is that I can't index EntityUser, because there are no string fields - this table only holds integers/timestamps as you see. Not sure if I could make a query in SphinxQL containing subquery to another idex even if I could build index for that table. Knowing that Sphinx was used for quite big projects with great success, I doubt it's a limitation of Sphinx - is it bad design of DB/application or leak of knowledge how to build proper RT index? Can I somehow extend existing index so that I can use restriction above?
I was thinking that I could apply the additional restrictions after Sphinx returns IDs of records on MySQL side, but that's not going to work: N records with highest weight would be returned, but after applying additional restrictions the result could be empty. So I need to get an area of search and then perform query only on those entities user can possibly see.
Adapting the example from http://sphinxsearch.com/docs/current.html#attributes, you might be able to use something like this in your conf:
...
sql_query = SELECT app_entity.id as id,
app_entity.title as title,
app_entity.description as description,
app_entityuser.id as userid
FROM app_entity, app_entityuser
WHERE app_entity.id = app_entityuser.entity_id AND app_entityuser.is_approved = 1
sql_attr_uint = id
sql_attr_uint = userid
...
I should provide a disclaimer: I have not tried this.
I did find a related SO post, but it doesn't look like they quite solved it: Django-sphinx result filtering using attributes?
Good luck!
Actually I've found the answer and it has nothing to do with the design of application or DB.
In fact that's simple - I just need to use MVA for RT index as I would do for plain one (rt_attr_multi or rt_attr_multi_64). In configuration file I will have to do something like this:
...
rt_attr_multi = entity_users
}
and then populate it with IDs of users which have joined the Entity and have been approved. Problem was that I couldn't understand how to use MVA with RT index, but not it's clear. There are not enough real-word examples with RT indexes and MVA I think, so I've shared this to help to solve similar problems.
UPDATE: was fighting last hour to generate RT index and always was getting "unknown column: 'entity_users'". Finally found the reason - if you add MVA to RT index (don't know if that's the same for plain), you've got to not only restart searchd daemon (service), but also DELETE everything you have in "data" folder (or where you have stored your index)!
I'm trying to get the ten most commented posts in my django app, but I'm unable to do it because I can't think a proper way.
I'm currently using the django comments framework, and I've seen a possibility of doing this with aggregate or annotate , but I can figure out how.
The thing would be:
Get all the posts
Calculate the number of comments per post (I have a comment_count method for that)
Order the posts from most commented to less
Get the first 10 (for example)
Is there any "simple" or "pythonic" way to do this? I'm a bit lost since the comments framework is only accesible via template tags, and not directly from the code (unless you want to modify it)
Any help is appreciated
You're right that you need to use the annotation and aggregation features. What you need to do is group by and get a count of the object_pk of the Comment model:
from django.contrib.comments.models import Comment
from django.db.models import Count
o_list = Comment.objects.values('object_pk').annotate(ocount=Count('object_pk'))
This will assign something like the following to o_list:
[{'object_pk': '123', 'ocount': 56},
{'object_pk': '321', 'ocount': 47},
...etc...]
You could then sort the list and slice the top 10:
top_ten_objects = sorted(o_list, key=lambda k: k['ocount'])[:10]
You can then use the values in object_pk to retrieve the objects that the comments are attached to.
Annotate is going to be the preferred way, partially because it will reduce db queries and it's basically a one-liner. While your theoretical loop would work, I bet your comment_count method relies on querying comments for a given post, which would be 1 query per post that you loop over- nasty!
posts_by_score = Comment.objects.filter(is_public=True).values('object_pk').annotate(
score=Count('id')).order_by('-score')
post_ids = [int(obj['object_pk']) for obj in posts_by_score]
top_posts = Post.objects.in_bulk(post_ids)
This code is shameless adapted from Django-Blog-Zinnia (no affiliation)
I've got a query,
Bid.objects.filter(shipment=shipment, status=BidStatuses.ACCEPTED, user=request.user, items__count=0).exists()
The part that doesn't work is items__count=0. Bids have a many-to-many relationship with items. I need to check if this bid has 0 items. How can I do that?
Aggregation.
http://docs.djangoproject.com/en/1.2/topics/db/aggregation/
see the doc upon, read the sample,you will find the answer
For the record (there is already an accepted answer with a link to Django aggregation docs), what OP needs is:
Bid.objects.annotate(item_num=models.Count('items')).filter(shipment=shipment, status=BidStatuses.ACCEPTED, user=request.user, item_num=0).exists()
Recently i have implemented django-sphinx search on my website.
It is working fine of each separate model.
But now my client requirement has changed.
To implement that functionality i need field name to whom search is made.
suppose my query is:
"select id, name,description from table1"
and search keyword is matched with value in field "name". So i need to return that field also.
Is it possible to get field name or any method provided by django-sphinx which return field name.
Please help me...
As far as I know, this isn't possible. You might look at the contents of _sphinx though.
Well from django-sphinx it might not be possible. But there is a solution -
Make different indexes, each index specifying the field that you need to search.
In your django-sphinx models while searching do this -
search1 = SphinxSearch(index='index1')
search2 = SphinxSearch(index='index2')
...
After getting all the search results, you aggregate them & you have the info of from where they have come.