How to query mongodb with a regex in the projection? - regex

Does anyone know if there is a way to query MongoDB and have only certain fields returned by using a regex as part of the projection?
For example: Given a collection having arbitrary field names, how might I query the collection and only return field names matching the regex '^foo'.
Possibly something like this?
db.mycollection.find({},{$regex:"^foo"})
Thanks.
Brent.

I think you need to break down the process into two pieces, the first one is retrieving the fields names from MongoDB.
Then the second piece is that you can run the regex on the result, and from there you can query the DB with the right fields.

Related

Quicksight breaking up strings for use of all aspects

I was wondering if anyone has every had experience with breaking a string up in quicksight and using certain aspects of the string. My example is a data set that returns tags like this "animals|funny|dog-park" I have used "split(tags,'|',1)" but then all that gets returned is the first part(animals). I have also tried a combination of ifelse->locate->split with no luck. Is there a way to split these tags to where they are all usable (animals) & (funny) or (funny) & (dog-park), etc.? Say the article associated will then be broken up into one tag but also another separately? I know this will end up being a calculated field most likely. Thank you in advance!
Since QuickSight does not support any form of nested fields (including objects and list) in analysis, you need to normalise this into separate rows before feeding the data to QuickSight.
Otherwise, if you leave it as is, you would be limited to filtering using string contains and doing string lookup in calculated fields - nevertheless you would not be able to use these tags as categories (such as in colours field well of visuals).

How to order django query set filtered using '__icontains' such that the exactly matched result comes first

I am writing a simple app in django that searches for records in database.
Users inputs a name in the search field and that query is used to filter records using a particular field like -
Result = Users.objects.filter(name__icontains=query_from_searchbox)
E.g. -
Database consists of names- Shiv, Shivam, Shivendra, Kashiva, Varun... etc.
A search query 'shiv' returns records in following order-
Kahiva, Shivam, Shiv and Shivendra
Ordered by primary key.
My question is how can i achieve the order -
Shiv, Shivam, Shivendra and Kashiva.
I mean the most relevant first then lesser relevant result.
It's not possible to do that with standard Django as that type of thing is outside the scope & specific to a search app.
When you're interacting with the ORM consider what you're actually doing with the database - it's all just SQL queries.
If you wanted to rearrange the results you'd have to manipulate the queryset, check exact matches, then use regular expressions to check for partial matches.
Search isn't really the kind of thing that is best suited to the ORM however, so you may which to consider looking at specific search applications. They will usually maintain an index, which avoids database hits and may also offer a percentage match ordering like you're looking for.
A good place to start may be with Haystack

RegEx for a JSON string

Im storing a person object as JSON in my SQLite database. The table will have few 1000 records of person objects. What i need is to query person based on the "name" attribute.
After investigation i figured out using GLOB method of SQLite to perform a RegEx kind of search in the JSON elements.
My Sample JSON is something like this.
{"name":"john","age":"22","father-name":"jackson"}
Now i want a RegEx matcher to get me all the records that matches a part of the SubString provided with the name attribute in JSON. And it should be case insensitive too.
For Ex: "ohn" search should fetch me john's record.
While you can store JSON and search against it using regexes (which are rather limited in SQLite), it does not mean you should.
Instead, you should really consider splitting your JSON into fields and storing them in normal SQLite table. Doing so will not only allow you to perform easier searches without need to painfully parse data every single time, search will be much faster too (if you add necessary indexes).
If you do want to go down the regex route the following will extract the record:
/\{"name":"\w*ohn\w*[^\}]+\}/i
This will match each of these:
{"name":"john","age":"22","father-name":"jackson"}
{"name":"john","age":"22","father-name":"johnson"}
{"name":"johnny","age":"22","father-name":"smith"}
but not:
{"name":"fred","age":"22","father-name":"hall"},
{"name":"mike","age":"22","father-name":"johnson"}
{"name":"bob","age":"22","father-name":"todd"}

Filtered annotations without removing results

Consider a model and a query using annotations, for example the following example from the Django documentation:
http://docs.djangoproject.com/en/dev/topics/db/aggregation/
Publisher.objects.filter(book__rating__gt=3.0).annotate(num_books=Count('book'))
The result of this query will only contain objects matching the filter (i.e. has a book_rating greater than 3.0), and these objects has been annotated. But what if I want the query to contain all objects, but only annotate objects which matches a filter (or for example annotate them with 0)? Or is this even possible?
No, you can't do that - because that's not how the underlying SQL works.
The only thing I can think of is to do two queries, one with the filter/annotation and one without, then iterate through them in Python, appending the annotation to the matching objects from the non-filtered list.

Django filter vs exclude

Is there a difference between filter and exclude in django? If I have
self.get_query_set().filter(modelField=x)
and I want to add another criteria, is there a meaningful difference between to following two lines of code?
self.get_query_set().filter(user__isnull=False, modelField=x)
self.get_query_set().filter(modelField=x).exclude(user__isnull=True)
is one considered better practice or are they the same in both function and performance?
Both are lazily evaluated, so I would expect them to perform equivalently. The SQL is likely different, but with no real distinction.
It depends what you want to achieve. With boolean values it is easy to switch between .exclude() and .filter() but what about e.g. if you want to get all articles except those from March? You can write the query as
Posts.objects.exclude(date__month=3)
With .filter() it would be (but I not sure whether this actually works):
Posts.objects.filter(date__month__in=[1,2,4,5,6,7,8,9,10,11,12])
or you would have to use a Q object.
As the function name already suggest, .exclude() is used to exclude datasets from the resultset. For boolean values you can easily invert this and use .filter() instead, but for other values this can be more tricky.
In general exclude is opposite of filter. In this case both examples works the same.
Here:
self.get_query_set().filter(user__isnull=False, modelField=x)
You select entries that field user is not null and modelField has value x
In this case:
self.get_query_set().filter(modelField=x).exclude(user__isnull=True)
First you select entries that modelField has value x(both user in null and user is not null), then you exclude entries that have field user null.
I think that in this case it would be better use first option, it looks more cleaner. But both work the same.