Most Efficient Way to get the "id" of the first record in Rails - ruby-on-rails-4

I'm reviewing some code and I came across a line that does the following:
Person.find_by(name: "Tom").id
The above code gets the FIRST record with a name of "Tom", then builds a model, then gets the id of that model. Since we only want the id, the process of retreiving all data in the model and initializing the model is unneeded. What's the best way to optimize this using active record queries?
I'd like to avoid a raw sql solution. So far I have been able to come up with:
Person.where(name: "Tom").pluck(:id).first
This is faster in some situations since pluck doesn't build the actual model object and doesn't load all the data. However, pluck is going to build an array of records with name "Tom", whereas the original statement only ever returns a single object or nil - so this technique could potentially be worse depending on the where statement. I'd like to avoid the array creation and potential for having a very long list of ids returned from the server. I could add a limit(1) in the chain,
Person.where(name: "Tom").limit(1).pluck(:id).first
but is seems like I'm making this more complicated than it should be.

With Rails 6 you can use the new pick method:
Person.where(name: 'Tom').pick(:id)

This is a little verbose, but you can use select_value from the ActiveRecord connection like this:
Person.connection.select_value(Person.select(:id).where(name: 'Tom').limit(1))

This might work depending on what you're looking for.
Person.where(name: "Tom").minimum(:id)
Person.where(name: "Tom").maximum(:id)
These will sort by id value while the Person.where(name: "Tom").first.id will sort off of your default sort. Which could be id, created_at, or primary_key.
eitherway test and see if it works for you

Related

Wrapping a Django query set in a raw query or vice versa?

I'm thinking of using a raw query to quickly get around limitations with either my brain or the Django ORM, but I don't want to redevelop the infrastructure required to support the existing ORM code such as filters. Right now I'm stuck with two dead ends:
Writing an inner raw query and reusing that like any other query set. Even though my raw query selects the correct columns, I can't filter on it:
AttributeError: 'RawQuerySet' object has no attribute 'filter'
This is corroborated by another answer, but I'm still hoping that that information is out of date.
Getting the SQL and parameters from the query set and wrapping that in a raw query. It seems the raw SQL should be retrievable using queryset.query.get_compiler(DEFAULT_DB_ALIAS).as_sql() - how would I get the parameters as well (obviously without actually running the query)?
One option for dealing with complex queries is to write a VIEW that encapsulates the query, and then stick a model in front of that. You will still be able to filter (and depending upon your view, you may even get push-down of parameters to improve query performance).
All you need to do to get a model that is backed by a view is have it as "unmanaged", and then have the view created by a migration operation.
It's better to try to write a QuerySet if you can, but at times it is not possible (because you are using something that cannot be expressed using the ORM, for instance, or you need to to something like a LATERAL JOIN).

What is the best way to use query with a list and keep the list order? [duplicate]

This question already has answers here:
Django: __in query lookup doesn't maintain the order in queryset
(6 answers)
Closed 8 years ago.
I've searched online and could only find one blog that seemed like a hackish attempt to keep the order of a query list. I was hoping to query using the ORM with a list of strings, but doing it that way does not keep the order of the list.
From what I understand bulk_query only works if you have the id's of the items you want to query.
Can anybody recommend an ideal way of querying by a list of strings and making sure the objects are kept in their proper order?
So in a perfect world I would be able to query a set of objects by doing something like this...
Entry.objects.filter(id__in=['list', 'of', 'strings'])
However, they do not keep order, so string could be before list etc...
The only work around I see, and I may just be tired or this may be perfectly acceptable I'm not sure is doing this...
for i in listOfStrings:
object = Object.objects.get(title=str(i))
myIterableCorrectOrderedList.append(object)
Thank you,
The problem with your solution is that it does a separate database query for each item.
This answer gives the right solution if you're using ids: use in_bulk to create a map between ids and items, and then reorder them as you wish.
If you're not using ids, you can just create the mapping yourself:
values = ['list', 'of', 'strings']
# one database query
entries = Entry.objects.filter(field__in=values)
# one trip through the list to create the mapping
entry_map = {entry.field: entry for entry in entries}
# one more trip through the list to build the ordered entries
ordered_entries = [entry_map[value] for value in values]
(You could save yourself a line by using index, as in this example, but since index is O(n) the performance will not be good for long lists.)
Remember that ultimately this is all done to a database; these operations get translated down to SQL somewhere.
Your Django query loosely translated into SQL would be something like:
SELECT * FROM entry_table e WHERE e.title IN ("list", "of", "strings");
So, in a way, your question is equivalent to asking how to ORDER BY the order something was specified in a WHERE clause. (Needless to say, I hope, this is a confusing request to write in SQL -- NOT the way it was designed to be used.)
You can do this in a couple of ways, as documented in some other answers on StackOverflow [1] [2]. However, as you can see, both rely on adding (temporary) information to the database in order to sort the selection.
Really, this should suggest the correct answer: the information you are sorting on should be in your database. Or, back in high-level Django-land, it should be in your models. Consider revising your models to save a timestamp or an ordering when the user adds favorites, if that's what you want to preserve.
Otherwise, you're stuck with one of the solutions that either grabs the unordered data from the db then "fixes" it in Python, or constructing your own SQL query and implementing your own ugly hack from one of the solutions I linked (don't do this).
tl;dr The "right" answer is to keep the sort order in the database; the "quick fix" is to massage the unsorted data from the database to your liking in Python.
EDIT: Apparently MySQL has some weird feature that will let you do this, if that happens to be your backend.

Find model returns undefined when trying to get the attribute of a model by first finding the model by another attribute?

I would like to do something like:
App.Model.find({unique_attribute_a: 'foo'}).objectAt(0).get('attribute_b')`
basically first finding a model by its unique attribute that is NOT its ID, then getting another attribute of that model. (objectAt(0) is used because find by attribute returns a RecordArray.)
The problem is App.Model.find({unique_attribute_a: 'foo'}).objectAt(0) is always undefined. I don't know why.
Please see the problem in the jsbin.
It looks like you want to use a filter rather than a find (or in this case a findQuery). Example here: http://jsbin.com/iwiruw/438
App.Model.find({ unique_attribute_a: 'foo' }) converts the query to an ajax query string:
/model?unique_attribute_a=foo
Ember data expects your server to return a filtered response. Ember Data then loads this response into an ImmutableArray and makes no assumption about what you were trying to find, it just knows the server returned something that matched your query and groups that result into a non-changable array (you can still modify the record, just not the array).
App.Model.filtler on the other hand just filters the local store based on your filter function. It does have one "magical" side affect where it will do App.Model.find behind the scenes if there are no models in the store although I am not sure if this is intended.
Typically I avoid filters as it can have some performance issues with large data sets and ember data. A filter must materialize every record which can be slow if you have thousands of records
Someone on irc gave me this answer. Then I modified it to make it work completely. Basically I should have used filtered.
App.Office.filter( function(e){return e.get('unique_attribute_a') == 'foo'}).objectAt(0)
Then I can get the attribute like:
App.Office.filter( function(e){return e.get('unique_attribute_a') == 'foo'}).objectAt(0).get('attribute_b')
See the code in jsbin.
Does anyone know WHY filter works but find doesn't? They both return RecordArrays.

django queryset ordering

I'm listing queryset results and would like to add an option for choosing the order results are displayed.
I would like to pass the actual data from the database to other page for sorting.
I was able to achieve such thing by getting all objects ids and use django session to recreate a new queryset based on the order criteria.
I was thinking if there is any other way to achieve such goal?
10x
Assuming you are currently displaying the data as a table, you could give chance to some javascript client side table sorter such as tablesorter. There are lots of javascript table sorte.
I'm away from my development machine right now, but I think you could just pass the list of ids to a new Queryset, pk__in=list_of_object_ids, and then use the native order_by function.
For example:
objs = Object.objects.filter(pk__in=list_of_object_ids).order_by('value_to_order_by')
Anyway, that's what I would try first, though I'm sure there are better optimizations.
For example, instead of a list of object ids, you could pass a dictionary with a key:value pair that has the value you want to order by.
For example:
[{'obj_id':1,'obj_value':'foo'},{'obj_id':2,'obj_value':'foo'}]
Then use some lambda function to sort it, like here.

Custom Date Aggregate Function

I want to sort my Store models by their opening times. Store models contains is_open function which controls Store's opening time ranges and produces a boolean if it's open or not. The problem is I don't want to sort my queryset manually because of efficiency problem. I thought if I write a custom annotate function then I can filter the query more efficiently.
So I googled and found that I can extend Django's aggregate class. From what I understood, I have to use pre-defined sql functions like MAX, AVG etc. The thing is I want to check that today's date is in a given list of time intervals. So anyone can help me that which sql name should I use ?
Edit
I'd like to put the code here but it's really a spaghetti one. One pages long code only generates time intervals and checks the suitable one.
I want to avoid :
alg= lambda r: (not (s.is_open() and s.reachable))
sorted(stores,key=alg)
and replace with :
Store.objects.annotate(is_open = CheckOpen(datetime.today())).order_by('is_open')
But I'm totally lost at how to write CheckOpen...
have a look at the docs for extra