doctrine: QueryBuilder vs createQuery? - doctrine-orm

In Doctrine you can create DQL in 2 ways:
EntityManager::createQuery:
$query = $em->createQuery('SELECT u FROM MyProject\Model\User u WHERE u.id = ?1');
QueryBuilder:
$qb->add('select', 'u')
->add('from', 'User u')
->add('where', 'u.id = ?1')
->add('orderBy', 'u.name ASC');
I wonder what the difference is and which should I use?

DQL is easier to read as it is very similar to SQL. If you don't need to change the query depending on a set of parameters this is probably the best choice.
Query Builder is an api to construct queries, so it's easier if you need to build a query dynamically like iterating over a set of parameters or filters. You don't need to do any string operations to build your query like join, split or whatever.

Query builder is just, lets say, interface to create query... It should be more comfortable to use, it does not have just add() method, but also methods like where(), andWhere(), from(), etc. But in the end, it just composes query like the one you use in the createQuery() method.
Example of more advanced use of query builder:
$em->createQueryBuilder()
->from('Project\Entities\Item', 'i')
->select("i, e")
->join("i.entity", 'e')
->where("i.lang = :lang AND e.album = :album")
->setParameter('lang', $lang)
->setParameter('album', $album);

They have different purposes:
DQL is easier to use when you know your full query.
Query builder is smarter when you have to build your query based on some conditions, loops etc.

The main difference is the overhead of calling the methods. Your first code sample (createQuery) just for simplicity makes one method call, while the the queryBuilder makes 4. At the end of everything, they come down to a string that has to be executed, first example you are giving it the string, and the other you are building it with multiple chained method calls.
If you are looking for a reason to use one over the other, that is a question of style, and what looks more readable. For me, I like the queryBuider most of the time, it provides well defined sections for the query. Also, in the past it makes it easier to add in conditional logic when you need it.

It might be easier to unit test when using the query builder. Let's say you have a repository that queries for some data basing on the complicated list of conditions. And you want to assure that if a particular condition is passed into the repository, some other conditions are added into the query. In case of DQL you have two options:
1) To use fixtures and test the real interaction with DB. Which I find somewhat troublesome and ununitestish.
2) To check the generated DQL code. Which can make your test too fragile.
With QueryBuilder, you can substitute it with mock and verify that "andWhere" method with needed parameter is called. Of course such considerations are not applicable if your query is simple and not depended on any parameters.

Related

Django subquery in insert

Is it possible to force django to make a subquery when I try to insert some values?
This produces two separate queries:
CommunicationLog.objects.create(
device='asdfasdf',
date_request=CommunicationLog.objects.get(pk=343).date_request,
content_obj_id=338, type_request=1, type_change=2
)
You definitely cannot do it by using create. There's no available API that will let you do it, since this is very unusual use case. You have to fall back to raw sql.
Even if the API won't let you do what you want, you can still improve the performance a little bit (which I assume is your intention) by using the .only method when querying the date:
CommunicationLog.objects.create(
device='asdfasdf',
date_request=CommunicationLog.objects.only('date_request').get(343).date_request,
content_obj_id=338, type_request=1, type_change=2
)

Django Sum & Count

I have some MySQL code that looks like this:
SELECT
visitor AS team,
COUNT(*) AS rg,
SUM(vscore>hscore) AS rw,
SUM(vscore<hscore) AS rl
FROM `gamelog` WHERE status='Final'
AND date(start_et) BETWEEN %s AND %s GROUP BY visitor
I'm trying to translate this into a Django version of that query, without making multiple queries. Is this possible? I read up on how to do Sum(), and Count(), but it doesn't seem to work when I want to compare two fields like I'm doing.
Here's the best I could come up with so far, but it didn't work...
vrecord = GameLog.objects.filter(start_et__range=[start,end],visitor=i['id']
).aggregate(
Sum('vscore'>'hscore'),
Count('vscore'>'hscore'))
I also tried using 'vscore>hscore' in there, but that didn't work either. Any ideas? I need to use as few queries as possible.
Aggregation only works on single fields in the Django ORM. I looked at the code for the various aggregation functions, and noticed that the single-field restriction is hardwired. Basically, when you use, say, Sum(field), it just records that for later, then it passes it to the database-specific backend for conversion to SQL and execution. Apparently, aggregation and annotation are not standardized in SQL.
Anyway, you probably need to use a raw SQL query.

Methods for filtering within a Doctrine results Collection?

I'm very new to Doctrine, so this might seem a rather obvious question to those more experienced.
I'm writing a data import tool that has to check every row being imported contains valid data. For example, the Row has a reference to a product code, I need to check that there is a pre-existing Product object with that code. If not, flag that row as invalid.
Now I can easily do something like this for each row.
$productCode = $this->csv->getProductNumber();
$product = $doctrine->getRepository('MyBundle:Product')->findOneBy(array('code' => $productCode ));
But that seems hideously inefficient. So I thought about returning the entire Product Collection and then iterating within that.
$query = $this->getEntityManager()->createQuery('SELECT p FROM MyBundle\Entity\Product p');
$products = $query->getResult();
All well and good, but then I've got to write messy loops to search for.
Two questions:
1). I was wondering if I'm missing some utility methods such as you have in Magento Collections, where you can search within the Collection results without incurring additional database hits. For example, in Magento this will iterate the collection and filter on the code property.
$collection->getItemByColumnValue("code","FZTY444");
2). At the moment I'm using the query below which returns an "rectangular array". More efficient, but could be better.
$query = $this->getEntityManager()->createQuery('SELECT p.code FROM MyBundle\Entity\Product p');
$products = $query->getResult();
Is there a way of returning a single dimensional array without have to reiterate the resultset and transform into a flat array, so I can use in_array() on the results?
If I understand your question correctly you want to filter an array of entities returned by getResult(). I had a similar question and I think I've figured out two ways to do it.
Method 1: Arrays
Use the array_filter method on your $products variable. Yes, this amounts to a "loop" in the background, but I think this is a generally acceptable way of filtering arrays rather than writing it yourself. You need to provide a callback (anonymous function in 5.3 preferred). Here is an example
$codes = array_filter($products, function($i) {
return $i->getCode() == '1234';
});
Basically in your function, return true if you want the result returned into $codes and false otherwise (not sure if the false is necssary, or if a void return value is sufficient).
Method 2: Doctrine's ArrayCollection
In your custom repository or where ever you are returning the getResult() method, you can instead return an ArrayCollection. This is found in the Doctrine namespace Doctrine\Common\Collections\. More documenation on the interface behind this method can be found here. So in this case you would have
$query = $this->getEntityManager()->createQuery('SELECT p FROM MyBundle\Entity\Product p');
$products = new ArrayCollection($query->getResult());
You can then use the filter() method on the array collection. Use it in a very similar way to the array_filter. Except it doesn't need a first argument because you call it like this: $products->filter(function($i) { ... });
The ArrayCollection class is an iterator, so you can use it in foreach loops to your hearts content, and it shouldn't really be different from an array of your products. Unless your code explicitly uses $products[$x], then it should be plug 'n' play*.
*Note: I haven't actually tested this code or concept, but based on everything I've read it seems legit. I'll update my answer if it turns out I'm wrong.
You can use another hydration mode. $query->getResult() usually returns a result in object hydration. Take a look at $query->getScalarResult(), which should be more suitable for your needs.
More info on the Doctrine 2 website.

OR query with Q object hanging

I'm constructing a query using the Q object but it's hanging.
When I "AND" the filters together, the query works fine. Here is the example:
School.objects.filter( Q(city__search='"orlando"'), Q(schoolattribute__attribute__name__search='"subjects"') )
But when I "OR" the filters together, the query just hangs because I'm assuming there's too much to process:
School.objects.filter( Q(city__search='"orlando"') | Q(schoolattribute__attribute__name__search='"subjects"')
I'm wondering what's going on here exactly and what can I do to mitigate it. Why does the query work when "AND" is used, but not when "OR" is used?
EDIT: Good tip #psagers. So it turns out that the AND query gets two INNER JOINs whereas the OR query gets two LEFT OUTER JOINs.
Given your situation, I'll assume the following:
You have a really big data set
You don't want to fetch too many entries
To optimize your code, you'd probably be better off using two queries:
schools_by_city = School.objects.filter(city__search='"orlando"')
schools_by_attribute_city = School.objects.filter(schoolattribute__attribute__name__search='"subjects"')
result = set(schools_by_city).union(set(schools_by_attribute_city))
This will probably be better than your original query (because you can use the INNER join), but you should test it out. If my assumptions are wrong, you should probably rethink your db structure (i.e. use a specialized tool for searching instead of mysql fulltext, rethinking SchoolAttribute, whatever floats your boat).

Case sensitive LINQ to DataSet

I am having an issue with a strongly typed DataSet exhibiting case-sensitivity using LINQ to DataSet to retrieve and filter data. In my example project, I have created a strongly typed DataSet called DataSet1. It contains a single DataTable called Customers. To instantiate and populate, I create a couple of rows (notice the casing on the names):
// Instantiate
DataSet1 ds = new DataSet1();
// Insert data
ds.Customers.AddCustomersRow(1, "Smith", "John");
ds.Customers.AddCustomersRow(2, "SMith", "Jane");
Next, I can easily fetch/filter using the DataSet's built-in Select functionality:
var res1 = ds.Customers.Select("LastName LIKE 'sm%'");
Console.WriteLine("DataSet Select: {0}", res1.Length);
DataSet Select: 2
The trouble begins when attempting to use LINQ to DataSet to perform the same operation:
var res2 = from c in ds.Customers where c.LastName.StartsWith("sm") select c;
Console.WriteLine("LINQ to DataSet: {0}", res2.Count());
LINQ to DataSet: 0
I've already checked the instantiated DataSet's CaseSensitive property as well as the Customer DataTable's CaseSensitive property--both are false. I also realize that when using the Select methodology, the DataSet performs the filtering and the LINQ query is doing something else.
My hope and desire for this type of code was to use it to Unit Test our Compiled LINQ to SQL queries so I can't really change all the current queries to use:
...where c.LastName.StartsWith("sm", StringComparison.CurrentCultureIgnoreCase) select c;
...as that changes the query in SQL. Thanks all for any suggestions!
LINQ to DataSets still use normal managed functions, including the standard String.StartsWith method.
It is fundamentally impossible for these methods to be aware of the DataTable's CaseSensitive property.
Instead, you can use an ExpressionVisitor to change all StartsWith (or similar) calls to pass StringComparison.CurrentCultureIgnoreCase.
You could also use c.LastName.ToLower().StartsWith("sm" which will make sure you also retrieve lower cased entries. Good luck!