Summary: which is quicker: updating / flushing a list of entities, or running a query builder update on each?
We have the following situation in Doctrine ORM (version 2.3).
We have a table that looks like this
cow
wolf
elephant
koala
and we would like to use this table to sort a report of a fictional farm. The problem is that the user wishes to have a customer ordering of the animals (e.g. Koala, Elephant, Wolf, Cow). Now there exist possibilities using CONCAT, or CASE to add a weight to the DQL (example 0002wolf, 0001elephant). In my experience this is either tricky to build and when I got it working the result set was an array and not a collection.
So, to solve this we added a "weight" field to each record and, before running the select, we assign each one with a weight:
$animals = $em->getRepository('AcmeDemoBundle:Animal')->findAll();
foreach ($animals as $animal) {
if ($animal->getName() == 'koala') {
$animal->setWeight(1);
} else if ($animal->getName() == 'elephant') {
$animal->setWeight(2);
}
// etc
$em->persist($animal);
}
$em->flush();
$query = $em->createQuery(
'SELECT c FROM AcmeDemoBundle:Animal c ORDER BY c.weight'
);
This works perfectly. To avoid race conditions we added this inside a transaction block:
$em->getConnection()->beginTransaction();
// code from above
$em->getConnection()->rollback();
This is a lot more robust as it handles multiple users generating the same report. Alternatively the entities can be weighted like this:
$em->getConnection()->beginTransaction();
$qb = $em->createQueryBuilder();
$q = $qb->update('AcmeDemoBundle:Animal', 'c')
->set('c.weight', $qb->expr()->literal(1))
->where('c.name = ?1')
->setParameter(1, 'koala')
->getQuery();
$p = $q->execute();
$qb = $em->createQueryBuilder();
$q = $qb->update('AcmeDemoBundle:Animal', 'c')
->set('c.weight', $qb->expr()->literal(2))
->where('c.name = ?1')
->setParameter(1, 'elephant')
->getQuery();
$p = $q->execute();
// etc
$query = $em->createQuery(
'SELECT c FROM AcmeDemoBundle:Animal c ORDER BY c.weight'
);
$em->getConnection()->rollback();
Questions:
1) which of the two examples would have better performance?
2) Is there a third or better way to do this bearing in mind we need a collection as a result?
Please remember that this is just an example - sorting the result set in memory is not an option, it must be done on the database level - the real statement is a 10 table join with 5 orderbys.
Initially you could make use of a Doctrine implementation named Logging (\Doctrine\DBAL\LoggingProfiler). I know that it is not the better answer, but at least you can implement it in order to get best result for each example that you have.
namespace Doctrine\DBAL\Logging;
class Profiler implements SQLLogger
{
public $start = null;
public function __construct()
{
}
/**
* {#inheritdoc}
*/
public function startQuery($sql, array $params = null, array $types = null)
{
$this->start = microtime(true);
}
/**
* {#inheritdoc}
*/
public function stopQuery()
{
echo "execution time: " . microtime(true) - $this->start;
}
}
In you main Doctrine configuration you can enable as:
$logger = new \Doctrine\DBAL\Logging\Profiler;
$config->setSQLLogger($logger);
Related
I am refactoring a code base, because this is a legacy code with a lot of raw sql, and the whole thing is spaghetti code.
I am sadly facing that doctrine has no REPLACE INTO functionality, I know the reasons why.
I found some workaround, like merge but that is deprecated.
I spend a lot of hours to learn doctrine, because it is widely used ORM, and a lot of hours while I built the entities.
Is there any "legal" solution to achieve this REPLACE INTO?
You can handle duplicates by catching UniqueConstraintViolationException.
$entity = new PossibleDuplicatedEntity();
try {
$em->persist($entity);
$em->flush();
}
catch (\Doctrine\DBAL\Exception\UniqueConstraintViolationException $e) {
// handle duplicated values
}
But beware – Doctrine uses implicit transaction. When exception is thrown, transaction is rolled-back and EntityManager is closed (entities are detached). See documentation.
Better would be handling cause of duplication occurring. I.e. if it's because of concurrency, try to use table locking etc.
If you just want to 'upsert' rows into the database and you don't care about change tracking you could use the piece of code below. It's probably more performant than relying on UniqueConstraintViolationException's.
I use it for batch-uploading rows into a table.
/**
* #throws Exception
*/
public static function replaceInto(EntityManagerInterface $em, iterable $entities): void
{
$i = 0;
$values = [];
$metadata = $fieldNames = $sql = null;
foreach ($entities as $entity) {
if (!isset($metadata)) {
$metadata = $em->getClassMetadata(get_class($entity));
$fieldNames = $metadata->getFieldNames();
$tableName = $metadata->getTableName();
$sql = "REPLACE INTO `$tableName` VALUES ";
}
$params = [];
foreach ($fieldNames as $fieldName) {
$paramName = ':' . $metadata->getColumnName($fieldName) . '_' . $i;
$params[] = $paramName;
$values[] = [$paramName, $metadata->getFieldValue($entity, $fieldName), $metadata->getTypeOfField($fieldName)];
}
if ($i > 0) {
$sql .= ",";
}
$sql .= "\n(" . join(', ', $params) . ")";
$i++;
}
$stmt = $em->getConnection()->prepare($sql);
foreach ($values as $value) {
$stmt->bindValue(...$value);
}
$stmt->executeQuery();
}
I'm using Symfony 3.4 and Doctrine.
I need to update large amount of entities (300k+) using Doctrine.
I've read batch article form Doctrine docs and I've read topics from stack, but problem is despite size of the batch (20, 100, 200, 500) I'm getting 'out of memory' error anyway when I'm approaching 20k proccessed entities.
Here is my function.
Can someone, please, give me a hint/suggestion how to avoid this?
protected function execute(InputInterface $input, OutputInterface $output): void
{
$io = new SymfonyStyle($input, $output);
$em = $this->getContainer()->get('doctrine.orm.entity_manager');
$em->getConfiguration()->setSQLLogger(null);
$repository = $em->getRepository('AppBundle:Order');
$qb = $repository->createQueryBuilder('o');
$totalCount = (int) $qb->select($qb->expr()->count('o'))
->where($qb->expr()->eq('o.amountOut', 0))
->getQuery()
->getSingleScalarResult();
$progressBar = $io->createProgressBar($totalCount);
$query = $qb->select('o')
->where($qb->expr()->eq('o.amountOut', 0))
->getQuery();
$iterableResult = $query->iterate();
$batchSize = 100;
$i = 0;
foreach ($iterableResult as $row) {
/** #var Order $order */
$order = $row[0];
$commissionsArr = $this->calcCommissionInOutFromOrder($order);
$amountOut = $order->getTransferAmount();
$order->setAmountOut($amountOut);
$order->setCommissionIn($commissionsArr['commission_in']);
$order->setCommissionOut($commissionsArr['commission_out']);
$em->persist($order);
$progressBar->advance();
if (0 === ($i % $batchSize)) {
$em->flush();
$em->clear();
}
++$i;
}
$em->flush();
$io->success('Suckess');
}
Found actual answer in Memory leak when executing Doctrine query in loop.
Quoting: "I resolved this by adding --no-debug to my command. It turns out that in debug mode, the profiler was storing information about every single query in memory."
It actually worked. Using memory_get_usage() I've checked it.
Yep, the title suggests: Doctrine is looking for a fieldname that's not there. That's both true and not true at the same time, though I cannot figure out how to fix it.
The full error:
File: D:\path\to\project\vendor\doctrine\dbal\lib\Doctrine\DBAL\Driver\AbstractMySQLDriver.php:71
Message: An exception occurred while executing 'SELECT DISTINCT id_2
FROM (SELECT p0_.name AS name_0, p0_.code AS code_1, p0_.id AS id_2
FROM product_statuses p0_) dctrn_result ORDER BY p0_.language_id ASC, name_0 ASC LIMIT 25
OFFSET 0':
SQLSTATE[42S22]: Column not found: 1054 Unknown column
'p0_.language_id' in 'order clause'
The query the error is caused by (from error above):
SELECT DISTINCT id_2
FROM (
SELECT p0_.name AS name_0, p0_.code AS code_1, p0_.id AS id_2
FROM product_statuses p0_
) dctrn_result
ORDER BY p0_.language_id ASC, name_0 ASC
LIMIT 25 OFFSET 0
Clearly, that query is not going to work. The ORDER BY should be in the sub-query, or else it should replace p0_ in the ORDER BY with dctrn_result and also get the language_id column in the sub-query to be returned.
The query is build using the QueryBuilder in the indexAction of a Controller in Zend Framework. All is very normal and the same function works perfectly fine when using a the addOrderBy() function for a single ORDER BY statement. In this instance I wish to use 2, first by language, then by name. But the above happens.
If someone knows a full solution to this (or maybe it's a bug?), that would be nice. Else a hint in the right direction to help me solve this issue would be greatly appreciated.
Below additional information - Entity and indexAction()
ProductStatus.php - Entity - Note the presence of language_id column
/**
* #ORM\Table(name="product_statuses")
* #ORM\Entity(repositoryClass="Hzw\Product\Repository\ProductStatusRepository")
*/
class ProductStatus extends AbstractEntity
{
/**
* #var string
* #ORM\Column(name="name", type="string", length=255, nullable=false)
*/
protected $name;
/**
* #var string
* #ORM\Column(name="code", type="string", length=255, nullable=false)
*/
protected $code;
/**
* #var Language
* #ORM\ManyToOne(targetEntity="Hzw\Country\Entity\Language")
* #ORM\JoinColumn(name="language_id", referencedColumnName="id")
*/
protected $language;
/**
* #var ArrayCollection|Product[]
* #ORM\OneToMany(targetEntity="Hzw\Product\Entity\Product", mappedBy="status")
*/
protected $products;
[Getters/Setters]
}
IndexAction - Removed parts not directly related to QueryBuilder. Added in comments showing params as they are.
/** #var QueryBuilder $qb */
$qb = $this->getEntityManager()->createQueryBuilder();
$qb->select($asParam) // 'pro'
->from($emEntity, $asParam); // Hzw\Product\Entity\ProductStatus, 'pro'
if (count($queryParams) > 0 && !is_null($query)) {
// [...] creates WHERE statement, unused in this instance
}
if (isset($orderBy)) {
if (is_array($orderBy)) {
// !!! This else is executed !!! <-----
if (is_array($orderDirection)) { // 'ASC'
// [...] other code
} else {
// $orderBy = ['language', 'name'], $orderDirection = 'ASC'
foreach ($orderBy as $orderParam) {
$qb->addOrderBy($asParam . '.' . $orderParam, $orderDirection);
}
}
} else {
// This works fine. A single $orderBy with a single $orderDirection
$qb->addOrderBy($asParam . '.' . $orderBy, $orderDirection);
}
}
================================================
UPDATE: I found the problem
The above issue is not caused by incorrect mapping or a possible bug. It's that the QueryBuilder does not automatically handle associations between entities when creating queries.
My expectation was that when an entity, such as ProductStatus above, contains the id's of the relation (i.e. language_id column), that it would be possible to use those properties in the QueryBuilder without issues.
Please see my own answer below how I fixed my functionality to be able to have a default handling of a single level of nesting (i.e. ProducStatus#language == Language, be able to use language.name as ORDER BY identifier).
Ok, after more searching around and digging into how and where this goes wrong, I found out that Doctrine does not handle relation type properties of entities during the generation of queries; or maybe does not default to using say, the primary key of an entity if nothing is specified.
In the use case of my question above, the language property is of a #ORM\ManyToOne association to the Language entity.
My use case calls for the ability to handle at lease one level of relations for default actions. So after I realized that this is not handled automatically (or with modifications such as language.id or language.name as identifiers) I decided to write a little function for it.
/**
* Adds order by parameters to QueryBuilder.
*
* Supports single level nesting of associations. For example:
*
* Entity Product
* product#name
* product#language.name
*
* Language being associated entity, but must be ordered by name.
*
* #param QueryBuilder $qb
* #param string $tableKey - short alias (e.g. 'tab' with 'table AS tab') used for the starting table
* #param string|array $orderBy - string for single orderBy, array for multiple
* #param string|array $orderDirection - string for single orderDirection (ASC default), array for multiple. Must be same count as $orderBy.
*/
public function createOrderBy(QueryBuilder $qb, $tableKey, $orderBy, $orderDirection = 'ASC')
{
if (!is_array($orderBy)) {
$orderBy = [$orderBy];
}
if (!is_array($orderDirection)) {
$orderDirection = [$orderDirection];
}
// $orderDirection is an array. We check if it's of equal length with $orderBy, else throw an error.
if (count($orderBy) !== count($orderDirection)) {
throw new \InvalidArgumentException(
$this->getTranslator()->translate(
'If you specify both OrderBy and OrderDirection as arrays, they should be of equal length.'
)
);
}
$queryKeys = [$tableKey];
foreach ($orderBy as $key => $orderParam) {
if (strpos($orderParam, '.')) {
if (substr_count($orderParam, '.') === 1) {
list($entity, $property) = explode('.', $orderParam);
$shortName = strtolower(substr($entity, 0, 3)); // Might not be unique...
$shortKey = $shortName . '_' . (count($queryKeys) + 1); // Now it's unique, use $shortKey when continuing
$queryKeys[] = $shortKey;
$shortName = strtolower(substr($entity, 0, 3));
$qb->join($tableKey . '.' . $entity, $shortName, Join::WITH);
$qb->addOrderBy($shortName . '.' . $property, $orderDirection[$key]);
} else {
throw new \InvalidArgumentException(
$this->getTranslator()->translate(
'Only single join statements are supported. Please write a custom function for deeper nesting.'
)
);
}
} else {
$qb->addOrderBy($tableKey . '.' . $orderParam, $orderDirection[$key]);
}
}
}
It by no means supports everything the QueryBuilder offers and is definitely not a final solution. But it gives a starting point and solid "default functionality" for an abstract function.
I currently have a fairly complex native SQL query which is used for reporting purposes. Given the amount of data it processes this is the only efficient way to handle it is with native SQL.
This works fine and returns an array of arrays from the scalar results.
What I'd like to do, to keep the results consistent with every other result set in the project is use a Data Transfer Object (DTO). Returning an array of simple DTO objects.
These work really well with DQL but I can't see anyway of using them with native SQL. Is this at all possible?
Doctrine can map the results of a raw SQL query to an entity, as shown here:
http://doctrine-orm.readthedocs.org/projects/doctrine-orm/en/latest/reference/native-sql.html
I cannot see support for DTOs unless you are willing to use DQL as well, so a direct solution does not exist. I tried my hand at a simple workaround that works well enough, so here are the DQL and non-DQL ways to achieve your goal.
The examples were built using Laravel and the Laravel Doctrine extension.
The DTO
The below DTO supports both DQL binding and custom mapping so the constructor must be able to work with and without parameters.
<?php namespace App\Dto;
/**
* Date with corresponding statistics for the date.
*/
class DateTotal
{
public $taskLogDate;
public $totalHours;
/**
* DateTotal constructor.
*
* #param $taskLogDate The date for which to return totals
* #param $totalHours The total hours worked on the given date
*/
public function __construct($taskLogDate = null, $totalHours = null)
{
$this->taskLogDate = $taskLogDate;
$this->totalHours = $totalHours;
}
}
Using DQL to fetch results
Here is the standard version, using DQL.
public function findRecentDateTotals($taskId)
{
$fromDate = new DateTime('6 days ago');
$fromDate->setTime(0, 0, 0);
$queryBuilder = $this->getQueryBuilder();
$queryBuilder->select('NEW App\Dto\DateTotal(taskLog.taskLogDate, SUM(taskLog.taskLogHours))')
->from('App\Entities\TaskLog', 'taskLog')
->where($queryBuilder->expr()->orX(
$queryBuilder->expr()->eq('taskLog.taskLogTask', ':taskId'),
$queryBuilder->expr()->eq(0, ':taskId')
))
->andWhere(
$queryBuilder->expr()->gt('taskLog.taskLogDate', ':fromDate')
)
->groupBy('taskLog.taskLogDate')
->orderBy('taskLog.taskLogDate', 'DESC')
->setParameter(':fromDate', $fromDate)
->setParameter(':taskId', $taskId);
$result = $queryBuilder->getQuery()->getResult();
return $result;
}
Support for DTOs with native SQL
Here is a simple helper that can marshal the array results of a raw SQL query into objects. It can be extended to do other stuff as well, perhaps custom updates and so on.
<?php namespace App\Dto;
use Doctrine\ORM\EntityManager;
/**
* Helper class to run raw SQL.
*
* #package App\Dto
*/
class RawSql
{
/**
* Run a raw SQL query.
*
* #param string $sql The raw SQL
* #param array $parameters Array of parameter names mapped to values
* #param string $className The class to pack the results into
* #return Object[] Array of objects mapped from the array results
* #throws \Doctrine\DBAL\DBALException
*/
public static function query($sql, $parameters, $className)
{
/** #var EntityManager $em */
$em = app('em');
$statement = $em->getConnection()->prepare($sql);
$statement->execute($parameters);
$results = $statement->fetchAll();
$return = array();
foreach ($results as $result) {
$resultObject = new $className();
foreach ($result as $key => $value) {
$resultObject->$key = $value;
}
$return[] = $resultObject;
}
return $return;
}
}
Running the raw SQL version
The function is used and called in the same way as other repository methods, and just calls on the above helper to automate the conversion of data to objects.
public function findRecentDateTotals2($taskId)
{
$fromDate = new DateTime('6 days ago');
$sql = "
SELECT
task_log.task_log_date AS taskLogDate,
SUM(task_log.task_log_hours) AS totalHours
FROM task_log task_log
WHERE (task_log.task_log_task = :taskId OR :taskId = 0) AND task_log.task_log_date > :fromDate
GROUP BY task_log_date
ORDER BY task_log_date DESC
";
$return = RawSql::query(
$sql,
array(
'taskId' => $taskId,
'fromDate' => $fromDate->format('Y-m-d')
),
DateTotal::class
);
return $return;
}
Notes
I would not dismiss DQL too quickly as it can perform most kinds of SQL. I have however also recently been involved in building management reports, and in the world of management information the SQL queries can be as large as whole PHP files. In that case I would join you and abandon Doctrine (or any other ORM) as well.
I dont get how to make it work.
I have:
a table partner with fields id and name
a table partner_address with two fields: id_partner and id_address
a table address with fields id and external key id_town which references town(id)
a table town with fields id, a name, and postal_code
I want to select all partners that are in towns with specific postal_code
This query works:
SELECT p.nom, v.nom
FROM partner p
JOIN partner_address pa
ON pa.id_partner=p.id
JOIN address a
ON pa.id_address = a.id
JOIN town t
ON a.id_town=t.id
WHERE t.postal_code='13480';
Now I want to "translate" it into Doctrine 2 full syntax, following the documentation.
So I've made a custom repository:
src/Society/Bundle/MyProjectBundle/Repository/PartnerRepository.php
In this repository, I'm trying to create the corresponding function:
<?php
namespace HQF\Bundle\PizzasBundle\Repository;
use Doctrine\ORM\EntityRepository;
class PartenaireRepository extends EntityRepository
{
/**
* Get all active partners from a given postal code.
*/
public function findAllActiveByCp($cp)
{
return $this->createQueryBuilder('p')
->where('p.dateVFin IS NULL')
->andWhere('p.cp=:cp')
->addOrderBy('p.cp', 'DESC')
->setParameter('cp', $cp);
}
}
Nota: the query in the code is not the right one but this code works in another custom repository I've made, so I'm trying to start from this code.
I'm trying something like this but it doesn't work:
public function findAllActiveByCp($cp)
{
$qb = $this->createQueryBuilder('p');
return $qb
->leftJoin('partner_address pa ON pa.id_partner=p.id')
->leftJoin('address a ON pa.id_address = a.id')
->leftJoin('town t ON a.id_ville=t.id')
->where('p.dateVFin IS NULL')
->andWhere('t.cp=:cp')
->addOrderBy('t.cp', 'DESC')
->setParameter('cp', $cp);
}
I get this error:
Warning: Missing argument 2 for Doctrine\ORM\QueryBuilder::leftJoin(),
called in
/blabla/Repository/PartenaireRepository.php
on line 18 and defined in
/blabla/symfony/vendor/doctrine/orm/lib/Doctrine/ORM/QueryBuilder.php
line 767
You have to join only properties, that the selected entity have.
In first parameter of join() or leftJoin() or xxxJoin() you pass the attribute name related to selected object, and in the second - alias for joined entity.
Try similar to this:
$q = $this->em()->createQueryBuilder();
$q->select(['item', 'itemContact'])
->from('ModuleAdmin\Entity\CustomerEntity', 'item')
->leftJoin('item.contacts', 'itemContact')
->andWhere($q->expr()->like('item.name', ':customerNameStart'));
Of course, the CustomerEntity contains OneToMany relation in field contacts.
Remember, that in select statement you have to select the root entity (in my example CustomerEntity aliased as item).
Edit by Olivier Pons to add how I found out the solution, and to mark this answer as valid, because it put me on the right track, thank you Adam!
In the file PartenaireRepository.php I've used the createQueryBuilder('p') properly. Here's how to make two joins in a row, using createQueryBuilder():
class PartenaireRepository extends EntityRepository
{
/**
* Retrieval of all partners given for a given postal code.
*/
public function findAllActiveByCp($cp)
{
return $this->createQueryBuilder('p')
->leftJoin('p.adresses', 'a')
->leftJoin('a.ville', 'v')
->where('v.cp=:cp')
->setParameter('cp', $cp);
... blabla
}
}
I believe for what you're doing, you will need to provide four arguments to the leftJoin method.
->leftJoin('partner_address', 'pa', 'ON', 'pa.id_partner = p.id')
So your query builder chain should look like this
public function findAllActiveByCp($cp)
{
$qb = $this->createQueryBuilder('p');
return $qb
->leftJoin('partner_address', 'pa', 'ON', 'pa.id_partner = p.id')
->leftJoin('address', 'a', 'ON', 'pa.id_address = a.id')
->leftJoin('town', 't', 'ON', 'a.id_ville = t.id')
->where('p.dateVFin IS NULL')
->andWhere('t.cp=:cp')
->addOrderBy('t.cp', 'DESC')
->setParameter('cp', $cp)
;
}