Advanced/custom Doctrine ORM relations (TYPO3 translated entities)

Advanced/custom Doctrine ORM relations (TYPO3 translated entities) - doctrine-orm

I'm in the unfortunate position of maintaining a TYPO3 project. The TYPO3 "ORM" is really slow and awkward to use and so I want to use Doctrine ORM for some of my read operations.
TYPO3 has a pretty weird way of storing translations and so my tables look something like this:
> products
id
---
123
64
37
> product_names
product_id | locale | name
----------------------------------
64 | en | My Product
64 | de | Mein Artikel
Assuming the current locale is always available in some kind of magic global variable, is there a way to map this to the following class in Doctrine ORM? (Basically adding a static AND locale = "en" to the JOIN condition. Or maybe in a custom Type? Or some kind of proxy? I don't know.)
class Product {
private string $id;
private ProductName $name;
}
class ProductName {
private string $name;
}
Obviously, I can't really change anything about the database or the class structure since TYPO3 expects it this way.
Currently, my only idea is to use an event subscriber to add some kind of proxy for the name in postLoad, but that would have the significant overhead of having a separate DB query every time the product name needs to be accessed - for every single product.

That structure is not the usual way TYPO3 handles translations.
Normaly translations are handled by a copy of the original record (in the same table) but with a pointer to the original (l18n_parent) and a field indicating the current language (sys_language_uid).
Additional TYPO3 handles languages not by locales but in declared language uids. So the same id can stand for different languages in different installations.
For me your datastructure seems to be imported from somewhere else.

Related

What are the trade-offs in Cloud Datastore for list property vs multiple properties vs ancestor key?

My application has models such as the following:
class Employee:
name = attr.ib(str)
department = attr.ib(int)
organization_unit = attr.ib(int)
pay_class = attr.ib(int)
cost_center = attr.ib(int)
It works okay, but I'd like to refactor my application to more of a microkernel (plugin) pattern, where there is a core Employee model that just might just have the name, and plugins can add other properties. I imagine perhaps one possible solution might be:
class Employee:
name = attr.ib(str)
labels = attr.ib(list)
An employee might look like this:
Employee(
name='John Doe'
labels=['department:123',
'organization_unit:456',
'pay_class:789',
'cost_center:012']
)
Perhaps another solution would be to just create an entity for each "label" with the core employee as the ancestor key. One concern with this solution is that currently writes to an entity group are limited to 1 per second, although that limitation will go away (hopefully soon) once Google upgrades existing Datastores to the new "Cloud Firestore in Datastore mode":
https://cloud.google.com/datastore/docs/firestore-or-datastore#in_native_mode
I suppose an application-level trade-off between the list property and ancestor keys approaches is that the list approach more tightly couples plugins with the core, whereas the ancestor key has a somewhat more decoupled data scheme (though not entirely).
Are there any other trade-offs I should be concerned with, performance or otherwise?

Personally I would go with multiple properties for many reasons but it's possible to mix all of these solutions for varying degree of flexibility as required by the app. The main trade-offs are
a) You can't do joins in data store, so storing related data in multiple entities will prevent querying with complex where clauses (ancestor key approach)
b) You can't do range queries if you make numeric and date fields as labels (list property approach)
c) The index could be large and expensive if you index your labels field and only a small set of the labels actually need to be indexed
So, one way to think of mixing all these 3 is
a) For your static data and application logic, use multiple properties.
b) For dynamic data that is not going to be used for querying, you can use a list of labels.
c) For a pluggable data that a plugin needs to query on but doesn't need to join with the static data, you can create another entity that again uses a) and b) so the plugin stores all related data together.

Boolean attribute or new table (Django + PostgreSQL)

Situation: I have a Books set. Book can be one of the types: "Test", "Premium" and "Common". Data proportional: 2%, 15%, 83%. Amount query per time unit (in percent): 40%, 20%, 40%
I see some ways for resolve it in database:
Boolean: is_test, is_premium. If we need only "Tests" book: Book.objects.filter(is_test=True). It is can be a proxy model, for example. Analogy for premium books;
Separate Tables: books_test, books_premium, books_common.
Choice field: string in ['Test', 'Premium', 'Common'];
Combine 1 and 2: books_test table and books table with 'is_premium' attribute.
And we need optimally querying this data! All three Book variants need in one page. Exist queryset combinations: only tests, only common, common + premium, only premium.
If we use 1,3 variant: 1 endpoint with specific filter;
If we use 2 variant: one of the tree endpoints without filters (frontend should know what kind endpoint use). Or we can create one endpoint with some conditions and check by backend. Anyway: need extend logic;
Which way is more correct and why?

If you need to mix different types on one page, separate models/tables would complicate things for no good reason. The same goes for mapping more than two exclusive states to a combination of boolean fields.
This leaves you with a choice field or a separate BookType model containing the choices.

Doctrine swap out table at runtime

Typically when you implement a entity using Doctrine you map it to a table explicitly:
<?php
/**
* #Entity
* #Table(name="message")
*/
class Message
{
//...
}
Or you reply on doctrine to implicitly map your class name to a table...I have several tables which are identical in schema but I do not wish to re-create the class for each time...there fore at runtime (dynamically) I would like to change the table name accordingly.
Where do I start or what would I look into overriding to implement this odd requirement???

Surprisingly (to me), the solution is very simple. All you have to do is to get the ClassMetadata of your entity and change the name of the table it maps to:
/** #var EntityManager $em */
$class = $em->getClassMetadata('Message');
$class->setPrimaryTable(['name' => 'message_23']);
You need to be careful and do not change the table name after you have loaded some entities of type Message and changed them. It's a big chance it will either produce SQL errors on saving (because of table constraints, for example), if you are lucky or it will modify the wrong row (from the new table).
I suggest the following workflow:
set the desired table name;
load some entities;
modify them at will;
save them;
detach them from the entity manager (the method EntityManager::clear() is a quick way to start over);
go back to step 1 (i.e. repeat using another table).
The step #5 (detach the entities from the entity manager) is useful even if you don't change or don't save the entities. It allows the entity manager use less memory and work faster.
This is just one of the many methods you can use to dynamically set/change the mapping. Take a look at the documentation of class ClassMetadata for the rest of them. You can find more inspiration in the documentation page of the PHP mapping.

django self referential models, use value of 0 or null as root for children table

I have a php app that I am considering rewriting in either Django or Rails (have done some maitence work over the years but not that familiar with issues like this). Ideally, I'd like to db schema as close as possible to what I'm using. It has a model this is like the following:
menu - id, name
menu_headers - id, menu_id, parent_menu_header_id, sort, name
The logic in the getMenu($id) function is to get the menu by the id and then get the menu_headers with corrent menu_id and a parent_menu_header_id of 0. There is a sub-menu function that gets called that gets submenus based upon the parent_menu_header_id. In other words, 0 means it is a root menu_header (ie select * from menu_headers where menu_id=$menu_id and parent_menu_header_id=0 order by sort). This all gets pushed to memcache so performance is not a concern.
I'm considering moving the app to django and am investigating how difficult / possible this would be.
I currently have:
class Menu(models.Model):
location=models.ForeignKey(Location)
name = models.CharField(max_length=200)
class Menu_Header(models.Model):
menu=models.ForeignKey(Menu)
parent=models.ForeignKey('self',null=True,blank=True,related_name="children")
A couple of issues have come up. It isn't a true foreign key relationship. It looks like composite foreign keys are not supported. Maybe using something like a Root_Menu_Header which does have a true fk relationship. Is there a better way to model this? I have looked at the django-mptt but think that this should be possible. Any ideas?
thx
--edit #2
I'm probably not getting what you're saying but for example I currently have:
menu
id name
1 test menu
menu_header
id menu_id parent_id name
1 1 NULL Wine
2 1 1 Red
3 1 1 White
When I get the Menu object, it has all 3 menu headers at the same level. So this clearly isn't working correctly. Should I be manipulating this at the view level then? Or should the foreign key (menu_id) not be set in the menu_header table? Sorry for confusion but will be a lot of help to figure this out. If any suggestions on whether better to do this in Rails, that would also be appreciated.
thx

Django expects that if a foreign key is not NULL, that it maps up to a real object. You're not really going to be able to get around that. However, the absence of a value for parent (NULL) implicitly implies there is no parent, and you'll find that developing the app around this will be quite natural.
The only real problem I see is if you're trying to use the existing database (or migrating data there from). In which case, you'll only need to run a SQL update and set parent to NULL wherever it's 0.

SQLite3 for Serialization Purposes

I've been tinkering with SQLite3 for the past couple days, and it seems like a decent database, but I'm wondering about its uses for serialization.
I need to serialize a set of key/value pairs which are linked to another table, and this is the way I've been doing this so far.
First there will be the item table:
CREATE TABLE items (id INTEGER PRIMARY KEY, details);
| id | details |
-+----+---------+
| 1 | 'test' |
-+----+---------+
| 2 | 'hello' |
-+----+---------+
| 3 | 'abc' |
-+----+---------+
Then there will a table for each item:
CREATE TABLE itemkv## (key TEXT, value); -- where ## is an 'id' field in TABLE items
| key | value |
-+-----+-------+
|'abc'|'hello'|
-+-----+-------+
|'def'|'world'|
-+-----+-------+
|'ghi'| 90001 |
-+-----+-------+
This was working okay until I noticed that there was a one kilobyte overhead for each table. If I was only dealing with a handful of items, this would be acceptable, but I need a system that can scale.
Admittedly, this is the first time I've ever used anything related to SQL, so perhaps I don't know what a table is supposed to be used for, but I couldn't find any concept of a "sub-table" or "struct" data type. Theoretically, I could convert the key/value pairs into a string like so, "abc|hello\ndef|world\nghi|90001" and store that in a column, but it makes me wonder if that defeats the purpose of using a database in the first place, if I'm going to the trouble of converting my structures to something that could be as easily stored in a flat file.
I welcome any suggestions anybody has, including suggestions of a different library better suited to serialization purposes of this type.

You might try PRAGMA page_size = 512; prior to creating the db, or prior to creating the first table, or prior to executing a VACUUM statement. (The manual is a bit contradictory and it also depends on the sqlite3 version.)
I think it's also kind of rare to create tables dynamically at a high rate. It's good that you are normalizing your schema, but it's OK for columns to depend on a primary key and, while repeating groups are a sign of lower normalization level, it's normal for foreign keys to repeat in a reasonable schema. That is, I think there is a good possibility that you need only one table of key/value pairs, with a column that identifies client instance.
Keep in mind that flat files have allocation unit overhead as well. Watch what happens when I create a one byte file:
$ cat > /tmp/one
$ ls -l /tmp/one
-rw-r--r-- 1 ross ross 1 2009-10-11 13:18 /tmp/one
$ du -h /tmp/one
4.0K /tmp/one
$
According to ls(1) it's one byte, according to du(1) it's 4K.

Don't make a table per item. That's just wrong. Similar to writing a class per item in your program. Make one table for all items, or perhaps, store the common parts of all items, with other tables referencing it with auxillary information. Do yourself a favor and read up on database normalization rules.
In general, the tables in your database should be fixed, in the same way that the classes in your C++ program are fixed.

Why not just store a foreign key to the items table?
Create Table ItemsVK (ID integer primary key, ItemID integer, Key Text, value Text)

If it's just serialization, i.e. one-shot save to disk and then one-shot restore from disk, you could use JSON (list of recommend C++ libraries).
Just serialize a datastructure:
[
{'id':1,'details':'test','items':{'abc':'hello','def':'world','ghi':'90001'}},
...
]
If you want to save some bytes, you can omit the id, details, and items keys and save a list instead: (in case that's a bottleneck):
[
[1,'test', {'abc':'hello','def':'world','ghi':'90001'}],
...
]

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Advanced/custom Doctrine ORM relations (TYPO3 translated entities) - doctrine-orm

Related

What are the trade-offs in Cloud Datastore for list property vs multiple properties vs ancestor key?

Boolean attribute or new table (Django + PostgreSQL)

Doctrine swap out table at runtime

django self referential models, use value of 0 or null as root for children table

SQLite3 for Serialization Purposes

Categories

Resources