Separating vs merging objects made from database tables in static languages

Separating vs merging objects made from database tables in static languages - c++

Consider that in the database you have a table called users and a table called wallets. Among other things a user has 0, 1 or more wallets. The relation is one to many, meaning that the wallet has a foreign key pointing at the user.
Now the question is the following: When building a struct or a class for a person I see two possibilities:
1) The user has no sign of wallet. There is a function which takes a user as arguments and will fetch an array of the wallets.
2) The user has as a member which is an array containing the wallets and the wallets are fetched when the object / struct is created.
I think that the first approach may be better, since it's more modular - in the second one the users depend on the wallets, even if the user has no wallets.
Still, I am not sure which approach is better so I am looking for a comparison of both approaches.

On the application level you might have a user type like this (Go notation):
type User interface {
Wallets() []Wallet
}
Far beneath, there's a database which in your case is SQL. That should not be obvious by looking at your application, thought.
Making assumptions about dependencies beyond what they guarantee in the form of interface contracts couples the components irreversibly.
That means that if you model your application by your database's schema you're doing it wrong because your entire application is now tightly coupled to said database and any change to any part of it will have a big, unpredictable impact.
A common solution is to use a so called ORM layer, which sits between your database driver and entity models. It will take care of stuff like:
how and when should the wallets be fetched?
where in the database is a wallet's information stored?
when you remove a user, should the wallet also be deleted?
among other things.
PS: This answer applies to both, statically and dynamically typed languages.

Related

Declarative mechanism for Django model rows

With some frequency, I end up with models who contents are approximately constant. For example, I might have a set of plans that users can sign up for, with various attributes -- ID, name, order in the list of plans, cost, whether the purchaser needs to be a student/FOSS project/etc.. I'm going to rarely add/remove/change rows, and when I do there's likely going to be code changes too (eg, to change landing pages), so I'd like the contents of the model (not just the schema) to be managed in the code rather than through the Django admin (and consequently use pull requests to manage them, make sure test deploys are in sync, etc.). I'd also like it to be in the database, though, so I can select them using any column of the model, filter for things like "show me any project owned by a paid account", etc..
What are good ways of handling this?
I think my ideal would be something like "have a list of model instances in my code, and either Django's ORM magically pretends they're actually in the database, or makemigrations makes data migrations for me", but I don't think that exists?
The two workable approaches that come to mind are to either write data migrations by hand or use regular classes (or dicts) and write whatever getters and filters I actually want by hand.
Data migrations give me all the Django ORM functionality I might want, but any time I change a row I need to write a migration by hand, and figuring out the actual state involves either looking in the Django admin or looking through all the data migrations to figure out their combined impact. (Oh, and if somebody accidentally deletes the objects in the database (most like on a test install...) or a migration is screwy recovering will be a mess.)
Just using non-ORM classes in the source is a clearer, more declarative approach, but I need to replacements for many things I might normally do in the ORM (.objects.get(...), __plan__is_student, .values(plan__is_student).annotate(...), etc.).
Which of these approaches is better presumably depends on how much ORM functionality I actually want and how often I expect to be changing things.
Are there other good approaches for this? Am I missing some Django feature (or add-on) that makes this easier?

Service objects with Doctrine2 and Symfony2

I'm working on a Symfony2/Doctrine2 project which handles 2 databases on MSSqlServer.
The first database A_db has a table forms and the second one B_db has people. All my entities are defined with annotations.
I need to get all forms from forms related to people as I will explain in following lines.
I've spent some time reading related answered questions:
Most likely what I need Using EntityManager inside Doctrine 2.0
entities!
Of course, a service! Doctrine2 Best Practice, Should Entities use Services?
Services/Repositories/Whatever Using the Data Mapper Pattern, Should the Entities (Domain Objects) know about the Mapper?
So I decided that a service might be the best way to handle my needs. But it makes no clear to me how actually get this done. I mean, where to put my service class, how to define it in config.yml, how into people entity...
My wish is to set up a full service (assuming its the best implementation) to perform something like:
foreach($onePeple->getForms() as $form) {/* some code with form */}
In case service implementation for doctrine is not the best practice, then what would it be and how can I make it work?

What you're asking there is possible using the entities alone - so long as you define a relationship such as this on the Form entity:
/**
* #OneToMany(targetEntity="Form", mappedBy="User")
*/
protected $Forms;
And on the User entity:
/**
* #ManyToOne(targetEntity="User", inversedBy="Forms")
*/
protected $User;
Then you can simply load a User entity (through a service, a repository or however you wish) and then access all the forms belonging to that user through $userObj->Forms (if using magic __get on your entities, if not then using a getter method, again down to your preference). Forms is an instance of an object implementing the Doctrine\Common\Collections\Collection interface (since it is a to-many relationship) which is iterable using a foreach.
In projects I have worked on using Doctrine2, we typically use our services for getting, saving and deleting entities, along with listing entities of a type (we call them index methods). This way you can tie in extra functionality required in saving, such as updating other entities which are closely associated and so on. These services are tied to the persistence layer and contain a reference to the entity manager, leaving the entities themselves isolated and testable.
We've also had the argument of whether to put this kind of logic in repositories or services, but that is more a case of personal preference and/or project requirements.
One thing to note - services you build are not going to be linked into Doctrine. It is unaware of your service layer (it's just some userland code) so it is up to you to link it in a meaningful and clean manner. Essentially you'd want something where you could pass the entity manager in through the constructor of the service and have some kind of base service class capable of operating on all entities, which could then be extended with special logic on a per-entity basis. Repositories, however, are intrinsically linked into Doctrine, so if that is what you want then it might be the best solution. We tended to use services as they are then purely related to implementing the business rules and so on, leaving the repositories to be used as a component of the persistence layer.

What kind of validations should I use in my db models?

My form validators are pretty good, and if a form passes is_valid, all data should be ok to insert in the db. Should I still validate something on the db model? What else could there be validated on the db side? Because right now, except maybe for uniqueness ( which I can't do from my FormModel ), I can't think of anything else.
EDIT:
I did some work with Rails earlier, and there you would validate a form on the client side, using JS, and on the server side using model validations. I saw in django you can validate on the client side, using JS, and on the server side you have 2 validation checks: forms and models. This is what confused me.

All data should be validated in the database if possible whether you validate from the front end or not. The first validation should be the datatype, for instance using a date datatype will ensure that no nondates can ever get into your database. If you have relationships between tables these absolutely must be enforced at the database level. If the data must be unique, it is irresponsible to not put a unique index on it. If you have a distinct set of values that are the only ones allowed, then put them in a lookup table and add a forign key constraint to that table.
The reason why it is CRITICAL to do validations in the database itself is that the user interface will not be the only thing that interacts with the database (even if you think it will be). Other applications may do so, people will need to make data changes through imports or at a query window (to fix/change large amounts of data such as when client a buys client B and you need to convert all the data to client A). Also if you change the application interface you might lose the some of the critical data integrity checks in the rewrite. Data integrity is one of the most critical factors in database design and maintenance. If you can't count on data integrity, you have no data. I have never seen a database that lets this stuff be handled by the application that didn't lose data integrity over time. Remember the database will far outlast the current application. People will still be looking at this data for years to come. The application typically doesn't consider reporting which is where the data integrity problems tend to come to light. You don't want to have to explain why you have 10,000,000 in orders that you can't identify who they were shipped to, for instance.

If your data has a constraint that's always valid, you should force it in the model/database level (and optionally at the form level). Your DB can be input in multiple ways besides just a form where validation was checked. E.g., someone can go to the django shell to save models directly or someone could create/edit a model in the admin interface or some later designer creates a new form somehow, that doesn't validate correctly.
Granted this is only required if there are additional constraints on the data. Django automatically will validate for things like fields storing proper values, if you are using the correct field types. E.g., IntegerField validates to ensure it contains an integer, EmailField checks that its entered in the form of a valid email address, django.contrib.localflavor.us.models.PhoneNumberField is a US phone number, etc. Note, this only happens if your models have the proper fields (e.g., if you use CharFields for email addresses no validation can be performed.
But there may be other links between data structures, where you should write your own validation. E.g., if all custom orders requiring special instructions (and non-custom orders only sometimes have special instructions), you should check to enforce all custom orders have something in the special instructions field (and maybe have some minimum length).
EDIT: In response to your edit, the reason for three potential validations in django is straightforward -- different validations at different points for different reasons.
Client side (javascript/jquery) validation can't be trusted at all, and should only be given as a convenience for users almost as an afterthought (if you want a spiffy smooth interface). AFAIK, django doesn't have JS validation unless you use an external package like django-ajax-forms or something, but you don't trust that the validation is correct.
Second, there's a difference between form and model validation. One model may have multiple forms for different purposes. For example, you may have a blog with a Comment Model and allow two types of users to comment: signed in users, or anonymous users. The form for anonymous users may require giving a name/email before they comment, while the form for logged in users doesn't need those fields. The signed in user form, when processed in a view may automatically add the correct name and email addresses of the signed in user to the comment model before being saved.
In contrast, model validation always applies and will always be true at the database level, regardless of how they tried saving the data. If you want to make sure some condition always applies make sure it is at the DB level. (And you don't have to write put that validation in at the form level).

Attribute lists or inheritance jungle?

I've got 2 applications (lets call them AppA and AppB) communicating with each other.
AppA is sending objects to AppB.
There could be different objects and AppB does not support every object.
An object could be a Model (think of a game, where models are vehicles, houses, persons etc).
There could be different AppBs. Each supporting another base of objects.
E.g. there could be an AppB which just supports vehicle-models. Another AppB just supports specific airplane-models.
The current case is the following:
There is a BasicModel which has a position and an orientation.
If another user wants extra attributes, he inherits an ExpandedModel. And adds e.g. an attribute Color.
Now every user who needs additional attributes inherits from a more general model. After a while there is a VehicleModel which could activate windshield-wipers, an AircraftModel which could have landing lights or a PersonModel which could wave goodbye when a certain boolean is set to true.
The AppB always needs to be customized if it should support a new Model.
This approach has a big disadvantage: It's getting extremely complex after a few inheritations. Perhaps there are redundancies like an ExpandedAircraftModel which could use windschield-wipers too.
Another approach:
I create just one Model-class which has an attribute-list. The most easy implementation would be a std::map where the Key is the attribute-name and the Value is the attribute-value.
The user could now enter as much information as he wants. If he wants to use a windshieldwiper he just adds a "windshieldwiper - ON"-pair.
If AppB supports windshieldwipers it just looks if there is such an attribute in the list and reads the related value.
A developer of AppB needs to document well what attributes he supports. Every developer has to check if the specific attribute is already existing and how it is called (e.g. one developer could name his attribute windshieldwiper and another calls it windshield-wiper)
This could get extremely complex too and the only thing a user can relate to is the documentation or a specific standard-specification which has to be kept at a central space.
Finally, the question:
Which approach is better?
Did you see any additional disadvantages?
Is there a third approach which should be used instead of these two?

Just for a comparison, Google's Protocol Buffers uses a combination of both but leans hard toward your second example.
If you have distinctly different data that needs to be sent over the channel, you use the tool to generate a derivitive of the "message" class, but each message can contain other messages, and you can nest message definitions in themselves. When a message is sent out, the receiver checks the fields to determine what type of message it is and what fields are contained within.
The downside is that your code becomes overly verbose very quickly, as you can't really use inheritance to automate the process of acting on an incoming message, but the upside is that your protocol messages stay highly organized and easy to DEBUG since you're using a reflexive attribute list of sorts.

Django: django-transmeta - sorting comments

I have created an article site, where articles are published in several languages. I am using transmeta (http://code.google.com/p/django-transmeta/) to support multiple languages in one model.
Also I am using generic comments framework, to make articles commentable. I wonder what will happen if the same article will be commented in one language and then in another. Looks like all comments will be displayed on both variants....
The question actually is:
Is there a possibility to display only comments submitted with current language of the article?

I tried the approach of transmeta for translation of dynamic texts and I had the following experience:
You want another language, you need to change the database model which is generally undesirable
You need every item in both languages, which is not flexible
You have problems linking with other objects (as you point out in your question)
If you take the way of transmeta you will need two solutions:
The transmeta solution for translating fields in a model
For objects connected to a model using transmeta you will need an additional field to determine the language, say CharField with "en", "de", "ru" etc.
These were major drawbacks that made me rethink the approach and switch to another solution: django.contrib.sites. Every model that needs internationalization inherits from a SiteModel:
class SiteModel(models.Model):
site = models.ForeignKey(Site)
Every object that would need transmeta translation is connected to a site. Every connected object can determine its language from the parent object's site attribute.
I basically ran the wikipedia approach and had a Site object for every language on a subdomain (en., de., ru.). For every site I started a server instance that had a custom settings file which would set the SITE_ID and the language of the site. I used django.contrib.sites.managers.CurrentSiteManagerto display only the items in the language of the current site. I also had a manager that would give you objects of every language. I constructed a model that connects objects of the same model from different languages denoting that they are semantically the same (think languages left column on wikipedia). The sites all use the same database and share the same untranslated User model, so users can switch between languages without any problem.
Advantages:
Your database schema doesn't need to change for additional languages
You are flexible: add languages easily, have objects in one language only etc.
Works with (generic) foreign keys, they connect to an object and know what language it is. You can display the comments of an object and they will be in one language. This solves your problem.
Disadvantages:
It's a greater deal to setup: you need a django server instance for every site and some more glue code
If you need e.g an article in different languages, you need another model to connect them
You may not need the django Site model and could implement something that does the same without the need of multiple django server instances.
I don't know what you are trying to build and what I described might not fit to your case, but it worked out perfectly for my project (internationalized community platform built upon pinax: http://www.bpmn-community.org/ ). So if you disclose some more about your project, I might be able to advise an approach.
To finally answer your question: No, the generic comments will not work out of the box with transmeta. As you realised you will have to display comments in both languages for the article that is displayed in one language. Or you will have to hack into the comments and change the model and do other dirty stuff (not recommended). The approach I described works with comments and any other pluggable app.
To answer your questions:
Two Django instances can share one database, no problem there.
If you don't want two Django instances, but one, you will have to do the following: A middleware checks the incoming request, extracts desired language from URL (en.example.com or example.com/en/ etc.) and saves the language preference in the request object. The view will have to take the request object with the language and take care of the filtering of objects accordingly. Since there is no dedicated server for the language (like in the sites approach where the language is stored in the settings.py file), you can only get the language from the request and you will have to pass attributes from the request object to Model managers to filter objects.
You could try to fake a global language state in the django application with an approach like threadlocals middleware, however I don't know if this plays out nicely with django I18N engine (which is also does some thread magic).
If you want to go big with your site in multiple languages, I recommend going for the sites-approach.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js