How do you model a relationship that is dependent on a value? - foreign-keys

How do you model the situation where the (foreign) table to which you relate is dependent on a value in the (primary) table?
This is the case:
I have a table of Risks, and each Risk has a TreatmentType.
If the TreatmentType is 'Accept', the Risk must be linked to an item in the Persons table.
If the TreatmentType is 'Mitigate', the Risk must be linked to 1 or more items in the Controls table.
If the TreatmentType is 'Transfer', the Risk must be linked to an item in the Departments table.
I could add different FK-fields for Person, Control and Department, and implement a rule to fill only one of these dependent on the value of TreatmentType, but it seems kind of sloppy.
It's been a while since I made a design for a database, what is good practice for this?

Related

GSI vs redundancy dynamoDB

I have this scenario:
I have to save in dynamoDB table a lot of shops. Every shop has a ID string and its PK.
Every shop has a field "category" that is a string that indicates its category (food,tatoo ...).
So far everything is ok.
I have this use-case: "given in a category get all the stores of that category".
To accomplish this, two options came to my mind:
create a GSI that has like PK the "category id" and like field "shop ID".
In this way with the id of the category I get all the IDs of the stores of that category and then for each store id I query the main table to get all the info of each single store (name, address, etc.).
I create in the main table a PK called type "category_$id" (where $id is the category id) and as field the id of the store. This, as in the case of GSI, given a category ID, I have the set of IDs of the shops and then for each ID I execute the query on the same table to get all the info of that shop.
I wanted to know what the difference between these two options is in terms of cost / benefit and which is the best.
They seem to me substantially the same thing (the only difference is that the first uses another table, i.e. the index, while the second uses the same table), but I await the opinion of someone more experienced than me
One benefit of GSI is that it will result in less management. Lets say you delete/add a record from/to a main table. This will be automatically be reflected in your GSI.
In contrast, if you have two independent tables, you have to manage the synchronization between them yourself.

Database Design: Same entity different relations

I'm working on an e-commerce platform build using Django and Postgresql. In my platform I have an entity called "Category" and an entity called "Attribute". A category can have multiple attributes, but an attribute can only belong to one category.
For example, the "digital camera" category can have lens, image quality, etc. as its attributes.
I have come to realize that some attributes can belong to all categories. For example: packaging, shipping, customer service, etc.
In addition to these general attributes, in the future I may end up having attributes that belong to multiple categories. For example battery can belong to all categories in the Electronics department (categories belong to a department). But I'm not sure if this is a good idea as it may make things unnecessarily complex.
What's the best way to approach this? Please note that I need to be able to query the general attributes often. I have thought of the following solutions:
Make a default category and assign those attributes to that category. In the code write a special logic that will always look at these categories.
Allow a nullable foreign key in the attributes table. So an attribute can belong to no specific category indicating that it belongs to all categories.
Make another table for general attributes.
Store the category-attribute relationship in a third table. But then my question is how can I query for attributes that don't belong to any specific category?
I appreciate your help in advance.
UPDATE:
Not sure if this is the best solution, but I took the easy route and ended up making nullable m2m relationship between attribute and the other models. Thank you all for your help.

Postgresql + Django: What is better between same strings or a many to many relationship?

I'm currently designing my data base using postgresql with Django and I was wondering: What is best practice - having several instances of the same model with the same value or a many to many relation ship?
Let me elaborate. Let's say I'm designing a store. The store sells items. Items can have one or many statuses (e.g. ordered, shipped, delivered, paid, pre-ordered etc.).
What would be a better practice:
Relating the items to their status via a many-to-many relationship, which will lead to one status having hundreds of thousand and later millions of relations? Will so many relations become problematic?
Or is it better for each item to have a foreignkey to their statuses? So that each status only has one item. And if I would like to query all the items that have the same status (e.g. shipped), I would have to iterate over all statuses with a common name.
What would be better, especially for the long term?
I would recommend going with a many-to-many relationship.
Hundreds of thousands or even millions of relations should not be a problem. The many-to-many relationship is stored as a table with id, item_id, status_id. SQL will be performant at querying the table either by status_id or item_id even if the table gets big. This is exactly the kind of thing it was built to handle.
Let me elaborate. Let's say I'm designing a store. The store sells
items. Items can have one or many statuses (e.g. ordered, shipped,
delivered, paid, pre-ordered etc.).
If many people will have this many itens you should use manytomany relations, better let django handle with this "third table", since this table just hold ids you can interate over them using reverse lookup, i do prefer using many to many instad of simple foreignkeys.
In your case, who you will handle when your User will hold many itens? like what if my User buy one potato and 2 bananas? you will duplicate the tuple in your User Table to tell "here he have the potato and in this second one he have the banana"? so you will be slave of Disctinct attribute while you still dirtying your main table User
...
class Item(models.Model):
...
class User(models.Model):
items = models.ManyToMany(Item)
So when i query my Item and my User will only bring attributes related to them... while if you use item inside of User Model you will have multiple instances of same user.
So instead of use User.items.all() you will use User.objects.filter(id=id)and them items = [user.item for user in User.objects.filter(id=id)]
Look how complex this get and makeing your database so dirty

Creating User Generated Models and SQL Tables in Ruby on Rails 4

Considering scenario of an Inventory Management System. Inventory has many types of items, each with own table and columns. One, two or Twelve tables are not sufficient to describe the plethora of the TYPES of items as they are extremely varying. e.g. some attributes of a family of items like BIKES do not have the same attributes of CARS. It is tedious for developer to take into account the thousands of the TYPES of items and incorporate them into each model manually.
Is there a way for users to generate models themselves? thereby generating own SQL tables etc... Is there another approach to this problem? (Maybe using Semantic Web Technologies)
Coming from Spring Framework, I am fairly new to RoR development.
Thanks in Advance.
I'm not an expert, but you could do it with regular, pre-defined models.
Item_Type
Item_Attribute
Item
Item_Type would have a name variable (not unique), and perhaps any other common attributes you'd want. It would then have a has_many Item_Attributes relationship, whereas Item_Attribute belongs_to Item_Type.
So I'd make a view that allows the user to add new Item_Types and then define Item_Attributes for those item types.
Then you could have the actual Item model, each instance of which is the existence of an Item_Type in the inventory. Item belongs to Item_Type, and Item_Type has_many Items, and Item cannot have a null Item_Type.
So a user creates a new Item_Type with the name "BIKE", then adds several Item_Attributes to it, such as "Mountain" and "Red". Then the user can create a new Item that has a relationship to the "BIKE" Item_Type.
If they wanted to add a blue mountain bike instead of a red one, they would need to go through the process again, adding another Item_Type of "BIKE" except adding "Blue" as an attribute for the new instance of Item_Type's Item_Attributes.

Can you move compound keys and/or foreign keys to other tables when normalizing to 3NF (third normal form)

My database design is currently at 3NF. The issue is foreign keys and in some cases compound keys.
Can you move compound keys and/or foreign keys to create other tables provided the attributes associated with the compound/foreign keys do not rely on the primary key?
I suspect the answer is yes due to this link:
Are Foreign Keys included in Third Normal Form?
Best Answer: Just because it's a foreign key doesn't mean it also can't be considered an attribute of the primary key. The fact that it's a foreign key to begin with implies it's defining a relationship with another table, and thus would not violate [...] 3NF.
-- TheMadProfessor
https://answers.yahoo.com/question/index?qid=20081117095121AAXWBbX#
This leads me to wonder whether my current normalization stage is 3NF.
Preamble
In pure relational database theory, there is nothing to stop you having composite primary keys (PKs), and you can have foreign keys (FKs) that reference them and those FKs are necessarily composite too. Some software has difficulty with composite keys, so you often find that people add an ID column which contains an automatically generated number, which is then designated as the PK of the table. Other tables can then have (simple) FKs that reference the (simple) ID column. One not uncommon mistake is to forget that the original composite PK is still a candidate key (CK), and its uniqueness should be enforced by the DBMS with a unique constraint on the table; it becomes an alternative key (AK).
Diversion
The system of CKs, AKs and PKs works like this:
Every CK is a set of (one or more) columns that is a unique identifier for the data in the rest of each row of data in a table.
One CK may be designated as the PK.
The other CKs become AKs.
Consider this table:
CREATE TABLE elements
(
atomic_number INTEGER NOT NULL PRIMARY KEY
CHECK (atomic_number > 0 AND atomic_number < 120),
symbol CHAR(3) NOT NULL UNIQUE,
name CHAR(20) NOT NULL UNIQUE,
atomic_weight DECIMAL(8,4) NOT NULL,
period SMALLINT NOT NULL
CHECK (period BETWEEN 1 AND 7),
group CHAR(2) NOT NULL
-- 'L' for Lanthanoids, 'A' for Actinoids
CHECK (group IN ('1', '2', 'L', 'A', '3', '4', '5', '6',
'7', '8', '9', '10', '11', '12', '13',
'14', '15', '16', '17', '18')),
stable CHAR(1) DEFAULT 'Y' NOT NULL
CHECK (stable IN ('Y', 'N'))
);
Each of atomic_number, symbol and name is a candidate key. For chemistry, the symbol is most convenient as the primary key; for physics, the atomic_number is most convenient. The tables related to isotopes etc reference the atomic_number column, but the tables related to chemical compounds reference the symbol column. The three CKs here are all simple; on the other hand, the isotopes table has a compound PK consisting of the atomic number of the element (the number of protons) and the number of neutrons.
Answer
Getting back to your question, your data may well be in 3NF, or more likely BCNF (which is formally stronger than 3NF).
You'd have to show us your table schemas and specify the constraints (functional dependencies, etc) that apply to the columns before we could assess your design. But there is nothing that you've described which, a priori, prevents it from being well normalized.
I'm not sure that I understand your question. If you are asking, "Can you have a foreign key in a table without violating 3NF?" the answer is absolutely, positively yes. Nothing about any stage of normalization says that you should eliminate foreign keys. Indeed, it's pretty much impossible to normalize all but the most trivial data without using foreign keys.
** Update **
Okay, maybe now I understand your question, but then I think you've answered it for yourself. Yes, in a fully-normalized DB, you should not have non-key dependencies. If you have FKs that are not dependent on the PK, then they should be moved to another table.
To make a simple example, suppose you want to keep track of people, the city they live in, and the country that that city is in. So for your first draft, you make this structure: (Asterisk marks the PK.)
Person (person_id*, person_name, city_id, country_id)
City (city_id*, city_name)
Country (country_id*, country_name)
This is not normalized. A city is in the same country regardless of what resident of that city we are talking about. Paris is not in France when we are talking about Pierre but in Germany when we are talking about Francois. (If there are two cities with the same name, of course those are different cities and should have different records. I suppose a city could cross national boundaries, but for our purposes here let's assume that if it does, we consider it two cities that happen to touch. They would surely have different city governments, different postal systems, etc.) So we have a non-key dependency. country_id depends on city_id, not on person_id.
So to normalize this schema, we should move the country_id to a table where it is dependent solely on the PK. Presumably, the City table:
Person (person_id*, person_name, city_id)
City (city_id*, city_name, country_id)
Country (country_id*, country_name)
Do you understand what a FD (functional dependency) is? A FD is an expression with an arrow between two attribute sets. Given a table and a FD, saying that the FD holds in the table or the table has the FD or the table satisfies the FD says all subrows for the first attribute set appear with the same subrow for the second attribute set.
Do you understand that normalization involves CKs (candidate keys)? Independent of normalization we can call one CK a PK (primary key) and the others AKs (alternate keys).
Do you understand what a FK (foreign key) is? Given a database, a table and an attribute list, saying that the list is a FK referencing some attribute list in some table says that every list of values for the attributes in the first table is also a list of of values for the attributes in the second table where those attributes form a CK (candidate key).
Normalization uses FDs & JDs (join dependencies) to replace a table by projections of it that join back to it. FKs (foreign keys) are irrelevant to normalization. That's both to whether a table is in a given NF (normal form) and to decomposing to a given NF. The answer you link to is also saying that FKs are irrelevant to whether a table is in 3NF--first for a specific case, then for the general case.
Because components share column values from an original table, some FKs will arise from normalization. Just as various tables and their PKs, AKs, CKs, superkeys, FDs & JDs will. That is normalization output, not input.
Since normalization replaces tables by projections of them, column sets in common among the original & components must contain the same subrow values. That's an EQD (equality dependency). Subrow values that form CKs will thus have FKs to them arise from normalization.
But often during normalization we see that we want to replace a table by some that are projections with others that have at least the subrows of a projection. Common subrows in a projection must then appear in expanded components, but not vice versa. That's an IND (inclusion dependency). There will still be a FK when common columns form a CK in a table that is a superset of a projection. Such a design change isn't normalization, it is just a change that you noticed during normalization from a design that was wrong in not allowing all the possible business situation to be recorded to a design that doesn't have that problem.
"Compound" vs "simple" vs "empty" apply to FKs, PKs, AKs, CKs (candidate keys), & superkeys and are relevant to FD & JD parts. Definitions will specify compound/simple/empty when it matters. (Eg: Sometimes FDs are put into canonical forms involving single columns. Sometimes we can easily infer NFs hold partly based on whether CKs are simple.)
After you get your tables, declare (sufficient) constraints. Then the DBMS can enforce them. FKs, PKs, AKs, CKs, superkeys, FDs & JDs all have associated constraints. SQL lets you declare all PKs & AKs (via PRIMARY KEY & UNIQUE NOT NULL). Those declarations actually declare superkeys which happen to be CKs/PKs/AKs when no smaller ones are declared within them. Similarly SQL FOREIGN KEY declares a foreign superkey that is a FK if it is actually to a CK. Declare sufficient chains of FKs to enforce the ones you don't declare. (Via transitivity.) SQL DBMSs typically won't let you declare FK cycles. SQL also makes you declare superkeys referenced by FK declarations whether or not those columns contain a smaller declared superkey/CK and so must be a superkey. Declare or enforce via triggers any FD or JD constraints that aren't implied by CK constraints. (5NF gets rid of all such constraints except some cycles of FDs on CKs.)
Find academic textbook definitions and algorithms.