C++ : Generic interface design for Database - c++

I have a class which is used to create connection with database:
class DBHandler
{
public:
DBHandler();
~DBHandler();
int connect();
int execQuery( string query);
string getField( int row, int col);
};
Now there is another class which is used to fetch some info from database,
class DBManager
{
public:
DBManager();
~DBManager();
//Approach 1
string getUsername()
{
//create a query here and use object of DBHandler class to execute it.
}
//Approach 2
string getUsername (struct QueryDetails& qDetails)
{
//create query using fields of structure and execute the query using DBHandler class.
}
};
Now here is the problem:
1 ) which approach should I follow:
A) If I use approach 1, then I need to hard code query.
B) If I use approach 2, then I need to fill structure each time before calling to function getUsername.
2 ) Is there any better solution except these two which would be generic ?
PS : Definition of structure
struct QueryDetails
{
string tableName;
vector<string> colList;
...
};

Your question is very broad, and the elements you give do not permit to propose you an objective best answer.
Your approach 1 has the following advantages:
it is a robust and secure approach : The queries are written with knowledge of the relevant object
if the database evolve it's easy to find out (text search) where specific queries are made for the tables, and updated the querying code for your object
if your object evolves, needless to say, that you'll immediately realise what you have to change on the database side
The main inconvenience, is that you're tightly linked to the database. If tomorrow you change from PostGres to something else, you have to rewrite every query.
Your approach 2 has the following advantages:
It is very flexible
If your database change, you have to change only the generic functions.
The inconvenience is that this flexibility bears a lot of risks for the maintenance: you can't be sure that the correct query is send by the client, and impact assessment of database layout changes are very difficult to assess.
So finally, it's up to you to decide which one would more fit your needs.
I'd personally tend to favour 1. But this is subjective, and I'd anyway introduce an additional layer to make the application code more independent of the database system that implements access to database.
However, depending on your need, a greater flexibility could be of advantage. For instance, if your class is in fact meant to be a middle layer for other calsses to fetch their own data, then approach 2 could be the best option.

Related

What are the trade-offs in Cloud Datastore for list property vs multiple properties vs ancestor key?

My application has models such as the following:
class Employee:
name = attr.ib(str)
department = attr.ib(int)
organization_unit = attr.ib(int)
pay_class = attr.ib(int)
cost_center = attr.ib(int)
It works okay, but I'd like to refactor my application to more of a microkernel (plugin) pattern, where there is a core Employee model that just might just have the name, and plugins can add other properties. I imagine perhaps one possible solution might be:
class Employee:
name = attr.ib(str)
labels = attr.ib(list)
An employee might look like this:
Employee(
name='John Doe'
labels=['department:123',
'organization_unit:456',
'pay_class:789',
'cost_center:012']
)
Perhaps another solution would be to just create an entity for each "label" with the core employee as the ancestor key. One concern with this solution is that currently writes to an entity group are limited to 1 per second, although that limitation will go away (hopefully soon) once Google upgrades existing Datastores to the new "Cloud Firestore in Datastore mode":
https://cloud.google.com/datastore/docs/firestore-or-datastore#in_native_mode
I suppose an application-level trade-off between the list property and ancestor keys approaches is that the list approach more tightly couples plugins with the core, whereas the ancestor key has a somewhat more decoupled data scheme (though not entirely).
Are there any other trade-offs I should be concerned with, performance or otherwise?
Personally I would go with multiple properties for many reasons but it's possible to mix all of these solutions for varying degree of flexibility as required by the app. The main trade-offs are
a) You can't do joins in data store, so storing related data in multiple entities will prevent querying with complex where clauses (ancestor key approach)
b) You can't do range queries if you make numeric and date fields as labels (list property approach)
c) The index could be large and expensive if you index your labels field and only a small set of the labels actually need to be indexed
So, one way to think of mixing all these 3 is
a) For your static data and application logic, use multiple properties.
b) For dynamic data that is not going to be used for querying, you can use a list of labels.
c) For a pluggable data that a plugin needs to query on but doesn't need to join with the static data, you can create another entity that again uses a) and b) so the plugin stores all related data together.

Doctrine swap out table at runtime

Typically when you implement a entity using Doctrine you map it to a table explicitly:
<?php
/**
* #Entity
* #Table(name="message")
*/
class Message
{
//...
}
Or you reply on doctrine to implicitly map your class name to a table...I have several tables which are identical in schema but I do not wish to re-create the class for each time...there fore at runtime (dynamically) I would like to change the table name accordingly.
Where do I start or what would I look into overriding to implement this odd requirement???
Surprisingly (to me), the solution is very simple. All you have to do is to get the ClassMetadata of your entity and change the name of the table it maps to:
/** #var EntityManager $em */
$class = $em->getClassMetadata('Message');
$class->setPrimaryTable(['name' => 'message_23']);
You need to be careful and do not change the table name after you have loaded some entities of type Message and changed them. It's a big chance it will either produce SQL errors on saving (because of table constraints, for example), if you are lucky or it will modify the wrong row (from the new table).
I suggest the following workflow:
set the desired table name;
load some entities;
modify them at will;
save them;
detach them from the entity manager (the method EntityManager::clear() is a quick way to start over);
go back to step 1 (i.e. repeat using another table).
The step #5 (detach the entities from the entity manager) is useful even if you don't change or don't save the entities. It allows the entity manager use less memory and work faster.
This is just one of the many methods you can use to dynamically set/change the mapping. Take a look at the documentation of class ClassMetadata for the rest of them. You can find more inspiration in the documentation page of the PHP mapping.

Design of a class hierarchy for generating a PDF

I am basically having to make a program that will generate a PDF. The PDF will have 3 different page types: a front cover sheet, a general sheet, and a last cover sheet. The header contains a fair amount of information but the header of the front cover sheet and the general sheet only differ by 1 item, however, that one items requires me to shift the others down in coordinates. I have a picture attached to show what I mean.
Also the only purpose the classes are really serving is holding values to represent rectangles that will be used as targets to print text in the pdf. So they really have no need of any functionality aside from the constructor with only initializes the values from a file of constants.
I am trying to use "good" design practice but I am uncertain of what a more efficient method is. It doesn't seem like I can use inheritance that shares the common elements as I will always end up with something I don't need in one of the classes. I thought about just using composition and making a class for each element in the header, that would solve the problem, but then I would have a lot more classes and it would be a class with just to hold one data member which doesn't seem efficient.
So I would just appreciate any suggestions on how to make this a more cohesive and sensible design.
The picture is NOT what I have currently but it is to represent that the data I need seems to be coupled awkwardly or perhaps I am just over complicating this.
Front sheet, general sheets and back sheet have in common that they ARE sheets. A good candidate for your class hierarchy would therefore be:
class sheet { .... };
class front_sheet : public sheet { ...};
class back_sheet : public sheet { ...};
class general_sheet : public sheet { ...};
In sheet, you should put all the common elements, and of course the common behaviour (e.g. print(), save(), get_size(), ...).
There should be a member function that calculates the position of an element in a page. As the rule depends on the kind of page, it would be a virtual function of sheet(), and front and back sheets could overide the default function. This approach will help you to easily manage the different layout of the different pages.
class sheet {
public:
virtual void get_position (int item_no, int& x, int&y) { x=5; y=14*item_no; }
...
};
class back_sheet : public sheet {
public:
void get_position (int item_no, int& x, int&y) { x=5; y = (item==5 ? 14:0); }
...
};
As inheritance really corresponds to a "is-a" relationship, you'll have get a very robust design.
You should however think about the content of your sheet. I see here two main orientations:
you could manage items in a container (e.g. vectors) : it's easier to organise your output in loops instead of copy pasting similar lines of codes for every item
you should ask yourself if the items could have subitems (e.g. a region could contain several subregions, like a print layout with a box and boxes in the box). In this case, i'd recommend the use of the composite pattern
Edit; After having thought about the content, it could be worth coming back again to the sheets, and ask yourself how much different their behaviour really is:
do they have different behaviour thorughout their lifecycle (different way to acquire data for PDF generation or to use dynamically the layout ? In which case the class hierarchy would be ok.
or, after having adopted a dynamic structure such as one suggested above, does it turn out that the only difference is the way you create/construct them ? In this case, it could be worth thinking to keep only one sheet class after all, but having three construction "engines", using either the prototype pattern or the builder pattern. This would however be a significant shift in your application's architecture.

C++/OOP Associations model and database

Edit: I called it association because in my head it should be this, but it seems that I implement it as an aggregation... This can be also discussed...
Context
What you learn in IT is that when you do associations, you should use pointers.
For example: You have 2 objects: Person and Movie. One Person can do several Movie and one Movie can be done by/with several Person. You would then have:
class Person
{
public:
Person::Person();
int id;
vector<Movie*> movies;
};
class Movie
{
public:
Movie::Movie();
int id;
};
main()
{
Person person;
Movie *movie = new Movie;
person.movies.push_back(movie); // With a setter it would be better but...
}
or something like this (please correct me if I do something wrong =D)
Where the troubles appear
Now you have many persons and movies and you want to save it somewhere: in a database.
You get your person
You get all the movies it is associated with in order to construct the whole object.
But how do you get them?
Do you reconstruct a new pointer of Movie for each Person concerned that you associate ?
You lose then the association property that allow the objects to be linked but live their own life.
Do you load all the database in RAM and... ok forget this
What is the way to do it cleverly? What is the proper way given by documentations?
I'm interested in simplified/pseudo code as examples, dissertation... Thx a lot !
Your question is very broad, and there's a number of approaches, how to bind database tables (and represent their foreign key connections).
It's not really only how to represent/handle that kind of Domain Model snippet, you're presenting in your code sample here.
#Martin Fowler provided the EAA pattern catalogue you could reasonably research, and apply appropriate patterns for these kind of object <-> relational mapping problems you address.

Unit Testing & Primary Keys

I am new to Unit Testing and think I might have dug myself into a corner.
In your Unit Tests, what is the better way to handle primary keys?
Hopefully an example will paint some context. If create several instances of an object (Lets' say Person).
My unit test is to test the correct relationships are being created.
My code is to create Homer, he children Bart and Lisa. He also has a friend Barney, Karl & Lenny.
I've seperated my data layer with an Interface. My preference is to keep the primary key simple. Eg On Save, Person.ProductID = new Random().Next(10000); instead of say Barney.PersonID = 9110 Homer.PersonID = 3243 etc.
It doesn't matter what the primary key is, it just needs to be unique.
Any thoughts???
EDIT:
Sorry I haven't made it clear. My project is setup to use Dependency Injection. The data layer is totally separate. The focus of my question is, what is practical?
I have a class called "Unique" which produces unique objects (strings, integers, etc). It makes sure they're unique per-test by keeping a internal static counter. That counter value is incremented per key generated, and included in the key somehow.
So when I'm setting up my test
var Foo = {
ID = Unique.Integer()
}
I like this as it communicates that the value is not important for this test, just the uniqueness.
I have a similar class 'Some' that does not guarantee uniqueness. I use it when I need an arbitrary value for a test. Its useful for enums and entity objects.
None of these are threadsafe or anything like that, its strictly test code.
There are several possible corners you may have dug yourself into that could ultimately lead to the question that you're asking.
Maybe you're worried about re-using primary keys and overwriting or incorrectly loading data that's already in the database (say, if you're testing against a dev database as opposed to a clean test database). In that case, I'd recommend you set up your unit tests to create their records' PKs using whatever sequence a normal application would or to test in a clean, dedicated testing database.
Maybe you're concerned about the efficacy of your code with PKs beyond a simple 1,2,3. Rest assured, this isn't something one would typically test for in a straightforward application, because most of it is outside the concern of your application: generating a number from a sequence is the DB vendor's problem, keeping track of a number in memory is the runtime/VM's problem.
Maybe you're just trying to learn what the best practice is for this sort of thing. I would suggest you set up the database by inserting records before executing your test cases using the same facilities that your application itself will use to insert records; presumably your application code will rely on a database-vended sequence number for PKs, and if so, use that. Finally, after your test cases have executed, your tests should roll back any changes they made to the database to ensure the test is idempotent over multiple executions. This is my sorry attempt of describing a design pattern called test fixtures.
Consider using GUIDs. They're unique across space and time, meaning that even if two different computers generated them at the same exact instance in time, they will be different. In other words, they're guaranteed to be unique. Random numbers are never good, there is a considerable risk of collision.
You can generate a Guid using the static class and method:
Guid.NewGuid();
Assuming this is C#.
Edit:
Another thing, if you just want to generate a lot of test data without having to code it by hand or write a bunch of for loops, look into NBuilder. It might be a bit tough to get started with (Fluent methods with method chaining aren't always better for readability), but it's a great way to create a huge amount of test data.
Why use random numbers? Does the numeric value of the key matter? I would just use a sequence in the database and call nextval.
The essential problem with database unit testing is that primary keys do not get reused. Rather, the database creates a new key each time you create a new record, even if you delete the record with the original key.
There are two basic ways to deal with this:
Read the generated Primary Key, from the database and use it in your tests, or
Use a fresh copy of the database each time you test.
You could put each test in a transaction and roll the transaction back when the test completes, but rolling back transactions doesn't always work with Primary Keys; the database engine will still not reuse keys that have been generated once (in SQL Server anyway).
When a test executes against a database through another piece of code it ceases to be an unit test. It is called an "integration test" because you are testing the interactions of different pieces of code and how they "integrate" together. Not that it really matters, but its fun to know.
When you perform a test, the following things should occur:
Begin a db transaction
Insert known (possibly bogus) test items/entities
Call the (one and only one) function to be tested
Test the results
Rollback the transaction
These things should happen for each and every test. With NUnit, you can get away with writing step 1 and 5 just once in a base class and then inheriting from that in each test class. NUnit will execute Setup and Teardown decorated methods in a base class.
In step 2, if you're using SQL, you'll have to write your queries such that they return the PK numbers back to your test code.
INSERT INTO Person(FirstName, LastName)
VALUES ('Fred', 'Flintstone');
SELECT SCOPE_IDENTITY(); --SQL Server example, other db vendors vary on this.
Then you can do this
INSERT INTO Person(FirstName, LastName, SpouseId)
VALUES('Wilma', 'Flintstone', #husbandId);
SET #wifeId = SCOPE_IDENTITY();
UPDATE Person SET SpouseId = #wifeId
WHERE Person.Id = #husbandId;
SELECT #wifeId;
or whatever else you need.
In step 4, if you use SQL, you have to re-SELECT your data and test the values returned.
Steps 2 and 4 are less complicated if you are lucky enough to be able to use a decent ORM like (N)Hibernate (or whatever).