C++ class design from database schema - c++

I am writing a perl script to parse a mysql database schema and create C++ classes when necessary. My question is a pretty easy one, but us something I haven't really done before and don't know common practice. Any object of any of classes created will need to have "get" methods to populate this information. So my questions are twofold:
Does it make sense to call all of the get methods in the constructor so that the object has data right away? Some classes will have a lot of them, so as needed might make sense too. I have two constrcutors now. One that populates the data and one that does not.
Should I also have a another "get" method that retrieves the object's copy of the data rather that the db copy.
I could go both ways on #1 and am leaning towards yes on #2. Any advice, pointers would be much appreciated.

Ususally, the most costly part of an application is round trips to the database, so it would me much more efficient to populate all your data members from a single query than to do them one at a time, either on an as needed basis or from your constructor. Once you've paid for the round trip, you may as well get your money's worth.
Also, in general, your get* methods should be declared as const, meaning they don't change the underlying object, so having them go out to the database to populate the object would break that (which you could allow by making the member variables mutable, but that would basically defeat the purpose of const).
To break things down into concrete steps, I would recommend:
Have your constructor call a separate init() method that queries the database and populates your object's data members.
Declare your get* methods as const, and just have them return the data members.

First realize that you're re-inventing the wheel here. There are a number of decent object-relational mapping libraries for database access in just about every language. For C/C++ you might look at:
http://trac.butterfat.net/public/StactiveRecord
http://debea.net/trac
Ok, with that out of the way, you probably want to create a static method in your class called find or search which is a factory for constructing objects and selecting them from the database:
Artist MJ = Artist::Find("Michael Jackson");
MJ->set("relevant", "no");
MJ->save();
Note the save method which then takes the modified object and stores it back into the database. If you actually want to create a new record, then you'd use the new method which would instantiate an empty object:
Artist StackOverflow = Artist->new();
StackOverflow->set("relevant", "yes");
StackOverflow->save();
Note the set and get methods here just set and get the values from the object, not the database. To actually store elements in the database you'd need to use the static Find method or the object's save method.

there are existing tools that reverse db's into java (and probably other languages). consider using one of them and converting that to c++.

I would not recommend having your get methods go to the database at all, unless absolutely necessary for your particular problem. It makes for a lot more places something could go wrong, and probably a lot of unnecessary reads on your DB, and could inadvertently tie your objects to db-specific features, losing a lot of the benefits of a tiered architecture. As far as your domain model is concerned, the database does not exist.
edit - this is for #2 (obviously). For #1 I would say no, for many of the same reasons.

Another alternative would be to not automate creating the classes, and instead create separate classes that only contain the data members that individual executables are interested in, so that those classes only pull the necessary data.
Don't know how many tables we're talking about, though, so that may explode the scope of your project.

Related

How to partly initialize c++ objects when I don't need all values from the database?

I’m trying to cut down on the amount of network queries in my c++ program (to increase speed), and when displaying search results, I don’t want each (of the sometimes thousands) of objects found in the search to initialize themselves completely from the database when I only need to display part of this information.
It is much faster to perform one bigger query where I get all the information I want to display about the objects at once in the query (for example, for each object/row I select the id, the name and the location), passing them to a bigger constructor, and letting all other members be default values. Previously, and in other cases where I need the complete object, I just pass the ID to the object, then call initializeFromDatabase() directly to set all the other values.
//current solution (problem is, I might need many constructors like this for different purposes)
auto *myobject = new MyObject(345, "ObjectName", "Europe");
//no further (costly) initialization since I only need the following 2 values for my search results.
myobject->getName();
myobject->getLocationName();
//prevous solution (resulting in too many queries)
auto *myobject = new MyObject(345);
myobject->initializeFromDatabase();
myobject->getName();
myobject->getLocationName();
//I could also query the other 30 or so members here, everything is set.
This doesn’t feel like good practice though, I would need other custom constructors for say, another search window displaying other kinds of data about the objects.
Are there any general best practices / a suitable design pattern to solve this sort of problem? Should I create a “Search object” that is its own class and that can then be used to create the complete object when needed? Or always initialize with only the database ID (setting a flag that the object is not initialized yet) and use the setters I need?
I found that a solution to this would be to use some sort of Lazy Loading, since I want to quickly load part of the object for the list, then load all of it if a user clicks on one of the objects. For example, a Virtual Proxy or the Ghost Design pattern would be suitable. I simply create a proxy object for displaying the search results (and for other lists in the program) that can create the full object on demand. Every proxy object has one constructor so I avoid the problem of using lots of different constructors for different purposes.
See Chapter 11 of Patterns of Enterprise Application Architecture by Martin Fowler (published by Addison-Wesley Professional, 2002)

Future-proofing data types(i.e creating structs with only one member in hopes they can be expanded later, when needed)

Say I want to create a list of variables/objects to store something very specific(say the coordinate for where an enemy needs to spawn in a videogame), at first I would only need a simple point in space to store this information but later on I may want to add the enemy type and other data specific to each element of this list. Is it good practice to write a whole new class or struct with only just the initial data member I need in hopes that whenever I need to update the list with more data per element I can just add members to this previously redundant struct/class? Furthermore, is packaging an already existing type into a new one in the spirit of being more descriptive something that actually helps code readability?
Sounds like a classic case for some good old object orientated programming.
So the idea would basically be that you have your simple struct/class with, as mentioned in your example, coordinates. If you would want to add another enemy with some extra attributes to be stored you would go ahead and create a new class/struct that inherits from the very basic class. That way you get all the attributes the basic class had in your new class plus you can define new ones.
That way you have a good structure in your code and it is easy to understand what is going on. Scalability and reusability also profit greatly from this, which is why this concept is state-of-the-art.
This might sound a little confusing at first, but I recommend to read up on inheritance and object oriented programming in general. I promise it is not too hard once you get used to the thinking patterns.

Worth using getters and setters in DTOs? (C++)

I have to write a bunch of DTOs (Data Transfer Objects) - their sole purpose is to transfer data between client app(s) and the server app, so they have a bunch of properties, a serialize function and a deserialize function.
When I've seen DTOs they often have getters and setters, but is their any point for these types of class? I did wonder if I'd ever put validation or do calculations in the methods, but I'm thinking probably not as that seems to go beyond the scope of their purpose.
At the server end, the business layer deals with logic, and in the client the DTOs will just be used in view models (and to send data to the server).
Assuming I'm going about all of this correctly, what do people think?
Thanks!
EDIT: AND if so, would their be any issue with putting the get / set implementation in the class definition? Saves repeating everything in the cpp file...
If you have a class whose explicit purpose is just to store it's member variables in one place, you may as well just make them all public.
The object would likely not require destructor (you only need a destructor if you need to cleanup resources, e.g. pointers, but if you're serializing a pointer, you're just asking for trouble). It's probably nice to have some syntax sugars constructors, but nothing really necessary.
If the data is just a Plain Old Data (POD) object for carrying data, then it's a candidate for being a struct (fully public class).
However, depending on your design, you might want to consider adding some behavior, e.g. an .action() method, that knows how to integrate the data it is carrying to your actual Model object; as opposed to having the actual Model integrating those changes itself. In effect, the DTO can be considered part of the Controller (input) instead of part of Model (data).
In any case, in any language, a getter/setter is a sign of poor encapsulation. It is not OOP to have a getter/setter for each instance fields. Objects should be Rich, not Anemic. If you really want an Anemic Object, then skip the getter/setter and go directly to POD full-public struct; there is almost no benefit of using getter/setter over fully public struct, except that it complicates code so it might give you a higher rating if your workplace uses lines of code as a productivity metric.

Object Oriented Design - The easiest case, but I'm confused anyway!

When I wrap up some procedural code in a class (in my case c++, but that is probably not of interest here) I'm often confused about the best way to do it. With procedural code I mean something that you could easily put in an procedure and where you use the surrounding object mainly for clarity and ease of use (error handling, logging, transaction handling...).
For example, I want to write some code, that reads stuff from the database, does some calculations on it and makes some changes to the database. For being able to do this, it needs data from the caller.
How does this data get into the object the best way. Let's assume that it needs 7 Values and a list of integers.
My ideas are:
List of Parameters of the constructor
Set Functions
List of Parameters of the central function
Advantage of the first solution is that the caller has to deliver exactly what the class needs to do the job and ensures also that the data is available right after the class has been created. The object could then be stored somewhere and the central function could be triggered by the caller whenever he wants to without any further interaction with the object.
Its almost the same in the second example, but now the central function has to check if all necessary data has been delivered by the caller. And the question is if you have a single set function for every peace of data or if you have only one.
The Last solution has only the advantage, that the data has not to be stored before execution. But then it looks like a normal function call and the class approaches benefits disappear.
How do you do something like that? Are my considerations correct? I'm I missing some advantages/disadvantages?
This stuff is so simple but I couldn't find any resources on it.
Edit: I'm not talking about the database connection. I mean all the data need for the procedure to complete. For example all informations of a bookkeeping transaction.
Lets do a poll, what do you like more:
class WriteAdress {
WriteAdress(string name, string street, string city);
void Execute();
}
or
class WriteAdress {
void Execute(string name, string street, string city);
}
or
class WriteAdress {
void SetName(string Name);
void SetStreet(string Street);
void SetCity(string City);
void Execute();
}
or
class WriteAdress {
void SetData(string name, string street, string city);
void Execute();
}
Values should be data members if they need to be used by more than one member function. So a database handle is a prime example: you open the connection to the database and get the handle, then you pass it in to several functions to operate on the database, and finally close it. Depending on your circumstances you may open it directly in the constructor and close it in the destructor, or just accept it as a value in the constructor and store it for later use by the member functions.
On the other hand, values that are only used by one member function and may vary every call should remain function parameters rather than constructor parameters. If they are always the same for every invocation of the function then make them constructor parameters, or just initialize them in the constructor.
Do not do two-stage construction. Requiring that you call a bunch of setXYZ functions on a class after the constructor before you can call a member function is a bad plan. Either make the necessary values initialized in the constructor (whether directly, or from constructor parameters), or take them as function parameters. Whether or not you provide setters which can change the values after construction is a different decision, but an object should always be usable immediately after construction.
Interface design is very important but in your case what you need is to learn that worst is better.
First choose the simplest solution you have, write it now.
Then you'll see what are the flaws, so fix them.
Repeat until it's not important to fix them.
The idea is that you'll have to get experience to understand how to get directly to the "best" or better said "less worst" solution of some type of problem (that's what we call "design pattern"). To get that experience you'll have to hit problems fast, solve them and try to deeply understand why something was wrong.
That's you'll have to do each time you try something "new". Errors are not a problem if you fix them and learn from them.
You should use the constructor parameters for all values, which are necessary in any case (consider that many programming languages also support constructor overloading).
This leads to the second: Setter should be used to introduce optional parameters, or to update values.
You can also join these methods: expect necessary parameters in the constructor and then call their setter-function. This way you have to do check validity checks only once (in the setters).
Central functions should use temporary parameters only (timestamps, ..)
First off, it sounds like you are trying to do too much at once. Reading, calculating and updating are all separate operations, that themselves can probably split down further.
A technique I use when I'm thinking about the design of a method or class is to think: 'what do I want the highest-level method to ideally look like?' i.e. think about the separate components of the method and split them down. That's top-down design.
In your case, I envisaged this in my head (C#):
public static void Dostuff(...)
{
Data d = ReadDatabase(...);
d.DoCalculations(...);
UpdateDatabase(d);
}
Then do the same thing for each of those methods.
When you come to passing in parameters to your method, you need to consider whether the data you're passing in is stored or not - i.e. if your class is static (it cannot be instantiated, and is instead just a collection of methods etc) or if you make objects of the class. In other words: each object of the class has a state.
If the parameters can indeed be considered to be attributes of the class, they define its state, and should be stored as private variables with getters and setters for each, where neccessary. If the class instead has no state, it should be static and the parameters passed directly to the method.
Either way, it is common, and not considered bad practice, to have both a constructor and a few get / set functions where neccessary. It is also common to have to check the state of the object at the beginning of a method, so I wouldnt worry about that.
As you can see, it largely depends on what else you are doing in this class.
The reason you can't find many resources on this is that the 'right' answer is hugely domain-specific; it depends heavily on the specific project. The best way to find out is usually by experiment.
(For example: You're right about the advantages of the first two methods. An obvious disadvantage is the use of memory to store the data the whole time the object exists. This disadvantage doesn't matter in the least if your project needs two of these data objects; it's potentially a huge problem if you need a very large number. If it's a big live dataset, you're probably better querying for data as you need it, as implied by your third solution... but not definitely, as there are times when it's better to cache the data.)
When in doubt, do a quick test implementation with a simplest-possible interface; just writing it will frequently make it clearer what the pros and cons are for your project.
Specifically addressing your example it seems as though you are still thinking too procedurally.
You should make an object that initialises the connection to the database doing all relevant error checking. Then have a method on the object that writes the values in whatever convenient way you prefer. When the object is destroyed it should release the handle to the database. That would be the object oriented way to approach the problem.
I assume the only responsibility of your WriteAddress class is to write an address to a database or an output stream. If so, then you should not worry about getters and setters for the address details; instead, define an interface AddressDataProvider that is to be implemented by all classes with which your WriteAddress class will collaborate.
One of the methods on that interface would be GetAddressParts(), which would return an array of strings as required by WriteAddress. Any class that implements that method will need to respect this array structure.
Then, in WriteAddress, define a setter SetDataProvider(AddressDataProvider). This method will be called by the code that instantiates your WriteAddress object(s).
Finally, in your Execute() method, obtain the data that are required by calling GetAddressParts() on the "data provider" that you set and write out your address.
Notice that this design shields WriteAddress from subsidiary activities that are not strictly part of its responsibilities. So, WriteAddress does not care how the address details are retrieved; it does not even care about knowing and holding the address details. It just knows from where to get them and how to write them out.
This is obvious even in the description of this design: only two names WriteAddress and AddressDataProvider come up; there is no mention of database or how to pass the address details. This is usually an indication of high cohesion and low coupling.
I hope this helps.
You can implement each approach, they don't exclude each other, then you're going to see which are most useful.

Single Document project structure

I have previously asked about the proper way of accessing member variables present in the project. In the project, I have CWinapp-derived class, CMainFrm class, a list of different view classes. However, currently, I have instances of different user-defined classes instantiated in the CWinApp-derived class, while the rest of the classes use a pointer obtained from AfxGetApp() function, and then access the different user-defined classes. I was told by some community members on the MFC newsgroup that this is a very bad design (i.e. the parent should not know anything about an app-class, view class, or document class). However, I'm not sure how otherwise I can access various user-defined classes without using this design. It would be great to hear some suggestions as I'm not familiar enough with MFC to come up with proper search terms.
"(i.e. the parent should not know anything about an app-class, view class, or document class)"
I'm not sure I understand this sentence, what do you mean with 'parent' here?
Anyway, in my opinion, the design you describe isn't really a problem. It's a trade off: do you either pass these classes to all functions that need them, complicating their use and API, or do you store them as a sort of global variables like you're doing? It depends on the data that is accessed, and how often. Data that is needed in many places can just as well be 'global'.
There are multiple ways of making data 'global': make it a member of CWinApp (that is, your CWinApp-derived class), or of CMainFrame, or do you make an actual 'global variable', or do you make a singleton, ...
The problem with global variables is that it becomes hard to figure out who accesses it when and from where. If you data as a member of CWinApp, you can access it through an accessor function and trace access from there (through log messages, break point, ...) This, in my opinion, mitigates most of the problems associated with global variables. What I usually do nowadays is use a Loki singleton.
The reason that is stated in your post for not making data a member of CWinApp, as a decoupling issue, is (in the context that you've presented it) a bit strange imo. If certain classes need access, they'll need to know of those data structures anyway, and their storage location is irrelevant. Maybe it's just because I don't know about the specifics of your design.