Dynamic SQL vs Static SQL - c++

In our current codebase we are using MFC db classes to connect to DB2. This is all old code that has been passed onto us by another development team, so we know some of the history but not all.
Most of the Code abstracts away the creation of SQL queries through functions such as Update() and Insert() that prepend something like "INSERT INTO SCHEMA.TABLE" onto a string that you supply. This is done through the recordset classes that sit on top of the database class
The other way to do the SQL queries is to execute them directly on the database class using dbclass.ExecuteSQL(String).
We are wondering what the pro's and cons of each approach is. From our perspective its much easier to do the ExecuteSQL() call, as we dont have to write another class etc, but there must be good reasons to do the other way. we are just not sure what they are.
Any help would be great!
Thanks Mark
Update----
I think I may have misunderstood Dynamic and Static SQL. I think our code always uses Dynamic, so my question really becomes, should I construct the SQL strings myself and do an ExecuteSQL() or should this be abstracted away in a class for each table in the database, as the recordset classes from mfc seem to do?

The ATL OLE DB consumer database classes are absolutely the way to go. Beyond the risks of injection (mentioned by Skurmedel), piles of string-concatenated queries will become impossible to maintain very quickly.
While the ATL classes can be initially tedious, they provide the benefit of strong-typed and named columns, result navigation, connection and session management, etc.

I would try to abstract it away if it's many SQL statements. Managing dozens of different SQL queries quickly become tedious. Also it's easier to validate input that way.

Related

How to call SQL functions / stored procedure when using the Repository pattern

What is the best way to call a SQL function / stored procedure when converting code to use the repository pattern? Specifically, I am interested in read/query capabilities.
Options
Add an ExecuteSqlQuery to IRepository
Add a new repository interface specific to the context (i.e. ILocationRepository) and add resource specific methods
Add a special "repository" for all the random stored procedures until they are all converted
Don't. Just convert the stored procedures to code and place the logic in the service layer
Option #4 does seem to be the best long term solution, but it's also going to take a lot more time and I was hoping to push this until a future phase.
Which option (above or otherwise) would be "best"?
NOTE: my architecture is based on ardalis/CleanArchitecture using ardalis/Specification, though I'm open to all suggestions.
https://github.com/ardalis/CleanArchitecture/issues/291
If necessary, or create logically grouped Query services/classes for
that purpose. It depends a bit on the functionality of the SPROC how I
would do it. Repositories should be just simple CRUD, at most with a
specification to help shape the result. More complex operations that
span many entities and/or aggregates should not be added to
repositories but modeled as separate Query objects or services. Makes
it easier to follow SOLID that way, especially SRP and OCP (and ISP)
since you're not constantly adding to your repo
interfaces/implementations.
Don't treat STORED PROCEDURES as 2nd order citizens. In general, avoid using them because they very often take away your domain code and hide it inside database, but sometimes due to performance reasons, they are your only choice. In this case, you should use option 2 and treat them same as some simple database fetch.
Option 1 is really bad because you will soon have tons of SQL in places you don't want (Application Service) and it will prevent portability to another storage media.
Option 3 is unnecessary, stored procedures are no worse than simple Entity Framework Core database access requests.
Option 4 is the reason why you cannot always avoid stored procedures. Sometimes trying to query stuff in application service/repositories will create very big performance issues. That's when, and only when, you should step in with stored procedures.

Django functions vs database functions

What is the best way to implement functions while writing an app in django? For example, I'd like to have a function that would read some data from other tables, merge then into the result and update user score based on it.
I'm using postgresql database, so I can implement it as database function and use django to directly call this function.
I could also get all those values in python, implement is as django function.
Since the model is defined in django, I feel like I shouldn't define functions in the database construction but rather implement them in python. Also, if I wanted to recreate the database on another computer, I'd need to hardcode those functions and load them into database in order to do that.
On the other hand, if the database is on another computer such function would need to call database multiple times.
Which is preferred option when implementing an app in django?
Also, how should I handle constraints, that I'd like the fields to have? Overloading the save() function or adding constraints to database fields by hand?
This is a classic problem: do it in the code or do it in the DBMS? For me, the answer comes from asking myself this question: is this logic/functionality intrinsic to the data itself, or is it intrinsic to the application?
If it is intrinsic to the data, then you want to avoid doing it in the application. This is particularly true where more than one app is going to be accessing / modifying the data. In which case you may be implementing the same logic in multiple languages / environments. This is a situation that is ripe with ways to screw up—now or in the future.
On the other hand, if this is just one app's way of thinking about the data, but other apps have different views (pun intended), then do it in the app.
BTW, beware of premature optimization. It is good to be aware of DB accesses and their costs, but unless you are talking big data, or a very time sensitive UI, then machine-time, and to a lesser degree user-time, is less important than your time. Getting v1.0 out the door is often more important. As the inimitable Fred Brooks said, "Plan to throw one away; you will anyhow."

Where to store SQL code for a C++ application?

We have a C++ application that utilizes some basic APIs to send raw queries to a MS SQL Server. Scattered through the various translation units in our program, we have simple 1-2 line queries as C++ strings, and every now and then you'll see more complex queries that can be over 20 lines.
I can't help but think that the larger queries, specifically the 20+ line ones, should not be embedded in C++ code as constant strings. I want to propose pulling these out into separate text files that are loaded on-demand by the C++ application, however I'm not sure if this is the best approach.
What design choices are typical for situations like this? I definitely feel there needs to be improvement, I just don't know if moving the SQL queries out into data files (text files) is the best idea.
You could make a DAL (Data Access Layer).
It would be the API that the rest of the program talks to. Then you can mess around and try anything and everything (Stored procedures, caching, etc.) without disturbing the main program.
Move them into their own files, or even into their own stored procedures. Queries embedded in the application cannot be changed without a recompile, and depending on your release procedures, that could severely impair your ability to respond to emergencies or deploy hot fixes. You could alter your app to cache the file contents, if you go down that road, and even periodically check the files for updates.
the best "design choice" - for many different reasons - is to use MSSQL stored procedures whenever/wherever possible.
I've seen code that segregates SQL queries into a common module, but I don't think there's much benefit to a common "queries module" (or a standalone text file) over having the SQL queries spelled out as string literals in the module that's calling them.
Stored procedures, on the other hand, increase modularity, enhance security, and can vastly improve performance.
IMHO...
I would leave the SQL embedded in the C++ functions that use it: it will be easier to read and understand what the code does.
If you have SQL queries scattered around your code I'd say that there is some problem with the overall structure of the classes you are using: you should have some (or even just one) 'low level' classes that handle the interaction with the database, and the rest of the code uses these classes.
I personally don't like using stored procedure: if you have to support a different database server the porting will be a pain, I never saw that much of a performance improvement and to understand what the code does you have to jump back and forth between the stored procedures and the C++.
It really depends, here are some notes:
1) If all your sql code resides in the application, then your application is pretty much self contained in terms of logic. This is good as you have done in the current application. In terms of speed, this can be a little slower as SQL will need to be parsed when when you run these queries(also depends if you used Prepared statements,etc which can speed it up).
2) The second approach is to put all SQL logic as stored procedures on the server. This is a very much preferred approach for even small SQL queries whether one line or not. You just build a DAL layer. In terms of performance this is very good, however the logic lives in two different systems, your C++ app and the SQL server. You will quite likely need to build a small utility application that can translate the stored procedures input and output to template code (be it C++ or any other) to make your life easier.
3) A mixed approach with the above two. I would not recommend this route.
You need to think about how these queries are likely to change over time, and compare it to how the related C++ code is likely to change. If the queries are relatively independent of the code, and have a higher likelihood of change, then I would either load them at runtime from separate files, or use stored procedures instead. That approach allows for changing the queries without recompiling the C++ code. On the other hand, if the queries are highly coupled to the C++ code, making a change in one likely to accompany a change in the other, I would keep the queries in the code. This approach makes a change more localized and less error prone.

When should I use C++ instead of SQL?

I am a C++ programmer who occasionally uses MySQL to work with databases, but my SQL knowledge is rather limited. However I am surely willing to change that.
At the moment I am trying to do analysis(!) on the data I have in my database solely with SQL queries. But I am about to give up, and instead import the data to C++ and do the analysis with C++ code.
I have discussed this with my colleagues, and they also push me to use C++, saying that SQL is not meant for complex analysis but mainly for importing (from the existing tables) and exporting (to new tables) data, and a little bit more such as merging data to - e.g. - joined tables.
Can somebody help me drawing a line? So I know when to switch to C++? Of course performance is also an issue.
What are indications that things get to complex in SQL? Or maybe I just take the wrong approach with designing the queries. Then where can I find tutorials, books, ... to take a better approach?
I hope this is not too vague. I am really a bit lost.
SQL excels at analyzing large sets of relational data.
The place to draw the line is the scale of your analysis.
If you analyze individual records one at a time, do it in your application.
If you analyze large sets of records as a unit, SQL is definitely the best tool for that job.
Row-by-row analysis is not something SQL is designed or optimized for very well. But, if you want to know something about a million-row group of data, do it in the database.
I have discussed this with my colleagues, and they also push me to use C++, saying that SQL is not meant for complex analysis but mainly for importing (from the existent tables) and exporting (to new tables) data, and a little bit more such as merging data to - e.g. - joined tables.
This is completely arbitrary. Learn SQL. There are a lot of resources available on the web for free.
You can do very complex analysis of data in SQL, provided you know how use the features that SQL offers.
SQL has features for doing relational operations, like joins and projections. Also for doing set operations like union, intersection, and restriction (subset). Also for doing basic arithmetic on numbers, like the four arithmetic operators, and built in functions like SQRT. Also statistical functions like COUNT, SUM, and AVG that can be combined with projections in very interesting ways. A good DBMS will let you extend the built in functions with your own functions written in C, C++ or maybe PL/SQL.
The power you get from these features depends on how well designed the database is. A well designed database conforms to the relational model, and should be relvant to your intended use of the data.
SQL code can be stored in the database in stored prodecures. It can be stored in SQL script files. And, as you already know, it can be embedded in application programs. In addition to SQL, you can use OLAP tools and report generators to do standard things with the data very easily.
The people who advise you to keep all of your processing in C++ sound like they have learned just enough to use a database like a big and stupid file system. A good DBMS is much more than that.
SQL is usually very efficient handling its own database (depends on the server implementation).
You should use queries to analyze the database.
The main reason for that would be the communication overhead.
Even if the server is on the local machine (remote servers would have obvious communication overhead), you'll still have to retrieve the stored information from the SQL server to your c++ program for analysis.
Now if you have 10000s of lines in the SQL you would have to get the SQL server to read them all and send them to your program where it would probably create a local copy of the data for you to work on.
If you let the SQL server do it with queries, you'll gain the complex optimizations it does according the kind of query you're executing, and in the end you can retrieve only a limited amount of data (the one you actually need) through the communication.
You made right decision to begin data analysis with SQL. Now, when you feel that your knowledge of SQL limits you, you have 2 choices: give up and switch back to familiar but not very efficient toolset (C++) or bring your level with SQL up.
It's possible that at some point SQL will become too complex too, but then C++ won't be the answer either - most likely some specialized tools.
In my opinion you should only perform analysis in C++ if no equivalent for the analysis function is provided by database server, As database servers are very smart and it is hard and almost imposible to beat the algorithm efficiency of analysis function of database server. Also bringing raw data to the application for performing analysis also includes lots of overheads.
If at some point plain SQL becomes overly complex native PL of the sever could be a good choice
I agree with JNK and Jochai, but disagree with Ascanio.
It's better to improve the knowledge in database systems.
Sql comes with it
So, this is something I've been thinking about and it seems to me that SQL, as just a platform/language for storing/manipulating data, should have no inherent advantage over a C++ or C library. It seems to me that theoretically you could build a C++ library just as efficient, if not more efficient, than SQL at doing this. In doing so, you would be able to build it from the ground up, in terms of how ints, chars, strings, and other data types are stored, and make it easier to interface with you particular application (like web development). You could even make it so that the queries could be done in a language like javascript (allowing web developers to focus on just learning one language really well).

Using SQL statements to query in-memory objects

Suppose I have a collection of C++ objects in memory and would like to query them using an SQL statement. I’m willing to implement some type of interface to expose the objects’ properties like columns of a database row. Is there a library available to accomplish this?
In essence, I’m trying to accomplish something like LINQ without using the .NET platform.
C++ objects are not the same thing as SQL tables.
If you want to use SQL syntax to query the objects, you will first need to map/persist them into a table structure (ORM, object-relational-mapping). There are a number of fine ORM solutions out there besides Linq.
Once you have your objects represented in SQL tables, you should look to the SQL engine to do the heavy lifting. Most SQL platforms can be configured to keep a table mostly or always in memory.
As an alternative, you might consider a system specifically designed to cache objects. On Linux, memcached is a leading choice.
There is a pitfall between the object world and entity-relational data (ER). Its called Object-relational impedance mismatch. Basicaly it means You need to "map" object concepts to relational database concepts which is called Object-relational mapping (ORM).
Example for Polymorphism: a derived class is not an ER concept so you need to say for example all attributes from all object belonging to the same class will be stored in the same table with all the "parents" attributes
OR a derived class (object) will be stored in the same table as the abstract class from which it is derived from.
It would be probably best to use a comunity supported ORM, but when the motives are right your company can benefit having your own ORM solution.
In our company we developed our own ORM solution (started 6 years ago so it was written in c++ and the modeller was a windows desktop application). But now we would probably use ADO.NET Entity Framework (LINQ's "big" brother) or another supported ORM
I have also been searching for something of this sort, but it seems the SQLlite is the closest I can find.