Is there some kind of persistency layer that can be used for a regularly-modified list/queue container that stores strings?
The data in the list is just strings, nothing fancy. It could be useful, though, to store a key or hash with each string for definite references, so I thought I'd wrap each string in a struct with an extra key field.
The persistency should be saved on each modification, more or less, as spontaneous power offs might happen.
I looked into Boost::Serialisation and it seems easy to use, but I guess I'd have to write the whole queue everytime it gets modified to close the file and be safe for power offs, as I see no journaling option there.
I saw SQLite, but it could be over the top as I don't need relations or any sophisticated queries.
And I don't want to reinvent the wheel by doing it manually in some files.
Is there anything available worth looking into?
I have few experience with C++ and an OS beneath, so I'm unaware of what's available and what's suitable. And couldn't find any better.
Potentially simpler alternative to relational databases, when you don't need those relations, are "nosql" databases. A document oriented database might be a reasonable choice based on the description.
I have a table with data that must be filled by users. Once this data is filled, the status changes to 'completed' (status is a field inside data).
My question is, is it good practice to create a table for data to be completed and another one with completed data? Or should I only make one table with both types of data, distinguished by the status?
Not just Django
This is actually a very good general question, not necessarily specific to Django. But Django, through easy use of linked tables (ForeignKey, ManyToMany) is a good use case for One table.
One table, or group of tables
One table has some advantages:
No need to copy the data, just change the Status field.
If there are linked tables then they don't need to be copied
If you want to remove the original data (i.e., avoid keeping redundant data) then this avoids having to worry about deleting the linked data (and deleting it in the right sequence).
If the original add and the status change are potentially done by different processes then one table is much safer - i.e., marking the field "complete" twice is harmless but trying to delete/add a 2nd time can cause a lot of problems.
"or group of tables" is a key here. Django handles linked tables really well, so but doing all of this with two separate groups of linked tables gets messy, and easy to forget things when you change fields or data structures.
One table is the optimal way to approach this particular case. Two tables requires you to enforce data integrity and consistency within your application, rather than relying on the power of your database, which is generally a very bad idea.
You should aim to normalize your database (within reason) and utilize the database's built-in constraints as much as possible to avoid erroneous data, including duplicates, redundancies, and other inconsistencies.
Here's a good write-up on several common database implementation problems. Number 4 covers your 2-table option pretty well.
If you do insist on using two tables (please don't), then at least be sure to use an artificial primary key (IE: a unique value that is NOT just the id) to help maintain integrity. There may be matching id integer values in each table that match, but there should only ever be one version of each artificial primary key value between the two tables. Again, though, this is not the recommended approach, and adds complexity in your application that you don't otherwise need.
What is the best way to implement functions while writing an app in django? For example, I'd like to have a function that would read some data from other tables, merge then into the result and update user score based on it.
I'm using postgresql database, so I can implement it as database function and use django to directly call this function.
I could also get all those values in python, implement is as django function.
Since the model is defined in django, I feel like I shouldn't define functions in the database construction but rather implement them in python. Also, if I wanted to recreate the database on another computer, I'd need to hardcode those functions and load them into database in order to do that.
On the other hand, if the database is on another computer such function would need to call database multiple times.
Which is preferred option when implementing an app in django?
Also, how should I handle constraints, that I'd like the fields to have? Overloading the save() function or adding constraints to database fields by hand?
This is a classic problem: do it in the code or do it in the DBMS? For me, the answer comes from asking myself this question: is this logic/functionality intrinsic to the data itself, or is it intrinsic to the application?
If it is intrinsic to the data, then you want to avoid doing it in the application. This is particularly true where more than one app is going to be accessing / modifying the data. In which case you may be implementing the same logic in multiple languages / environments. This is a situation that is ripe with ways to screw up—now or in the future.
On the other hand, if this is just one app's way of thinking about the data, but other apps have different views (pun intended), then do it in the app.
BTW, beware of premature optimization. It is good to be aware of DB accesses and their costs, but unless you are talking big data, or a very time sensitive UI, then machine-time, and to a lesser degree user-time, is less important than your time. Getting v1.0 out the door is often more important. As the inimitable Fred Brooks said, "Plan to throw one away; you will anyhow."
again, sorry for my silly question, but it seems that what i've learned from Relation Database should be "erased", there is no joins, so how the hell will i draw use Merise and UML in NoSql?
http://en.wikipedia.org/wiki/Class_diagram
this one will not work for NoSql?
How you organize your project is an independent notion of the technology used for persistence; In particular; UML or ERD or any such tool doesn't particularly apply to relational databases any more than it does to document databases.
The idea that NoSQL has "No Joins" is both silly and unhelpful. It's totally correct that (most) document databases do not provide a join operator; but that just means that when you do need a join, you do it in the application code instead of the query language; The basic facts of organizing your project stay the same.
Another difference is that document databases make expressing some things easier, and other things harder. For example, it's often easier to an entity relationship constraint in a relational database, but it's easier to express an inheritance heirarchy in a document database. Both notions can be supported by both technologies, and you will certainly use them when your application needs them; regardless of the technology you end up using.
In short, you should design your application without first choosing a persistence technology. Once you've got a good idea what you want to persist, You may have a better idea of which technology is a better fit. It may be the case that what you really need is both, or you might need something totally different from either.
EDIT: The idea of a foreign key is no more magical than simply saying "This is a name of that kind of thing". It happens that many SQL databases provide some very terse and useful features for dealing with this sort of thing; specifically, constraints (This column references this other relation; so it cannot take a value unless there is a corresponding row in the referant), and cascades, (If the status of the referent changes, make the corresponding change to the reference). This does make it easy to keep data consistent even at the lowest level, since there's no way to tell the database to enter a state where the referent is missing.
The important thing to distinguish though, is that the idea of giving a database entity (A row in a relational database, document in document databases) is distinct from the notion of schema constraints. One of the nice things about document databases is that they can easily combine or reorient where data lives so that you don't always have to have a referant that actually exists; Most document databases use the document class as part of the key, and so you can still be sure that the key is meaningful, even if when the referent doesn't actually exist.
But most of the time, you actually do want the referent to exist; you don't want a blog post to have an author unless that author actually exists in your system. Just how well this is supported depends a lot on the particular document database; Some databases do provide triggers, or other tools, to enforce the integrity of the reference, but many others only provide some kind of transactional capability, which requires that the integrity be enforced in the application.
The point is; for most kinds of database, every value in the database has some kind of identifier; in a relational database, that's a triple of relation:column:key; and in a document database it's usually something like the pair document_class:path. When you need one entity to refer to another, you use whatever sort of key you need to identify that datum for that kind of database. Foreign Key constraints found in RDBMses are just (really useful) syntactic sugar for "if not referant exists then raise ForeignKeyError", which could be implemented with equal power in other ways, if that's helpful for your particular use.
Goal: I wish users to be able to directly connect to a RDBMS (e.g., MS SQL Server) and do some queries with possible cross references.
Tool: SAP BusinessObjects XI Enterprise
Description:
The main reason is that Universe creation is pretty techy. Imagine the SQL DB structure changing frequently, may be even daily. Hense the synchronization issues.
Is BO capable of doing a cross reference using the BO query GUI usable by non-techy do generate a request like:
SELECT
Classroom.Location
FROM
Student,
Classroom
WHERE
Student.Name = 'Foo' AND
Student.ClassroomName = Classroom.Name
...with only a ODBC connection and no Universe (or an autogenerated Universe)?
If yes, does it require foreign keys to be defined?
If no, is there a simple way to create and update (synch) a BO Universe directly from the DB structure? May be using their new XML format?
Good question.
Background
I have implemented one very large and "complex" banking database, 500+ tables, that the customer bought BO for. The "complex" is in quotes because although I created a pure 5NF (correctly Normalised to 5NF) RDB, and most developers and the power users did not find it "complex", some developers and users found it "complex". The first BO consultant could not even create a working Universe, and overran his budgeted one month. The second BO consultant created the entire Universe in 10 days. The whole structure (one 5NF RDB; 5 apps; one Universe; web reporting) all worked beautifully.
But as a result of that exercise, it was clear to me that although the Universe is very powerful, it is only required to overcome the impediments of an un-normalised database, or a data warehouse that has tables from many different source databases, which then need to be viewed together as one logical table. The first consultant was simply repeating what he was used to, doing his techie thing, and did not understand what a Normalised db meant. The second realisation was that BO Universe was simply not required for a true (normalised) RDB.
Therefore on the next large banking project, in which the RDB was pretty much 120% of the previous RDB, I advised against BO, and purchased Crystal Reports instead, which was much cheaper. It provided all the reports that users required, but it did not have the "slice and dice" capability or the data cube on the local PC. The only extra work I had to do was to provide a few Views to ease the "complex" bits of the RDB, all in a days work.
Since then, I have been involved in assignments that use BO, and fixed problems, but I have not used XI (and its auto-generated Universe). Certainly, a preponderance towards simple reporting tools, and avoiding the Universe altogether, which has been proved many times.
In general then, yes, BO Query GUI (even pre-XI) will absolutely read the RDB catalogue directly and you can create and execute any report you want from that, without an Universe. Your example is no sweat at all. "Cross references" are no sweat at all. The non-techie users can create and run such reports themselves. I have done scores of these, it takes minutes. Sometimes (eg. for Supertype-Subtype structures), creating Views eases this exercise even further.
Your Question
Exposes issues which are obstacles to that.
What is coming across is that you do not have a Relational Database. Pouring some data into a container called "relational DBMS" does not transform that content into a Relational Database.
one aspect of a true RDB is that all definitions are in the ISO/IEC/ANSI standard SQL catalogue.
if our "foreign keys" are not in the catalogue then you do not have Foreign Keys, you do not have Referential Integrity that is defined, maintained by the server.
you probably do not have Rules and Check Constraints either; therefore you do not have Data Integrity that is defined and maintained by the server.
Noting your comments regarding changing "db" structure. Evidently then, you have not normalised the data.
If the data was normalised correctly, then the structure will not change.
Sure, the structure will be extended (columns added; new tables added) but the existing structure of Entities and Attributes will not change, because they have been (a) modelled correctly and (b) normalised
therefore any app code written, or any BO Universe built (and reports created from that), are not vulnerable to such extensions to the RDB; they continue running merrily along.
Yes of course they cannot get at the new columns and new tables, but providing that is part of the extension; the point is the existing structure, and everything that was dependent on it, is stable.
Noting your example query. That is prima facie evidence of complete lack of normalisation: Student.ClassroomName is a denormalised column. Instead of existing once for every Student, it should exist once for each Classroom.
I am responding to your question only, but it should be noted that lack of normalisation will result in many other problems, not immediately related to your question: massive data duplication; Update Anomalies; lack of independence between the "database" and the "app" (changes in one will affect the other); lack of integrity (data and referential); lack of stability, and therefore a project that never ends.
Therefore you not only have some "structure" that changes almost daily, you have no structure in the "structure" of that, that does not change. That level of ongoing change is classic to the Prototype stage in a project; it has not yet settled down to the Development stage.
If you use BO, or the auto-generated Universe, you will have to auto-generate the Universe daily. And then re-create the report definition daily. The users may not like the idea of re-developing an Universe plus their reports daily. Normally they wait for the UAT stage of a project, if not the Production stage.
if you have Foreign Keys, since they are in the Standard SQL catalogue, BO will find them
if your do not have Foreign Keys, but you have some sort of "relation" between files, and some sort of naming convention from which such "relations" can be inferred, BO has a check box somewhere in the auto-generate window, that will "infer foreign keys from column names". Of course, it will find "relations" that you may not have intended.
if you do not have naming conventions, then there is nothing that BO can use to infer such "relations". there is only so much magic that a product can perform
and you still have the problem of "structure" changing all the time, so whatever magic you are relying on today may not work tomorrow.
Answer
Business Objects, Crystal reports, and all high end to low end report tools, are primarily written for Relational Databases, which reside in an ISO/IEC/ANSI Standard SQL DBMS. that means, if the definition is in the catalogue, they will find it. The higher end tools have various additional options (that's what you pay for) to assist with overcoming the limitations of sub-standard contents of a RDBMS, culminating in the Universe; but as you are aware takes a fair amount of effort and technical qualification to implement.
The best advise I can give you therefore, is to get a qualified modeller and model your data; such that it is stable, free of duplication, and your code is stable, etc, etc; such that simple (or heavy duty) report tools can be used to (a) define reports easily and (b) run those report definitions without changing them daily. You will find that the "structure" that changes daily, doesn't. What is changing daily is your understanding of the data.
Then, your wish will come true, the reports can be easily defined once, by the users, "cross references" and all, without an Universe, and they can be run whenever they like.
Related Material
This, your college or project, is not the first in the universe to be attempting to either (a) model their data or (b) implement a Database, relational or not. You may be interested in the work that other have already done in this area, as often much information is available free, in order to avoid re-inventing the wheel, especially if your project does not have qualified staff. Here is a simplified version (they are happy for me to publish a generic version but not the full customer-specific version) of a recent project I did for a local college; I wrote the RDB, they wrote the app.
Simplified College Data Model
Readers who are not familiar with the Relational Modelling Standard may find IDEF1X Notation useful.
Response to Comments
To be clear then. First a definition.
a Relational Database is, in chronological order, in the context of the last few days of 2010, with over 25 years of commonly available true relational technology [over 35 years of hard-to-use relational technology], for which there are many applicable Standards, and using such definitions (Wikipedia being unfit to provide said definitions, due to the lack of technical qualification in the contributors):
adheres the the Relational Model as a principle
Normalised to at least Third Normal Form (you need 5NF to be completely free of data duplication and Update Anomalies)
complies with the various existing Standards (as applicable to each particular area)
modelled by a qualified and capable modeller
is implemented in ISO/IEC/ANSI Standard SQL (that's the Declarative Referential Integrity ala Foreign Key definitions; Rule and Check constraints; Domains; Datatypes)
is Open Architecture (to be used by any application)
treated as as a corporate asset, of substantial value
and therefore reasonably secured against unauthorised access; data and referential integrity; uncontrolled change (unplanned changes affecting other users, etc).
Without that, you cannot enjoy the power, performance, ease of change, and ease of use, of a Relational Database.
What it is not, is the content of an RDBMS platform. Pouring unstructured or un-organised data into a container labelled "Relational Database Engine" does not magically transform the content into the label of the container.
Therefore, if it is reasonably (not perfect, not 100% Standard-complaint), a Relational Database, the BO Universe is definitely not required to access and use it to it full capability (limited only by functions of the report tool).
If it has no DRI (FK definitions), and no older style "defined keys" and no naming conventions (from which "relations can be derived) and no matching datatypes, then no report tool (or human being) will be able to find anything.
It is not just the FK definitions.
Depending on exactly which bits of a Relational Database has been implemented in the data heap, and on the capability of the report tool (how much the licence costs), some capability somewhere within the two ends of the spectrum, is possible. BO without the Universe is the best of breed for report tools; their Crystal Reports item is about half the grunt. The Universe is required to provide the database definitions for the non-database.
Then there is the duplication issue. Imagine how an user is going to feel when they find out that the data that they finally got through to, after 3 months, turns out to be a duplicate that no one keeps up-to-date.
"Database" Object Definition
If you have unqualified developers or end users implementing "tables" in the "database", then there is no limit to the obstacles and contradictions they place on themselves. ("Here, I've got an RDBMS but the content isn't; I've got BO but it can't; I've got encryption but I've copied the payroll data to five places, so that people can get at it when they forget their encryption key".) Every time I think I have seen the limit of insanity, someone posts a question on SO, and teaches me again that there is no limit to insanity.
BO via an ODBC connection is capable of doing JOIN (cross reference) without Universe as long as there are the correct FK defined?
(ODBC has nothing to do with it; it will operate the same via a native connection or via a browser.)
For that one time, re FKs defined correctly, yes. But the purpose of my long response is to identify the that are many other factors.
It isn't a BO or BO Universe question, it is "just how insane are the users' definitions and duplication". FKs could work sometimes and not others; could work today and not tomorrow.