Returning objects that work with parent object's data and later invalidate them - c++

I'm working on an implementation of tables in a Qt app, but having a bit of trouble getting the design right.
I have two classes, Table and Cell.
Cell has API for setting cell properties such as borders and paddings, and getting the row and column of the cell using int Cell::row() and int Cell::column(). It is an explicitly shared class, using QExplicitlySharedDataPointer for its data. It also has an isValid() API to query if the cell is valid or not.
Table has API for inserting/removing rows and columns and merging areas of cells. A Cell may be retrieved from a table using Table::cellAt(int row, int column). Rows of cells are kept as a QList<QList<Cell>>. When rows and columns are removed, the removed cells are marked as invalid by the table, which makes calls to Cell::isValid on any previously returned cells from the removed rows/columns return false.
Now to the tricky part: Since calculating the row and column number of a cell if you haven't already got them is an expensive operation, the Table::cellAt(int row, int column) methods sets the row/column explicitly on the Cell before returning it and the Cell keeps them as simple int members. This means a Cell can reply fast when queried for its row/column.
But here comes the problem; This also means that the values of Cell::row() and Cell::column will be incorrect if rows or columns are removed/inserted before the row/column that the cell is in.
I can't mark the affected cells as invalid in the same way I do when the actual row/column they are part of is removed. Since later on someone might again retrieve a cell with cellAt(int, int) in that row/column. And that cell should not be invalid.
Does anyone have some advise on a better design here?

You could do a lazy update. That is, instead of updating the cell's position information every time the table changes, only update it when Cell:row or Cell:column are called if the table has changed since the last time the cell's position was updated. That would require you to keep a version stamp or something in Table and Cell. Every time the table gets updated, bump the version. Cell::row and Cell:column would first check if the Cell's version is older than the Table's, and if so, recalculate the position.
Whether that extra work is worth it versus just always recalculating position or recalculating on every change, I can't say.

I discussed this problem with a friend because it seemed too much like a Programming Pearls exercise. This is what we came up with:
Create a new structure for counting indices. It could be something like struct Index { int n; };.
Store row and column indices in two QList<Index*>. Let's call these Dimension. Using a pointer is crucial, as you'll see later.
Cells do not store their row and column values anymore. They point to an element at each of the two Dimension. When queried for their row and column, there is now an extra pointer dereference which should not be too expensive.
When the table has rows and columns added or removed, you add items to or remove items from the corresponding Dimension and update the n value of the following items. Storing pointers is necessary because QList copies its values.
Instead of renumbering every Cell affected by a table manipulation, an operation that has a cost of O(rows x columns), you'll now update only the affected rows or columns, which has cost of O(rows + columns). The downside is added code complexity and two dereferenced accesses when the Cell is queried for its row or column.

Related

how to deal with virtual index in a database table in Django + PostgreSQL

Here is my current scenario:
Need to add a new field to an existing table that will be used for ordering QuerySet.
This field will be an integer between 1 and not a very high number, I expect less than 1000. The whole reasoning behind this field is to use it for visual ordering on the front-end, thus, index 1 would be the first element to be returned, index 2 second, etc...
This is how the field is defined in model:
priority = models.PositiveSmallIntegerField(verbose_name=_(u'Priority'),
default=0,
null=True)
I will need to re-arrange (reorder) the whole set of elements in this table if a new or existing element gets this field updated. So for instance, imagine I have 3 objects it this table:
Element A
priority 1
Element B
priority 2
Element C
priority 3
If I change Element C priority to 1 I should have:
Element C
priority 1
Element A
priority 2
Element B
priority 3
Since this is not a real db index ( and have empty values), I'm gonna have to query for all elements on database each time a new element is created / updated and change priority value for each record in table. Not really worried about performance since table will always be small BUT, I'm worried this way to proceed is not the way to go or simply it generates too much overhead.
Maybe there is simpler way to do this with plain SQL stuff? If I use an index though, I will get an error every time an existing priority is used, something I don't want either.
Any pointers?
To insert at 10th position all you need is a single sql query:
MyModel.objects.filter(priority__gte=10).update(priority=models.F('priority')+1)
Then you would need a similar one for deleting an element, and swapping two elements (or whatever your use case requires). It all should be doable in a similar manner with bulk update queries, no need to manually update entry by entry.
First, you can very well index this column, just don't enforce it to contains unique values. Such standard indexes can have nulls and duplicates... they are just used to locate the row(s) matching a criteria.
Second, updating each populated* row each time you insert/update a record should be looked at based on the expected update frequency. If each user is inserting several records each time they use the system and you have thousands of concurrent users, it might not be a good idea... whereas if you have a single user updating any number of rows once in a while, it is not so much an issue. On the same vein, you need to consider if other updates are occurring to the same rows or not. You don't want to lock all rows too often if they are to be updated/locked for updating other fields.
*: to be accurate, you wouldn't update all populated rows, but only the ones having a priority lower than the inserted one. (inserting a priority 999 would only decrease the priority of items with 999 and 1000)

mysql++ (mysqlpp): how to get number of rows in result prior to iteration using fetch_row through UseQueryResult

Is there an API call provided by mysql++ to get the number of rows returned by the result?
I have code structured as follows:
// ...
Query q = conn.query(queryString);
if(mysqlpp::UseQueryResult res = query.use()){
// some code
while(mysqlpp::Row row = res.fetch_row()){
}
}
My previous question here will be solved easily if a function that returns the number of rows of the result. I can use it to allocate memory of that size and fill in as I iterate row by row.
In case anyone runs into this:
I quote the user manual:
The most direct way to retrieve a result set is to use Query::store(). This returns a StoreQueryResult object,
which derives from std::vector, making it a random-access container of Rows. In turn,
each Row object is like a std::vector of String objects, one for each field in the result set. Therefore, you can
treat StoreQueryResult as a two-dimensional array: you can get the 5th field on the 2nd row by simply saying
result[1][4]. You can also access row elements by field name, like this: result[2]["price"].
AND
A less direct way of working with query results is to use Query::use(), which returns a UseQueryResult object.
This class acts like an STL input iterator rather than a std::vector: you walk through your result set processing
one row at a time, always going forward. You can’t seek around in the result set, and you can’t know how many
results are in the set until you find the end. In payment for that inconvenience, you get better memory efficiency,
because the entire result set doesn’t need to be stored in RAM. This is very useful when you need large result sets.
A suggestion found here: http://lists.mysql.com/plusplus/9047
is to use the COUNT(*) query and fetch that result and then use Query.use again. To avoid inconsistent count, one can wrap the two queries in one transaction as follows:
START TRANSACTION;
BEGIN;
SELECT COUNT(*) FROM myTable;
SELECT * FROM myTable;
COMMIT;

How does HBase delete a row while it is still in memstore?

Assume I just added a row to a table and the row is still in the memstore. At this point, I deleted it. What happened for deletion? I am not sure if my understanding is right: A marker is added for that row in the memstore. When the memstore is flushed, the row and the marker are written to HFile. But if that is the case, why isn't the row removed from memstore?
The way HBase works is that every change is a new "insert". This let's hbase work in an efficient manner. You should also keep in mind that in many cases HBase is set to save x versions of each value. So a time where the row is just in the memstore and it should only keep one version of the row is a very specific edge case. It is better for the system to work in a single, predicted and tested way than to handle that edge case.
It is possible that that a row with same rowkey already exists in Store file on disk, in which case another add with same rowkey will be an update to the existing row, resulting into a new version of that row.
Saving the newly added row with delete marker from memstore would allow major compaction to remove all old versions of that row upto the version with delete marker.

MFC CListCtrl updating text of any cell

This question is to understand how to update any row programatically.
Details.
I have a listcrtl, that accepts the data from either from a file or from the edit controls in the dialog. When the items are added I will know its position, that I added, so I can change its subitem texts. I have even implemented the sort functionality in the list, so now the position keeps changing. I have an identifier column for each row, so that I can recognize the row.
Now, from an out side event, if I have to change an other columns value of an ID that I know , I have to first find the position of the item by comparing the id column, then with that position, I have set the subitemtext.
This works fine except that it takes time to find the row first then it need to update the column.
Now, in order to get the row directly, I need some help.
I have gone through
http://msdn.microsoft.com/en-us/library/windows/desktop/hh298346(v=vs.85).aspx
But this does not use MFC. Please help me achieving this.
If you have many items you should consider switching to Virtual Lists. It is the fastest way to access the data. If you do not want to invest time to this, then the easiest way for you will be the following:
When you populate the CListCtrl store the ID of each item in the item data using the SetItemData() method. The ID will always be associated with the item, even after re-sorting.
When you need to locate the required item, just scan all items, but do not use GetItemText(). Use GetItemData() instead. This will be faster

Disable QSql(Relational)TableModel's prefetch/caching behaviour

For some (well, performance) reason, Qt's "model" classes only fetch 256 rows from the database so if you want to append the row to the end of the recordset, you, apparently, must do something along the lines of
while (model->canFetchMore()) {
model->fetchMore();
}
This does work and when you do model->insertRow(model->rowCount()) afterwards, the row is, indeed, appended after the last row of the recorset.
There are various other problems related to this behaviour, such as when you insert or remove rows from the model, the view that renders it gets redrawn with only 256 rows showing and you must manually make sure, that the missing rows are fetched again.
Is there a way to bypass this behaviour altogether? My model is very unlikely to display more, than, say, 1000 rows, but getting it to retrieve those 1000 rows seems to be a royal pain. I understand, that this is a great performance optimization if you have to deal with larger recordsets, but for me it is a burden rather than a boon.
The model needs to be writable so I can't simply use QSqlQueryModel instead of QSqlRelationalTableModel.
From the QSqlTableModel documentation:
bool QSqlTableModel::insertRecord ( int row, const QSqlRecord & record )
Inserts the record after row. If row is negative, the record will be appended to the end.
Calls insertRows() and setRecord() internally.
Returns true if the row could be inserted, otherwise false.
See also insertRows() and removeRows().
I've not tried yet, but I think that it's not necessary to fetch the complete dataset to insert a record at the end.