I am using DPDK-18.02.02 in my application where we have created hash table
with the following parameters,
*/*DPDK hash table configuration parameters */
rte_hash_parameters flowHashParams = {
.name = "Hash Table for FLOW",
.entries = 128000,
.key_len = sizeof(ipv4_flow_key_t),
.hash_func = ipv4_flow_hash_crc,
.hash_func_init_val = 0,
.extra_flag = RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD,
/* FLAG - Multiple thread can use this HT */
};*
While adding and deleting an entry in hash table in DPDK-18.02.2, everything works well.
While moving on with DPDK-19.11.13(latest stable version), we are facing a crash at rte_hash_del_key(). If we remove that extra_flag, RTE_HASH_EXTRA_FLAGS_MULTI_WRITER_ADD in the hash table parameters, we did not face any crash at rte_hash_del_key().
In my application, one thread adds entry to the hash table, another one thread reads those information and delete the entry in the hash table.
How can we add that multi-writer support for the hash table? Is there any alternative to enable that support?
Related
I have a DynamoDB table with partition key as userID and no sort key.
The table also has a timestamp attribute in each item. I wanted to retrieve all the items having a timestamp in the specified range (regardless of userID i.e. ranging across all partitions).
After reading the docs and searching Stack Overflow (here), I found that I need to create a GSI for my table.
Hence, I created a GSI with the following keys:
Partition Key: userID
Sort Key: timestamp
I am querying the index with Java SDK using the following code:
String lastWeekDateString = getLastWeekDateString();
AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("user table");
Index index = table.getIndex("userID-timestamp-index");
QuerySpec querySpec = new QuerySpec()
.withKeyConditionExpression("timestamp > :v_timestampLowerBound")
.withValueMap(new ValueMap()
.withString(":v_timestampLowerBound", lastWeekDateString));
ItemCollection<QueryOutcome> items = index.query(querySpec);
Iterator<Item> iter = items.iterator();
while (iter.hasNext()) {
Item item = iter.next();
// extract item attributes here
}
I am getting the following error on executing this code:
Query condition missed key schema element: userID
From what I know, I should be able to query the GSI using only the sort key without giving any condition on the partition key. Please help me understand what is wrong with my implementation. Thanks.
Edit: After reading the thread here, it turns out that we cannot query a GSI with only a range on the sort key. So, what is the alternative, if any, to query the entire table by a range query on an attribute? One suggestion I found in that thread was to use year as the partition key. This will require multiple queries if the desired range spans multiple years. Also, this does not distribute the data uniformly across all partitions, since only the partition corresponding to the current year will be used for insertions for one full year. Please suggest any alternatives.
When using dynamodb Query operation, you must specify at least the Partition key. This is why you get the error that userId is required. (In the AWS Query docs)
The condition must perform an equality test on a single partition key value.
The only way to get items without the Partition Key is by doing a Scan operation (but this wont be sorted by your sort key!)
If you want to get all the items sorted, you would have to create a GSI with a partition key that will be the same for all items you need (e.g. create a new attribute on all items, such as "type": "item"). You can then query the GSI and specify #type=:item
QuerySpec querySpec = new QuerySpec()
.withKeyConditionExpression(":type = #item AND timestamp > :v_timestampLowerBound")
.withKeyMap(new KeyMap()
.withString("#type", "type"))
.withValueMap(new ValueMap()
.withString(":v_timestampLowerBound", lastWeekDateString)
.withString(":item", "item"));
Always good solution for any customised querying requirements with DDB is to have right primary key scheme design for GSI.
In designing primary key of DDB, the main principal is that hash key should be designed for partitioning entire items, and sort key should be designed for sorting items within the partition.
Having said that, I recommend you to use year of timestamp as a hash key, and month-date as a sort key.
At most, the number of query you need to make is just 2 at max in this case.
you are right, you should avoid filtering or scanning as much as you can.
So for example, you can make the query like this If the year of start date and one of end date would be same, you need only one query:
.withKeyConditionExpression("#year = :year and #month-date > :start-month-date and #month-date < :end-month-date")
and else like this:
.withKeyConditionExpression("#year = :start-year and #month-date > :start-month-date")
and
.withKeyConditionExpression("#year = :end-year and #month-date < :end-month-date")
Finally, you should union the result set from both queries.
This consumes only 2 read capacity unit at most.
For better comparison of sort key, you might need to use UNIX timestamp.
Thanks
I noticed that DynamoDB query/scan only returns documents that contain a subset of the document, just the key columns it appears.
This means I need to do a separate Batch_Get to get the actual documents referenced by those keys.
I am not using a projection expression, and according to the documentation this means the whole item should be returned.1
How do I get query to return the entire document so I don't have to do a separate batch get?
One example bit of code that shows this is below. It prints out found documents, yet they contain only the primary key, the secondary key, and the sort key.
t1 = db.Table(tname)
q = {
'IndexName': 'mysGSI',
'KeyConditionExpression': "secKey= :val1 AND " \
"begins_with(sortKey,:status)",
'ExpressionAttributeValues': {
":val1": 'XXX',
":status": 'active-',
}
}
res = t1.query(**q)
for doc in res['Items']:
print(json.dumps(doc))
This situation is discussed in the documentation for the Select parameter. You have to read quite a lot to find this, which is not ideal.
If you query or scan a global secondary index, you can only request
attributes that are projected into the index. Global secondary index
queries cannot fetch attributes from the parent table.
Basically:
If you query the parent table then you get all attributes by default.
If you query an LSI then you get all attributes by default - they're retrieved from the projection in the LSI if all attributes are projected into the index (so that costs nothing extra) or from the base table otherwise (which will cost you more reads).
If you query or scan a GSI, you can only request attributes that are projected into the index. GSI queries cannot fetch attributes from the parent table.
I configured my firebird database to autoincrement the primary key of the table.
CREATE GENERATOR GEN_CHANNEL_PARAMETER_SET_ID;
SET GENERATOR GEN_CHANNEL_PARAMETER_SET_ID TO 0;
CREATE TRIGGER CHANNEL_PARAMETER_SETS_BI FOR CHANNEL_PARAMETER_SETS
ACTIVE BEFORE INSERT POSITION 0
AS
BEGIN
if (NEW.CHANNEL_PARAMETER_SET_ID is NULL) then NEW.CHANNEL_PARAMETER_SET_ID = GEN_ID(GEN_CHANNEL_PARAMETER_SET_ID, 1);
END
Now, in my C++ program using IBPP I have the following problem:
When inserting a dataset into an new row of this table I know all values in my C++ program exept the new primary key because the database creates it. How can I retrieve this key form the database?
Maybe someone else inserted an entry too - just a moment after I inserted one. So retrieve the PK with the highest value could create an error. How can I handle this?
Adopting Amir Rahimi Farahani's answer I found the following solution for my problem:
I use a generator:
CREATE GENERATOR GEN_CHANNEL_PARAMETER_SET_ID;
SET GENERATOR GEN_CHANNEL_PARAMETER_SET_ID TO 0;
and the following C++/IBPP/SQL code:
// SQL statement
m_DbStatement->Execute(
"SELECT NEXT VALUE FOR gen_channel_parameter_set_id FROM rdb$database"
);
// Retrieve Data
IBPP::Row ibppRow;
int64_t channelParameterSetId;
m_DbStatement->Fetch(ibppRow);
ibppRow->Get (1, channelParameterSetId);
// SQL statement
m_DbStatement->Prepare(
"INSERT INTO channel_parameter_sets "
"(channel_parameter_set_id, ...) "
"VALUES (?, ...) "
);
// Set variables
m_DbStatement->Set (1, channelParameterSetId);
...
...
// Execute
m_DbStatement->Execute ();
m_DbTransaction->CommitRetain ();
It is possible to generate and use the new id before inserting the new record:
SELECT NEXT VALUE FOR GEN_CHANNEL_PARAMETER_SET_ID FROM rdb$database
You now know the value for new primary key.
Update:
IBPP supports RETURNING too:
// SQL statement
m_DbStatement->Prepare(
"INSERT INTO channel_parameter_sets "
"(...) VALUES (...) RETURNING channel_parameter_set_id"
);
// Execute
m_DbStatement->Execute ();
m_DbTransaction->CommitRetain ();
// Get the generated id
m_DbStatement->Get (1, channelParameterSetId);
...
To retrieve the value of the generated key (or any other column) you can use INSERT ... RETURNING ....
For example:
INSERT INTO myTable (x, y, z) VALUES (1, 2, 3) RETURNING ID
Also a lot of drivers provide extra features to support RETURNING, but I don't know IBPP.
Note that from the perspective of a driver the use of RETURNING will make the insert act like an executable stored procedure; some drivers might require you to execute it in a specific way.
I have two sql server tables which I would like to sync to a read sql database. I want to flatten the data from the source database into one table in the read database using the sync framework. Can I do this?
The schemas in the source and target need to match. You could add a view that joins the two source tables within the source database and presents the data in the same format that your 'read' database expects.
if you're using the older providers (same one used by VS Local Database Cache Project item), you can use a view on the server side, however, your client can only be SQL Ce. but even that is tricky, what constitues a changed row if a change can occur on two source tables? if table 1 is updated and table 2 is not or vice versa?
the newer provider for SqlSyncProvider dont support views as its change tracking is based on triggers and the entire provisioning works against tables.
#Scott, schema's or table structures dont need to match.
I'm trying to do the same, I suppose.
This is my question on stackoverflow:
Merging 2 tables in a single table with different schema
I worked on this problem for some times and I reached some results...
For now, I'm working on the case in which the changes are only tracked in the PERSON table (so if something change in ADDRESS the changes are not synchronized). But I suppose the code can be improved to track changes in ADDRESS too... And for now I'm not taking into consideration the changes in the destination db (in CUSTOMER table). This will be more difficult to code, I suppose...
Anyway, my solution add an handler to changesSelected, there I alter the DataTable adding the columns I need (Address and City). I get the Address and the City by a sql SELECT and updates the rows... This works for updated and inserted rows...
The problem raise with deleted rows. In my db CUSTOMER, the primary key must be Id-Address, and not only Id (or I can't have multiple ADDRESS for each PERSON). So, when SyncFX tries to perform a deletion, the keys don't match and the deletion doesn't affect any row... I don't know how to alter a DataRow with state deleted, and I also can't get the Address from the db... So I can't have an Id-Address information in the deleted DataRow...
For now, I can only perform a sql DELETE using the Id (the only available info for a deleted row)...
Please try to improve the code and post back, so we could help each other!
This is the code. First the addhandler, then the code into the handler.
AddHandler remoteProvider.ChangesSelected, AddressOf remoteProvider_ChangesSelected
...
Private Shared Sub remoteProvider_ChangesSelected(ByVal sender As Object, ByVal e As DbChangesSelectedEventArgs)
If (e.Context.DataSet.Tables.Contains("PersonGlobal")) Then
Dim person = e.Context.DataSet.Tables("PersonGlobal")
Dim AddressColumn As New DataColumn("Address")
AddressColumn.DataType = GetType(String)
AddressColumn.MaxLength = 10
AddressColumn.AllowDBNull = False
'NULL is not allowed, so set a defaultvalue
AddressColumn.DefaultValue = "Nessuna"
Dim CityColumn As New DataColumn("City")
CityColumn.DataType = GetType(String)
CityColumn.AllowDBNull = False
CityColumn.DefaultValue = 0
persona.Columns.Add(AddressColumn)
persona.Columns.Add(CityColumn)
Dim newPerson = person.Clone()
For i = 0 To person.Rows.Count - 1 Step 1
Dim row = person.Rows(i)
If (row.RowState <> DataRowState.Deleted) Then
Dim query As String = "SELECT * FROM dbo.address WHERE Id = " & row("AddressId")
Dim sqlCommand As New SqlCommand(query, serverConn)
serverConn.Open()
Dim reader As SqlDataReader = sqlCommand.ExecuteReader()
Try
While reader.Read()
row("Address") = CType(reader("Address"), String)
row("City") = CType(reader("City"), String)
' Solo importando mantengo i valori di RowState
newPerson.ImportRow(row)
End While
Finally
reader.Close()
End Try
serverConn.Close()
Else
' TODO - Non funziona la cancellazione!!!
' La cancellazione cerca la chiave primaria su cliente, che รจ ID-Via
' Noi abbiamo l'ID corretto, ma "nessuna" come via...
' Dobbiamo recuperare la via giusta...
Dim query As String = "DELETE FROM dbo.customer WHERE Id = " & row("Id", DataRowVersion.Original)
Dim sqlCommand As New SqlCommand(query, clientConn)
clientConn.Open()
sqlCommand.ExecuteNonQuery()
clientConn.Close()
End If
Next
newPerson.Columns.Remove(newPerson.Columns("AddressId"))
e.Context.DataSet.Tables.Remove(person)
e.Context.DataSet.Tables.Add(newPerson)
End If
End Sub
I have a table with the following fields:
id VARCHAR(32) PRIMARY KEY,
parent VARCHAR(32),
name VARCHAR(32)
parent is a foreign key referencing the same table. This structure generates a tree. This tree is supposed to replicate a filesystem tree. The problem is that looking up an id from a path is slooow. Therefore I want to build an index. What is the best way of doing this?
Example Data:
id parent name
-------- ----------- ----------
1 NULL root
2 1 foo
3 1 bar
4 3 baz
5 4 aii
Would Index To:
id parent name
-------- ----------- ----------
1 NULL root
2 1 root/foo
3 1 root/bar
4 3 root/bar/baz
5 4 root/bar/baz/aii
I am currently thinking about using a temporary table and in the code manually running a series of insert from's to build the index. (The reason that I make it temporary is that if this db is accessed from a windows system, the path needs backslashes whereas from *nix it needs forward slashes). Is there an alternative to this?
so, you have a function that does something like this (pseudocode):
int getId(char **path) {
int id = 0;
int i;
char query[1000];
for (i = 0; path[i] != NULL; i++) {
sprintf(query, "select id from table where parent = %d and name = '%s'", id, name);
id = QuerySimple(query);
if (id < 0)
break;
}
return id;
}
looking at the query, you need a (non unique) index on columns (parent, name), but maybe you already have it.
a (temporary) table can be used like you said, note that you can change the path separator in your program, avoiding the needed of different tables for windows and unix. you also need to keep the additional table in sync with the master. if updates/deletes/inserts are rare, instead of a table, you can simply keep an in-memory cache of already looked up data, and clear the cache when an update happens (you can also do partial deletes on the cache if you want). In this case you can also read more data (e.g. given a parent read all the children) to fill up the cache faster. On the extreme, you can read the entire table in memory and work there! it depends on how many data you have, your typical access patterns (how many reads, how many writes), and the deployment environment (do you have RAM to spare?).