Architecture problem: access-queue block - c++

There's a resource manager class. It helps us to access devices. But, of course, it should look for not to give access to one device for 2 processes at the same time.
At first I thought I wouldn't have any access queue. I thought there would be method like anyFree_devicename() that would return access handle if there is any free and NULL if no any. But, because of high concurrency for some devices, I've written accessQueue in every device.
Now, when you try to access device your pid (process id) is inserted into such accessQueue and you can ask for your turn using special method.
But, I found one problem: access Queues can block each other when you need few devicec in one command:
Device1 Device2
1 2
2 1
And both of them would be blocked.
inline bool API::Device::Device::ShallIUse(int pid)
{
if (amIFirst(pid)) return 1; // if I'm first I can use it anyway
std::stack<int> tempStorage; // we pass every element acessQ -> Temp
while (acessQueue.front() != pid) // every process
{
//we take process pointer to look into it's queue
API::ProcessManager::Process* proc = API::ProcessManager::TaskManager::me->giveProcess(acessQueue.front());
// list of devices this prosess needs now
std::vector<API::Device::Device*>* dINeed = proc->topCommand()->devINeedPtr();
// an dsee if there any process
for (int i = 0; i < (dINeed->size() - 1); i++)
{
if (!dINeed[i]->mIFirst())
{
while ( ! tempStorage.empty())
{
acessQueue.push(tempStorage.top());
tempStorage.pop();
}
return 0;
}
}
tempStorage.push(acessQueue.front());
acessQueue.pop();
}
return 1;
I've written such algorithm some lime later but:
It ruing all layer-based architecture
Now It seems to work wrong.
That's crazy! We simply look-trough all commands in nearly all processes and tring to push some of commands up on the access Queue. It works really slow.

Your access queue is creating what is known as a dead-lock. Multiple clients become perpetually blocked because they are trying to take ownership of the same set of resources but in a different order.
You can avoid it by assigning a unique value to all your resources. Have the clients submit a list of desired resources to the resource manager. The resource manager's acquire method will sort the list by the resource number and then attempt to allocate that set of resources in order.
This will enforce a specific order for all acquisitions and you will never be able to deadlock.
Any given client will, of course, block until all the set of resources it needs are available.

Related

Querying a growing data-set

We have a data set that grows while the application is processing the data set. After a long discussion we have come to the decision that we do not want blocking or asynchronous APIs at this time, and we will periodically query our data store.
We thought of two options to design an API for querying our storage:
A query method returns a snapshot of the data and a flag indicating weather we might have more data. When we finish iterating over the last returned snapshot, we query again to get another snapshot for the rest of the data.
A query method returns a "live" iterator over the data, and when this iterator advances it returns one of the following options: Data is available, No more data, Might have more data.
We are using C++ and we borrowed the .NET style enumerator API for reasons which are out of scope for this question. Here is some code to demonstrate the two options. Which option would you prefer?
/* ======== FIRST OPTION ============== */
// similar to the familier .NET enumerator.
class IFooEnumerator
{
// true --> A data element may be accessed using the Current() method
// false --> End of sequence. Calling Current() is an invalid operation.
virtual bool MoveNext() = 0;
virtual Foo Current() const = 0;
virtual ~IFooEnumerator() {}
};
enum class Availability
{
EndOfData,
MightHaveMoreData,
};
class IDataProvider
{
// Query params allow specifying the ID of the starting element. Here is the intended usage pattern:
// 1. Call GetFoo() without specifying a starting point.
// 2. Process all elements returned by IFooEnumerator until it ends.
// 3. Check the availability.
// 3.1 MightHaveMoreDataLater --> Invoke GetFoo() again after some time by specifying the last processed element as the starting point
// and repeat steps (2) and (3)
// 3.2 EndOfData --> The data set will not grow any more and we know that we have finished processing.
virtual std::tuple<std::unique_ptr<IFooEnumerator>, Availability> GetFoo(query-params) = 0;
};
/* ====== SECOND OPTION ====== */
enum class Availability
{
HasData,
MightHaveMoreData,
EndOfData,
};
class IGrowingFooEnumerator
{
// HasData:
// We might access the current data element by invoking Current()
// EndOfData:
// The data set has finished growing and no more data elements will arrive later
// MightHaveMoreData:
// The data set will grow and we need to continue calling MoveNext() periodically (preferably after a short delay)
// until we get a "HasData" or "EndOfData" result.
virtual Availability MoveNext() = 0;
virtual Foo Current() const = 0;
virtual ~IFooEnumerator() {}
};
class IDataProvider
{
std::unique_ptr<IGrowingFooEnumerator> GetFoo(query-params) = 0;
};
Update
Given the current answers, I have some clarification. The debate is mainly over the interface - its expressiveness and intuitiveness in representing queries for a growing data-set that at some point in time will stop growing. The implementation of both interfaces is possible without race conditions (at-least we believe so) because of the following properties:
The 1st option can be implemented correctly if the pair of the iterator + the flag represent a snapshot of the system at the time of querying. Getting snapshot semantics is a non-issue, as we use database transactions.
The 2nd option can be implemented given a correct implementation of the 1st option. The "MoveNext()" of the 2nd option will, internally, use something like the 1st option and re-issue the query if needed.
The data-set can change from "Might have more data" to "End of data", but not vice versa. So if we, wrongly, return "Might have more data" because of a race condition, we just get a small performance overhead because we need to query again, and the next time we will receive "End of data".
"Invoke GetFoo() again after some time by specifying the last processed element as the starting point"
How are you planning to do that? If it's using the earlier-returned IFooEnumerator, then functionally the two options are equivalent. Otherwise, letting the caller destroy the "enumerator" then however-long afterwards call GetFoo() to continue iteration means you're losing your ability to monitor the client's ongoing interest in the query results. It might be that right now you have no need for that, but I think it's poor design to exclude the ability to track state throughout the overall result processing.
It really depends on many things whether the overall system will at all work (not going into details about your actual implementation):
No matter how you twist it, there will be a race condition between checking for "Is there more data" and more data being added to the system. Which means that it's possibly pointless to try to capture the last few data items?
You probably need to limit the number of repeated runs for "is there more data", or you could end up in an endless loop of "new data came in while processing the last lot".
How easy it is to know if data has been updated - if all the updates are "new items" with new ID's that are sequentially higher, you can simply query "Is there data above X", where X is your last ID. But if you are, for example, counting how many items in the data has property Y set to value A, and data may be updated anywhere in the database at the time (e.g. a database of where taxis are at present, that gets updated via GPS every few seconds and has thousands of cars, it may be hard to determine which cars have had updates since last time you read the database).
As to your implementation, in option 2, I'm not sure what you mean by the MightHaveMoreData state - either it has, or it hasn't, right? Repeated polling for more data is a bad design in this case - given that you will never be able to say 100% certain that there hasn't been "new data" provided in the time it took from fetching the last data until it was processed and acted on (displayed, used to buy shares on the stock market, stopped the train or whatever it is that you want to do once you have processed your new data).
Read-write lock could help. Many readers have simultaneous access to data set, and only one writer.
The idea is simple:
-when you need read-only access, reader uses "read-block", which could be shared with other reads and exclusive with writers;
-when you need write access, writer uses write-lock which is exclusive for both readers and writers;

thread safe class implementation for two servers

I have a class which has 2 members:
RequestController {
public:
SendRequest(); // called by multiple threads synchronously
private:
server primary; // This is the primary server
server backup; // This is the back up server.
}
My logic is simply this:
In SendRequest(), I want to send the request to primary server,
if it fails, I want to send it to backup server, if it passes,
I want to swap primary server and backup server.
Here the problem comes: when I do the swapping, I have to lock the primary and
backup (this is the place where multiple threads can not do at the same time).
Actually I need to make sure when I swap, no threads are reading primary server.
How do I write this piece of code in an efficient way?
I don't want to lock the whole thing, as for most of the
case, primary server works and there's no need to lock.
I think generally this problem is language independent. Anyway I tag this
with C++.
Lets assume that the servers take some non-negligible amount of time to process a request. Then if the requests are coming fast enough, you will have the case where SendRequest is called a second time while it is waiting for one of the servers to processes a previous request.
As a designer, you have two choices.
If it is OK for a server to process multiple requests simultaneously, then you do nothing.
If a server can only process a single request at a time, then you will need to perform some kind of synchronization on the code.
In case 2, since you already have a lock on the servers, you can swap them with no ramifications.
For case 1, why not do the following:
std::mutex my_mutex;
...
// Select the server
server* selected = NULL;
my_mutex.lock();
selected = &primary;
my_mutex.unlock();
// Let the selected server process the message.
bool success = selected->process();
// If there was a primary failure, see if we can try the backup.
if (!success) {
my_mutex.lock();
if (selected == &primary) {
selected = &backup;
}
my_mutex.unlock();
// Now try again
success = selected->process();
// If the backup was used successfully, swap the primary and backup.
if (success) {
my_mutex.lock();
if (selected == &backup) {
backup = primary;
primary = selected;
}
my_mutex.unlock();
}
}
But this could have some problems. Say for example that primary fails on the first message, but is successful on the rest. If SendRequest() is called at the same time by 3 different threads, then you could have the following:
Thread 1 - sends with primary
Thread 2 - sends with primary
Thread 3 - sends with primary
Thread 1 - fails, sends with backup
Thread 2 - primary succeeds
Thread 1 - backup succeeds
Thread 1 - swaps primary and backup
Thread 3 - old primary (new backup) succeeds
Thread 3 - swaps primary and backup
If the messages keep coming fast enough, it is possible to remain in this state where you keep swapping primary and backup. The condition would resolve the moment there are no pending messages, and then the primary and backup would be set until there is another failure.
Perhaps a better way would be to never swap, but have a better selection method. For example:
...
// Select the server
server* selected = NULL;
selected = &primary;
if (!primary.last_message_successful) {
// The most recent attempt made with primary was a failure.
if (backup.last_message_successful) {
// The backup is thought to be functioning.
selected = &backup;
}
}
// Let the selected server process the message.
// If successful, process() will set the last_message_successful boolean.
bool success = selected->process();
// If there was a failure, try the other one.
if (!success) {
if (&primary == selected) {
selected = &backup;
} else {
selected = &primary;
}
}
// Try again with the other one.
selected->process();
In this example, the lock is not necessary. Primary will be used until it fails. Then the backup will be used. If other messages are processed in the mean time, it may result in the primary becoming usable again, in this case it will be used. Otherwise, backup will b used until it fails. If both fail, they will both be attempted, first primary, and then backup.

How to hook a method from ANY thread within a process using unmanaged EasyHook?

I've been having some issues getting my method hooks to work. I can get the hook to work if "I" call the method that's being hooked. But when it occurs naturally during the processes operation, it doesn't get hooked. My problem is probably stemming from the fact that I'm actually setting these hooks in my own thread that I've spawned. And apparently the LhSetInclusiveACL() method needs to know the thread that you want to hook. Well, here are my issues...
I don't really care which threads apply the hook, i want them all to be hooked. For example, lets say I want the CreateICW() method from the "gdi32.dll" library hooked for the entire process "iexplorer.exe". Not just from thread ID number 48291 or whatever. Knowing which threads are going to be calling the routines you are interested in hooking requires intimate knowledge of internal workings of the process you are hooking. I'm speculating that is generally not feasible and certainly not feasible for me. Thus its kind of impossible for me to know a priori which thread IDs need to be hooked.
The following code was taken from the "UnmanageHook" example:
extern "C" int main(int argc, wchar_t* argv[])
{
//...
//...
//...
/*
The following shows how to install and remove local hooks...
*/
FORCE(LhInstallHook(
GetProcAddress(hUser32, "MessageBeep"),
MessageBeepHook,
(PVOID)0x12345678,
hHook));
// won't invoke the hook handler because hooks are inactive after installation
MessageBeep(123);
// activate the hook for the current thread
// This is where I believe my problem is. ACLEntries is
// supposed to have a list of thread IDs that should pay
// attention to the MessageBeep() hook. Entries that are
// "0" get translated to be the "current" threadID. I want
// ALL threads and I don't want to have to try to figure out
// which threads will be spawned in the future for the given
// process. The second parameter is InThreadCount. I'm
// kind of shocked that you can't just pass in 0 or -1 or
// something for this parameter and just have it hook all
// threads in that given process.
FORCE(LhSetInclusiveACL(ACLEntries, 1, hHook));
// will be redirected into the handler...
MessageBeep(123);
//...
//...
//...
}
I've added some comments to the LhSetInclusiveACL() method call explaining the situation. Also LhSetExclusiveACL() and the "global" versions for these methods don't seem to help either.
For reference here is the documentation for LhSetExclusiveACL:
/***********************************************************************
Sets an exclusive hook local ACL based on the given thread ID list.
Global and local ACLs are always intersected. For example if the
global ACL allows a set “G” of threads to be intercepted, and the
local ACL allows a set “L” of threads to be intercepted, then the
set “G L” will be intercepted. The “exclusive” and “inclusive”
ACL types don’t have any impact on the computation of the final
set. Those are just helpers for you to construct a set of threads.
EASYHOOK_NT_EXPORT LhSetExclusiveACL(
ULONG* InThreadIdList,
ULONG InThreadCount,
TRACED_HOOK_HANDLE InHandle);
Parameters:
InThreadIdList
An array of thread IDs. If you specific zero for an
entry in this array, it will be automatically replaced
with the calling thread ID.
InThreadCount
The count of entries listed in the thread ID list. This
value must not exceed MAX_ACE_COUNT!
InHandle
The hook handle whose local ACL is going to be set.
Return values:
STATUS_INVALID_PARAMETER_2
The limit of MAX_ACE_COUNT ACL is violated by the given buffer.
***********************************************************************/
Am I using this wrong? I imagine that this is how the majority of implementations would use this library, so why is this not working for me?
You want to use LhSetExclusiveACL instead. This means that any calls across any threads get hooked, except for ones you specify in the ACL.

Incremental uploading of file using signals and slots

Before implementing this I would like to check if this will lead to undefined behaviour or race conditions.
When uploading files to asure, this must be done in blocks. I want to upload 5 blocks in parallel and they all get their data from the same file. This would happen like this:
char *currentDataChunk;
int currentDataChunkSize;
connect(_blobStorageProvider, SIGNAL(putBlockSucceded(int)), this, SLOT(finalizeAndUploadNextBlock(int)));
int parallelUploads = ((_item->size() / MAX_BLOCK_SIZE) >= MAX_PARALLEL_BLOCKUPLOADS) ? MAX_PARALLEL_BLOCKUPLOADS : (_item->size() / MAX_BLOCK_SIZE);
_latestProcessedBlockId = (parallelUploads - 1);
for(int i = 0; i < parallelUploads; i++) {
currentDataChunkSize = _item->read(currentDataChunk, MAX_BLOCK_SIZE);
...
uploader->putBlock(_container, _blobName, currentDataChunk, i);
}
In the putBlock function in the uploader, it calls the QNetworkAccessManager with the call. When it's done it sends back a signal if it failed, succeded or got canceled, along with the blockId so that I know which one of the blocks that was uploaded.
void BigBlobUploader::finalizeAndUploadNextBlock(int blockId) {
// FINALIZE BY ADDING SUCCESSFUL BLOCK TO FUTURE BLOCKLIST
QByteArray temp;
for(int i = 0; i != sizeof(blockId); i++) {
temp.append((char)(blockId >> (i * 8)));
}
_uploadedBlockIds.insert(blockId, QString(temp.toBase64()));
this->uploadNextBlock();
}
void BigBlobUploader::uploadNextBlock() {
char *newDataChunk;
int newDataChunkSize = _item->read(newDataChunk, MAX_BLOCK_SIZE);
...
_latestProcessedBlockId++;
uploader->putBlock(_container, _blobName, newDataChunk, _latestProcessedBlockId);
}
My plan now is to fetch these signals to a slot which should take note that this block was uploaded (put it in a list to be able to put a block list to finalize this blob), increase the index by one (which starts at 5) and fetch a new chunk of data and redo the whole process.
My issue now is, what if two of them finishes at the EXACT same time? I'm not dealing with threads here but since the HTTP requests are threaded by default, what is the case here? Are the signals queued (or should I use QueuedConnection)? Can a slot be called in parallel? Is there a better way of doing this?
Sorry for the inconvenience, I assumed you were using .NET since you added the Windows Azure tag to this thread. I'm familar with Windows Azure, but my understanding about Qt is limited. However, it would not be different from using signals/slots in other concurrent scenarios. This document may help: http://qt-project.org/doc/qt-4.8/signalsandslots.html.
Best Regards,
Ming Xu.
I am not familiar with QNetworkAccessManager. But in general, to deal with race conditions, please use locks. Usually, the way to use locks in C# is leveraging the lock keyword. Something like:
private object lockingObject = new object();
In a method:
lock
{
// If a thread acquires a lock, another thread is blocked here until the lock is released.
}
In addition, you can refer to http://msdn.microsoft.com/en-us/library/c5kehkcz(v=vs.100).aspx for more information.
Best Regards,
Ming Xu.

Asynchronous network calls

I made a class that has an asynchronous OpenWebPage() function. Once you call OpenWebPage(someUrl), a handler gets called - OnPageLoad(reply). I have been using a global variable called lastAction to take care of stuff once a page is loaded - handler checks what is the lastAction and calls an appropriate function. For example:
this->lastAction == "homepage";
this->OpenWebPage("http://www.hardwarebase.net");
void OnPageLoad(reply)
{
if(this->lastAction == "homepage")
{
this->lastAction = "login";
this->Login(); // POSTs a form and OnPageLoad gets called again
}
else if(this->lastAction == "login")
{
this->PostLogin(); // Checks did we log in properly, sets lastAction as new topic and goes to new topic URL
}
else if(this->lastAction == "new topic")
{
this->WriteTopic(); // Does some more stuff ... you get the point
}
}
Now, this is rather hard to write and keep track of when we have a large number of "actions". When I was doing stuff in Python (synchronously) it was much easier, like:
OpenWebPage("http://hardwarebase.net") // Stores the loaded page HTML in self.page
OpenWebpage("http://hardwarebase.net/login", {"user": username, "pw": password}) // POSTs a form
if(self.page == ...): // now do some more checks etc.
// do something more
Imagine now that I have a queue class which holds the actions: homepage, login, new topic. How am I supposed to execute all those actions (in proper order, one after one!) via the asynchronous callback? The first example is totally hard-coded obviously.
I hope you understand my question, because frankly I fear this is the worst question ever written :x
P.S. All this is done in Qt.
You are inviting all manner of bugs if you try and use a single member variable to maintain state for an arbitrary number of asynchronous operations, which is what you describe above. There is no way for you to determine the order that the OpenWebPage calls complete, so there's also no way to associate the value of lastAction at any given time with any specific operation.
There are a number of ways to solve this, e.g.:
Encapsulate web page loading in an immutable class that processes one page per instance
Return an object from OpenWebPage which tracks progress and stores the operation's state
Fire a signal when an operation completes and attach the operation's context to the signal
You need to add "return" statement in the end of every "if" branch: in your code, all "if" branches are executed in the first OnPageLoad call.
Generally, asynchronous state mamangment is always more complicated that synchronous. Consider replacing lastAction type with enumeration. Also, if OnPageLoad thread context is arbitrary, you need to synchronize access to global variables.