QWebElement manipulation of a QWebPage in a separate thread

QWebElement manipulation of a QWebPage in a separate thread - c++

I have a QWebPage created in the main thread (you can't create it anywhere else). I would like to manipulate this page using the QWebElement API introduced in Qt 4.6, but in a separate thread. So that thread would acquire a reference to the page and perform the necessary tree walking and attribute changes I need.
As the Threads and QObjects doc page explains, it is unsafe to manipulate QObjects in threads that don't own them unless the developer can ensure that the QObject in question will not be processing events while this manipulation is going on.
Now, this QWebPage is also being displayed in a QWebView, but the main thread will be blocked while waiting for the worker thread to finish (actually many of them, working on many different pages). Hence, the main event loop will not be running while the operation is in progress.
Thus, I believe the operation to be safe. Am I mistaken? Have I missed something? I'm basically asking for reassurance that this will not blow up in my face...

I do think you're right, and it is safe. At least, you have me convinced :)

Related

moveToThread vs deriving from QThread in Qt

When should moveToThread be preferred over subclassing QThread?
This link shows that both methods work. On what basis should I decide what to use from those two?

I would focus on the differences between the two methods. There isn't a general answer that fits all use cases, so it's good to understand exactly what they are to choose the best that fits your case.
Using moveToThread()
moveToThread() is used to control the object's thread affinity, which basically means setting the thread (or better the Qt event loop) from which the object will emit signals and its slots will be executed.
As shown in the documentation you linked, this can be used to run code on a different thread, basically creating a dummy worker, writing the code to run in a public slot (in the example the doWork() slot) and then using moveToThread to move it to a different event loop.
Then, a signal connected to that slot is fired. Since the object that emits the signal (the Controller in the example) lives in a different thread, and the signal is connected to our doWork method with a queued connection, the doWork method will be executed in the worker thread.
The key here is that you are creating a new event loop, run by the worker thread. Hence, once the doWork slot has started, the whole event loop will be busy until it exits, and this means that incoming signals will be queued.
Subclassing QThread()
The other method described in Qt's documentation is subclassing QThread. In this case, one overrides the default implementation of the QThread::run() method, which creates an event loop, to run something else.
There's nothing wrong with this approach itself, although there are several catches.
First of all, it is very easy to write unsafe code, because the run() method is the only one in that class that will be actually run on another thread.
If as an example, you have a member variable that you initialize in the constructor and then use in the run() method, your member is initialized in the thread of the caller and then used in the new thread.
Same story for any public method that could be called either from the caller or inside run().
Also slots would be executed from the caller's thread, (unless you do something really weird as moveToThread(this)) leading to extra confusion.
So, it is possible, but you really are on your own with this approach and you must pay extra attention.
Other approaches
There are of course alternatives to both approaches, depending on what you need. If you just need to run some code in background while your GUI thread is running you may consider using QtConcurrent::run().
However, keep in mind that QtConcurrent will use the global QThreadPool. If the whole pool is busy (meaning there aren't available threads in the pool), your code will not run immediately.
Another alternative, if you are at the least on C++11, is to use a lower level API such as std::thread.

As a starting point: use neither. In most cases, you have a unit of work that you wish to run asynchronously. Use QtConcurrent::run for that.
If you have an object that reacts to events and/or uses timers, it's a QObject that should be non-blocking and go in a thread, perhaps shared with other objects.
Such an object can also wrap blocking APIs.
Subclassing QThread is never necessary in practice. It's like subclassing QFile. QThread is a thread handle. It wraps a system resource. Overloading it is a bit silly.

Simple answer is ALWAYS.
When you move object to thread:
it is easy to write test for code
it is easy to refactor code (you can use thread but you don't have to).
you do not mix functionality of thread with business logic
there is no problem with object lifetime
When you subclass QThread
it is harder to write test
object clean up process can get very confusing leading to strange errors.
There is full description of the problem from Qt blog: You’re doing it wrong….
QtConcurrent::run is also very handy.
Please remember that by default slots are trying to jump between treads when signal is send from other thread object is assigned to. For details see documentation of Qt::ConnectionType.

QThread is low level thread abstraction, first look at high level API QtConcurrent module and QRunnable
If nothing of these is suitable for you, then read this old article, it tells how you should use QThread. Think about thread and task performed in this thread as a separate objects, don't mix them together.
So, if you need to write come custom, specific or extended thread wrapper then you should subclass QThread.
If you have QObject derived class with signals and slots, then use moveToThread on it.
In other cases use QtConcurrent, QRunnable and QThreadPoll.

MFC: accessing GUI from another thread?

So generally only the main thread should access the GUI in a MFC application.
However is that a law or just recommended? If I make sure, via critical sections, that only one thread accesses a certain object in the GUI, is it ok then? Or is it a problem if the MAIN thread accesses one part of the GUI while another thread access one. Even if those 2 objects don't affect each other?
The reason I ask is because this simplifies my rewrite of the application a lot if I can access the GUI from another thread.

Don't do it. You'll live in a world of ASSERTs and weird behaviour if you do. The GUI works through a system of Windows messages which are 'pumped' on the main thread. If you start modifying the UI in another thread you'll have situations where your operation causes other UI messages, which will be handled by the main thread potentially at the same time you're still trying to access the UI on another thread.
MFC programming is hard enough without trying to handle this sort of thing. Instead use PostMessage to put the UI related handling onto the main thread.

I used to think its almost forbidden to access GUI from a worker thread in MFC and is a recipe for disaster. But recently I learned this is not that hard rule if you know what you are doing, you can use worker threads to access GUI. In the Win32 Multithreaded Book the provides an example of a 'self animated control' which is completely drawn in a worker thread.
If I remember correctly the author pretty much said the same thing you said, if you critical sections at the right places you can make accessing GUI thread safe. The reason MFC doesn't do it by itself is for performance reasons.

Qt signals (QueuedConnection and DirectConnection)

I'm having trouble with Qt signals.
I don't understand how DirectConnection and QueuedConnection works?
I'd be thankful if someone will explain when to use which of these (sample code would be appreciated).

You won't see much of a difference unless you're working with objects having different thread affinities. Let's say you have QObjects A and B and they're both attached to different threads. A has a signal called somethingChanged() and B has a slot called handleChange().
If you use a direct connection
connect( A, SIGNAL(somethingChanged()), B, SLOT(handleChange()), Qt::DirectConnection );
the method handleChange() will actually run in the A's thread. Basically, it's as if emitting the signal calls the slot method "directly". If B::handleChange() isn't thread-safe, this can cause some (difficult to locate) bugs. At the very least, you're missing out on the benefits of the extra thread.
If you change the connection method to Qt::QueuedConnection (or, in this case, let Qt decide which method to use), things get more interesting. Assuming B's thread is running an event loop, emitting the signal will post an event to B's event loop. The event loop queues the event, and eventually invokes the slot method whenever control returns to it (it being the event loop). This makes it pretty easy to deal with communication between/among threads in Qt (again, assuming your threads are running their own local event loops). You don't have to worry about locks, etc. because the event loop serializes the slot invocations.
Note: If you don't know how to change a QObject's thread affinity, look into QObject::moveToThread. That should get you started.
Edit
I should clarify my opening sentence. It does make a difference if you specify a queued connection - even for two objects on the same thread. The event is still posted to the thread's event loop. So, the method call is still asynchronous, meaning it can be delayed in unpredictable ways (depending on any other events the loop may need to process). However, if you don't specify a connection method, the direct method is automatically used for connections between objects on the same thread (at least it is in Qt 4.8).

in addition to Jacob Robbins answer:
the statement "You won't see much of a difference unless you're working with objects having different thread affinities" is wrong;
emitting a signal to a direct connection within the same thread will execute the slot immediately, just like a simple function call.
emitting a signal to a queued connection within the same thread will enqueue the call into the threads event loop, thus the execution will always happen delayed.
QObject based class has a queued connection to itself

Jacob's answer is awesome. I'd just like to add a comparative example to Embedded Programming.
Coming from an embedded RTOS/ISR background, it was helpful to see the similarities in Qt's DirectConnection to Preemptive behavior of the ISRs and Qt's QueuedConnection to Queued Messages in an RTOS between tasks.
Side note: Coming from an Embedded background, it's difficult for me to not define the behavior in the programming. I never leave the argument as Auto, but that is just a personal opinion. I prefer everything to be explicitly written, and yes that gets difficult at times!

Can Qt call two slots simultaneously, if they get called from the same signal?

If there are two slots in two different threads and these slots are connected to a signal in a third thread. Can it happen, that both slots get called at the same time by the signal or do they get called synchronized every time?
I ask because i want to send some callback data structure (encapsulated with QSharedPointer) and ask if locking mechanism inside is needed.

You don't need to lock the actual signal/slot calls if you're using a Qt::QueuedConnection to pass the information to your threads, as the QueuedConnection mechanism handles this in a thread-safe manner.
That being said, you still need to protect any shared memory your threads access, regardless of how they were called. The fact that a third thread emitted a single signal to cause both slots to be called will not change this.

Have a look here (official Qt documentation for Qt's signal/slot mechanism regarding threads).
Each slot is called inside its thread, therefore I am pretty sure anything can happen. You should install a lock mechanism.

Inter-thread communication. How to send a signal to another thread

In my application I have two threads
a "main thread" which is busy most of the time
an "additional thread" which sends out some HTTP request and which blocks until it gets a response.
However, the HTTP response can only be handled by the main thread, since it relies on it's thread-local-storage and on non-threadsafe functions.
I'm looking for a way to tell the main thread when a HTTP response was received and the corresponding data. The main thread should be interrupted by the additional thread and process the HTTP response as soon as possible, and afterwards continue working from the point where it was interrupted before.
One way I can think about is that the additional thread suspends the main thread using SuspendThread, copies the TLS from the main thread using some inline assembler, executes the response-processing function itself and resumes the main thread afterwards.
Another way in my thoughts is, setting a break point onto some specific address in the second threads callback routine, so that the main thread gets notified when the second threads instruction pointer steps on that break point - and therefore - has received the HTTP response.
However, both methods don't seem to be nicely at all, they hurt even if just thinking about them, and they don't look really reliable.
What can I use to interrupt my main thread, saying it that it should be polite and process the HTTP response before doing anything else? Answers without dependencies on libraries are appreciated, but I would also take some dependency, if it provides some nice solution.
Following question (regarding the QueueUserAPC solution) was answered and explained that there is no safe method to have a push-behaviour in my case.

This may be one of those times where one works themselves into a very specific idea without reconsidering the bigger picture. There is no singular mechanism by which a single thread can stop executing in its current context, go do something else, and resume execution at the exact line from which it broke away. If it were possible, it would defeat the purpose of having threads in the first place. As you already mentioned, without stepping back and reconsidering the overall architecture, the most elegant of your options seems to be using another thread to wait for an HTTP response, have it suspend the main thread in a safe spot, process the response on its own, then resume the main thread. In this scenario you might rethink whether thread-local storage still makes sense or if something a little higher in scope would be more suitable, as you could potentially waste a lot of cycles copying it every time you interrupt the main thread.

What you are describing is what QueueUserAPC does. But The notion of using it for this sort of synchronization makes me a bit uncomfortable. If you don't know that the main thread is in a safe place to interrupt it, then you probably shouldn't interrupt it.
I suspect you would be better off giving the main thread's work to another thread so that it can sit and wait for you to send it notifications to handle work that only it can handle.
PostMessage or PostThreadMessage usually works really well for handing off bits of work to your main thread. Posted messages are handled before user input messages, but not until the thread is ready for them.

I might not understand the question, but CreateSemaphore and WaitForSingleObject should work. If one thread is waiting for the semaphore, it will resume when the other thread signals it.
Update based on the comment: The main thread can call WaitForSingleObject with a wait time of zero. In that situation, it will resume immediately if the semaphore is not signaled. The main thread could then check it on a periodic basis.

It looks like the answer should be discoverable from Microsoft's MSDN. Especially from this section on 'Synchronizing Execution of Multiple Threads'

If your main thread is GUI thread why not send a Windows message to it? That what we all do to interact with win32 GUI from worker threads.

One way to do this that is determinate is to periodically check if a HTTP response has been received.
It's better for you to say what you're trying to accomplish.

In this situation I would do a couple of things. First and foremost I would re-structure the work that the main thread is doing to be broken into as small of pieces as possible. That gives you a series of safe places to break execution at. Then you want to create a work queue, probably using the microsoft slist. The slist will give you the ability to have one thread adding while another reads without the need for locking.
Once you have that in place you can essentially make your main thread run in a loop over each piece of work, checking periodically to see if there are requests to handle in the queue. Long-term what is nice about an architecture like that is that you could fairly easily eliminate the thread localized storage and parallelize the main thread by converting the slist to a work queue (probably still using the slist), and making the small pieces of work and the responses into work objects which can be dynamically distributed across any available threads.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js